*BSD News Article 80944

Path: euryale.cc.adfa.oz.au!newshost.carno.net.au!harbinger.cc.monash.edu.au!munnari.OZ.AU!news.ecn.uoknor.edu!news.wildstar.net!news.ececs.uc.edu!news.kei.com!news.mathworks.com!www.nntp.primenet.com!nntp.primenet.com!news1.best.com!nntp1.best.com!flash.noc.best.net!not-for-mail
From: dillon@best.com (Matthew Dillon)
Newsgroups: comp.unix.bsd.freebsd.misc
Subject: Re: FreeBSD as news-server??
Date: 17 Oct 1996 00:30:36 -0700
Organization: Best Internet Communications, Inc. (info@best.com)
Lines: 292
Distribution: world
Message-ID: <544nas$b5h@flash.noc.best.net>
References: <537ddl$3cc@amd40.wecs.org> <53ucuj$8qh@twwells.com> <543urf$ar3@flash.noc.best.net> <544bat$41o@twwells.com>
NNTP-Posting-Host: flash.noc.best.net

:In article <544bat$41o@twwells.com>, T. William Wells <bill@twwells.com> wrote:
:>In article <543urf$ar3@flash.noc.best.net>,
:>Matthew Dillon <dillon@best.com> wrote:
:>: :>Also, experience (and my theoretical analysis) shows that multiple
:>: :>parallel feeds generally work better than streaming.
:...
:>
:>This is contrary to my expectations, if one is disk bottlenecked.
:>This, I suspect, is the difference between your system and mine.
:>The disks I use are pretty generic; I suspect that they're really
:>not suited to the task.

    I agree with this in regards to INND's single-process nature.
    You can bottleneck on a disk and hold up everything else, whereas
    in a multi-process incoming feed situation, one process may bottleneck
    on one disk while others are free to run on other disks.

:>Anyway, the reason this is important is that if the overhead of
:>writing articles gets too large, it exceeds the ability of the
:>protocol to overlap it with network latency. Once that starts
:>happening, the protocol slows down dramatically and things like
:>increasing the TCP window size will only make things worse.
:
:>One other thing: you simply cannot run streaming and nonstreaming
:>feeds into the same server. Or, you can, but the nonstreaming
:>feeds will get so far behind as to be pointless. Even with fast
:>disks, this will be true....

     Well, the article writing overhead *could* be decoupled relatively
     easily from INND.  It would be a 'one-hour hack' in programming
     terms.  You just pipe the data to another process and go on to the
     next article.

     So far, it isn't a problem for us.

:>: :>operations can be done somewhat asynchronously, you don't get
:>: :>much "win" by minimizing history accesses.
:>:
:>:     Perhaps it is pointless with a single feed, but it certainly
:>:     is NOT pointless if you have multiple redundant feeds.
:>
:>I have four feeds. My disk statistics really don't reflect your
:>opinion. Or, put it this way: if you have ten incoming feeds and
:>they all require a disk hit, that's twenty five disk hits per
:>second. This doesn't strain the disks at all....
:>
:>:     History file caching is EXTREMELY important, because it means
:>:     that 6 out of 7 responses to IHAVE requests will be cached
:>:     (because the response is 'I've already got the article'), and
:>:     thus involve *NO* disk activity whatsoever.
:>
:>Alas, this is only true if your feeds are all so close to "real
:>time" that things remain in the cache. Otherwise, caching doesn't
:>do anything for you. (In my system, I solve this problem with a
:>message id daemon, which eliminates most redundant history
:>lookups.)

    You can cache a *lot* of history file.  Sure, the cache will not be
    as optimal, but it will still be there, and it will be a disk read
    rather then a file create.

    Even so, if you have streaming turned on (I know you hate streaming!),
    you actually *do* get some locality of reference insofar as history
    file lookups go, since a backed-up feed still tends to be time-ordered
    in a fuzzy statistical sense.

:>:     There are a thousand things that cause create references
:>:     to unlinked history files... literally!
:>
:>I don't have them. Ever. Maybe it's just luck. :-)

    I didn't have such problems either until my active nnrpd's went
    over 100.  As with many other things, it isn't a problem until
    your statistical sample is large enough and then something goes
    slightly wrong.  Boom!

:>: :>-- otherwise overchan spends a lot of time directory searching.
:>:
:>:     The overview file is normally near the beginning of the directory.
:>
:>No it is not. Because when you do an expireover, it makes a new
:>history file and renames it to the old one. There's no guarantee
:>that that will end up near the beginning of the directory.  In
:>fact, odds are pretty good it won't be.

    It is near the beginning of the directory.  Hey!  This should be easy
    to prove!  I'll write a little program that scans the directory and
    tells me what slot .overview is in. hold on....

#include <sys/types.h>
#include <sys/dir.h>
#include <stdio.h>
#include <string.h>

int
main(int ac, char **av)
{
    DIR *dir;
    struct direct *den;
    int i;

    for (i = 1; i < ac; ++i) {
        int count = 0;

        if ((dir = opendir(av[i])) == NULL) {
            continue;
        }
        while ((den = readdir(dir)) != NULL) {
            ++count;
            if (strcmp(den->d_name, ".overview") == 0) {
                printf("%s\t%d\n", av[i], count);
                break;
            }
        }
        closedir(dir);
    }
    return(0);
}

    ok... now:

news1:/home/news/spool/news/comp/sys# /tmp/x */
3b1/    3
acorn/  3
aix/    3
alliant/        3
amiga/  3
apollo/ 3
apple/  3
apple2/ 3
arm/    3
atari/  7
att/    3
be/     3
cbm/    3
cdc/    3
convex/ 3
...

news1:/home/news/spool/news/rec/arts# /tmp/x */
animation/      3
anime/  3
ascii/  3
bodyart/        3
bonsai/ 3
books/  3
comics/ 4
dance/  3
disney/ 3
drwho/  3
erotica/        3
fine/   3
...

news1:/home/news/spool/news/rec/arts/disney# /tmp/x */
animation/      3
announce/       3
merchandise/    3
misc/   3
parks/  3

news1:/home/news/spool/news/rec/arts/disney# ls -1 */ | wc
          877           872          5235


    See?  The .overview file is near the beginning of the directory. 
    Ergo, it is likely to be cached since the article file was just written
    to the same directory.

:>:     Besides, overchan is an asynchronous
:>:     process.  It does not really matter if it takes a little extra
:>:     overhead...
:>
:>Yes it does. Because if innd can't buffer it, you get entries lost
:>into the batch file. Unless you go to pains to ensure that those
:>entries get processed, you end up with nnrpds wasting time
:>recreating those entries.

    Huh?  I have no idea what you are talking about here.  nnrpd does not
    go around creating .overview entries.   It's asynchronous, and it has
    no effect whatsoever on innd unless it gets behind.  I have NEVER seen
    overchan get behind... ever... the system could be dying and overchan
    still wouldn't get behind.

:>:     it's in the noise because the directory in question
:>:     has *already* been cached by the act of writing out the article
:>:     file in the first place.  The namei caching works for .overview
:>:     files as well.
:>
:>Alas not, because overchan is asynchronous. By the time it's
:>ready to fiddle with the overview file, that directory stuff is
:>likely to be long gone.

    This is not true at all.  A 4K buffer is equivalent to less then 
    a hundred articles.  It's still cached.  We aren't taling about hour
    delays here, or even 5 minute delays.  We are talking about 30 seconds
    of delay here.

    You can tune the buffer sizes to whatever you want.  On MY system it's
    cached.

:>:     Any UNIX that implements vfork() will not care at all, and FreeBSD
:>:     doesn't care whether you use fork() *or* vfork().  It's a big zero
:>:     time-wise, even with huge data segments.
:>
:>Irrelevant because, even if FreeBSD doesn't copy or write the
:>data, it _does_ allocate swap space. Get a bunch of these all at
:>once and your server will refuse to fork. There are certain news
:>clients which have a bad habit of making large numbers of nntp
:>connections all at once. This makes random things fail on the
:>server.

    No, FreeBSD does not allocate swap space.  Lookee here, program #2:

main()
{
    char *ptr;
    int i;
    int pid;

    ptr = malloc(4 * 1024 * 1024);
    bzero(ptr, 4 * 1024 * 1024);
    system("pstat -s");
    for (i = 0; i < 100; ++i) {
        if ((pid = fork()) == 0) {
            sleep(30);
            exit(0);
        }
        if (pid < 0) {
            perror("fork() failed!");
            exit(1);
        }
    }
    sleep(10);
    system("pstat -s");
    return(0);
}

apollo:/home/dillon> ./x
Device      1K-blocks     Used    Avail Capacity  Type
/dev/sd0b      141920    22180   119676    16%    Interleaved
Device      1K-blocks     Used    Avail Capacity  Type
/dev/sd0b      141920    22180   119676    16%    Interleaved
apollo:/home/dillon> 


    See ?  I just forked a program with a 4MB run size 100 times... I
    only have 140MB of swap space.  Did fork() fail?  No....  and, yes, I 
    did a ps to verify that:

  101 11144 10412   1  18  0  4236 4504 pause  S     p2    0:00.18 ./x
  101 11147 11144   1  18  0  4236 4236 pause  S     p2    0:00.01 ./x
  101 11148 11144   1  18  0  4236 4236 pause  S     p2    0:00.01 ./x
  101 11149 11144   1  18  0  4236 4236 pause  S     p2    0:00.01 ./x
  101 11150 11144   1  18  0  4236 4236 pause  S     p2    0:00.01 ./x
  101 11151 11144   1  18  0  4236 4236 pause  S     p2    0:00.01 ./x
  101 11152 11144   1  18  0  4236 4236 pause  S     p2    0:00.01 ./x
  101 11153 11144   1  18  0  4236 4236 pause  S     p2    0:00.01 ./x
  101 11154 11144   1  18  0  4236 4236 pause  S     p2    0:00.01 ./x
  101 11155 11144   1  18  0  4236 4236 pause  S     p2    0:00.01 ./x
  101 11156 11144   1  18  0  4236 4236 pause  S     p2    0:00.01 ./x
 ....  on for 101 processes

    IRIX does, SUN might.  FreeBSD does not.

:>:     if
:>:     you have a full active file, on the order of the size of the
:>:     active file (500K to 1MB usually) per nnrpd process.
:>
:>Yeah. For awhile, I was running a hacked nnrpd that read the
:>active file for each new newsgroup the user wanted. 300K images.
:>:-) Only problem was some newsreaders that positively insisted on
:>looking up several hundred newsgroups all at once, so I
:>regretfully had to retire that hack.
:>
::>: :>screw you up memory-wise. Basically, it's a bad idea to run
:>: :>channel feeds. For that matter, I think I'm going to remove the
:>: :>last of mine (for overview). Then innd will *never* fork -- and
:>: :>that's one less thing to get in the way of shovelling articles as
:>: :>fast as possible. :-)
:>:
:>:      Hmm.. file batching overview :-) :-) :-)
:>
:>No particular reason not to. It certainly works for C news and
:>the nnrpds will deal with records not in the file quite yet.

					-Matt

-- 
    Matthew Dillon   Engineering, BEST Internet Communications, Inc.
		     <dillon@best.net>
    [always include a portion of the original email in any response!]