Return to BSD News archive
Path: euryale.cc.adfa.oz.au!newshost.carno.net.au!harbinger.cc.monash.edu.au!munnari.OZ.AU!news.ecn.uoknor.edu!news.wildstar.net!news.ececs.uc.edu!news.kei.com!news.mathworks.com!www.nntp.primenet.com!nntp.primenet.com!news1.best.com!nntp1.best.com!flash.noc.best.net!not-for-mail From: dillon@best.com (Matthew Dillon) Newsgroups: comp.unix.bsd.freebsd.misc Subject: Re: FreeBSD as news-server?? Date: 17 Oct 1996 00:30:36 -0700 Organization: Best Internet Communications, Inc. (info@best.com) Lines: 292 Distribution: world Message-ID: <544nas$b5h@flash.noc.best.net> References: <537ddl$3cc@amd40.wecs.org> <53ucuj$8qh@twwells.com> <543urf$ar3@flash.noc.best.net> <544bat$41o@twwells.com> NNTP-Posting-Host: flash.noc.best.net :In article <544bat$41o@twwells.com>, T. William Wells <bill@twwells.com> wrote: :>In article <543urf$ar3@flash.noc.best.net>, :>Matthew Dillon <dillon@best.com> wrote: :>: :>Also, experience (and my theoretical analysis) shows that multiple :>: :>parallel feeds generally work better than streaming. :... :> :>This is contrary to my expectations, if one is disk bottlenecked. :>This, I suspect, is the difference between your system and mine. :>The disks I use are pretty generic; I suspect that they're really :>not suited to the task. I agree with this in regards to INND's single-process nature. You can bottleneck on a disk and hold up everything else, whereas in a multi-process incoming feed situation, one process may bottleneck on one disk while others are free to run on other disks. :>Anyway, the reason this is important is that if the overhead of :>writing articles gets too large, it exceeds the ability of the :>protocol to overlap it with network latency. Once that starts :>happening, the protocol slows down dramatically and things like :>increasing the TCP window size will only make things worse. : :>One other thing: you simply cannot run streaming and nonstreaming :>feeds into the same server. Or, you can, but the nonstreaming :>feeds will get so far behind as to be pointless. Even with fast :>disks, this will be true.... Well, the article writing overhead *could* be decoupled relatively easily from INND. It would be a 'one-hour hack' in programming terms. You just pipe the data to another process and go on to the next article. So far, it isn't a problem for us. :>: :>operations can be done somewhat asynchronously, you don't get :>: :>much "win" by minimizing history accesses. :>: :>: Perhaps it is pointless with a single feed, but it certainly :>: is NOT pointless if you have multiple redundant feeds. :> :>I have four feeds. My disk statistics really don't reflect your :>opinion. Or, put it this way: if you have ten incoming feeds and :>they all require a disk hit, that's twenty five disk hits per :>second. This doesn't strain the disks at all.... :> :>: History file caching is EXTREMELY important, because it means :>: that 6 out of 7 responses to IHAVE requests will be cached :>: (because the response is 'I've already got the article'), and :>: thus involve *NO* disk activity whatsoever. :> :>Alas, this is only true if your feeds are all so close to "real :>time" that things remain in the cache. Otherwise, caching doesn't :>do anything for you. (In my system, I solve this problem with a :>message id daemon, which eliminates most redundant history :>lookups.) You can cache a *lot* of history file. Sure, the cache will not be as optimal, but it will still be there, and it will be a disk read rather then a file create. Even so, if you have streaming turned on (I know you hate streaming!), you actually *do* get some locality of reference insofar as history file lookups go, since a backed-up feed still tends to be time-ordered in a fuzzy statistical sense. :>: There are a thousand things that cause create references :>: to unlinked history files... literally! :> :>I don't have them. Ever. Maybe it's just luck. :-) I didn't have such problems either until my active nnrpd's went over 100. As with many other things, it isn't a problem until your statistical sample is large enough and then something goes slightly wrong. Boom! :>: :>-- otherwise overchan spends a lot of time directory searching. :>: :>: The overview file is normally near the beginning of the directory. :> :>No it is not. Because when you do an expireover, it makes a new :>history file and renames it to the old one. There's no guarantee :>that that will end up near the beginning of the directory. In :>fact, odds are pretty good it won't be. It is near the beginning of the directory. Hey! This should be easy to prove! I'll write a little program that scans the directory and tells me what slot .overview is in. hold on.... #include <sys/types.h> #include <sys/dir.h> #include <stdio.h> #include <string.h> int main(int ac, char **av) { DIR *dir; struct direct *den; int i; for (i = 1; i < ac; ++i) { int count = 0; if ((dir = opendir(av[i])) == NULL) { continue; } while ((den = readdir(dir)) != NULL) { ++count; if (strcmp(den->d_name, ".overview") == 0) { printf("%s\t%d\n", av[i], count); break; } } closedir(dir); } return(0); } ok... now: news1:/home/news/spool/news/comp/sys# /tmp/x */ 3b1/ 3 acorn/ 3 aix/ 3 alliant/ 3 amiga/ 3 apollo/ 3 apple/ 3 apple2/ 3 arm/ 3 atari/ 7 att/ 3 be/ 3 cbm/ 3 cdc/ 3 convex/ 3 ... news1:/home/news/spool/news/rec/arts# /tmp/x */ animation/ 3 anime/ 3 ascii/ 3 bodyart/ 3 bonsai/ 3 books/ 3 comics/ 4 dance/ 3 disney/ 3 drwho/ 3 erotica/ 3 fine/ 3 ... news1:/home/news/spool/news/rec/arts/disney# /tmp/x */ animation/ 3 announce/ 3 merchandise/ 3 misc/ 3 parks/ 3 news1:/home/news/spool/news/rec/arts/disney# ls -1 */ | wc 877 872 5235 See? The .overview file is near the beginning of the directory. Ergo, it is likely to be cached since the article file was just written to the same directory. :>: Besides, overchan is an asynchronous :>: process. It does not really matter if it takes a little extra :>: overhead... :> :>Yes it does. Because if innd can't buffer it, you get entries lost :>into the batch file. Unless you go to pains to ensure that those :>entries get processed, you end up with nnrpds wasting time :>recreating those entries. Huh? I have no idea what you are talking about here. nnrpd does not go around creating .overview entries. It's asynchronous, and it has no effect whatsoever on innd unless it gets behind. I have NEVER seen overchan get behind... ever... the system could be dying and overchan still wouldn't get behind. :>: it's in the noise because the directory in question :>: has *already* been cached by the act of writing out the article :>: file in the first place. The namei caching works for .overview :>: files as well. :> :>Alas not, because overchan is asynchronous. By the time it's :>ready to fiddle with the overview file, that directory stuff is :>likely to be long gone. This is not true at all. A 4K buffer is equivalent to less then a hundred articles. It's still cached. We aren't taling about hour delays here, or even 5 minute delays. We are talking about 30 seconds of delay here. You can tune the buffer sizes to whatever you want. On MY system it's cached. :>: Any UNIX that implements vfork() will not care at all, and FreeBSD :>: doesn't care whether you use fork() *or* vfork(). It's a big zero :>: time-wise, even with huge data segments. :> :>Irrelevant because, even if FreeBSD doesn't copy or write the :>data, it _does_ allocate swap space. Get a bunch of these all at :>once and your server will refuse to fork. There are certain news :>clients which have a bad habit of making large numbers of nntp :>connections all at once. This makes random things fail on the :>server. No, FreeBSD does not allocate swap space. Lookee here, program #2: main() { char *ptr; int i; int pid; ptr = malloc(4 * 1024 * 1024); bzero(ptr, 4 * 1024 * 1024); system("pstat -s"); for (i = 0; i < 100; ++i) { if ((pid = fork()) == 0) { sleep(30); exit(0); } if (pid < 0) { perror("fork() failed!"); exit(1); } } sleep(10); system("pstat -s"); return(0); } apollo:/home/dillon> ./x Device 1K-blocks Used Avail Capacity Type /dev/sd0b 141920 22180 119676 16% Interleaved Device 1K-blocks Used Avail Capacity Type /dev/sd0b 141920 22180 119676 16% Interleaved apollo:/home/dillon> See ? I just forked a program with a 4MB run size 100 times... I only have 140MB of swap space. Did fork() fail? No.... and, yes, I did a ps to verify that: 101 11144 10412 1 18 0 4236 4504 pause S p2 0:00.18 ./x 101 11147 11144 1 18 0 4236 4236 pause S p2 0:00.01 ./x 101 11148 11144 1 18 0 4236 4236 pause S p2 0:00.01 ./x 101 11149 11144 1 18 0 4236 4236 pause S p2 0:00.01 ./x 101 11150 11144 1 18 0 4236 4236 pause S p2 0:00.01 ./x 101 11151 11144 1 18 0 4236 4236 pause S p2 0:00.01 ./x 101 11152 11144 1 18 0 4236 4236 pause S p2 0:00.01 ./x 101 11153 11144 1 18 0 4236 4236 pause S p2 0:00.01 ./x 101 11154 11144 1 18 0 4236 4236 pause S p2 0:00.01 ./x 101 11155 11144 1 18 0 4236 4236 pause S p2 0:00.01 ./x 101 11156 11144 1 18 0 4236 4236 pause S p2 0:00.01 ./x .... on for 101 processes IRIX does, SUN might. FreeBSD does not. :>: if :>: you have a full active file, on the order of the size of the :>: active file (500K to 1MB usually) per nnrpd process. :> :>Yeah. For awhile, I was running a hacked nnrpd that read the :>active file for each new newsgroup the user wanted. 300K images. :>:-) Only problem was some newsreaders that positively insisted on :>looking up several hundred newsgroups all at once, so I :>regretfully had to retire that hack. :> ::>: :>screw you up memory-wise. Basically, it's a bad idea to run :>: :>channel feeds. For that matter, I think I'm going to remove the :>: :>last of mine (for overview). Then innd will *never* fork -- and :>: :>that's one less thing to get in the way of shovelling articles as :>: :>fast as possible. :-) :>: :>: Hmm.. file batching overview :-) :-) :-) :> :>No particular reason not to. It certainly works for C news and :>the nnrpds will deal with records not in the file quite yet. -Matt -- Matthew Dillon Engineering, BEST Internet Communications, Inc. <dillon@best.net> [always include a portion of the original email in any response!]