Return to BSD News archive
Path: euryale.cc.adfa.oz.au!newshost.carno.net.au!harbinger.cc.monash.edu.au!munnari.OZ.AU!news.ecn.uoknor.edu!news.wildstar.net!news.sdsmt.edu!news.mid.net!mr.net!www.nntp.primenet.com!nntp.primenet.com!news1.best.com!nntp1.best.com!flash.noc.best.net!not-for-mail From: dillon@best.com (Matthew Dillon) Newsgroups: comp.unix.bsd.freebsd.misc Subject: Re: FreeBSD as news-server?? Date: 14 Oct 1996 11:42:20 -0700 Organization: Best Internet Communications, Inc. (info@best.com) Lines: 162 Distribution: world Message-ID: <53u1ic$61i@flash.noc.best.net> References: <537ddl$3cc@amd40.wecs.org> <Dz375G.76v@news2.new-york.net> <53ott7$579@adv.IAEhv.nl> <53pm5c$5ks@twwells.com> NNTP-Posting-Host: flash.noc.best.net :In article <53pm5c$5ks@twwells.com>, T. William Wells <bill@twwells.com> wrote: :>In article <53ott7$579@adv.IAEhv.nl>, Arjan de Vet <devet@adv.IAEhv.nl> wrote: :>: Shouldn't make too much difference. However an upgrade to INN would... :> :>No, it wouldn't. Almost certainly, INN is slower for a single :>incoming newsfeed than C news. In this day of huge news spool :>directories, it is absolutely necessary that the process :>accepting incoming NNTP *not* write the articles to the spool. :>The latency this introduces into the protocol slows it down way :>too much. (No, streaming doesn't help -- many providers have :>found quite the opposite and have stopped using it....) :> :>With bare INN, you cannot even get 2 articles/second on typical woa woa! Not true any more! Just make sure all of your feeds understand INN's streaming mode. I get about a 5 articles/sec transfer rate from my main news machine to my nntp machine under medium load conditions (around 200 nnrpd's users). It's harder to tell on the newsfeeds machine, since it has a dozen incoming feeds, but I would say the aggregate burst transfer is on the order of 10 articles/sec. :>C news is, in effect, a decoupled system and, more importantly, :>unlike INN, isn't a memory hog. Thus you can expect C news to :>function more efficiently than INN for a given piece of hardware :>and a single incoming feed. It still won't be fast enough for a :>full feed on typical hardware. :> This is true to a degree, but you hit the big problem with CNews on the nose below: :>If you have more than one incoming feed, things get complex. I'll :>save my fingers explaining why, as I have no reason to believe :>that this person has more than one feed. ... which is why most people run INN now rather then CNews. I'm sure all of these points have been repeated, but I have a little time so I'll write down my own list. These comments pertain to one or more full feeds: * You need lots of ram. The machine cannot afford to swap *at all*, plus you need enough to keep most of the history file and all of the history file page table in core, plus you need enough to be able to keep your feeds coming in *while* an expire run is going on. Expire has about the same memory utilization as innd, effectively doubling your in-core memory requirements. Finally, you need lots of left over memory for filesystem caching. * Lots of spindles. Separate by functionality. (a) /usr/local/lib/news or /news or whatever you want to call it should own its own disk. The logs *can* be put on the same disk. The partition containing the history file MUST have at *least* 1GB free. The reason is that it must not only support the potentially 100-200MB history file, it must also support the expire run's history file rebuild *AND* support active references to unlinked history files by nnrpd and other programs that will prevent the 'old' history file's space from being reclaimed. (b) /var/spool/news or whatever you call it.. the news spool, should generally own several disks. I suggest a minimum of three disks. (c) Overview... it is not strictly necessary to put the overview files on a separate physical disk if you (1) have three or more disks for your main spool and (2) buffer the overview records in the newsfeeds file correctly. * If you normally have more then a dozen or so active NNTP users, have a *second* machine. That is, use one machine for your newsfeeds machine with a minimal spool, and a second machine for your reader machine with a huge spool. nnrpd processes *kill* INN. You can afford to saturate your disks on the newsreader machine, which means you can throw four times the number of nnrpd processes on it that you could otherwise throw on a combination newsfeeds and newsreader machine. If you batch the articles between the newsfeeds machine and the newsreader machine in bursts, you can even afford to swap on the newsreader machine. * Make sure INN is compiled properly (a) use the history file page table in-core option or the history file mmap() option. I actually suggest the page table in-core option because most UNIX system's buffer caching algorithms seem to work better with lseek()/read() then with mmap()/access, even though the overhead is greater with lseek()/read(). (b) Buffer writes to the log file. It's another configuration option. Be generous :-) This allows you to put the logs on the same physical disk as the history file. (c) Use the absolute latest INN release, with the streaming mode extensions. (d) If you run nnrpd, for gods sake use the shared-active patched version! * Make sure INN is configured properly (a) Use the 'B' option in your newsfeeds file with reasonable buffering numbers. For example: overview:\ *:B4096/1024,Tc,WO:/home/news/bin/overchan It is stupid not to do so. This reduces the number of context switches and synchronous write-to-pipe calls by an order of magnitude, and tends to bunch up and batch the overchan process. If you do this, it is not strictly necessary to put the overview files on their own disk. (b) Use the 'B' option for ALL OUTGOING FEEDS. If you don't, then every time INN parses an incoming article for redistribution, it will do a synchronous write-to-file or write-to-pipe for each outgoing feed on every incoming article, which kills performance. For example: nntp1/nntp1.best.com\ :*\ :B8192/1024,Tf,Wnm\ :nntp1 It is up to you whether you want to run your outgoing feeds as channels or files. I tend to run mine to files then run a 5 minute batch from cron, but this is more due to historical memory leaks in INND. I also like to do things that way in order to have more control over the backlog. (c) Use innfeed or something similar that supports streaming mode for your outgoing feeds. A lot of people are so blasted worried about propogating articles instantly, that they use low 'B' option buffering or no buffer at all, and use real time channels rather then 5 or 10 minute file batches.... and get really horrible performance as a result. I also hear people complain about all the fork/exec's... I point out to such people that (a) channels have to fork/exec too, and with much greater overhead doing so from innd rather then cron, and (b) unless you have > 20 feeds, doing 20 fork/exec's from cron once every 5 minutes has almost no overhead, and you can even stagger them from cron to create less disk contention. This is verse the real time channel feeds which, even when buffered, give you NO ability to stagger their operational starts to reduce disk contention. If you are not afraid of a pidly 5 or 10 minute propogation delay, then use proper buffering and 5/10 minute file batches rather then channels. If you really are in love with channels, then at least use relatively large buffering parameters. -Matt -- Matthew Dillon Engineering, BEST Internet Communications, Inc. <dillon@best.net> [always include a portion of the original email in any response!]