Return to BSD News archive
Path: euryale.cc.adfa.oz.au!newshost.carno.net.au!harbinger.cc.monash.edu.au!news.mira.net.au!news.netspace.net.au!news.mel.connect.com.au!munnari.OZ.AU!news.ecn.uoknor.edu!solace!news.stealth.net!demos!news1.best.com!nntp1.best.com!flash.noc.best.net!not-for-mail From: dillon@best.com (Matthew Dillon) Newsgroups: comp.unix.bsd.freebsd.misc Subject: Re: FreeBSD as news-server?? Date: 17 Oct 1996 14:17:26 -0700 Organization: Best Internet Communications, Inc. (info@best.com) Lines: 169 Distribution: world Message-ID: <5467p6$bl4@flash.noc.best.net> References: <537ddl$3cc@amd40.wecs.org> <544bat$41o@twwells.com> <544nas$b5h@flash.noc.best.net> <5462it$r37@twwells.com> NNTP-Posting-Host: flash.noc.best.net :In article <5462it$r37@twwells.com>, T. William Wells <bill@twwells.com> wrote: :>In article <544nas$b5h@flash.noc.best.net>, :>Matthew Dillon <dillon@best.com> wrote: :>: :>One other thing: you simply cannot run streaming and nonstreaming :>: :>feeds into the same server. Or, you can, but the nonstreaming :>: :>feeds will get so far behind as to be pointless. Even with fast :>: :>disks, this will be true.... :>: :>: Well, the article writing overhead *could* be decoupled relatively :>: easily from INND. It would be a 'one-hour hack' in programming :>: terms. You just pipe the data to another process and go on to the :>: next article. :> :>Even if you do this, I think the the nonstreaming feeds will :>still get crunched. Innd processes the stuff from each stream all :>at once, so this introduces latency in all the other feeds. The :>is a *real* problem for nonstreaming feeds because any latency :>above the network latency directly slows the nonstreaming feed. :>(This is, in fact, the problem that streaming was invented to :>solve....) I can see the logic to that, but I think reasonable history file caching pretty much fixes the problem. Even with multiple feeds coming in, you still only have a relatively fixed article creation rate, with the remaining overhead going to history file lookups and sending back 'I've already got it' codes. A streaming feed tends to be bursty, with a *lot* of idle time between bursts... for example: Oct 17 12:47:01 5H:news1 newslink[13484]: nntp1.best.com:nntp1.S43565^Ifinal^I 113 secs 739 acc 44 dup 0 rej, 783 tot (415/min latency us/them: 43/28 mS) Oct 17 12:50:31 5H:news1 newslink[13642]: nntp-best.primenet.com:primenet.S40440^Ifinal^I 15 secs 8 acc 807 dup 0 rej, 815 tot (3260/min latency us/them: 0/17 mS) The first line is our newsfeed -> newsreader feed. With most articles accepted, it takes approximately 113 seconds to transmit 739 articles. Batches are 5 minutes apart, so there is another 287 seconds of 100% IDLE. the second line shows what history file caching does for you... in this particular case, primenet got behind in their feed to us and then caught up... but we already had the articles. The article rate reflects the history file caching - over 54 articles/sec, most NOT accepted. In this case, the entire 5 minute batch took 15 seconds of INN's time. So, from my point of view, innd tends to have plenty of free cycles for non-streaming feeds even in the face of streaming feeds. While it might loose some in latency when several streaming feeds are in operation, it gains it right back when those same feeds are in their idle period. :>: :>Alas, this is only true if your feeds are all so close to "real :>: :>time" that things remain in the cache. Otherwise, caching doesn't :>: :>do anything for you. (In my system, I solve this problem with a :>: :>message id daemon, which eliminates most redundant history :>: :>lookups.) :>: :>: You can cache a *lot* of history file. Sure, the cache will not be :>: as optimal, but it will still be there, and it will be a disk read :>: rather then a file create. :> :>Well maybe. There's an awful lot of disk activity going on on a :>news server and most of it isn't in the history file. This one, I :>suspect, isn't going to be answered without tools that can :>examine the buffering directly. The history file has its own partition. And, as I said, you need memory for filesystem caches. But if you set it up right, there isn't a problem. :>: :>Yes it does. Because if innd can't buffer it, you get entries lost :>: :>into the batch file. Unless you go to pains to ensure that those :>: :>entries get processed, you end up with nnrpds wasting time :>: :>recreating those entries. :>: :>: Huh? I have no idea what you are talking about here. nnrpd does not :>: go around creating .overview entries. It's asynchronous, and it has :>: no effect whatsoever on innd unless it gets behind. I have NEVER seen :>: overchan get behind... ever... the system could be dying and overchan :>: still wouldn't get behind. :> :>Ok, here's what happens. If overchan gets behind, innd starts :>creating a batch file for it. That goes in your out.going :>directory. This *does* happen and did happen for me until I moved :>the overview to a separate disk. This batch file doesn't ever get :>processed. Thus some entries are lost from the overview. This is :>not a catastrophe: nnrpd considers the spool to be the master; if :>there is a file in the directory which doesn't have an overview :>entry, it creates one on the fly. This entry is *not* written to :>the overview file, it's purely internal to the nnrpd. The "wasting :>time" I was referring to is the time to open the articles with :>missing entries and read them for overview data. overchan should never get behind. If it does, one has other problems to deal with it. I've run two busy news machines for two years now. Overchan has not ever, not once gotten behind. The rest of the system could be dying and overchan still wouldn't get behind. Also, keep in mind my original comment... I said that if you had three or more spool disks, that you would not have to separate the overview files. In fact, I would guess that reserving a single spindle just for overview would cause more problems then it would solve. If it's in the spool, overview disk activity gets spread around the N spindles like everything else in the spool. It's much more scaleable to stripe a single spool across three or four (or more!) physical disks and just put the overview in the spool. I suppose if you were paranoid, you could create a second directory hierarchy on the same (striped) partition as the spool. But as I said... 99% of the time (in my view), the .overview file will already be in the vnode or namei cache or the blocks relating to the directory will already be cached from the article file's creation. >: :>Alas not, because overchan is asynchronous. By the time it's >: :>ready to fiddle with the overview file, that directory stuff is >: :>likely to be long gone. >: >: This is not true at all. A 4K buffer is equivalent to less then >: a hundred articles. It's still cached. We aren't taling about hour >: delays here, or even 5 minute delays. We are talking about 30 seconds >: of delay here. > >Well, I can't show you directory statistics anymore (because of my >directory structure changes) but when your popular directories >are hundreds of k's and you have a lot of nnrpds floating around >reading from the disk, the cache turnover is pretty damned fast. >This is another one of those where it would be nice to instrument >the cache.... > :>: :>Irrelevant because, even if FreeBSD doesn't copy or write the :>: :>data, it _does_ allocate swap space. Get a bunch of these all at :>: :>once and your server will refuse to fork. There are certain news :>: :>clients which have a bad habit of making large numbers of nntp :>: :>connections all at once. This makes random things fail on the :>: :>server. :>: :>: No, FreeBSD does not allocate swap space. Lookee here, program #2: :> :>OK, then *you* tell *me* what EAGAIN from fork means. :-) When I :>checked the kernel code, it looked like nothing short of a swap :>shortage would cause it. (Well, running out of slots for child :>processes could, too, but I don't think that's the case here. :>Eveen at peak times, I've still got about 50% leeway in my process :>slots before I hit the limit.) I think you jumped to conclusions as to what exactly was being duplicated. The only memory the kernel really needs to allocate to do a fork() is a few pages here and there... the page directory, process structure, descriptor array, plus a few pages that get written to immediately, such as one page in the stack, maybe a page or two of data... it really isn't much. We are talking perhaps 16 KBytes, maybe a little more if the page table is huge (though mayhaps FreeBSD makes the pagetable pages copy-on-write as well :-)). The only other thing that could cause fork to fail is the user process limit, which is normally 20 or 40 on FreeBSD machines by default. For my test, I unlimited that resource. On FreeBSD machines, actual swap space is only allocated when a pageout must occur, and even then we are only talking about one page no matter how many tasks are sharing the data. -Matt -- Matthew Dillon Engineering, BEST Internet Communications, Inc. <dillon@best.net> [always include a portion of the original email in any response!]