Return to BSD News archive
Path: euryale.cc.adfa.oz.au!newshost.carno.net.au!harbinger.cc.monash.edu.au!munnari.OZ.AU!news.Hawaii.Edu!news.uoregon.edu!arclight.uoregon.edu!feed1.news.erols.com!howland.erols.net!cam-news-hub1.bbnplanet.com!uunet!in1.uu.net!twwells!twwells!not-for-mail From: bill@twwells.com (T. William Wells) Newsgroups: comp.unix.bsd.freebsd.misc Subject: Re: FreeBSD as news-server?? Date: 20 Oct 1996 18:27:45 -0400 Organization: None, Mt. Laurel, NJ Lines: 105 Message-ID: <54e911$24a@twwells.com> References: <537ddl$3cc@amd40.wecs.org> <5462it$r37@twwells.com> <5467p6$bl4@flash.noc.best.net> <546dd4$bn7@flash.noc.best.net> NNTP-Posting-Host: twwells.com In article <546dd4$bn7@flash.noc.best.net>, Matthew Dillon <dillon@best.com> wrote: : (b) Note the dead time. There is one point 14:50:27 to 14:50:42 in this : particular sample where innd is 100% idle for 15 seconds! (and, no, : innd was not swapped out :-)). (this occurs a lot, but I am not : going to post thousands of lines of log files to prove it :-)). I've noticed exactly this same dead time. However, on my system, at least, the dead time is mostly illusory. Keep in mind that the times in the log aren't the actual transaction time but the time that innd got control from the select. Consistent with my experience is the hypothesis that those dead times represent merely how long it takes to process any given channel. This *will* depend on how fast your disks are.... None of that is specially relevant to streaming vs. nonstreaming, though. Look at it this way: if you have no streaming feeds, each NNTP "step" takes time T. If you add streaming, you don't decrease *any* T -- but you do increase it for those that happen to hit during one of the streaming feeds, if the streaming feed processing time is greater than the transaction time. Thus you *will* see an effect on nonstreaming feeds. How much of one? That'll depend directly on what fraction of the time you can get in nonstreaming entries of time T. If that fraction is small, you will find that the nonstreaming feed continually gets further behind. If it's larger, it'll get delayed until increased dups causes the times to balance again. As it gets larger, this delay time will decrease until it gets lost in statistical noise. : * Until you get up to a dozen or more *full* feeds, the only thing : that counts are article-creation rates. That is, the : 'I already have it' response tends to be cached and therefore : ignorable. As above, you're ignoring that systems with less speedy disks will find that that dead time approaches the actual stream processing time, starving out the nonstreaming feeds. (NB: it may not be obvious but this starving occurs *before* you actually run out of processing time. That's because it's a latency effect. If that time spent in streaming were magically edited out of reality, there would be enough time left over to do the nonstreaming feeds.) : * That streaming mode is more efficient (cavet: in the face of : non-streaming mode, see below)... for several reasons. It saves : TCP packets, it allows disk and network latencies to overlap, and : it allows statistically significant locality of reference to propogate : in the face of a large number of incoming feeds. A foolish efficiency is the hobgoblin of little programmers. With apologies to Emerson. :-) Anyway, the point is that that's an irrelevant efficiency. News machines are almost invariably disk limited. It's nice that streaming mode conserves CPU cycles but not especially important. For the overlap, sure, it's an improvements for streaming feeds -- for nonstreaming feeds sharing the same daemon, it can be a disaster. The streaming feeds take so long that for many cycles, there is no useful overlap. And, for the locality of reference, this is plain hoghash. Innd does not place a heavy strain on the history file. Ten feeds would mean 25 accesses to the history file/second, which simply isn't going to be a problem. On a large server, however, there will be a *lot* of reader activity. This will result in the history file being flushed from the cache. Locality of reference only helps if your feeds are very nearly "real time". If your cache is 8M (it is, on my system), a history block will live there until 8M of data have been read. That's not very long, and if your repeated history references don't happen within a few seconds of each other, you don't get any locality of reference. Streaming mode doesn't help here, either. That's because, unless you're doing "takethis" (not likely with multiple streaming feeds), you still send the "check" and the article in separate transactions, meaning that innd has to go all the way around its processing loop to handle it. : * That while non-streaming mode feeds will suffer, I suggest that the : dead time is sufficient to handle most lower-latency non-streaming : mode feeds. It didn't help mine. Or a lot of other peoples'. :-) : My frank opinion is that everyone should run streaming-mode feeds. And a number of people disagree, from the experience of running streaming vs. multiple parallel feeds. The problem is that innd's structure doesn't give "fair" allocation to all channels when streaming is there. The original design was intended for lock-step protocols like nonstreaming NNTP -- streaming breaks the design assumptions, with a number of unhappy consequences. Multiple parallel feeds have their own problems but are compatible with innd's design assumptions, so work better. : Most real full-feed hubs use streaming nowadays anyway... it is not as if : you will have much of a choice. In the last 12 months, all but one : of my incoming full feeds went from non-streaming to streaming. In your part of the world, perhaps. In mine, several people have switched from streaming to multiple parallel feeds because they simply work better.