*BSD News Article 61838

Path: euryale.cc.adfa.oz.au!olive.mil.adfa.oz.au!navmat.navy.gov.au!posgate.acis.com.au!warrane.connect.com.au!news.syd.connect.com.au!news.mel.connect.com.au!munnari.OZ.AU!metro!metro!inferno.mpx.com.au!news.mel.aone.net.au!imci4!imci5!pull-feed.internetmci.com!news.internetMCI.com!newsfeed.internetmci.com!in1.uu.net!brighton.openmarket.com!decwrl!olivea!strobe!jerry
From: jerry@strobe.ATC.Olivetti.Com (Jerry Aguirre)
Newsgroups: news.software.nntp,comp.unix.bsd.freebsd.misc
Subject: Re: Poor performance with INN and FreeBSD.
Date: 20 Feb 1996 22:01:42 GMT
Organization: Olivetti ATC; Cupertino, CA; USA
Lines: 46
Message-ID: <4gdgc6$ron@olivea.ATC.Olivetti.Com>
References: <311F8C62.4BC4@pluto.njcc.com> <DMu8B6.6Jn@twwells.com>
NNTP-Posting-Host: strobe.atc.olivetti.com
Xref: euryale.cc.adfa.oz.au news.software.nntp:20015 comp.unix.bsd.freebsd.misc:14108

In article <DMu8B6.6Jn@twwells.com> bill@twwells.com (T. William Wells) writes:
>I think you can see where I'm heading....but just for completeess,
>with those numbers, I can expect only 1.2 articles per second.

Yes, the limit of conventional NNTP is one article for every 2*RTT.
This is because each transfer requires two exchanges, one to send the
ihave and another to send the article.  Each waits for the response so
the link is idle.

In practice the timing is worse as just about everything adds to it.
If the history lookup for the ihave requires a disk read, either directly
or as a result of swapping, then that delay hangs all other incomming
processing.  The open and read on the sending system, create, and the
write on the receiving, each hang everything until complete.  The links
for cross posted articles also take considerable time.

The answer to that was streaming NNTP.  It sends several articles ahead
before waiting for responses.  This keeps the incomming buffer full so
the server always has something ready to process.

The disk IO delays are another problem though.  As you have noticed the
bigger the directory the more overhead.  My misc.jobs.offered is nearly
half a meg (for the directory, not the articles).  It has over 22,000
entries.  Ever new article requires Unix to scan that directory and
compare every existing article name to the new one.  That really cranks
up the CPU usage.  Then there is the overhead of finding a free spot,
inode, and disk blocks.

A hardware write cache helps a little but not with much of the
overhead.  The writes are already buffered so the savings is with the
file create itself.  But it does nothing to help with scanning a large
directory.  Also when the file is created Unix still probably needs to
read the disk to find a free inode and to allocate the data blocks.

These reads block innd.  During that time no other incomming feeds are
processed.  What I have been thinking about is having innd set up N
(configurable) channels to "innwd" programs.  When an article was ready
to be written to disk a free innwd channel would be used to send the
article pathnames and data.  The innwd would do the actual create and
write to disk.  In this way innd could be doing other things, rejecting
duplicates, reading in the next article, etc. while the innwd was
blocked on disk IO.  With more than one innwd dividing up the load there
could be more than one simultainious IO request at a time.  This would
allow overlapping more article creations especially where the spool was
split across multiple drives.  The down side to this would be greater
CPU consumption as the data would be copied from innd to innwd.