*BSD News Article 72860

Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!harbinger.cc.monash.edu.au!munnari.OZ.AU!news.ecn.uoknor.edu!news.eng.convex.com!newshost.convex.com!newsgate.duke.edu!news.mathworks.com!hunter.premier.net!uunet!inXS.uu.net!news.artisoft.com!usenet
From: Terry Lambert <terry@lambert.org>
Newsgroups: comp.os.linux.networking,comp.unix.bsd.netbsd.misc,comp.unix.freebsd.misc
Subject: Re: TCP latency
Date: Thu, 04 Jul 1996 14:39:21 -0700
Organization: Me
Lines: 57
Message-ID: <31DC3A09.30D463F@lambert.org>
References: <4paedl$4bm@engnews2.Eng.Sun.COM> <4pf7f9$bsf@white.twinsun.com>
					<4qad7d$a5l@verdi.nethelp.no> <4qaui4$o5k@fido.asd.sgi.com> <4qc60n$d8m@verdi.nethelp.no> <31D2F0C6.167EB0E7@inuxs.att.com> <31D9AF0D.4C3AE08C@lambert.org> <31D9ECC5.41C67EA6@dyson.iquest.net>
NNTP-Posting-Host: hecate.artisoft.com
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Mailer: Mozilla 2.01 (X11; I; Linux 1.1.76 i486)
Xref: euryale.cc.adfa.oz.au comp.os.linux.networking:44043 comp.unix.bsd.netbsd.misc:3928

John S. Dyson wrote:
] > ] All this TCP latency discussion is interesting, but how does this
] > ] significantly impact performance when streaming data through the
] > ] connection?  Isn't TCP a streaming protocol?
] >
] > It's relevent for Samba and HTTP, which are request/response
] > protocols which tend to not take advantage of the sliding
] > windows.  For these protocols, the latency per packet counts
] > per packet instead of FTP or similar protocols, where it counts
] > only once per packet run.
] 
] So I guess it is time to look at it.  Isn't this likely an
] artifact of the sofware interrupt/ast type scheduling in the
] BSD code?

I think maybe the way to "fix" it is to virtualize interrupts
to increase interleave.

For Samba, the way to "fix" it is to turn the reads around in the
kernel.  8-).

It's possible to save ~50% of the copies Samba does between kernel
and user space with judicious mmap'ing.

It would probably be a good idea to implement a system call that
does a write and doesn't return until it does a read; this would
halve the call overhead in a work-to-do engine model.

"Hot engine scheduling" using a kernel packet mux to LIFO the
pending read satisfaction order for write/read requests would
go a long way towards increasing cache/data locality and reducing
paging overhead.

The stack overhead is probably only about half of the removable
latency for Samba -- but it's certainly far from zero, and can be
improved.


Kernel preemption when requests have been satisfied would help
probably more than anything else.

I was astonished at the size of a quantum in BSD (100?!?!); divide
that by 5 and your context switch overhead goes up, but your
latency goes down.  I really have no statistics on average
quantum utilization, which is something you'd need to set the
ideal.  If a process gives away quantum after 20ms, then 5 would
be a definite improvement.

What's the non-blocking "half life" of an average process, given
no involuntary preemption?  8-).


                                        Terry Lambert
                                        terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.