Return to BSD News archive
Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!harbinger.cc.monash.edu.au!munnari.OZ.AU!spool.mu.edu!howland.reston.ans.net!nntp.crl.com!news.PBI.net!news.mathworks.com!hunter.premier.net!netnews.worldnet.att.net!cbgw2.att.com!nntphub.cb.lucent.com!news From: "John S. Dyson" <dyson@inuxs.att.com> Newsgroups: comp.os.linux.networking,comp.unix.bsd.netbsd.misc,comp.unix.bsd.freebsd.misc Subject: Re: TCP latency Date: Fri, 12 Jul 1996 09:44:59 -0500 Organization: Lucent Technologies, Columbus, Ohio Lines: 87 Message-ID: <31E664EB.167EB0E7@inuxs.att.com> References: <4paedl$4bm@engnews2.Eng.Sun.COM> <31E106AF.41C67EA6@dyson.iquest.net> <4rvmtf$ven@linux.cs.Helsinki.FI> <31E3D9E2.41C67EA6@dyson.iquest.net> <4s5bl2$qpg@linux.cs.Helsinki.FI> NNTP-Posting-Host: dyson.inh.lucent.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: Mozilla 2.0 (X11; I; FreeBSD 2.1-STABLE i386) Xref: euryale.cc.adfa.oz.au comp.os.linux.networking:44963 comp.unix.bsd.netbsd.misc:4004 comp.unix.bsd.freebsd.misc:23380 Linus Torvalds wrote: > > In article <31E3D9E2.41C67EA6@dyson.iquest.net>, > John S. Dyson <toor@dyson.iquest.net> wrote: > > > >One other thing, the numbers show that the DRIVER used on BSD is slower -- the > >networking code is NOT SHOWN to be slower... Refer to the numbers... > > No, read the numbers again. Linux was faster on loopback too. > Given the same kernel compile options, that has not shown to be true. The difference of 20usecs is well within the range of them. > > >Do you know that my localhost results on my P5-166 are 200usecs? > >That is faster than the Linux measurements that are being espoused as a > >"record" isn't it??? > > Ooohh.. "FreeBSD is faster over loopback, when compared to Linux over > the wire". Film at 11. > > linux$ ./lat_tcp linux > $Id: lat_tcp.c,v 1.2 1995/03/11 02:25:31 lm Exp $ > TCP latency using linux: 181 microseconds > > That's on a P166 too. With a stable kernel. What were you saying > again? > Not the same machine :-(. I see that percentage here is not as important as the absolute latency is. Seems like a pretty small difference to me given a total reimplementation. I guess alot of performance problems are being fixed? Hmmm... Looks like the NEW IMPROVED Linux TCP suite is about the same perf as the BSD code... Luckily, there is movement afoot to clean-up the BSD networking code, and I wouldn't be too awful suprised if it betters Linux. (Some pieces of it haven't been reworked in years.) > (And if you think you will get 10% better numbers by just changing > compiler options, I'd suggest you _try_ it first, without spouting it on > the newsgroups as facts with no backing). > I get big differences on kernel compile options (I have seen 10% or better given -O vs. -O2 -fomit-frame-pointer, especially on code that uses lots of registers.) You are still not controlling the experiment. Sigh... Certain kinds of operations show big differences. One note, it is interesting that the latency differences are the same "20usecs" on both benchmarks... > > And if you don't like latency numbers, what are your throughput numbers? > (Btw, check your bcopy() speed first to see if the hardware really _is_ > comparable, see below) > > linux$ ./bw_tcp linux 50m > $Id: bw_tcp.c,v 1.3 1995/06/21 21:02:49 lm Exp $ > Socket bandwidth using linux: 17.14 MB/sec > I get about 17-19 MB/sec on localhost also on FreeBSD. The MBUF code is not very inefficient in reality. Again, it is hard to come to any conclusions given different hardware. > > Yes, the machine was idle while doing this. I guess you can just do them > in parallell, though, to get _some_ idea about the degradation under > load (admittedly not a lot of sockets, but at least some activity for > context switches etc): > Geesh, do you understand that your example tests only three connections to the same machine? You are not showing scalability at all. (Mostly you are showing that you arent' busting the cache.) The scalability issues on the old Linux context switch didn't come into effect until about 20processes did it? Herein, you are showing that the localhost code under very little if NO load runs the same speed (at least to me.) But you STILL are not addressing the issue of scalability (especially to/from multiple TCP/IP addresses.) > > (This machine does memory copies at 43MB/s - don't bother comparing to > wildly different hardware: it's memcpy() bound. I get 55MB/s on my > alpha with the same kernel) > I don't bother comparing OSes unless it is the SAME hardware... (Actually, I'll compare the results, but certainly NOT come to any conclusions.) At least you are trying to use more information than just NO-LOAD latency to compare the TCP suites... You ARE making progress. John