Return to BSD News archive
Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!harbinger.cc.monash.edu.au!munnari.OZ.AU!news.mel.connect.com.au!news.mira.net.au!inquo!in-news.erinet.com!newsfeeder.sdsu.edu!nntp.primenet.com!news.mathworks.com!enews.sgi.com!fido.asd.sgi.com!neteng!lm From: lm@neteng.engr.sgi.com (Larry McVoy) Newsgroups: comp.os.linux.networking,comp.unix.bsd.netbsd.misc,comp.unix.bsd.freebsd.misc Subject: Re: TCP latency Date: 24 Jul 1996 21:07:40 GMT Organization: Silicon Graphics Inc., Mountain View, CA Lines: 98 Message-ID: <4t63as$f2q@fido.asd.sgi.com> References: <4paedl$4bm@engnews2.Eng.Sun.COM> <4sadde$qsv@linux.cs.Helsinki.FI> <31E9E3A7.41C67EA6@dyson.iquest.net> <4sefde$f0l@fido.asd.sgi.com> <4socfr$3ot@dworkin.wustl.edu> Reply-To: lm@slovax.engr.sgi.com NNTP-Posting-Host: neteng.engr.sgi.com X-Newsreader: TIN [version 1.2 PL2] Xref: euryale.cc.adfa.oz.au comp.os.linux.networking:46228 comp.unix.bsd.netbsd.misc:4156 comp.unix.bsd.freebsd.misc:24335 Chuck Cranor (chuck@ccrc.wustl.edu) wrote: : In article <4sefde$f0l@fido.asd.sgi.com>, : Larry McVoy <lm@slovax.engr.sgi.com> wrote: : >Umm, I'd be happy to entertain suggestions for a better measurement of : >a null entry into the system. I don't want something that anyone special : >cases - that's just worthless. I want something that is actually measuring : >all the work you need to do to get to the point that you can do something in : >the kernel. : I took Larry's lat_syscall.c and a few of J Wunsch's suggestion for : different system calls to try and ran a few tests. Here are the results: : [note: sparc 2 is running SunOS 4.1.3_U1 (48MB RAM), P5-133MHz is running : NetBSD/i386 (32MB RAM) ... both systems unloaded. all numbers are : microseconds] : program description Sparc2 P5-133MHz : lat_syscall write 1 to /dev/null 61 6 : lat_gettime gettimeofday(&tv,0) 27 5 : lat_kill kill(1,0) 23 2 : lat_umask umask(0) 19 2 : lat_getppid getppid() 17 1 : Given that data, it seems like lat_syscall's writing 1 byte to /dev/null : is indeed a poor measurement of "Null syscall." This leaves me with : two questions: : 1. Larry, when you were designing lat_syscall, did you see numbers like : the above? If not, then I would consider that a mistake. If so, : then why did you stick with "write 1 to /dev/null" as a measurement : of "Null syscall" (which I also consider a mistake)? Sure did. If there is a mistake here, it is my choice of name for the benchmark. What I wanted was an entry into the kernel that represented the approximate real, average cost of getting to the point of being able to do something useful & common. I'll try and provide some insight into the thinking that went into this: getpid() It can be trivially optimized down to a memory read. The variance from one system to the next does not reflect anything that can be used as a performance comparison. getppid() This one turns into a trap plus a read. It is a "read only" type benchmark, I wanted one that had to do some work. gettimeofday() Some systems have a global variable that gets updated out of hardclock() every HZ (typically 10 millisecs). So this also can degenerate into a trap plus a memory load. But other systems actually read a high resolution clock for this system call, and reading it takes variable amounts of time, again making the results not very useful. kill(), umask() These are the best "null system call" benchmark choices I've seen. I'd vary the mask in the umask one so that it was actually changing state. The only rationale for not using these is this: the main reason I wanted a "null system call" test was that for all of the other benchmarks, I wanted to be able to "decompose" them into the various costs. For example, the pipe latency benchmark is really process 1 process 2 write() ctx switch -> read() write() <- ctx sw read() So it is 4 system calls and 2 context switches. I wanted to be able to look at the pipe benchmark and have the numbers roughly add up. And they typically do. So, I'm willing to cop to the critism that the labeling was crappy and perhaps I should call it the "null I/O syscall". I'm also willing (and interested) to find a different syscall that just measures trap overhead, but I haven't seen one yet that I really like. The getppid() may be the best out there, though, it's hard to cache that. Thoughts? : 2. How much of the difference between FreeBSD lat_syscall and Linux : lat_syscall can be attributed to VFS overhead in FreeBSD? Or : more generally, how does the overhead and functionality of Linux's : VFS layer compare with FreeBSD's VFS layer? Both FreeBSD and Linux offer roughly the same VFS interface. It took me a while to wrap my brain around Linus' thinking in his stuff, having come from a SunOS/BSD background and having spent a lot of time working in that area, but at this point, I think I can do everything in Linux that I could do in *BSD or SunOS, in the VFS areas. -- --- Larry McVoy lm@sgi.com http://reality.sgi.com/lm (415) 933-1804