Return to BSD News archive
Xref: sserve comp.os.386bsd.misc:3063 comp.os.linux.misc:21227 Path: sserve!newshost.anu.edu.au!harbinger.cc.monash.edu.au!msuinfo!agate!dog.ee.lbl.gov!news.cs.utah.edu!u.cc.utah.edu!cs.weber.edu!terry From: terry@cs.weber.edu (Terry Lambert) Newsgroups: comp.os.386bsd.misc,comp.os.linux.misc Subject: Re: STREAMS (was I hope this wont ignite ...) Date: 6 Aug 1994 07:29:48 GMT Organization: Weber State University, Ogden, UT Lines: 57 Message-ID: <31ve5c$4b1@u.cc.utah.edu> References: <31d5ls$8e9@quagga.ru.ac.za> <Cu0w8x.923@seas.ucla.edu> <Cu2Ey9.2oM@calcite.rhyolite.com> NNTP-Posting-Host: cs.weber.edu In article <Cu2Ey9.2oM@calcite.rhyolite.com> vjs@calcite.rhyolite.com (Vernon Schryver) writes: ] Unfortunately, all of those put and service functions and the generic ] nature of the stream head and scheduler ensure that STREAMS are never ] as fast as sockets. I think you can make "page flipping" and "hardware ] checksumming" work with STREAMS (two primary techniques for fast ] networking), but I doubt it is possible to make a "squashed STREAMS ] stack" without doing fatal violence to the fundamental ideas of STREAMS. ] The fastest TCP/IP implementations are based on sockets, not STREAMS, ] and they run 2 to 20 times faster (yes, twenty, as in Gbit/sec). You can build a "stack compiler" that takes I/O and connection specifications for multiple stacks and "squashes" them into a single stack with apparently discrete interfaces. There is at least one commercial implementation that does this (I would have to look at my notes at work to see which one). The page flipping an HW checksumming are both good points. Another technique is to "pre-know" how much you nead to read at the card level; you can do this with incestuous knowledge on a per-protocol basis in the drivers; this can nearly triple burst rate (but won't do anything for propagation delay). Another "cheat" is to start pushing a packet and shove it all the way down at a high priority. This isn't combinatorial with "squashing", and leads to some cute problems unless a lot of thought is taken beforehand. One "trick" that does do "fatal violence to the fundamental ideas of STREAMS" (I like that phrase) is doubly mapping the buffers, pinning the pages, and passing the address rather than the data itself. This requires pre-preparing the page mapping so the kernel and user space mapping is the same. Packet assembly at the stram "tail" must take this into account, but if done correctly, this will save two copies and a *lot* of page overhead on a 386 (less so on a 486 or other rational kernel page protecting architecture). Another "trick" is to preallocate the buffers to include the protocol header and thus avoid the assembly entirely (leaving only the copy to card memory, and only that if that is a considration and the card doesn't DMA from main memory). This does violence to the buffer return and the stream head, and generally doubles the buffer memory consumption (to be safe). The user space copyin is done into the real buffer as a unit instead of into "real" (seperate) mbufs. This techniques is not usable simultaneously with the previous one, unless the user space application has incestuous knowledge of the protocol and can handle skipping the encapsulation (header) data in dealing with the buffer contents. STREAMS can be high performance, but, as you note, at almost the penalty of not being STREAMS any more except in the technical sense. Terry Lambert terry@cs.weber.edu --- Any opinions in this posting are my own and not those of my present or previous employers.