Return to BSD News archive
Xref: sserve comp.os.386bsd.misc:3076 comp.os.linux.misc:21295 Newsgroups: comp.os.386bsd.misc,comp.os.linux.misc Path: sserve!newshost.anu.edu.au!harbinger.cc.monash.edu.au!msuinfo!uwm.edu!lll-winken.llnl.gov!decwrl!decwrl!netcomsv!calcite!vjs From: vjs@calcite.rhyolite.com (Vernon Schryver) Subject: Re: STREAMS (was I hope this wont ignite ...) Message-ID: <Cu58wD.E1G@calcite.rhyolite.com> Organization: Rhyolite Software Date: Sun, 7 Aug 1994 02:43:25 GMT References: <Cu0w8x.923@seas.ucla.edu> <Cu2Ey9.2oM@calcite.rhyolite.com> <31ve5c$4b1@u.cc.utah.edu> Lines: 84 In article <31ve5c$4b1@u.cc.utah.edu> terry@cs.weber.edu (Terry Lambert) writes: >In article <Cu2Ey9.2oM@calcite.rhyolite.com> vjs@calcite.rhyolite.com (Vernon Schryver) writes: >] Unfortunately, all of those put and service functions and the generic >] nature of the stream head and scheduler ensure that STREAMS are never >] as fast as sockets. I think you can make "page flipping" and "hardware >] checksumming" work with STREAMS (two primary techniques for fast >] networking), but I doubt it is possible to make a "squashed STREAMS >] stack" without doing fatal violence to the fundamental ideas of STREAMS. > ... >One "trick" that does do "fatal violence to the fundamental ideas of >STREAMS" (I like that phrase) is doubly mapping the buffers, pinning >the pages, and passing the address rather than the data itself. This >requires pre-preparing the page mapping so the kernel and user space >mapping is the same. Packet assembly at the stram "tail" must take >this into account, but if done correctly, this will save two copies >and a *lot* of page overhead on a 386 (less so on a 486 or other >rational kernel page protecting architecture). That's exactly what I call "page flipping." I don't think it does violence to STREAMS. Simply create a new STREAMS buffer type. It's easier to create STREAMS buffer types than fancy mbuf clusters. I don't know why fewer people play such games with STREAMS buffers than mbufs. "Type 3" mbufs were the rage at Sun in 1986. My FDDI code has been "page flipping" mbufs for years, with gratifying performance results. HP's FDDI code also page flips, with performance almost as good. Output page flipping is quite easy if you have copy-on-write. Input is harder, but modifying ld(1) to page align big buffers by default or special option makes it practical. >Another "trick" is to preallocate the buffers to include the protocol >header and thus avoid the assembly entirely (leaving only the copy >to card memory, and only that if that is a considration and the card >doesn't DMA from main memory). This does violence to the buffer return >and the stream head, and generally doubles the buffer memory consumption >(to be safe). The user space copyin is done into the real buffer as >a unit instead of into "real" (seperate) mbufs. This techniques is >not usable simultaneously with the previous one, unless the user space >application has incestuous knowledge of the protocol and can handle >skipping the encapsulation (header) data in dealing with the buffer >contents. This is an ancient BSD mbuf trick. I don't think it does any violence to STREAMS. At most your STREAMS modules have to peak at STREAMS buffer reference counts and know more than they should about the underlying implementation of the buffers (e.g. to do as you say and avoid writing on buffers that are not really simple buffers.) My first commerical STREAMS code in 1986 played such games to make tty's go faster on 68000 some based systems. (That's not intended as a brag, but proof it's not rocket science.) >STREAMS can be high performance, but, as you note, at almost the >penalty of not being STREAMS any more except in the technical sense. I disagree. I don't think you can build what I understand Van Jacobson calls a "squashed stack" without changing the STREAM head code beyond recognition. Remember that Jacobson's neat idea (as I understand it) is to cache the entire pile of headers, from TCP through MAC, and when the user makes a write(2) call, combine a copy of that cached glob of headers with the user's data while doing the TCP checksum, make the mindless modifications about 10 bytes among to those 54 bytes (for Ethernet) or 64 bytes (for FDDI with typical MACs), and stick the result on the MAC chip's DMA queue. Those mindless modifications consist of adding values to the previous contents--e.g. TCP seq #, IP ID, and IP cksum. Note that there is no ARP lookup, no running through TCP state machine switches, and no IP fiddling. It's just "header prediction" or "header compression" taken to it's obvious conclusion. ("obvious" once you're told about it, that is). Think about how the STREAMS head would have to be smart enough to do all of this and bypass all of the put and service functions, except when something exceptional has happened in which case it must do the old fashioned stuff. Note also that the STREAMS head would have to arrange to keep the user data around on some queue somewhere in case of retransmissions. On the other hand, not having seen Van Jacobson's code, but having thought a little about it, this seems to me like fairly straight forward violence to the BSD sosend() function--yeah, I understand the protocol switch is much changed and sosend() may not be called sosend() anymore, but those are not a big deal. Vernon Schryver vjs@rhyolite.com