Return to BSD News archive
Xref: sserve comp.os.386bsd.development:1281 comp.os.386bsd.misc:1141 Path: sserve!newshost.anu.edu.au!munnari.oz.au!news.Hawaii.Edu!ames!elroy.jpl.nasa.gov!usc!howland.reston.ans.net!spool.mu.edu!uwm.edu!ogicse!psgrain!agora!davidg From: davidg@agora.rain.com (David Greenman) Newsgroups: comp.os.386bsd.development,comp.os.386bsd.misc Subject: Revision 2.6 of 'ed' device driver (part 3 of 3) Message-ID: <CECD5J.HoL@agora.rain.com> Date: 3 Oct 93 22:00:53 GMT Article-I.D.: agora.CECD5J.HoL Organization: Open Communications Forum Lines: 178 Reminder: these release notes are out of date! I'm going to move to a manual page rather than this, so I haven't been too interested in keeping this current. -DG ed.relnotes: Release Notes for 'ed' Device Driver David Greenman, 24-May-1993 ------------------------------------ Last updated: 8-September-1993 INTRODUCTION ------------ The 'ed' device driver is a new, high performance device driver supporting Western Digital/SMC 80x3 series (including 'EtherCard PLUS' 'Elite16') and the 3Com 3c503. All of the ethernet controllers use the DS8390 or 83C690 Network Interface Controller (NIC). The differences between the boards are in their memory capacity, bus width (8/16 bits), and special logic (asic) used to configure the shared memory and other things. Every effort has been made to conform to the manufactures' specifications for the NIC and asic. This includes both normal operation and error recovery. PERFORMANCE ----------- transmit -------- The 8390 doesn't provide a mechanism for chained write buffers, so it is very important for maximum performance to queue the next packet for transmission as soon as the current one has completed. On boards with 16k or more of memory, the shared memory is divided in a way that allows enough space for two full size packets to be buffered for transmission. When sufficient data is available for transmission, a packet is copied into the shared memory, the transmission is started, and then an additional packet is copied into the shared memory (to a different memory area). As soon as the first packet has completed, transmission of a second packet can then be started immediately - in less time than the 9.6us interframe gap. This results in the highest performance possible from ethernet. Packets go out on the 'wire' with the following format: preamble dest-addr src-addr type data FCS intr-frame 64bits 48bits 48bits 16bits 1500bytes 32bits 96bits With 10Mbits/sec, each bit is 100ns in duration. All of the above fields, except for data are of fixed length. With full sized packets (1500 bytes), the maximum unidirectional data rate can be calculated as: 6.4us + 4.8us + 4.8us + 1.6us + 1200us + 3.2us + 9.6us = 1230.4us/packet = 812.74382 packets/second = 1219115.7 (1190k) bytes/second. With TCP, there is a 40 byte overhead for the IP and TCP headers, so there is only 1460 bytes of data per packet. This reduces the maximum data rate to 1186606 bytes/second. With TCP, there will also be periodic acknowledgments which will reduce this figure somewhat both because of the additional traffic in the reverse direction and because of the occasional collisions that this will cause. Despite this, the data rate has still been consistantly measured at 1125000 (~1100k) bytes/second through a TCP socket. In these tests, the TCP window was set to 16384 bytes. With UDP, there is less overhead for the headers, and with 1472 bytes of data per packet, a data rate of 1196358.9 (1168k) bytes/sec is possible. UDP performance hasn't been precisely measured with this device driver, but less precise tests show this to be true (measured at around 1135k/second). receive ------- The 8390 implements a shared memory ring-buffer to store incoming packets. The 8bit boards (3c503, and 8003) usually have only 8k bytes of shared memory. This is only enough room for about 4 full size (1500 byte) packets. This can sometimes be a problem, especially on the original WD8003E and 3c503. This is because these boards' shared memory access speed is also quite slow compared to newer boards - typically only about 1MB/second. The additional overhead of this slow memory access, and the fact that there is only room for 4 full-sized packets means that the ring-buffer will occassionally overflow. When this happens, the board must be reset to avoid a lockup problem in early revision 8390's. Resetting the board will cause all of the data in the ring-buffer to be lost - requiring it to be re-transmitted/received...slowing things even further. Because of these problems, maximum throughput on boards of this type is only about 400-600k per second. The 16bit boards (8013 series), however, have 16k of memory as well as much faster memory access speed. Typical memory access speed on these boards is about 4MB/second. These boards generally have no problems keeping up with full ethernet speed. The only problem I've seen with these boards is related to the (slow) performance of 386BSD's malloc code when additional mbufs must be added to the pool. This can sometimes increase the total time to remove a packet enough for a ring-buffer overflow to occur. This tends to be highly transient, and quite rare on fast machines. I've only seen this problem when doing tests with large amounts of UDP traffic without any acknowledgments (uni-directional). Again, this has been very rare. All of the above tests were done using a 486DX2/66, 486DX/33, 386DX/40, 8-9Mhz ISA bus, with Bruce Evans' high speed spl()/interrupt modifications, a high performance version of in_cksum.c from Bakul Shah, with tcp_sendspace/ tcp_recvspace set to 16k, and MCLBYTES set to 2048 (to allow full MTU sized packets). TCP tests were done with the 'ttcp' performance test utility, and also with FTP client/server. UDP tests were done with a modified version of ttcp (to work around a bug in 386BSD's UDP code related to queue depth), and also with NFS. KERNEL INSTALLATION ------------------- To 'install' this driver, the files if_ed.c and if_edreg.h must be copied into the i386/isa kernel source directory and the following line must be added into the file i386/conf/files.i386: i386/isa/if_ed.c optional ed device-driver To build a kernel that includes this driver, first comment out any 'we' or 'ec' devices in your kernel config file. Then, add a line similar to the following (modify to match your cards configuration): device ed0 at isa? port 0x300 net irq 10 iomem 0xcc000 vector edintr Note that the 'iosiz' option is not needed because the driver automatically figures this out. However, if you have problems with this, it can be specified to force the use of the size you specify. On 3Com boards, the tranceiver must be enabled in software (there is no hardware default). In this driver, this is controlled using the "LLC0" link- level control flag. The default for this flag can be set in your kernel config file by specifying 'flags 0x01' in the 'ed' device specification line (this is necessary for diskless support). Otherwise, the tranceiver is easily enabled and disabled with a command like "ifconfig ed0 -llc0" to enable the tranceiver or "ifconfig ed0 llc0" to disable it; this assumes that you have the modified ifconfig(8) that originally appeared in the 386BSD patchkit. To specify the 'flags' option, use a line similar to: device ed0 at isa? port 0x300 net irq 10 flags 0x01 iomem 0xcc000 vector edintr Flags can be similarly specified to force 8 or 16bit mode, and disabling the use of transmitter double buffering. The various supported flag values are: FORCE_8BIT_MODE 0x02 FORCE_16BIT_MODE 0x04 DISABLE_DOUBLE_BUFFERING 0x08 To use multiple flags, simply add the values together. Note that these numbers are in hexadecimal. If the FORCE_8BIT and FORCE_16BIT flags are both specified, the 8BIT flag has precedence. The use of the above flags should only be necessary under exceptional conditions where the probe routine incorrectly determines the board type. Or where the high performance of the transmitter causes problems with other vendors hardware. KNOWN PROBLEMS -------------- 1) Early revision DS8390B chips have problems. They lock-up whenever the receive ring-buffer overflows. They occassionally switch the byte order of the length field in the packet ring header (several different causes of this related to an off-by-one byte alignment) - resulting in "shared memory corrupt - invalid length NNNNN" messages. The board is reset whenever these problems occur, but otherwise there is no problem with recovering from these conditions. 2) 16bit boards can conflict with 8bit BIOSs, BIOS extensions (like the VGA), and 8bit devices with shared memory (again like the VGA, or perhaps a second ethernet board). There is a work-around for this in the driver, however. The problem is that the ethernet board stays in 16bit mode, asserting its '16bit' signal on the ISA bus. This signal is shared by other devices/ROMs in the same 128K memory segment as the ethernet card - causing the CPU to read the 8bit ROMs as if they were 16bit wide. The work-around involves setting the host access to the shared memory to 16bits only when the memory is actually accessed, and setting it back to 8bit mode all other times. Without this work-around, the machine will hang whenever a reboot is attempted. This work-around also allows the board to co-exist with other 8bit devices that have shared memory (like 8bit VGA boards or perhaps a second ethernet board). This has only been implemented for SMC/WD boards, but I haven't seen this problem with 3Com boards (i.e. if you have a 3Com board, you might experiance the above problem - I haven't specifically tested for it). 3) The shared memory on 3Com boards is much slower than it is on SMC boards; it's less than 1MB on 8bit boards and less than 2MB/second on the 16bit boards. This can lead to ring-buffer overruns resulting in additional lost data in heavy network traffic. 4) Because the 16bit 3Com boards bank switch their memory in 8k banks, the double buffering has not been implemented in those boards. Doing so would significantly complicate the code. Not really a bug, but rather a limitation that might be removed in the future.