Return to BSD News archive
Xref: sserve comp.arch:27981 comp.unix.bsd:7602 comp.os.linux:15102 Path: sserve!manuel.anu.edu.au!munnari.oz.au!spool.mu.edu!agate!dog.ee.lbl.gov!horse.ee.lbl.gov!torek From: torek@horse.ee.lbl.gov (Chris Torek) Newsgroups: comp.arch,comp.unix.bsd,comp.os.linux Subject: Re: IDE faster than Mips SCSI disk????? Date: 8 Nov 1992 12:00:26 GMT Organization: Lawrence Berkeley Laboratory, Berkeley Lines: 123 Message-ID: <27298@dog.ee.lbl.gov> References: <1992Nov6.033942.21194@ntuix.ntu.ac.sg> <1992Nov6.142946.17430@walter.bellcore.com> Reply-To: torek@horse.ee.lbl.gov (Chris Torek) NNTP-Posting-Host: 128.3.112.15 In article <1992Nov6.142946.17430@walter.bellcore.com> mo@bellcore.com writes: >one must be very careful that one isn't comparing the >performance of two disk controllers and not the attached drives. >... the problem, however, is really *knowing* what one is measuring. Indeed. Mike O'Dell knows whereof he speaks. In this case, of course, the comparison is between overall system performance. Whenever you run any benchmark, all you are *definitely* measuring is the speed of the system at that particular benchmark. Running bonnie or iozone on a MIPS and a 386 box will tell you how fast those systems are at running bonnie or iozone. Now, the authors of benchmarks try hard to write them such that they correlate well with performance at other things. With any luck, the performance you get running bonnie is fairly indicative of the performance you will get running other I/O bound tasks. But all you have really measured is overall system performance, NOT individual disk, controller, protocol, bus, cpu, memory, or kernel code performance. If you can hold most of these variables constant, you may be able to get meaningful comparisons between IDE and SCSI drives, but in practise, you have to change at least two variables simultaneously. As far as `knowing your measurements' goes... In I/O, there are at least two interesting numbers: `latency' and `throughput'. Latency is essentially `turnaround time'---the time lag from when you ask for something until when it actually happens. Throughput is `total work done'. Usually you want small latencies and large throughput: things get done immediately, and lots of things get done. But getting both is expensive. As a general rule, things you do to reduce latency generally worsen throughput. Things you do to increase throughput also increase latency. (There are exceptions to this general rule.) Large latencies do not necessarily reduce throughput. At a busy supermarket, for instance, one generally has to wait in line quite a while (high latency) but the actual volume of people moving through all checkouts tends to be high. (Supermarkets use parallelism: lots of lines. This is one exception: parallelism generally improves throughput without having much effect on latency---but not always; it only helps if you do not run into another, separate bottleneck.) With disk drives and controllers, fancy protocols can improve throughput (by allowing more stuff to run in parallel), but they often have a high cost in latency. As others have noted, manufacturers are notoriously cheap, and tend to use 8085s and Z80s and such in their I/O devices. These CPUs are horrendously slow, and take a long time to interpret the fancy protocols. Raw disk throughput depends entirely on bit density. If the platters spin at 3600 rpm, and there are 60 512-data-byte sectors on each track, we find that exactly 3600 sectors pass under the read/write head (assuming there is only one per track) each second. This is 1.8 MB of data, and therefore the fastest the disk could possibly read or write is 1.8 MB/s. Seek times and head switch or settle delays will only reduce this. Disk latency is influenced by a number of factors. Seek time is relevant, but the so-called `average seek time' is computed using a largely baseless assumption, namely that sectors are requested in a uniform random fashion. In all real systems, sector requests follow patterns. The exact patterns depends greatly on systemic details that are not well suited to analysis, so manufacturers use the randomness assumption and calculate average seek time using 1/3 the maximum stroke distance. (Actuators tend to be nonlinear, so time(1/3 max distance) != 1/3 time(max distance). On the other hand, the latter number is typically smaller; manufacturers may thus use it either out of `specsmanship' or simple ignorance....) Rotational rate also affects latency. The faster the disk spins, the faster any particular sector will appear under the heads. On the other hand, if the disk spins quickly, and the overall system runs slowly, the next sector may have already `flown by' just after it gets requested. This is called `blowing a rev', and is the basis of disk interleaving. One can also compensate for this with track buffers, but those require hardware, which is considered expensive. Interleaving can be done in software: in the drive, in the controller, in the driver, or in the file system, or in any combination thereof. Figuring out where and when to interleave, and how much, is horrendously confusing, because all the factors affect each other, and they are usually not plainly labelled (they are all just called `the interleave factor', as if there were only one). When you start building complete systems, other effects creep in. For instance, separate disk drives can seek separately, in parallel. Some busses permit this, some do not. The SCSI protocol includes a SEEK command, which may or may not be implemented in any particular SCSI unit. SCSI also allows `disconnecting', so that a target can release the bus while a disk seeks, but this requires cooperation from the host adapter. Early versions of SCSI, and the original SASI from which SCSI evolved, did not permit disconnecting, so it was added as an option. Disk controllers often have caches; these may or may not be separate from track buffers. The algorithms used to fill and flush these caches will affect both latency and throughput. The busses involved will put their own constraints on timing, affecting both latency and throughput. The CPU's involvement, if any, will also affect both. The design and coding of the kernel and the various drivers all have some effect. Most Unix systems do memory-to-memory copies, whether as DMA from device memory into a buffer cache followed by buffer-to-user copies, or DMA directly to user space, or whatever; often this means that the memory bus speed controls total throughput. (It used to be that memory speeds outpaced peripheral speeds comfortably, and hence kernels did not need to worry about this. Now the equations have changed, but many kernels have not. Incidentally, note that DMA to user space has its drawbacks: if the resulting page is not copy-on-write or otherwise marked clean, a later request for the same sector must return to the disk, rather than reusing the cached copy.) All in all, system characterisation is extremely difficult. Single- number measures should be taken with an entire salt lick. Good benchmarks must include more information---if nothing else, at least a complete system description. -- In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 510 486 5427) Berkeley, CA Domain: torek@ee.lbl.gov