Return to BSD News archive
Newsgroups: comp.unix.bsd Path: sserve!manuel.anu.edu.au!munnari.oz.au!uunet!zaphod.mps.ohio-state.edu!cs.utexas.edu!sun-barr!ames!agate!dog.ee.lbl.gov!hellgate.utah.edu!fcom.cc.utah.edu!cs.weber.edu!terry From: terry@cs.weber.edu (A Wizard of Earth C) Subject: Re: cache terms (was Adding Swapspace ??) Message-ID: <1992Oct25.224950.3098@fcom.cc.utah.edu> Sender: news@fcom.cc.utah.edu Organization: University of Utah Computer Center References: <Bw7H4L.LLB@cosy.sbg.ac.at> <1992Oct16.162729.3701@ninja.zso.dec.com> <1992Oct16.201806.21519@fcom.cc.utah.edu> <Bw8Mw5.IFC@pix.com> <1992Oct18.082017.22382@fcom.cc.utah.edu> <BwLLxp.7Bt@flatlin.ka.sub.org> <1992Oct25.111525.25782@fcom.cc.utah.edu> <26965@dog.ee.lbl.gov> Date: Sun, 25 Oct 92 22:49:50 GMT Lines: 152 In article <26965@dog.ee.lbl.gov>, torek@horse.ee.lbl.gov (Chris Torek) writes: |> In <1992Oct18.082017.22382@fcom.cc.utah.edu> terry@cs.weber.edu |> (A Wizard of Earth C) claimed: |> >>>the write to the disk is done through a write-through cache. |> |> In article <BwLLxp.7Bt@flatlin.ka.sub.org> bad@flatlin.ka.sub.org |> (Christoph Badura) pointed out: |> >>The UNIX FS buffer cache has since its invention been write-behind |> >>and not write-through. |> |> In article <1992Oct25.111525.25782@fcom.cc.utah.edu> terry@cs.weber.edu |> (A Wizard of Earth C) writes: |> >I tend to use these terms synonymously. When can a cache be write through |> >but not write behind? |> |> The various terms for describing caches are pretty standard. In hardware, |> a `write through' cache is one where each write updates both the cache |> and main memory `simultaneously'. In contrast, in a `write back' cache, |> writes update only the cache line; main memory is updated only when the |> line is kicked out, either by an explicit cache flush or by replacement |> with new contents. Of course, this still leaves the issue of `write behind' unresolved; is it synonymous with `write through' or `write back'? The distinction I would make, were I to draw one, would be that there was enforced latency for seek and rotational delays between the time the data was written to the cache and the time the data got written in a `write behind' cache. The difference between this and a `write through' cache being seek policy, since there is always rotational latency to consider (and possibly queue around) in both cases. As was pointed out by someone else, a faster head posititioning mechanism makes rotational delay relatively more important. Since I think we can all agree that a block of data has to go to memory before that memory is written to disk, then the operation is either `write through' or `write behind' based on your definition of simultanaity (something can't be going into a bounce buffer and written at the same time). |> In the Unix kernel, the buffer cache code simulates a hardware `write |> back' cache. All else being equal, write-back caches are usually more |> efficient than write-through. (All else is rarely equal.) In this |> case, cache `flush' occurs only on sync() or fsync() calls, or in some |> systems, through timers. Replacement occurs when a buffer is reused. So it's neither `write through' nor `write behind'; the issue being that writing a block of virtual memory through a traditional swap mechanism and writing a block of virtual memory through the file system page mechanism differ in (1) a copy taking place, and (2) the fact that you are trading cache memory for virtual memory. The overhead of the copy is obvious, but the overhead of the "cache buffers dedicated to swap data rather than real file data" is questionable; certainly it would be faster to "swap in" from a cache buffer than from real disk, but it would be faster to swap in from real disk directly than into a cache buffer, followed by a copy. I think that this is probably acceptable overhead for the benfits derived, and that the cost of the copy in kernel space is negligible. |> The BSD kernel does not, however, use a strict write-back policy. |> Instead, whenever it seems important for consistency (directory |> operations and indirect blocks), and/or whenever it seems likely that a |> block will not be rewritten soon, the kernel uses a synchronous |> bwrite() call or an asynchronous but immediate bawrite() call. More |> detail can be found in the Bach and BSD books. So it can, at times, act as `write through' for critical data. I had actually put swap in this category, although in retrospect it matters little whether swap data is reliably on the disk before a system crash, since by definition the data is invalid anyway (unless you attempt to make a recovery of the system state at the time of the crash). |> >Just curious as to why you draw such a sharp distinction, the point being |> >that there is negligible overhead in a cached writes for swap no matter |> >how you slice the pie. |> |> This is not really true, since swapping/paging occurs mainly when the |> machine is low on memory. This tends to coincide with the machine |> being `active', which implies that every bit of overhead counts. With |> unified VM/buffer caches, the effect is even worse: `heavy paging' and |> `overloaded buffers' can become completely synonymous without some sort |> of policy to prevent the buffer cache from taking over all of physical |> memory. (Current BSD systems have an enforced limit on buffer cache |> size, namely `bufpages' in machdep.c.) I wan't really thinking of permitting this; rather, I was thinking of going the other way (virtual memory steals buffer cache). I can definitely see the drawbacks in going the other way; going this way potentially blocks processes on a resource other than VM, which is good, plus also has the effect of limiting the sector-to-cache-buffer-mappings during times of heavy swapping. Basically, if there are less cache buffers for users, there are less seek offsets represented by user cache buffers, and thus disk throughput (assuming good placement of the swap file) should be increased under heavy load. Of course, implementing this doesn't require the file system be used for swapping. If 10% of the cache buffers were reserved for swapping as a low watermark, and some higher value (30%?) for the high watermark of cache buffers being used for swapping, this would increase system availability in a "memory bound" kernel (one in which swapping is required to occur). |> This is the core of the idea behind `dribble' buffer write policies |> (the timers mentioned above): the machine can best afford the writes |> when it is not busy doing other stuff. At the time the write occurs, |> it is busy (obviously so: someone is busy writing). If the write is |> merely cached, a huge queue can build up, and then when demand |> increases *everyone* will have to wait. A `dribble-back' cache avoids |> all of this, but requires extra mechanism and trades off total |> throughput for decreased latency. Systems with big queues tend to have |> greater overall throughput. Right; more memory == better performance. A `dribble-back' is certainly a potential loss of granularity for swapping implemented on top of it (in the sense of some form of unified but managed implementation of VM/buffer cache). After all, one doesn't want to delay swapping until the machine is less busy ....8-). This assumes the buffer caching is `dribble-back' and the swap-to-disk mechanism isn't. I still think it's desirable to swap to a file. The best arguments against this are still Christoph's, which basically boil down to penalties when trading cache memory for buffer memory, the cost of the additional copies coming and going, and the allocation policy for files being [potentially] bad for swap space. Arguments based on VM/buffer-cache unification, and the actual I/O (which has to be done anyway) going through the file system rather than the current swapping mechanisms are much less important, to my mind, as they represent negligible overhead compared to the work that has to be done anyway (given promiscuous preallocation of the swap file to get a "good" geometry on disk). None of the arguments so far have convinced me that I shouldn't swap to a file, and, in particular, one on an NFS mounted partition, since not doing so means there are 36 machines that are doomed to sit here and run DOS that could be providing CPU cycles for 386BSD. I think it would be nice if a student could put in a boot disk, get a login prompt first thing, run X on 386BSD, and have the machine reboot on logout. It would be even better if they could run a "DOS program" off the Novell server to boot 386BSD without having to have a local disk. Swapping over the net is the only way to achieve this (short of "please insert swap floppy in A: and press any key to continue" 8-)), and swapping to a file is the easiest way to swap over the net. Regardless of how much overhead (I see it at about 6-8%) going through the file system causes, anything is better than DOS. Terry Lambert terry@icarus.weber.edu terry_lambert@novell.com --- Any opinions in this posting are my own and not those of my present or previous employers. -- ------------------------------------------------------------------------------- "I have an 8 user poetic license" - me Get the 386bsd FAQ from agate.berkeley.edu:/pub/386BSD/386bsd-0.1/unofficial -------------------------------------------------------------------------------