Return to BSD News archive
Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!newshost.telstra.net!act.news.telstra.net!psgrain!iafrica.com!pipex-sa.net!plug.news.pipex.net!pipex!weld.news.pipex.net!pipex!tank.news.pipex.net!pipex!news.mathworks.com!newsfeed.internetmci.com!usenet.eel.ufl.edu!nntp.neu.edu!camelot.ccs.neu.edu!nntp.ccs.neu.edu!albert From: albert@krakatoa.ccs.neu.edu (Albert Cahalan) Newsgroups: comp.unix.bsd.freebsd.misc,comp.os.linux.development.system Subject: Re: The better (more suitable)Unix?? FreeBSD or Linux Date: 27 Feb 1996 20:59:49 GMT Organization: Northeastern University, College of Computer Science Lines: 40 Message-ID: <ALBERT.96Feb27155949@krakatoa.ccs.neu.edu> References: <4er9hp$5ng@orb.direct.ca> <311C5EB4.2F1CF0FB@freebsd.org> <CBITMEAD.96Feb26173656@versant.versant.com.au> <4gt6mb$pv@park.uvsc.edu> NNTP-Posting-Host: krakatoa.ccs.neu.edu In-reply-to: Terry Lambert's message of 26 Feb 1996 20:54:35 GMT Xref: euryale.cc.adfa.oz.au comp.unix.bsd.freebsd.misc:14632 comp.os.linux.development.system:18381 >>>>> "T" == Terry Lambert <terry@lambert.org> writes: T> cbitmead@versant.versant.com.au wrote: ] >Sync metadata is an T> implementation of ordered writes. It's ] >about as trivial an T> implementation as you can possibly devise, ] >but it *is* one. ] ] Except T> that it is the wrong order. The correct way is to write the ] data first T> and then the meta-data. This ensures consistent data. T> How, in your proposed implementation, would you distinguish allocated T> blocks that have been written from allocated blocks that have not been T> written in a two user delete/create case? T> Which is to say, Bob deletes file "foo", Jim copies secure file "fum", T> writes some sensitive data to "fum" in a block that belonged to "foo", and T> the system crashes before "foo" is really deleted. Nope, blocks are not deallocated until the metadata is written. The filesystem always has a consistant state - all fdisk needs to do is check a few CRC protected, timestamped blocks to make sure they all agree. This is so trivial that the kernel can do it at mount time. This ideal filesystem wastes space though. How it works: User changes a file, and it is put into free space on disk. This changes the directory/inode/whatever, so these data structures are also copied into free space. Note that nothing points to this information, so the changes would not exist if the system crashed. At some point, a block is written that points to the root directory, inode table, and free block list. Since this is the one point of failure (the block could be half written at a crash), the block is written several places with timestamps and CRC codes. When this block is successfully written, the filesystem has advanced to a new state and all changes on disk are committed. Only at this time can "deallocated" blocks get put back into the free pool. -- Albert Cahalan albert@ccs.neu.edu