Return to BSD News archive
Path: euryale.cc.adfa.oz.au!newshost.carno.net.au!harbinger.cc.monash.edu.au!munnari.OZ.AU!news.ecn.uoknor.edu!solace!nntp.se.dataphone.net!nntp.uio.no!newsfeed.nacamar.de!cpk-news-hub1.bbnplanet.com!news.bbnplanet.com!newsxfer3.itd.umich.edu!news1.best.com!nntp1.ba.best.com!not-for-mail From: dillon@flea.best.net (Matt Dillon) Newsgroups: comp.unix.bsd.freebsd.misc,comp.unix.bsd.bsdi.misc,comp.sys.sgi.misc Subject: Re: no such thing as a "general user community" Date: 29 Mar 1997 01:58:46 -0800 Organization: BEST Internet Communications, Inc. Lines: 98 Message-ID: <5hip4m$ss7@flea.best.net> References: <331BB7DD.28EC@net5.net> <5hfl3n$a3t@fido.asd.sgi.com> <5hh5n2$9q8@flea.best.net> <5hhi67$1gl@fido.asd.sgi.com> NNTP-Posting-Host: flea.best.net Xref: euryale.cc.adfa.oz.au comp.unix.bsd.freebsd.misc:37980 comp.unix.bsd.bsdi.misc:6493 comp.sys.sgi.misc:29496 :In article <5hhi67$1gl@fido.asd.sgi.com>, :Ray Chen <rcc@tilt.engr.sgi.com> wrote: :>Ok, I've stayed out of this so far because I've been busy :>fixing bugs and making some of those performance improvements :>people have been grumping about us not doing :-) but I've got :>to jump in now. :> :>In article <5hh5n2$9q8@flea.best.net>, :>Matt Dillon <dillon@flea.best.net> wrote: :>> I don't think a log based filesystem will be much of a win over FFS :... :> :>Matt, if you can give me more data to make it easier for us to :>reproduce the scenario's you've seen, I'll try and see that the :>problems are fixed. Just because most of our customers don't do :>sustained paging for example, that doesn't mean IRIX shouldn't do :>it well. I'll email you on this. :>But I'm sorry. The filesystem comments are just flat-out wrong. :> :>FFS will *always* be slower doing file creates than XFS or for that :>matter, any good journalling filesystem. :> :>The fundamental problem with FFS is that to guarantee safety, when :>you do file deletion/creation, the writes have to be ordered. The :>first set of updates have to hit the disk regardless of how you order :>the directory update and inode deallocation before the second set hits. Actually, this isn't entirely true. To guarentee safety on file create, the only thing you need to do is guarentee that the inode is pre-cleared (before the create). You can then update the inode and directory entry asynchronously. If a crash occurs, fsck will either find a directory entry pointing to a clear inode, or an inode without an associated directory entry. To guarentee safety on file delete, the inode must be cleared synchronously but the directory entry (if it does not split itself across a sector boundry) can be updated asynchronously. If a crash occurs, fsck will possibly find a directory entry to a cleared inode. :>Otherwise, you can get nice anomalies like a file changing to a named :>pipe because the inode happened to be a named pipe before it was deleted :>and reallocated as a file. ... which is easy to fix, since you have to update the inode on delete anyway, you might as well clear it (or mark it as unallocated). :>Enough about journalling vs. FFS. People have talked about XFS's :>main claims to fame. I'd like to set the record straight. :> :>XFS's main claim to fame are the S-words: speed, scalability, safety. :> :>Speed: we're fast. We hit >300 MB/sec the first day we shipped and :>that number's been going up ever since. As the I/O hardware gets :>faster, so do we. We've done >500 MB/sec for something like over :... :>Scalability: we can work on big files and filesystems. 80 GB :>filesystems are routine. So are 12 GB files. We work on large :... :>directories. Put a million files into a directory. The filesystem :>still runs fast. :> :>Safety: if your computer crashes for some reason, the 80 GB filesystem :>recovers in < 15 seconds and it's just fine. These are all good points. I agree completely. :>> crash very often), especially if FFS is further adjusted to set the :>> clean bit on mounted filesystems that have been synced up and are idle. :>> :>> -Matt :> :>We have 24x7 customers running high-availability configurations who :>would disagree with you about fsck. They don't *ever* want to run :>fsck on a 40 GB filesystem. If they crash, they want to be back up :>fast. fsck is just too slow. This is a valid point to, though I would never personally design such a system myself... too much can go wrong with the complex hardware AND software that makes up such a configuration. I might use the configuration, but it would be in a duel-redundant machine setup rather then a quick-reboot setup. Either that or I would use a dedicated NFS box. I just don't trust complex operating systems (UNIX is a relatively complex operating system) enough... the worst thing that can happen is that a kernel bug in some unrelated subsystem will corrupt filesystem data. -Matt :>-- :>Raymond C. Chen, PhD rcc@sgi.com :>Member of Technical Staff Silicon Graphics, Inc. (SGI) :>High-End Operating Systems Generic Disclaimer: I speak only for me.