Return to BSD News archive
Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!newshost.telstra.net!act.news.telstra.net!psgrain!usenet.eel.ufl.edu!gatech!newsfeed.internetmci.com!inet-nntp-gw-1.us.oracle.com!news.caldera.com!news.cc.utah.edu!park.uvsc.edu!usenet From: Terry Lambert <terry@lambert.org> Newsgroups: comp.os.linux.development.system,comp.unix.bsd.freebsd.misc Subject: Re: Ideal filesystem Date: 4 Apr 1996 04:10:58 GMT Organization: Utah Valley State College, Orem, Utah Lines: 126 Message-ID: <4jvi4i$oim@park.uvsc.edu> References: <4hptj4$cf4@cville-srv.wam.umd.edu> <3140C968.20699696@netcom.com> <4istou$ri9@floyd.sw.oz.au> <4j0bmo$ftv@park.uvsc.edu> <jlemonDoqBq5.1Bx@netcom.com> <4jerrj$f12@park.uvsc.edu> <4joiil$r75@narses.hrz.tu-chemnitz.de> NNTP-Posting-Host: hecate.artisoft.com Xref: euryale.cc.adfa.oz.au comp.os.linux.development.system:20645 comp.unix.bsd.freebsd.misc:16614 fachat@physik.tu-chemnitz.de (Andre Fachat) wrote: ] ] Terry Lambert (terry@lambert.org) wrote: ] : jlemon@netcom.com (Jonathan Lemon) wrote: ] : ] Hm. Shells just look at the inode to see if the file is executable, ] : ] in order to add it to the hash list. So what if there's some (unspecified) ] : ] : For a "binary" that was a directory, there is a need to search ] : every directory in every directory in the path for "a.out" or ] : whatever you call your actual non-fork binary. ] ] You need to search only _one_ directory for a.out - the one directory ] with the name of the program you want to start. ] Only for each program execution, there is One more directory search, ] in a (assumption) small bundle directory. A modern shell will search its path as a result of an event (startup, user command, path change, etc.) and produce a name-to-full path hash list for executables in the path. So when I type "ls", instead of statting or trying to run an "ls" command in every possbile path in path search order, it knows that "ls" mean "/bin/ls". Even if "/bin" is at the end of my path. The search being discusses is the path search to build the hash. The current design which allows this is based on the idea of trading startup time for execution time, with the idea that you run a shell once per some number of commands, so it's a good trade. If you have to search each potential executable as a directory to see if it has an "a.out" (executable file) "fork", then you have increased the search time one order of magnitude on an expotential curve (directories rarely contain one file). Which means you need to now go back and question the time assumptions that made you build the hash in the first place. Rob's suggested optimization, unfortunately, adds a failure mode that a exhaustive search for possible executables doesn't have. Using Rob's suggestion, you add possibly bad hash entries for directories that are marked as executable but which have no a.out "fork". Further, because these are directories, not attribute lists, there's no way to make the attribute on the directory "go away" when the a.out is deleted, or "magically appear" when the a.out is added the first time. The result is a lot of mucking around with the programs that create a.out files to make them do the job for you. ] But then the linear search for a file in a directory is, ] IMHO a hack and should be replaced by something fast like binary tree ] or so... (when talking about performance problems.) Agreed. I believe Rob's position is "change as little as possible to make things work on a case-by-case basis" (he can correct me if he wants). Changing UFS flexname support into HPFS style btree storage would contradict that philosophy. IMO opinion, we should not be content to "just get by with directories", even if we can solve half of the problems we want to solve by doing that. [ ... ] ] This is a point to think about. But then, most filesystem corruption comes ] from not written (meta)data when writing something to disk. Normal ] operation should not write something to a binary... Or intentional mucking around with the subfiles of these "executables". You don't need a crash to cause a failure. I'd point out, though, that directory entries are metadata, so you are doing exactly what exposes you to risk when you manipulate the fork namespace, if it's a directory. ] How do you handle the "signle focal file system object" in a Unix ] filesystem. Is is a "Unix-File" with a builtin structure that contains ] fields for EAs, binaries...? You then also have to search the file ] for the binary field when executing (see above). ] ] Are the EAs stored in a special block, that is pointed to by something in ] the standard inode? Then this special block can get lost like ] any other block. Both good questions. The magic here is that by default, a linker will create a file with one or more forks (it might, for instance, put in default icon information as part of the "link"). But since the operations have to go through the user/kernel interface, you can track the file system events and make sure that the updates to what's actually on disk is idempotent. That is, all changes will be recovered, or none of them will. A good example of why you might want this is the Windows95 verion of the "4DOS" program, which has a text version string as part of its icon. If you are in the middle of an update of your 4DOS program from one version to another, you want both the "a.out" and "default_icon" forks updated, or you want none of them updated. You just can't make the guarantee about multiple files in a directory. So it boild down to "it's irrelvant (only for answering these particular questions) how the association is actually made in the kernel -- what is important is that it is made in the kernel (not in user space, and not in an exposed transaction space, like a directory, where the fork manipulation isn't guaranteeably idempotent). Terry Lambert terry@cs.weber.edu --- Any opinions in this posting are my own and not those of my present or previous employers.