*BSD News Article 64430

Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!newshost.dca.gov.au!news.mel.aone.net.au!inferno.mpx.com.au!goliath.apana.org.au!news.syd.connect.com.au!news.mel.connect.com.au!munnari.OZ.AU!news.ecn.uoknor.edu!news.ysu.edu!usenet.ins.cwru.edu!gatech!newsfeed.internetmci.com!sgigate.sgi.com!sdd.hp.com!hamblin.math.byu.edu!park.uvsc.edu!usenet
From: Terry Lambert <terry@lambert.org>
Newsgroups: comp.os.linux.development.system,comp.unix.bsd.freebsd.misc
Subject: Re: Ideal filesystem
Date: 28 Mar 1996 20:55:13 GMT
Organization: Utah Valley State College, Orem, Utah
Lines: 291
Message-ID: <4jeubh$f12@park.uvsc.edu>
References: <4hptj4$cf4@cville-srv.wam.umd.edu> <3140C968.20699696@netcom.com> <4ilgto$861@floyd.sw.oz.au> <4j6if4$15gk@news.missouri.edu>
NNTP-Posting-Host: hecate.artisoft.com
Xref: euryale.cc.adfa.oz.au comp.os.linux.development.system:20150 comp.unix.bsd.freebsd.misc:16144

rhys@vortex.cc.missouri.edu (Justin "Rhys Thuryn" McNutt) wrote:
] Jeremy Fitzhardinge (jeremy@suede.sw.oz.au) wrote:
] : In article <4if9gb$4kh@park.uvsc.edu>,
] : 	Terry Lambert <terry@lambert.org> writes:
] : >The 'non-silly feature' list includes:
] : >1)	Application icon information
] 
] : Feh.  May as well build it into the executable.  ELF is good for that
] : kind of thing, if you're really excited by it.
] 
] I agree.  The icons themselves should be built into the executable.  This 
] is probably one of the *very* *very* few things that Windows 3.0 did 
] right.  The icon is *in* the binary.  Simple.  Useful.  Effective.

Ugh.  Yeah, I want the vendor telling me what my desktop looks
like, riiiiiiiight.

The icon information (which I hold to be distinct from icon data)
is an identifier that the desktop then maps to an icon based on
user configuration information... with the vendor getting to
supply a default, nothing more.

The windows "in the binary" approach (and the similar X
implementation for default icon, which may be overridden by
the desktop manager) *SUCKS* becuase i severely limits your
options.

This is readily apparent in Windows95, which has an "ID to
icon" map in the registry, since it is difficult to pick "one
icon out of all possible icons" for a file without examining
every executable.

Further, since icons are associated with executables instead
of being in a common database (for non-default icons), if you
remove an executable with a rich set of icons in it, you can
damage the appearance of many unrelated files by virtue of
removing an object which (improperly) was acting as a container
object both for the icons you wanted to keep and the data you
wanted to destroy.

No, icon databases should not be in the binary, or in a fork
(or ELF or COFF segment) in the binary.


] : >2)	Desktop position information for an icon
] 
] : Bad for any multiuser machine.
] 
] And way too complicated to store in an inode, and useless.  The location 
] of a file in the filesystem *means* something.  Where an icon sits on the 
] desktop is almost completely irrelevant.  Leave icon placement to the 
] window manager (or file browser).

Which can operate on the EA's for the file using a per user
section of the combined file.

What isn't obvious (and what you both seem to have neglected)
is that hard links mean that placement information for a
directory as a container object varies by directory as well
as binary.  Thus the proper placement information is per
directory.

Consider the current Appletalk servers, where moving icons
around in a directory can allow two users of the same directory
to "fight" over icon location.


] : >3)	Character set attribution of files; you can use this
] 
] : You mean having multiple representations for the one file, or
] : just the filename?  Or just having encoding information associated
] : with the file?
] 
] I think he is trying to solve the following problem:  I am using a French 
] ANSI character set.  When you, the Brit, read my file, a lot of it comes 
] out as unreadable characters.  He wants to solve that problem by putting 
] the character set information in the inode.  It's not a bad idea, but it 
] could get confusing.  What about devices?  What about directories?  Some 
] more discussion on this would be helpful.

See the comp.std.internat archives of two years ago for a discussion
between myself and M. OHTA about language attribution of files,
interaction with "cat" and other programs, etc..  This has been
discussed to death.

] : >4)	Compression headers for compressed storage on a[...]
] : >6)	Data migration information for migrating the file to/from[...]
] : >8)	ACL's (Access Control Lists), ala VMS, to allow finer[...]
] : >9)	NetWare style "trustee" information.[...]
] : >11)	"Installed image" priviledges, ala the SVR4 protections[...]
] 
] : You mean a general way for layers over the filesystem to store
] : per-file metadata?
] 
] 8, 9, and 11 above sound like a good idea, particularly number 11.  
] SUID/SGID is still a good idea, but in many cases, like the sendmail 
] example earlier, it is too powerful.  Anything that allows me to get 
] closer to "minimum necessary access", especially in automated apps and 
] daemons, gets my support.
] 
] As far as 4 goes, I like things the way they are now.  Gzip is a great 
] program.  If we start making the OS compress everything (disks, memory, 
] etc...) we're going to see these wonderful new processors run like 386s.  
] In other words, I think that making any part of the OS deal with 
] compression is adding an unnecessary layer of overhead.  I want my files 
] messed with by *me* and *me* *only*.

1)	There exists compressed images on media already; for
	instances, most CDROM's, if they aren't cheaping out
	on you, have > 800M or so on data on media where it
	would not fit otherwise.

2)	Never underestimate the willingness to trade compute
	cycles for space.  In point of fact, most modern systems
	are I/O bound, not CPU bound (tell that to your 33MHz
	I/O bus on your 200 MHz P6), and it is worthwhile to
	compress to effectively double the I/O bandwidth.

3)	Interoperability with "DoubleSpace" and "DriveSpace"
	drives which already exist.  Deal with them.

4)	"Old" files can be data-migrated to a compressed form
	with near-zero impact.

These are data migration issues, not "compressed drive" issues.

] : >5)	Name/location attribution so that they don't end up as
] : >	an inode number in lost and found: with full referential
] : >	integrity, fsck can put the files back where they belong.
] : >10)	Parent/predecessor/location information; ideally, you'd[...]
] 
] : Unix files don't have *a* name or location.  And if you're writing
] : a new filesystem, one hopes it will eliminate fsck altogether.
] 
] One would hope so, but anyway, here's his point (I think) again:  If an 
] inode contained the information about its location inside itself, you 
] could do away with lost+found altogether.  If the directory structure got 
] screwed up, fsck (or whatever is "in charge" of the filesystem) could 
] just do a MacOS style "Rebuilding the desktop (read: filesystem)" and go 
] through the location information in the inodes to reconstruct the 
] directory structure (if a directory "file" got trashed, the inode would 
] contain its location and permissions, and the "contents" would be 
] recreated from other inodes).

Yes.  I would, of course, prefer some form of order writes
mechanism to make this completely unnecessary, and kill the
lost+found altogether.

] I am unclear about this next part, so I am going to state how I *assume* 
] it works, and then suggest how this would fit into the above:
] 
] A hard link is just another inode pointing to the same data as another 
] inode.  DOS would throw a fit, calling this "cross-linked files".  Unix 
] doesn't care.  Anyway, each hard link would contain location information 
] for itself, not the file contents.

Not quite.

You need to divorce the concept of a directoy node from a
data node on disk.  Thus you could have a linked list of
directory nodes, all of which have parent + sibling data
as well as pointing to the data node on the disk.

Thus each hierarchy is computationally N-P complete, as long
as you prohibit hard links on directories (which are a historical
kludge for the lack of an atomic rename system call).

So if I have a hard link to inode 7501 in directory "foo" named
"foofile" and in directory "fee" named "feefile", I can recover
the path used o access the node if the link identifiers were the
in core objects.

Effectively, this is a seperation of incore inode structural
members from in core inode data members.

Clearly, you would have to manipulate the list when pushing
the link count above one (the predominant case would be one
link, anyway).

To really explain this, you have to consider the inode name
space as being a flat numeric name space, and the imposition of
directory hierarchy as a totally seperate idea.

This makes a complete file system, populated with files,
directories, and hard links, into a three dimensional
manifold, with a complex connected projection onto the
numeric name space.

You can thing of the block management system as being a simply
n/m mapping mechanism that the numeric name space is projected
onto, based on object granularity and length.

] A symbolic link would be the same way.  The "contents" of the link are 
] just a pointer to the "real" file.  The location information would 
] contain the location of the link.  For example the link in /lib "libm.so" 
] would *contain*:  "libm.so.5".  However, it's *location* information in 
] the inode would be "/lib", since the *link* is in /lib and points to 
] libm.so.5.

A symlink can be a symlink, as they currently exist, without
impacting anything.

An Inherited permission set *demands* that links into hierarchies
with permission differences set an explicit permission on the
object and divorce inheritance.  For a symlink, this means it is
still correct to return the actual path to the link target
instead of returning the path take to the link target.

This is an important distinction, actually, since it demands
the ability to calculate location of a file target -- unlike
directories, files don't have a ".." to aid this calculation
in current FS implementation.

] So you'd only get messed up dir structures if *inodes* got screwed up, in 
] which case you have bigger problems than just a trashed FS.  Your files 
] themselves are probably nuked to, and you disk is probably making funny 
] noises.  :)

Yes. If you order idempotent operations to guarantee atomicity
of the transactions (using sync writes, delayed ordered writes,
soft updates, or some other ordering mechanism), then you are
safe from anything but hardware failure... of course you are
always at risk from hardware failure.

] : >7)	Creator application information, so that a document can
] : >	"know" the application needed to access it, and a desktop
] 
] : That's a pretty disgusting way of going about it.  One doesn't need
] : to know who created a file, but what the type of the file is.  That
] : works so far, but often a file is one of many types (a TeX source file
] : is just plain ascii text; its something which TeX can format; its
] : something which emacs knows how to handle specially).  If you have
] : unlimited filename lengths, encoding that kind of info into the filename
] : is a useful shorthand, though it might be nice to have a MIME type
] : associated with files.
] 
] This is where I differ from many people.  I don't *like* the OS, or 
] anything else, assuming what's in a file based on the filename, inode 
] information, or anything else (exception:  difference between a file and 
] a directory).  When I tell grav to open a file, I want it to open that 
] file!  If it isn't a graphic, I want it to complain, but I want it to 
] *try* first.

There is nothing in my suggestion precluding opening the file
that way.

The question is whether the file will have the capability to
say "reference application through file to operate on file".
I think the answer should be "yes".

If you want to use a command line, or drag the icon onto an
application other than the default application for that file
class, then fine; that is allowable.

By default, if I double-clickl a GIF file, I want an application
that can display (and preferrably, edit) GIF files to be run
with the file I clicked on as its argument.

] Perhaps if when I "execute" a data file the OS or the shell or whatever 
] brought up a customizable list of apps that I can choose from that are in 
] some way related to the file, it would be nice, BUT:
] 
] a)  This shouldn't be a part of the filesystem.  The programs associated 
] with files will differ from system to system.  Filesystems shouldn't have 
] to know anything about the contents of your machine.  In other words, the 
] filesystem shouldn't have to know whether or not you have one or another 
] app installed.  It should only have to know hard information about the 
] file, not abstractions like: "what is this file used for?".


See my argument against the Window95 "registry/icon" mechanism
in another post.  I believe it is valid here as well.

] b)  I can't tell you how much I *hate* the way the Macs implement this.  
] Once a file has been "associated" with a particular app, it's a pain in 
] the rear to change the "creator" application (once a Word doc, always a 
] Word doc).  See a) above.

This is because the Mac attributes the file by creator instead
of by type, meaning that you can't seperately change the type
to application mapping seperately.  This is what's broken about
Mac file attribution... there is no indirection mechanism to
allow you to customize and/or upgrade to new applications that
operate on the same file type.


                                        Terry Lambert
                                        terry@cs.weber.edu
---
Any opinions in this posting are my own and not those of my present
or previous employers.