Return to BSD News archive
Newsgroups: comp.os.386bsd.development Path: sserve!newshost.anu.edu.au!munnari.oz.au!constellation!osuunx.ucc.okstate.edu!moe.ksu.ksu.edu!zaphod.mps.ohio-state.edu!magnus.acs.ohio-state.edu!csn!hellgate.utah.edu!fcom.cc.utah.edu!cs.weber.edu!terry From: terry@cs.weber.edu (A Wizard of Earth C) Subject: Re: File Truncation Philosophy Message-ID: <1993Apr11.035322.19610@fcom.cc.utah.edu> Sender: news@fcom.cc.utah.edu Organization: Weber State University (Ogden, UT) References: <1993Apr2.072443.790@cm.cf.ac.uk> <1993Apr8.002028.2376@fcom.cc.utah.edu> <1993Apr8.025858.22137@uvm.edu> Date: Sun, 11 Apr 93 03:53:22 GMT Lines: 136 In article <1993Apr8.025858.22137@uvm.edu> wollman@sadye.emba.uvm.edu (Garrett Wollman) writes: >In article <1993Apr8.002028.2376@fcom.cc.utah.edu> terry@cs.weber.edu (A Wizard of Earth C) writes: >>I can live with the canonical fix, but want the prettier fix, since >>the file would act like it *wasn't* a swap store... the same actions >>would be required on any inode write attempt, not just truncation. > >I can't live with the prettier fix, unless you have a way to make >other memory-mapped files work correctly. Remember that, as far as >the VM system is concerned, an executable is just another >memory-mapped file. This is the reason why the ``obvious'' fix could >not be made in the vnodepager: it doesn't know that the file it's >paging is an executable---only execve() knows that. I would argue >that that is as it should be. Well, I've promised some people to elaborate on "the pretty fix" in some private mail, and it's kinda been misinterpreted here. I am *not* suggesting going back to the previous VM code (even though this will work) because it would mean losing the "instant start" benefits, as well as the non-modified text page benefits (non-modified text pages take vm cache but do not take swap). First, let's consider the issues, boiled down to the essentials, based on 0.1 PL0.2.2 with none of the recently suggested patches installed: o Processes swap pages from text files instead of swap - startup is faster because the copy-to-swap is avoided - this saves on swap - due to the lack of a unified VM/buffer cache, this is slower, since a copy to vm cache from FS buffer cache is required for each page in o The VTEXT flag is not correctly set on files during exec to indicate an EBUSY or ETXTBSY should result from an attempt to open or truncate the file. - Setting VTEXT on the vnode in exec returns the proper error codes on attempts to truncate or write a running programs original image. - If the image is open before the VTEXT is set by it being run, however, write and truncate uperations subsequent to executing the image opened do *NOT* correctly return errors. - Returning an ETXTBSY is *NOT* Posix compliant; the error ETXTBSY is *NOT* supported in Posix. - Images open for write or in such a way as to allow truncation should not be allowed to execute (EBUSY?). - Disallowing the running of images which may be potentially modified is also *NOT* Posix compliant. - NFS does not provide a way of sharing current vnode flags across exported/imported file systems; there is no way to solve this problem in the current implementation of NFS. We are lucky that this is unlikely to be a problem given "normal" usage of NFS does not export executable directories as writable, nor is concurrent access of user images by writing and execution on differnt hosts likely (although it is possible). Implementing the "pretty" soloution on top of the "EBUSY/ETXTBSY" will need the following: o Posix compliance. - The EBUSY/ETXTBSY returns have to be hidden (preferrably in the VFS/vncalls layer) to ensure the presentation of a Posix compliant interface to the user. Basically, corrective action is taken at the hiding layer, and the operation is retried. The resulting EBUSY/ETXTBSY are *internal* and not exposed to the services consumer, who expects Posix compliance. o Writing to files can not be allowed to crash the system. As an alternative to the *UGLY* (no Posix compliant) soloution *and* an alternative to copying the program to swap on start up as in the old VM system, the following can be done: - On open for write/trucation, the text pages from the file belonging to the executable image are copied to swap or are copied to memory pages marked as swappable and dirty. This will result in protection of the image from overwrite of it's swap store (since it will no longer be using the file as the swap store). A file can be determined to be open for a swap store for an image by examining it's flags to determine if VTEXT is set *before* the vnode reference count is bumped. - Potentially, and additional flag indicating the process image was copied to swap could be used to allow subsequent invocations even during or following modification of the file. - Files already open for text access must be assumed to be open for writing (unless we add another flag to the in core vnode to indicate whether or not the vnode is considered writable), and running the process must require on of: + refuse to run an image undergoing changes. + copy the current file image to swap as per the old VM approach and *then* run as a process. The first soloution is, to my mind, superior, although it again raises the spectre of Posix compliance. o Speedups. - The VM and buffer cache must be unified to minimize the swap overhead on swap-from-file for the file-as-swapping-store case during normal use.. >I haven't tried out the execve implementation of the ``obvious'' fix >to see if it works yet. (I'm so reluctant to reboot my machine when >my NTP is doing > xntpd[78]: offset 0.006774 freq -62.73834 comp 4 >so well.) Good numbers -- and the "obvious" fix fails in the "file already open for writing/truncation when execution takes place" case. A write to an executing image is not prevented provided the open occurred prior to VTEXT being set. The setting of VTEXT needs to respect a non-zero reference count when VTEXT is not already set. None of the currently posted fixes resolves this issue. Mark Tinguely is currently researching the issue (he's one of the people I promised this writeup to). PS to Garrett: I have been unable to contact you via email -- please contact me with any information or preferences on inclusion of your loadable module interface in the 0.1.5 release, and any information you feel relevant to Sun-style shared libs in 0.1.5. I have some stuff, but I'd have to remove a lot of "not ready for public consumption" code to use them instead (plus it reduces friction if I concede beforehand 8-). Terry Lambert terry@icarus.weber.edu terry_lambert@novell.com --- Any opinions in this posting are my own and not those of my present or previous employers. -- ------------------------------------------------------------------------------- "I have an 8 user poetic license" - me Get the 386bsd FAQ from agate.berkeley.edu:/pub/386BSD/386bsd-0.1/unofficial -------------------------------------------------------------------------------