Return to BSD News archive
Newsgroups: comp.os.386bsd.development
Path: sserve!newshost.anu.edu.au!munnari.oz.au!constellation!osuunx.ucc.okstate.edu!moe.ksu.ksu.edu!zaphod.mps.ohio-state.edu!magnus.acs.ohio-state.edu!csn!hellgate.utah.edu!fcom.cc.utah.edu!cs.weber.edu!terry
From: terry@cs.weber.edu (A Wizard of Earth C)
Subject: Re: File Truncation Philosophy
Message-ID: <1993Apr11.035322.19610@fcom.cc.utah.edu>
Sender: news@fcom.cc.utah.edu
Organization: Weber State University (Ogden, UT)
References: <1993Apr2.072443.790@cm.cf.ac.uk> <1993Apr8.002028.2376@fcom.cc.utah.edu> <1993Apr8.025858.22137@uvm.edu>
Date: Sun, 11 Apr 93 03:53:22 GMT
Lines: 136
In article <1993Apr8.025858.22137@uvm.edu> wollman@sadye.emba.uvm.edu (Garrett Wollman) writes:
>In article <1993Apr8.002028.2376@fcom.cc.utah.edu> terry@cs.weber.edu (A Wizard of Earth C) writes:
>>I can live with the canonical fix, but want the prettier fix, since
>>the file would act like it *wasn't* a swap store... the same actions
>>would be required on any inode write attempt, not just truncation.
>
>I can't live with the prettier fix, unless you have a way to make
>other memory-mapped files work correctly. Remember that, as far as
>the VM system is concerned, an executable is just another
>memory-mapped file. This is the reason why the ``obvious'' fix could
>not be made in the vnodepager: it doesn't know that the file it's
>paging is an executable---only execve() knows that. I would argue
>that that is as it should be.
Well, I've promised some people to elaborate on "the pretty fix" in some
private mail, and it's kinda been misinterpreted here. I am *not*
suggesting going back to the previous VM code (even though this will work)
because it would mean losing the "instant start" benefits, as well as the
non-modified text page benefits (non-modified text pages take vm cache
but do not take swap).
First, let's consider the issues, boiled down to the essentials, based on
0.1 PL0.2.2 with none of the recently suggested patches installed:
o Processes swap pages from text files instead of swap
- startup is faster because the copy-to-swap is avoided
- this saves on swap
- due to the lack of a unified VM/buffer cache, this is slower,
since a copy to vm cache from FS buffer cache is required for
each page in
o The VTEXT flag is not correctly set on files during exec to
indicate an EBUSY or ETXTBSY should result from an attempt to
open or truncate the file.
- Setting VTEXT on the vnode in exec returns the proper error
codes on attempts to truncate or write a running programs
original image.
- If the image is open before the VTEXT is set by it being run,
however, write and truncate uperations subsequent to executing
the image opened do *NOT* correctly return errors.
- Returning an ETXTBSY is *NOT* Posix compliant; the error
ETXTBSY is *NOT* supported in Posix.
- Images open for write or in such a way as to allow truncation
should not be allowed to execute (EBUSY?).
- Disallowing the running of images which may be potentially
modified is also *NOT* Posix compliant.
- NFS does not provide a way of sharing current vnode flags
across exported/imported file systems; there is no way to
solve this problem in the current implementation of NFS. We
are lucky that this is unlikely to be a problem given "normal"
usage of NFS does not export executable directories as
writable, nor is concurrent access of user images by writing
and execution on differnt hosts likely (although it is possible).
Implementing the "pretty" soloution on top of the "EBUSY/ETXTBSY" will
need the following:
o Posix compliance.
- The EBUSY/ETXTBSY returns have to be hidden (preferrably in the
VFS/vncalls layer) to ensure the presentation of a Posix
compliant interface to the user. Basically, corrective action
is taken at the hiding layer, and the operation is retried.
The resulting EBUSY/ETXTBSY are *internal* and not exposed to
the services consumer, who expects Posix compliance.
o Writing to files can not be allowed to crash the system. As an
alternative to the *UGLY* (no Posix compliant) soloution *and*
an alternative to copying the program to swap on start up as in
the old VM system, the following can be done:
- On open for write/trucation, the text pages from the file
belonging to the executable image are copied to swap or are
copied to memory pages marked as swappable and dirty. This
will result in protection of the image from overwrite of it's
swap store (since it will no longer be using the file as the
swap store). A file can be determined to be open for a swap
store for an image by examining it's flags to determine if
VTEXT is set *before* the vnode reference count is bumped.
- Potentially, and additional flag indicating the process image
was copied to swap could be used to allow subsequent invocations
even during or following modification of the file.
- Files already open for text access must be assumed to be open
for writing (unless we add another flag to the in core vnode
to indicate whether or not the vnode is considered writable),
and running the process must require on of:
+ refuse to run an image undergoing changes.
+ copy the current file image to swap as per the old VM approach
and *then* run as a process.
The first soloution is, to my mind, superior, although it again
raises the spectre of Posix compliance.
o Speedups.
- The VM and buffer cache must be unified to minimize the swap
overhead on swap-from-file for the file-as-swapping-store case
during normal use..
>I haven't tried out the execve implementation of the ``obvious'' fix
>to see if it works yet. (I'm so reluctant to reboot my machine when
>my NTP is doing
> xntpd[78]: offset 0.006774 freq -62.73834 comp 4
>so well.)
Good numbers -- and the "obvious" fix fails in the "file already open
for writing/truncation when execution takes place" case. A write to
an executing image is not prevented provided the open occurred prior
to VTEXT being set. The setting of VTEXT needs to respect a non-zero
reference count when VTEXT is not already set. None of the currently
posted fixes resolves this issue. Mark Tinguely is currently researching
the issue (he's one of the people I promised this writeup to).
PS to Garrett: I have been unable to contact you via email -- please
contact me with any information or preferences on inclusion of your
loadable module interface in the 0.1.5 release, and any information you
feel relevant to Sun-style shared libs in 0.1.5. I have some stuff, but
I'd have to remove a lot of "not ready for public consumption" code to
use them instead (plus it reduces friction if I concede beforehand 8-).
Terry Lambert
terry@icarus.weber.edu
terry_lambert@novell.com
---
Any opinions in this posting are my own and not those of my present
or previous employers.
--
-------------------------------------------------------------------------------
"I have an 8 user poetic license" - me
Get the 386bsd FAQ from agate.berkeley.edu:/pub/386BSD/386bsd-0.1/unofficial
-------------------------------------------------------------------------------