Return to BSD News archive
Newsgroups: comp.os.386bsd.bugs Path: sserve!newshost.anu.edu.au!munnari.oz.au!news.Hawaii.Edu!ames!elroy.jpl.nasa.gov!swrinde!zaphod.mps.ohio-state.edu!uwm.edu!caen!nic.umass.edu!news.mtholyoke.edu!news.byu.edu!ns.novell.com!gateway.univel.com!fcom.cc.utah.edu!cs.weber.edu!terry From: terry@cs.weber.edu (A Wizard of Earth C) Subject: Re: VM problems w/unlimited memory? Message-ID: <1993Mar18.183443.6397@fcom.cc.utah.edu> Sender: news@fcom.cc.utah.edu Organization: Weber State University (Ogden, UT) References: <1o4spvINNl1v@usenet.INS.CWRU.Edu> <1993Mar16.221837.10302@fcom.cc.utah.edu> <1o81joINNieh@usenet.INS.CWRU.Edu> Date: Thu, 18 Mar 93 18:34:43 GMT Lines: 143 In article <1o81joINNieh@usenet.INS.CWRU.Edu> chet@odin.ins.cwru.edu (Chet Ramey) writes: >In article <1993Mar16.221837.10302@fcom.cc.utah.edu> terry@cs.weber.edu (A Wizard of Earth C) writes: > >>Sorry, but the fcntl fails not because it found allocated memory for the >>closed fd, but because the fd exceeds fd_lastfile in the per process >>open file table. The location, since it doesn't exist, isn't addressable. > >This requires that the programmer be aware of the implementation >technique used by the kernel to manage the process `open file table'. >It's a kernel problem -- we all agree on that. The failure mode is a kernel one, yes; however, the selection of an fd at extreme high range is a programming bogosity that should not be occurring anyway. Yes, it will cause problems on 386bsd -- it will also, however, cause problems on SVR4 and AIX, both of which dynamically allocate their per-process file table. Certainly this problem *must* have been solved in other environments in the bash code, right? Allocation of all memory on an AIX 3.2 box will result not in a crash, but in killing the process with the largest image size (rather than the most recently run process or the one doing the excessive allocation). This is arguably a *bad thing* for bash to do on AIX, right? I think any time the programmer makes assumptions about the kernel architecture (as one is doing when one allocates a real high fd number and assumes that fd's are going to be allocated in ascending order by the "open()" call such that you are guaranteed that a real high choice is a safe one), one has to be aware of it. Use of fd 19 with only single digit redirection allowed is okaying off an architectural assumption. >>The safest would be the call I suggested to return the largest open fd >>and add one to it for your use of the fd; I don't understand from the >>code why it's necessary to get anything other than the next available >>descriptor; the ones the shell cares about, 0, 1, and 2, are already >>taken at that point; if nothing else a search starting a 3 would be >>reasonable. > >Nope. Not reasonable. If a user were to use fd 3 in a redirection >spec, perhaps to save one of the open file descriptors to, or to open >an auxiliary input, the file descriptor we so carefully opened to >/dev/tty and count on to be connected to the terminal would suddenly >be invalid. Bash attempts to choose an `unlikely' file descriptor -- >nothing is totally safe, but this minimizes problems. Of course, this brings into question whether or not it is reasonable to have a fixed open fd at all for this particular purpose. Not only is it simply a probabalistic exercise, since one can't guarantee the shell isn't execed from a for of some program not protecting the "unlikely" nature of it's fd choice (ie: not another shell, and not login; another program that makes the same assumptions without the same protections), the fact is, it's unnecessary. Being able to get to the controlling tty device at a later date is what "/dev/tty" is about; as long as the controlling tty isn't blown away, opening /dev/tty at some future time as a transient fd (ie: close it when done) is just as effective. If you can come up with an example of a shell that can open /dev/tty early in it's life, but not later in it's life (ie: it's controlling tty changes), I'd like to see it. Otherwise, what we are talking about is an access-time opyimization by bash based on some assumptions about architecture which are no longer valid. >Bash uses the same technique in shell.c to protect the file descriptor >it's using to read a script -- all shells do pretty much the same >thing. (Well, at least it does now. The distributed version of 1.12 >does not do this.) There doesn't seem to be a good reason for this if the shell script is in core in the shell; it should be closed after reading. Since reading takes place entirely before execution, there is no conflict with the shell script itself. In any event, a shell script is run by a sub-shell, not the active shell, unless one is playing games with a disk-based interpreter and context frames within the shell. In this case, the traditional recursion within a shell script played by many install packages would fail anyway. >Before Posix.2 it was `safe' to assume that fd 19 >could be used for this -- a process was guaranteed to have at least 20 >file descriptors available to it, and there were no multi-digit file >descriptors in redirections. This is no longer the case. Since bash >currently uses stdio for reading scripts, and stdio provides no >portable way to change the file descriptor associated with a given >FILE *, bash attempts to avoid having to do so. We used to just use >the file descriptor returned from opening the script, but ran into a >number of mysterious problems as a result. If one is going to make fundamental assumptions about the OS, such as "there are a finit number of fd's on which collisions can occur" or "the number of fd's reported by getdtablesize can all be opened or opened out of sequence without repercussions", one might as well either directly manipulate the FILE * contents, including the fd, or prepare for failure. There *is* a documented, *portable* way of replacing the fd associated with a stream: FILE *newfp; FILE savfp_str; memcpy( &savfp_str, oldfp, sizeof( file)); newfp = fdopen( fd, "rw"); /* actual mode derivable from oldfp*/ memcpy( oldfp, newfp); Of course &savfp_str can be treated as if it were a FILE *; but if the contents of the file struct were what you wanted to modify, this will do it. Traditionally, the soloution has been to use a "non-portable method" to directly manipulate the fd in the fp; this is what most shells do; this is no less non-portable than making assumptions about "safe fd's" or about range limits equalling operational limits (by using the highest known fd possible). >>I do *NOT* suggest 19, as this would tend to be bad for >>most SVR4 systems, as it would have a tendency to push the number of >>fd's over the bucket limit if NFPCHUNK were 20 and allocation were done >>from index 20 on up. > >I don't really see how a loop going down starting at 19 will cause the >fd `table' to grow beyond 20. (That is the code we were talking about, >right?) Sorry; I assumed that the traversal would be "up" from 19 to insure that the lower numbers (deemed "most important" were't tromped on. >I'll probably put something in the code to clamp the max file descriptor >to something reasonable, like 256, and press for its inclusion in the >next release of bash, whenever that is. This will certainly prevent outright failure on AIX and SVR4 (as well as 386bsd), but it is certainly a non-optimal soloution from the perspective of unnecessary resource utilization. Terry Lambert terry@icarus.weber.edu terry_lambert@novell.com --- Any opinions in this posting are my own and not those of my present or previous employers. -- ------------------------------------------------------------------------------- "I have an 8 user poetic license" - me Get the 386bsd FAQ from agate.berkeley.edu:/pub/386BSD/386bsd-0.1/unofficial -------------------------------------------------------------------------------