Return to BSD News archive
Xref: sserve comp.os.linux:18357 comp.unix.bsd:8769 Path: sserve!manuel.anu.edu.au!munnari.oz.au!news.hawaii.edu!ames!olivea!uunet!haven.umd.edu!decuac!pa.dec.com!vixie From: vixie@pa.dec.com (Paul A Vixie) Newsgroups: comp.os.linux,comp.unix.bsd Subject: Re: [386bsd] cp something to /bin/cp and cp core dumps; bug or feature? Message-ID: <VIXIE.92Dec5145556@cognition.pa.dec.com> Date: 5 Dec 92 22:55:56 GMT References: <Byn6uL.2oM@ra.nrl.navy.mil> <1992Dec2.185331.57@unislc.uucp> <1fjcsuINN2vf@hrd769.brooks.af.mil> Followup-To: comp.unix.bsd Organization: DEC Network Software Lab Lines: 107 NNTP-Posting-Host: cognition.pa.dec.com In-reply-to: burgess@hrd769.brooks.af.mil's message of 2 Dec 1992 16:21:18 -0600 [Dave Burgess] > I have also noticed something that bothers me a little bit. I was ftping >a bunch of files from point A to point B and managed to ftp /usr/bin/ftp. >The execed ftp core dumped. > > Why? > >I have seen the same thing with several other programs (cp, for example). >What umbilical link is there between a running program and its image on >disk? Paged "Virtual Memory", as BSD implements it, means that programs are brought into memory in itty bitty pieces called "pages", and various lies are told that make the program believe that its text and its data and its stack are all contiguous in memory even though most of it could be missing and what's there could be in random order in the real RAM. Each page of "virtual memory" -- meaning, memory as viewed by a user program -- has several possible states. It can be "invalid", meaning that the program did not specify anything for that page and so using it results in "segmentation violation - core dumped". It can be "read/write data or stack" which means that the program has specified that something exists there, and if it tries to access it while the kernel has done something else with that memory, the kernel has to catch the exception ("page fault"), allocate a page of real memory, change the page tables to make that real memory look like it's in the place the program expects it to be, and then fill the memory with the contents the program gave it (usually this means reading from the swap area, since the contents were put there when the page was "stolen" by the kernel in the first place). Finally a page can be "read-only text" as in your particular case (overwriting /usr/bin/ftp while running it). "Read-only text" is a page that cannot change while the program is running. If the kernel has to steal this page -- or if the program has specified it but not tried to use it yet -- then it is NOT written out to "swap" since the kernel assumes that, as read-only text, the original file from which it was loaded will still be there if the page has to come back. This is why the file system won't let you write(2) a file that someone else is running, as shown by... % cp /bin/cat mycat % ./mycat & [7] 10382 [7] + Suspended (tty input) ./mycat % cp /bin/cat mycat cp: mycat: Text file busy % However, the file system is less stringent about allowing you to remove the file entirely. That is, you can remove any link ("name") of the file, but the blocks are not supposed to be deallocated (returned to the free list where other files can grow into them) until the last program that has it open, closes it. So, continuing my example: % rm mycat rm: override protection 755 for mycat? yes % cp /bin/cat mycat % "rm" saw that the file seemed to be open, so it asked me if I was sure I wanted to remove it. After I did that, I was able to create another file with the same name (and, as it happens, with the same contents). I can run this new "mycat" but it will NOT share pages with the one that I'm running in %7. You can see the blocks being held away from the free list by the following continuation of my example: % ls -s mycat 28 mycat* % df . Filesystem Total kbytes kbytes % node kbytes used free used Mounted on /dev/rz0f 521885 289461 180236 62% /a1 % kill %7 [7] Terminated ./mycat % df . Filesystem Total kbytes kbytes % node kbytes used free used Mounted on /dev/rz0f 521885 289433 180264 62% /a1 The file is (and was before) 28K. When I killed the job that was running the old version of the file, 28K magically appeared in my free list -- even though the "rm" command was executed several minutes ago. Those blocks were still needed by the kernel, in case %7 had had any of its read-only text pages stolen by the kernel (or in case it needed one it hadn't used yet). Note that NFS makes this harder on everyone, since the server won't keep track of who has files open (this is because NFS is "stateless"). When someone on the server removes a file that is being executed by some client, the blocks GO AWAY IMMEDIATELY. In recent years, NFS was fixed so that the client's old block-numbers are invalid to the server after the file is removed (but not when it's written to! hahahahaha but I digress). The symptom you saw with your own /usr/bin/ftp process getting a segfault because you overwrote the executable still happens when NFS is involved, but for purely local files (as I expect your /usr/bin/ftp was) it is not supposed to happen. I don't have a 386BSD machine to try this on. I tried ref.tfs.com but there's some kind of network problem between me and it right now. I would love to see someone else run the above examples to see if 386BSD knows how to keep you from writing on running executables, and whether it hangs onto "busy blocks" until the last close. I know that Bill had to rewrite the buffer cache, which is what handles all of this stuff, and it's possible that this somewhat-obscure boundary condition didn't get tested in this way. -- Paul Vixie, DEC Network Systems Lab Palo Alto, California, USA "Don't be a rebel, or a conformist; <vixie@pa.dec.com> decwrl!vixie they're the same thing, anyway. Find <paul@vix.com> vixie!paul your own path, and stay on it." -me