Return to BSD News archive
Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!harbinger.cc.monash.edu.au!news.mel.connect.com.au!munnari.OZ.AU!news.hawaii.edu!ames!agate!howland.reston.ans.net!usc!news.cerf.net!news.titan.com!news.tcst.com!op.net!candle.pha.pa.us!not-for-mail From: root@candle.pha.pa.us (Bruce Momjian) Newsgroups: comp.unix.bsd.bsdi.misc Subject: Re: BSDI 2.0.1 swapspace leak fix? Date: 11 Dec 1995 00:32:33 GMT Organization: a consultant's basement Lines: 216 Message-ID: <4afu71$ai@picasso.op.net> References: <4a0hk0$20d@news2.ucsd.edu> <4aa9mf$6pr@picasso.op.net> <4aaehi$eav@moon.igcom.net> NNTP-Posting-Host: s1-03.ppp.op.net David Bauman (david@terra.igcom.net) wrote: : : Brian Kantor (brian@nothing.ucsd.edu) wrote: : : : We're suffering from running out of swapspace; it appears that it's a : : : known problem with BSDI 2.0.1 (and probably earlier) where processes : : : that fork chew up swapspace. Any patch to fix this yet? : : : I asked BSDI if the next release will fix this problem and was told the : : "swap overallocation" bug has been improved but not eliminated in the : : 2.0/2.0.1 release. I read this to say they do not have a fix for this : : in the next release. : : This is unacceptable. I have multiple machines running BSDI BSD/OS 2.0.1 : and my whole business relies on BSD/OS. The price that BSD charges for : their software should run with NO bugs whatsoever. I don't know how realistic that standard is. If you required all software you paid for to ship with no bugs you wouldn't have much software. I worked with Mike Karels to identify the bug in 1.0. He has looked into it and talked to the initial Mach developer and the solution is not easy. It exists in all 386 BSD implementions as far as I know. They have made some changes to make 2.0 less prone to this problem. Attached is a posting addressing the issue. --------------------------------------------------------------------------- Because the topic has come up, I would like to just clear up the cause of swap overallocation bug, and to confirm that it can lock up the machine completely with no warning. I know because I worked with Mike Karels to find the bug. Basically with 5MB of RAM and no X-Windows, I locked up every seven days. "pstat -T" showed swap allocated getting bigger and bigger until the system locked up. This is probably not the problem this particular person is having, but it is possible. I have advised the user to log "pstat -T" from a cron job to elminate this as a possible cause. The new version of BSD/OS does not fix this bug, though Mike Karels is aware of it and certainly wants to fix it, but it is a major job. The easiest solution for users is to add more RAM so the condition does not occur. With 16MB RAM running X, I get no lockups. Attached is an old posting outlining the problem: --------------------------------------------------------------------------- >From maillist Tue Dec 14 23:52:45 1993 Subject: Swap overallocation To: bsdi-users@bsdi.com (BSDI mailing list) Date: Tue, 14 Dec 1993 23:52:45 -40962758 (EST) Cc: karels@bsdi.com X-Mailer: ELM [version 2.4 PL20] Content-Type: text Content-Length: 5046 Status: OR I am running BSD/386 from BSDI. When running with 5MB of RAM, I found that the system locked up about once every week. In researching the problem with Mike Karels of BSDI, I think we have found a bug that exists on BSD/386 and most free 386-based *BSD systems. Here are the details. First, let me define copy-on-write(COW): When a process forks, the OS maps the address space of both the parent and child to the same memory pages, and both process start running. If either process makes changes to its shared memory pages, the OS makes a copy of the shared page. One process gets the original, another gets the copy. Ok, here is the bug we have found: If a process forks a child, and the parent writes to its memory pages (forcing a COW), and those pages are paged out to swap before the child exec's or exits, the parent's and child's<!> swap space is not released until the parent exits. The ramifications of this is that if you have a long-running process that forks a lot, like a shell, and your system does a lot of paging, those long-running process will allocated more and more swap until they exit. It is particularly a problem with non-csh shells (csh, uses vfork and exec), because they often run scripts by forking themselves, and the child running the script may exist for quite some time without exec'ing or exit'ing. Here are Mike Karels more detailed words on the subject: --------------------------------------------------------------------------- ... The problem here is that if the process forks, and the parent modifies data pages while the child exists, it must make copies of those pages (copy-on-write after fork). If those copies are paged out, then both the copies and the originals will occupy space until the parent exits, even if the child exits. I think I described the chains of shadow objects that were accumulating, and the fact that those are supposed to get coalesced. It turns out that the code to coalesce does not work if an object has been paged out. This is the scenario that causes problems: - a long-lived program forks repeatedly, - the parent modifies data space before the child does exec or exit, and - the parent's modified pages get paged out before child does exec or exit. The only situation in which this seems to be a problem is if a login shell (or any long-running interactive shell) runs scripts by forking and running them directly. This will not happen with csh; I don't know about ksh or bash. (It does not happen with csh because it uses vfork, and re-exec's itself if running a csh script). It also does not happen if the scripts are "executable" scripts, i.e. those that start with #!/bin/sh. It is also a problem only if the script or other system activity uses enough memory for the shell to be paged out while the script is running. The bad news is that this problem is not easy to solve... However, I think there are some workarounds that can be used for the moment. --------------------------------------------------------------------------- My experience with 5MB of RAM and 20MB of swap running several screens (no X, no networking) was that because I never logged out, my shell accumulated swap space until it ran out. About every 7 days the system had to be rebooted (everything had stopped running). I hope this helps explain some lockup problems some people may be having. Has anyone solved this problem? I don't know the specifics of why it is occurring, or why it is hard to solve, but if someone has already solved it, I would love to hear about it. Attached is a program that illustrates the problem. With MAKE_CHILD undefined, swap space is allocated the first time through the loop, and stays pretty constant. With MAKE_CHILD defined, swap decreases rapidly each time through the loop until the system runs out of swap space and locks up. Note that each child is killed before the loop is restarted, yet the swap space continues to decline rapidly. You will need to define some things at the top before you compile, including your systems program for monitoring swap space. --------------------------------------------------------------------------- /* show swap overallocation bug in child processes */ /* Bruce Momjian, root@candle.uucp */ /* tabs = 4 */ #include <stdio.h> #include <unistd.h> #include <stdlib.h> #include <signal.h> #define MAKE_CHILD /* make this higher if you have more than 8 MB of RAM */ #define SYSTEM_RAM 4 /* program to show remaining swap space, vmstat? */ #define SHOWSWAP "swaptotal" int k = 1024; void main() { char *y; int c_pid; int j; char *t; /* make my address space big */ if ( (y=malloc(SYSTEM_RAM*k*k)) == NULL) { perror("Malloc"); exit(1); } while (1) { #ifdef MAKE_CHILD if ((c_pid = fork()) == 0) sleep(1000); #endif /* parent touches memory to force COW copy */ for (j=0,t=y; j < SYSTEM_RAM*k*k; j+=k) { *t = 'x'; t += k; } #ifdef MAKE_CHILD kill(c_pid,SIGHUP); #endif puts("done "); system(SHOWSWAP); } /* NOT REACHED */ } -- Bruce Momjian | 830 Blythe Avenue root@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)