Return to BSD News archive
Xref: sserve comp.os.386bsd.bugs:1915 comp.unix.internals:6594 comp.unix.bsd:13083 Newsgroups: comp.os.386bsd.bugs,comp.unix.internals,comp.unix.bsd Path: sserve!newshost.anu.edu.au!munnari.oz.au!news.Hawaii.Edu!ames!agate!spool.mu.edu!uwm.edu!msuinfo!netnews.upenn.edu!dsinc!jabber!candle!root From: root@candle.uucp (Bruce Momjian) Subject: Swap overallocation in i386 BSD's Organization: a consultant's basement Date: Wed, 15 Dec 1993 04:38:11 GMT X-Newsreader: TIN [version 1.2 PL2] Message-ID: <CI27Jo.DGy@candle.uucp> Lines: 146 I am running BSD/386 from BSDI. When running with 5MB of RAM, I found that the system locked up about once every week. In researching the problem with Mike Karels of BSDI, I think we have found a bug that exists on BSD/386 and most free 386-based *BSD systems. Here are the details. First, let me define copy-on-write(COW): When a process forks, the OS maps the address space of both the parent and child to the same memory pages, and both process start running. If either process makes changes to its shared memory pages, the OS makes a copy of the shared page. One process gets the original, another gets the copy. Ok, here is the bug we have found: If a process forks a child, and the parent writes to its memory pages (forcing a COW), and those pages are paged out to swap before the child exec's or exits, the parent's and child's<!> swap space is not released until the parent exits. The ramifications of this is that if you have a long-running process that forks a lot, like a shell, and your system does a lot of paging, those long-running process will allocated more and more swap until they exit. It is particularly a problem with non-csh shells (csh, uses vfork and exec), because they often run scripts by forking themselves, and the child running the script may exist for quite some time without exec'ing or exit'ing. Here are Mike Karels more detailed words on the subject: --------------------------------------------------------------------------- ... The problem here is that if the process forks, and the parent modifies data pages while the child exists, it must make copies of those pages (copy-on-write after fork). If those copies are paged out, then both the copies and the originals will occupy space until the parent exits, even if the child exits. I think I described the chains of shadow objects that were accumulating, and the fact that those are supposed to get coalesced. It turns out that the code to coalesce does not work if an object has been paged out. This is the scenario that causes problems: - a long-lived program forks repeatedly, - the parent modifies data space before the child does exec or exit, and - the parent's modified pages get paged out before child does exec or exit. The only situation in which this seems to be a problem is if a login shell (or any long-running interactive shell) runs scripts by forking and running them directly. This will not happen with csh; I don't know about ksh or bash. (It does not happen with csh because it uses vfork, and re-exec's itself if running a csh script). It also does not happen if the scripts are "executable" scripts, i.e. those that start with #!/bin/sh. It is also a problem only if the script or other system activity uses enough memory for the shell to be paged out while the script is running. The bad news is that this problem is not easy to solve... However, I think there are some workarounds that can be used for the moment. --------------------------------------------------------------------------- My experience with 5MB of RAM and 20MB of swap running several screens (no X, no networking) was that because I never logged out, my shell accumulated swap space until it ran out. About every 7 days the system had to be rebooted (everything had stopped running). I hope this helps explain some lockup problems some people may be having. Has anyone solved this problem? I don't know the specifics of why it is occurring, or why it is hard to solve, but if someone has already solved it, I would love to hear about it. Attached is a program that illustrates the problem. With MAKE_CHILD undefined, swap space is allocated the first time through the loop, and stays pretty constant. With MAKE_CHILD defined, swap decreases rapidly each time through the loop until the system runs out of swap space and locks up. Note that each child is killed before the loop is restarted, yet the swap space continues to decline rapidly. You will need to define some things at the top before you compile, including your systems program for monitoring swap space. --------------------------------------------------------------------------- /* show swap overallocation bug in child processes */ /* Bruce Momjian, root@candle.uucp */ /* tabs = 4 */ #include <stdio.h> #include <unistd.h> #include <stdlib.h> #include <signal.h> #define MAKE_CHILD /* make this higher if you have more than 8 MB of RAM */ #define SYSTEM_RAM 4 /* program to show remaining swap space, vmstat? */ #define SHOWSWAP "swaptotal" int k = 1024; void main() { char *y; int c_pid; int j; char *t; /* make my address space big */ if ( (y=malloc(SYSTEM_RAM*k*k)) == NULL) { perror("Malloc"); exit(1); } while (1) { #ifdef MAKE_CHILD if ((c_pid = fork()) == 0) sleep(1000); #endif /* parent touches memory to force COW copy */ for (j=0,t=y; j < SYSTEM_RAM*k*k; j+=k) { *t = 'x'; t += k; } #ifdef MAKE_CHILD kill(c_pid,SIGHUP); #endif puts("done "); system(SHOWSWAP); } /* NOT REACHED */ } -- Bruce Momjian | 830 Blythe Avenue root%candle.uucp@bts.com | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (215) 353-9879(w) + Christ can be your backup. | (215) 853-3000(h)