Return to BSD News archive
Path: sserve!newshost.anu.edu.au!harbinger.cc.monash.edu.au!simtel!zombie.ncsc.mil!news.mathworks.com!news.kei.com!nntp.et.byu.edu!news.byu.edu!hamblin.math.byu.edu!park.uvsc.edu!usenet From: Terry Lambert <terry@cs.weber.edu> Newsgroups: comp.unix.bsd.freebsd.misc Subject: Re: Eliminating kernel panics Date: 14 Jun 1995 20:14:46 GMT Organization: Utah Valley State College, Orem, Utah Lines: 114 Message-ID: <3rnfvm$1qa@park.uvsc.edu> References: <3rlecq$i2r@felix.junction.net> NNTP-Posting-Host: hecate.artisoft.com Michael Dillon <michael@junction.net> wrote: ] I was just browsing a WWW page at Amdahl, the mainframe ] manufacturer when I came across the following: ] ] http://www.amdahl.com/doc/products/oes/cb.uts/utshist.html ] ] UTS 4.2 was engineered to eliminate all kernel panics (other UNIX ] operating systems based on a simple port of the base SVR4 source ] contain "panic" code that will stop the machine in unexpected ] situations). In the development of UTS 4.2, the base SVR4 code ] was methodically "scrubbed" to create a run-time environment as ] reliable as the S/390 hardware platform it serves. ] ] If they can do it, why can't FreeBSD do the same? I'm thinking that this ] problem is similar to the problems with TCP/IP congestion and that ] solutions could be found similarly. Is that "marketing eliminated" or "engineering eliminated". I guaran-damn-tee you if there is a hardware fault, the machine is going to suck mud, no matter what they do to the software. Just like a short in the ethernet will take out a NetWare SFT (Software Fault Tolerance) server. Now there *are* two classes of panic. One is the result of an unrecoverable failure mode. UTS has unrecoverable failure modes, too -- don't let them kid you. You hadle these by panicing. The Second type of panic is one where the kernel agrees to do something, then renigs on the agreement. There are a lot of cases, mostly based on probability, where the kernel will commit to doing something that it thinks it can most likely do, but is not 100% certain it can. For instance, allowing a process to start at all without knowing what the maximum dirty data pages it will use during its lifetime is beforehand. It's possible to get around most of these problems by not allowing the overcommitting of resources; the problem with that is that on the the average, it's OK ot overcommit resources, and doing so will result in less overall resources being required for the average case. One of my favorite hobby-horses is memory overcommit. The good things about memory overcommit are: o Your total avaiable memory is swap size + RAM size o You don't require real swap for clean text (and data, if correctly implemented) pages, since they can be reloaded from the file (this is called using the file as backing store). o Precommitting resources takes time, so not doing it means you can start executing code before you have it all in core. o The copy costs for the pages can be amortized over the runtime of the program. The plus to this is that it grants the appearance of speed; the downside is that it actually detracts from overall speed during runtime binding (a problem most shared library implementations also have). The bad things are: o Unless your total available memory is limited to swap size (meaning that you have real swap space reserved as backing store for RAM), you can't guarantee hot shutdown/restart, and you can't guarantee enough space to support kernel dumps (in case of unrecoverable errors). o Using a program file as backing store causes problems: if the program was loaded over NFS, the NFS server must stay up to swap in pages; therefore the image is fragile to network outages (anyone who has used a diskless Sun would agree). The "fix" for this would be to special case remote file systems to load remote images entirely into local swap. That only works in "dataless" configurations, not "diskless", since swap is also remote in the second case. FreeBSD, NetBSD, SVR4, etc. typically don't implement this "fix". o Using the program image as a swap store makes the program fragile to modification. This is the purpose of the VTEXT flag on an in core vnode on such systems, and attempts to modify the image result in an error return of ETXTBUSY (a non-POSIX error return "extension"). The "fix" for this one is to fault the image to swap (and make the VM system "prefer" swap pages to disk pages -- something you want anyway, since a page reference from swap is much faster than one through the file system) and allow the modification to proceed. Again, this is not typically implemented, and there are problems if the modification is not local to the machine doing the running, since the non-standard VTEXT flag is not propagated to a remote host (NFS/RFS). In combination with forcing remotely executed code to local swap, this window is (mostly) closed. o Delayed startup (obviously: related to the size of the image being copied to swap). And this is just *one* of the overcommitted resources on the machine. Obviously, it a set of trade-offs between what the user is willing to spend on hardware vs. what they get for their money. Terry Lambert terry@cs.weber.edu --- Any opinions in this posting are my own and not those of my present or previous employers.