Return to BSD News archive
Path: sserve!newshost.anu.edu.au!harbinger.cc.monash.edu.au!simtel!zombie.ncsc.mil!news.mathworks.com!news.kei.com!nntp.et.byu.edu!news.byu.edu!hamblin.math.byu.edu!park.uvsc.edu!usenet
From: Terry Lambert <terry@cs.weber.edu>
Newsgroups: comp.unix.bsd.freebsd.misc
Subject: Re: Eliminating kernel panics
Date: 14 Jun 1995 20:14:46 GMT
Organization: Utah Valley State College, Orem, Utah
Lines: 114
Message-ID: <3rnfvm$1qa@park.uvsc.edu>
References: <3rlecq$i2r@felix.junction.net>
NNTP-Posting-Host: hecate.artisoft.com
Michael Dillon <michael@junction.net> wrote:
] I was just browsing a WWW page at Amdahl, the mainframe
] manufacturer when I came across the following:
]
] http://www.amdahl.com/doc/products/oes/cb.uts/utshist.html
]
] UTS 4.2 was engineered to eliminate all kernel panics (other UNIX
] operating systems based on a simple port of the base SVR4 source
] contain "panic" code that will stop the machine in unexpected
] situations). In the development of UTS 4.2, the base SVR4 code
] was methodically "scrubbed" to create a run-time environment as
] reliable as the S/390 hardware platform it serves.
]
] If they can do it, why can't FreeBSD do the same? I'm thinking that this
] problem is similar to the problems with TCP/IP congestion and that
] solutions could be found similarly.
Is that "marketing eliminated" or "engineering eliminated".
I guaran-damn-tee you if there is a hardware fault, the machine
is going to suck mud, no matter what they do to the software.
Just like a short in the ethernet will take out a NetWare SFT
(Software Fault Tolerance) server.
Now there *are* two classes of panic. One is the result of an
unrecoverable failure mode. UTS has unrecoverable failure modes,
too -- don't let them kid you. You hadle these by panicing.
The Second type of panic is one where the kernel agrees to do
something, then renigs on the agreement. There are a lot of
cases, mostly based on probability, where the kernel will commit
to doing something that it thinks it can most likely do, but is
not 100% certain it can. For instance, allowing a process to
start at all without knowing what the maximum dirty data pages
it will use during its lifetime is beforehand.
It's possible to get around most of these problems by not allowing
the overcommitting of resources; the problem with that is that on
the the average, it's OK ot overcommit resources, and doing so
will result in less overall resources being required for the
average case.
One of my favorite hobby-horses is memory overcommit. The good
things about memory overcommit are:
o Your total avaiable memory is swap size + RAM size
o You don't require real swap for clean text (and data, if
correctly implemented) pages, since they can be reloaded
from the file (this is called using the file as backing
store).
o Precommitting resources takes time, so not doing it means
you can start executing code before you have it all in core.
o The copy costs for the pages can be amortized over the
runtime of the program. The plus to this is that it
grants the appearance of speed; the downside is that it
actually detracts from overall speed during runtime binding
(a problem most shared library implementations also have).
The bad things are:
o Unless your total available memory is limited to swap size
(meaning that you have real swap space reserved as backing
store for RAM), you can't guarantee hot shutdown/restart,
and you can't guarantee enough space to support kernel
dumps (in case of unrecoverable errors).
o Using a program file as backing store causes problems: if
the program was loaded over NFS, the NFS server must stay
up to swap in pages; therefore the image is fragile to
network outages (anyone who has used a diskless Sun would
agree). The "fix" for this would be to special case remote
file systems to load remote images entirely into local swap.
That only works in "dataless" configurations, not "diskless",
since swap is also remote in the second case. FreeBSD,
NetBSD, SVR4, etc. typically don't implement this "fix".
o Using the program image as a swap store makes the program
fragile to modification. This is the purpose of the VTEXT
flag on an in core vnode on such systems, and attempts to
modify the image result in an error return of ETXTBUSY (a
non-POSIX error return "extension"). The "fix" for this
one is to fault the image to swap (and make the VM system
"prefer" swap pages to disk pages -- something you want
anyway, since a page reference from swap is much faster
than one through the file system) and allow the modification
to proceed. Again, this is not typically implemented, and
there are problems if the modification is not local to the
machine doing the running, since the non-standard VTEXT
flag is not propagated to a remote host (NFS/RFS). In
combination with forcing remotely executed code to local
swap, this window is (mostly) closed.
o Delayed startup (obviously: related to the size of the image
being copied to swap).
And this is just *one* of the overcommitted resources on the machine.
Obviously, it a set of trade-offs between what the user is willing
to spend on hardware vs. what they get for their money.
Terry Lambert
terry@cs.weber.edu
---
Any opinions in this posting are my own and not those of my present
or previous employers.