Return to BSD News archive
Newsgroups: comp.unix.bsd Path: sserve!manuel!munnari.oz.au!spool.mu.edu!caen!hellgate.utah.edu!fcom.cc.utah.edu!cs.weber.edu!terry From: terry@cs.weber.edu (A Wizard of Earth C) Subject: Re: Fixed: Runs at 8MHz, Crashes at 33MHz, 386bsd Message-ID: <1992Sep11.222258.2144@fcom.cc.utah.edu> Sender: news@fcom.cc.utah.edu Organization: Weber State University (Ogden, UT) References: <1992Sep8.070731.21159@bernina.ethz.ch> <1992Sep11.200736.20247@qualcomm.com> Date: Fri, 11 Sep 92 22:22:58 GMT Lines: 93 In article <1992Sep11.200736.20247@qualcomm.com> karn@servo.qualcomm.com (Phil Karn) writes: >In article <1992Sep8.070731.21159@bernina.ethz.ch> torda@igc.ethz.ch (Andrew Torda) writes: >> >> At 8 MHz, my machine appears perfectly stable. >> At 33 MHz, I get repeated trap type 12 panics. >[...] >>The most concrete suggestions were to either add wait states or buy >>faster memory. Couldn't add any more wait states, but I managed to >>swap 8Mb of 80ns simms for 70 ns simms. >> >>Instantly, I could rebuild kernels or run my little crash program >>which simply allocated ever increasing amounts of memory and scribbled >>through it. >>The peculiarity is that with the old memory, I had been able to run >>dos, windows in enhanced mode and even SCO unix. >>It would still be nice to know what the cause is and why 386bsd >>provokes the problem. > >Very interesting. I've been having similar problems with my 486-50 >(with 16 meg, Adaptec SCSI controller and NE-2000). A good way to >crash it is to go into one of the source trees and run make. Often I >couldn't get through half a dozen nroff's of man pages before a panic, >usually a message from vm_fault() that I interpret to be the kernel >dereferencing a bogus pointer. Sometimes it wouldn't even get through >the reboot before it would panic again. Applying every patch in sight >didn't seem to help the problem. > >So, inspired by your note, I just tried hitting my machine's Turbo >switch, knocking its clock speed down to 10 Mhz (at least that's what >the display on the front panel says). And the machine now seems *much* >more stable. It's gotten through several source directories without >incident so far, albeit much more slowly. > >One possible theory (stress *theory*): many modern PC chipsets provide >registers to control things like bus clock speeds, memory wait states, >etc. Much more convenient than the hardware jumpers on old motherboards. >Since these are usually set by the BIOS setup program and forgotten, >perhaps something in 386BSD is scribbling over them (or their CMOS >save areas) unintentionally? Going to faster memory, or slowing the >machine down, would let the machine run with these unintentionally >changed settings. This theory would also explain why the same machine >could run other systems at full speed without problem, because they >leave the control registers alone. >Comments? One. Bus mastering controllers using DMA. Most of these controllers have clocks you can set to tell it how long it *MUST* relinquish the bus for and how frequently you have to do it. I ran in this problem while writing a Am33C93A SCSI interface driver for a WD7000-FASST2. The system would crash occasionally. >From the Western Digital documentation [with comments]: "The maximum on time [where the controller owns the bus] should be 15uS less all overhead time required to allow the host to service memory refresh cycles, including DMA bus arbitration time." My theory is that the aha1542b isn't letting the memory refresh. When you start actually using a lot of memory (say during a compile), you get up into the region where it isn't refreshing (since the refresh proceeds up to the the point that the bus is grabbed away, the lower the memory, the "safer" it is). *This* is why the "memory problem" can't be identified with a memory test program (other than 386bsd, of course ;-)). SCO has pessimistic assumptions about the speed of the machine, or actually tests to see how much time it can grab, and so doesn't have a problem. Test: Anyone have a non-SCSI system that has the "works OK at 8MHz but not at 33MHz" problem? I realize this isn't a definitive test, as I might get responses from someone running 200ns RAM saying "Yeah; funny... no one else seems to have the problem", but it should give a weight of SCSI-with-problem vs. not-SCSI-with-problem. Not to discount the "low core being overwritten" theory, but if you were getting the problem after warm boot *only*, then I could see it; otherwise, it's unlikely that low core would be getting blown on one machine and not another. Terry Lambert terry_lambert@gateway.novell.com terry@icarus.weber.edu --- Any opinions in this posting are my own and not those of my present or previous employers. -- ------------------------------------------------------------------------------- "I have an 8 user poetic license" - me Get the 386bsd FAQ from agate.berkeley.edu:/pub/386BSD/386bsd-0.1/unofficial -------------------------------------------------------------------------------