Return to BSD News archive
Newsgroups: comp.unix.bsd Path: sserve!manuel!munnari.oz.au!uunet!mcsun!sunic!psinntp!psinntp!dg-rtp!ponds!rivers From: rivers@ponds.uucp (Thomas David Rivers) Subject: Some more on NMI problems (some meager advancement) Message-ID: <1992Sep7.014351.946@ponds.uucp> Date: Mon, 7 Sep 1992 01:43:51 GMT Lines: 63 Well, I thought I would relay my current status with the NMI investigation. Right now, I'm thinking it has something to do with an IDE controller/disk drive, so I have been examining the wd.c driver trying to divine what it might be; without too much luck. (I know very little about the IDE/WD disk controllers.) The common thread seems to be: 1) It happens during some prolonged disk I/O (i.e. rebuilding the kernel over-and-over, or building X) 2) It happens with IDE drives, other people have run my test (rebuilding the kernel) with a SCSI drive, 486-33 and 16meg without finding any NMIs. I have tried several switches on my controller; 1) Having the disk drive/controller assert IOCHRDY (by default it doesn't.) 2) Changing the "precompensation" (which I don't believe is related to disk precompensation) from 125ns to 187ns. 3) Changing the speed of the processor from 20mhz to 8mhz. None of these changes seems to affect the problem. Several people have suggested it could be a cache problem; but I'm running on a very old 20mhz 386, it doesn't have any caches. I'm still reluctant to believe it's actually a memory problem, since 1) It doesn't occur with version 0.0 2) It only occurs *once*, once I get one NMI, it never happens again. You wouldn't think the memory could repair itself... 3) It happens within 2 hours of running the kernel compiles, often within two minutes. 38+hours of memory tests (reading and writting double/single words randomly) found nothing. One last item; I did discover where the empty /var/log/messages line was produced, and why you only got the empty line on the console, without the NMI messages. In isa.c, the function to handle the Non-Maskable Interrupt (isa_nmi) calls log(), but the string contains an initial new-line. Removing that new-line fixes those problem, at least. Again, suggestions are always welcome - I would especially appreciate it if someone with an IDE setup tries to compile the kernel over-and-over (i.e. in a shell "for"-loop) to see if the problem can be reproduced by more people. My next approach is to replace the wd.c driver with Tom Ivar Helbekkmo's new driver - to see if he has altered things enough to either cause the problem to go away, or make it's occurrence more reliable. Unfortunately, I don't seem to be able to get to barsoom.nhh.no right now... (trans-atlantic links are difficult at best.) - Still trying!! - - Dave Rivers - (rivers@ponds.uucp)