Return to BSD News archive
Newsgroups: comp.os.386bsd.bugs Path: sserve!newshost.anu.edu.au!munnari.oz.au!news.Hawaii.Edu!ames!agate!howland.reston.ans.net!pipex!uunet!world!hd From: hd@world.std.com (HD Associates) Subject: Re: SCSI disk I/O error Message-ID: <CFK6Fu.Grw@world.std.com> Organization: The World Public Access UNIX, Brookline, MA References: <1993Oct23.203652.4718@diana.ocunix.on.ca> Date: Wed, 27 Oct 1993 13:50:17 GMT Lines: 52 In article <1993Oct23.203652.4718@diana.ocunix.on.ca>, Dyane Bruce <db@diana.ocunix.on.ca> wrote: >I am having a problem with NetBSD 0.9 on a ISA 486 DX/66 >Adaptec 1542C Controller, internal Seagate ST-2383N, external >Panasonic LF-7010 (optical R/W) and NEC Multispin CDR-74-1 (CDROM). > >I sometimes get a "sd0:reset" console error with subsequent consistent >"I/O error" on any command from the shell. Extracting gsrc >triggers this everytime. This then forces the use of the >"big red switch" on the machine. Before I dig into the SCSI driver has >anyone seen this as well? Or yet better fixed this? I have noted problems >intermittent soft errors under NeXTSTEP and this same machine >(I am dumping NeXTSTEP 3.1 for NetBSD 0.9) which NeXTSTEP was >able to recover from. I have been completely unable to determine >where these errors are coming from. (Yes, I have checked terminations >put brand new cables in, the works. The point is NeXTSTEP was able >to deal with these errors and NetBSD 0.9 doesn't.) When you get UNIT ATTENTION ("removable medium may have been changed or the target has been reset") from the disk the sd driver sets a "not valid" flag and disallows further I/O to the disk until it is fully closed and reopened. Thus the big red switch. I've had problems with multiple initiators on a SCSI bus because some of the initiators always insist on resetting the bus. The SD driver then does what you are seeing. I think the sd driver could be extended to look at the additional sense code. ASC=0x28 is "Not ready to ready transition, medium may have changed" and ASC=0x29 is "Power on, reset, or bus device reset occurred". We could ignore ASC=0x29 and treat ASC=0x28 the same way as we are now, that is, no more I/O to an open device that someone may have changed. Both Sun and SGI are more tolerant of SCSI bus resets. Two points: 1. I just looked through the source and don't see the "sd0:reset" message anywhere in any of the revs I have. Netbsd is packed up right now, though, so it could be changed to say that in there. You want to look around for the SDVALID flag. 2. Is your disk really giving back a UNIT ATTENTION? If so, why? It would be interesting to dump the full sense information when you get that condition and see what your drive is telling you. Peter -- Peter Dufault Real Time Machine Control and Simulation HD Associates Voice: 508 433 6936 hd@world.std.com Fax: 508 433 5267