Return to BSD News archive
Newsgroups: comp.unix.bsd.freebsd.misc,comp.unix.bsd.bsdi.misc Path: euryale.cc.adfa.oz.au!newshost.carno.net.au!harbinger.cc.monash.edu.au!munnari.OZ.AU!news.ecn.uoknor.edu!news.wildstar.net!cancer.vividnet.com!hunter.premier.net!www.nntp.primenet.com!nntp.primenet.com!howland.erols.net!newsxfer2.itd.umich.edu!news.sprintlink.net!news-chi-8.sprintlink.net!rockyd!dnn.rockefeller.edu!dan From: dan@dnn.rockefeller.edu (Dan Ts'o) Subject: Re: Why one should buy parity memory for reliability? X-Nntp-Posting-Host: dnn.rockefeller.edu Message-ID: <DyApK7.DL8@rockyd.rockefeller.edu> Followup-To: comp.unix.bsd.freebsd.misc,comp.unix.bsd.bsdi.misc Sender: notes@rockyd.rockefeller.edu (News Administrator) Organization: Rockefeller University X-Newsreader: TIN [version 1.2 PL2] References: <32485B0D.41C6@austin.ibm.com> Date: Wed, 25 Sep 1996 15:55:19 GMT Lines: 45 Xref: euryale.cc.adfa.oz.au comp.unix.bsd.freebsd.misc:27987 comp.unix.bsd.bsdi.misc:4983 Tushar Patel (tpatel@austin.ibm.com) wrote: : If the Board supports the parity memory and error occurs then in : theory the OS should be notified and the access should be reapeted. I don't know if the FreeBSD kernel attempts any repeats. That would be nice but in most UNIX systems you just get an error message and sometimes even just a panic. In most real-world cases a repeat won't do any good, though as the simm is dead. In addition, I believe that parity is only checked on reads, in which case, if it reads wrong, it won't generally change. What you really want of course is ECC memory so that the memory values get corrected on the fly, software continues to run, data is intact and you get the warning that hardware needs replacing. No serious mission-critical computer should be without ECC. : What happens in the case of the DMA transfer from the DISK to the : memory or from memory to disk, if the memory error occures then : processor is not looking at the data bus, so does that mean that the : DMA master (SCSI controller) will detect the parity error and : retransfer the data? The memory controller does the parity/ECC check. The processor is not involved unless there is an error. Then the CPU usually gets an NMI (non- maskable interrupt) Every thing going into and out of the memory gets checked on the fly, DMA transfers included. : There is a big difference in the price between the parity and non : parity memory so I am trying to justify the parity memory purchase. The danger with non-parity or fake-parity is that memory errors will go undetected. You could be computing a payroll or other important transaction and writing bad and corrupted data on the disk or to a printer or screen and you may *never* know it. If base your computing work on non-parity memory you either don't care if the results are accurate (like you are playing games) or you are gambling that a memory failure will be so catastrophic that it will take down the machine or exhibit some other very obvious behavior, which isn't necessarily the case. What the statistics are on this gamble I don't know. Serious computing must at least have true parity, if not ECC. -- Cheers, Dan Ts'o 212-327-7671 Dept. of Neurobiology FAX: 212-327-7671 The Rockefeller University 1230 York Ave. Box 138 dantso@cris.com New York, NY 10021 dan@dna.rockefeller.edu