Return to BSD News archive
Xref: sserve comp.periphs:4025 comp.os.386bsd.development:978 Newsgroups: comp.periphs,comp.os.386bsd.development Path: sserve!newshost.anu.edu.au!munnari.oz.au!news.Hawaii.Edu!ames!olivea!decwrl!decwrl!csus.edu!netcom.com!jmonroy From: jmonroy@netcom.com (Jesus Monroy Jr) Subject: The DMA Problem with slow devices. Message-ID: <jmonroyCB6IrJ.G2y@netcom.com> Keywords: QIC FDC DMA problem Organization: NETCOM On-line Communication Services (408 241-9760 guest) Date: Tue, 3 Aug 1993 10:35:42 GMT Lines: 143 Comment from John Sokol about "timing problems" with IBM-type clones and IBM 100% compapatibles. (as told to Jmonroy@netcom.com) ------------oOo-------------- Before you read, this may be old news for some of you, but to others it might be news. That is, with the way messages get chained and messages threads get contorted (with non-subjects) some of you may have missed this. Also, this is (as I see it) the reasons why some (unnamed) OS groups are lost for system solutions when working with the IBM AT. The problem basiclly is if you can't read the EE (Electrical Engineer's) notes your lost. Solution: Make a new friend(associate). I also want to thank John for talking to me at 1 a.m. about a non-money making problem. --Or better yet -- Thanks John, this is what "Midnight Enginnering" is all about. ------------oOo-------------- Problem: The FDC driver losses about 1 in every 30 requests due to a reported "DMA transfer lost". A few Terms: DMA - Direct Memory Access (Controller) (Intel 8237a) FDC - Floppy Drive Controller (NEC uP765a) ACK - Acknowledge ------------oOo-------------- John Sokol: OK, let's take it from the top. First you setup the DMA controller to wait for request on Channel 2 (from the FDC). Second the FDC is setup for the transfer. At this point the FDC is waiting for the sector, in request, to come around with the appropriate information. When it gets that sector, it then triggers the DMA request. Then it waits for the DMA ACK(nowledge) to come back from the DMA controller chip. Jmonroy: OK. John Sokol: Now it (the DMA) has to be back -in time- for the arrival of the first byte from the FDC. That data then (on the FDC) is ready and queued up; if it (the DMA) doesn't catch that first byte it's (the transfer) lost. So to clarify, the disk is spinning and this has a long latency time before the actual transfer may start. If then the diskette is in an incorrect position for an immediate transfer, the system is essential waiting for the DMA transfer to start. The window from dma request to the ACK(nowledge) for the floppy drive is probable ----- no it is definately no longer than the gap length after the sector header. It is at this point that the DMA arbitration may have a problem. Now, there is a DMA RAM refresh occuring on DMA channel 0 controlled by timer channel (?). If then, that RAM refresh is occurring while the sector in request "becomes" ready, then the DMA is not going to able to ACK(nowledge) in time. So it is just a matter of fact - that the disk happens to be at the wrong place (when the DMA RAM refresh is occuring) and that the transfer will then fail. JMonroy: I think this leads to a problem I was discussing with you before about harmonic ringing. That is, when more than one timer happen to be running in parallel on the system, the timers, which run almost in harmony, then collide at some point when a harmonic is in consequence. So maybe there is a harmonic somewhere where the timers overlap? John Sokol: Well, with the floppy disk it is not the timer; it is the revolution of the disk and the overlap of the timing of the DMA RAM refresh, which you don't control. That (the refresh) is set at boot-time by the BIOS. Jmonroy: So because it's so slow, the FDC is the most likely fail. The ethernet might have the same problem,,,, the Com driver... John Sokol: ANYTHING asking for a DMA request that has timely data (will fail). That is, if there is no buffering, beyond a one byte buffer, it is going to fail, if there is a DMA RAM refresh occuring at that exact instant in time. RAM refresh has the highest priority. Jmonroy: The highest priority and nobody can tell what is going on with it. John Sokol: You can't tell when it (the refresh) is going to come up or anything. Jmonroy: OK. So profile the system wouldn't help the problem. John Sokol: I wouldn't expect it. Jmonroy: and the best thing to do is work around it? John Sokol: If you can't afford to work with a buffered device, it means then you just retry. Jmonroy: Expect transfer failures with non-buffered device? John Sokol: YES. Jmonroy: OK. John Sokol: NOW, I know there have been quite a few people experimenting "with the slowing down" of the the RAM refresh cycle, which actually makes the system go faster. You get a 5% (or so) performance increase by optimizing the RAM refresh (seperating them apart as far as possible), but you are running the risk of data loss in your DRAM (not refreshing them as often as the manufactures specifies). Jmonroy: So, you might get things like parity error poping out of now where? John Sokol: Yes. Jmonroy: OK, thanks. I will send this of to Bill, when I get a chance, and maybe that will get some solutions Ideas working. Thanks again. ___________________________________________________________________________ Jesus Monroy Jr jmonroy@netcom.com /386BSD/device-drivers /fd /qic /clock /documentation ___________________________________________________________________________