Return to BSD News archive
Newsgroups: comp.unix.bsd Path: sserve!manuel!munnari.oz.au!spool.mu.edu!agate!dog.ee.lbl.gov!hellgate.utah.edu!fcom.cc.utah.edu!cs.weber.edu!terry From: terry@cs.weber.edu (A Wizard of Earth C) Subject: Re: Program dies with FP Exception Message-ID: <1992Sep13.083846.6134@fcom.cc.utah.edu> Sender: news@fcom.cc.utah.edu Organization: Weber State University (Ogden, UT) References: <STARK.92Sep13002650@sbstark.cs.sunysb.edu> Date: Sun, 13 Sep 92 08:38:46 GMT Lines: 67 In article <STARK.92Sep13002650@sbstark.cs.sunysb.edu> stark@cs.sunysb.edu (Gene Stark) writes: >Here's a tough one I've been trying to track down -- maybe somebody out there >who knows more can guess what is going on. > >I am running 386BSD on a 486/33 system with 4MB RAM and a 210MB Connor IDE >drive. A program I was working on dies on Signal 8 (Floating point exception) >in a perfectly repeatable fashion. It is not so easy to tell where the >exception actually comes from, though, because the signal seems to be getting >delivered to the process much later, when it is leaving the system after >a call to "write". I haven't been able to get a small test program that >repeats the bug, however there seem to be several crucial elements involved: > > (1) A call to "atof", which returns a double that is then > stored in a temporary on the stack. Removing the call > removes the error. > > (2) The actual magnitude of the number being converted by "atof". > I found that the string "1e10" and "1e12" cause the error, > but "1e9", "1e6", and "0.0" do not. > > (3) Some later "write" system calls. The signal is actually > delivered on the fourth call to write after the atof. > What is happening in the interim is just C code without > any other system calls. I do not know what causes the > signal to get delivered when it actually does. First of all, like all other signals, the SIGFPE gets delivered to a process as a result of the sigtrampoline code. The *only* way you get a signal is on return from a system call. The problem is that there appears to be no code in the library which forces a check for the exception *immediately* after the floating point function call. This is aggravated by the fact that GCC likes to in-line 386 floating point (from what little experimentation I've done). This has the effect of defeating any fixes made at the library level to hit the sigtrampoline code to check for an exception. Second, are you using a real FPU, or are you using the emulation? I know that I *could* try it myself, but I prefer to arrive at an expected answer before experimenting (I guess my physics background shows). Third, you were aware that for a 16 bit value to be multiplied/divided, you have to have a 32 bit area to receive the value, and for a 32 bit, you have to have a 64 bit receiver? Perhaps you are truly getting an exception. Fourth, I believe that the math stuff is actually not being done at the highest floating point resoloution (I read this in the newgroup here, so I could be totally wrong 8-)). This would lend credence to the idea that you are actually getting an exception. Fifth, there is a well known problem that causes 'ps' to die with the same exception -- the problem occurs when you have a double lvalue and assign it to an undeclared (int) rvalued function. Are you sure that atof() is declared extern double somewhere? Hope this helps narrow the problem. Terry Lambert terry_lambert@gateway.novell.com terry@icarus.weber.edu --- Any opinions in this posting are my own and not those of my present or previous employers. -- ------------------------------------------------------------------------------- "I have an 8 user poetic license" - me Get the 386bsd FAQ from agate.berkeley.edu:/pub/386BSD/386bsd-0.1/unofficial -------------------------------------------------------------------------------