Return to BSD News archive
Newsgroups: comp.unix.bsd Path: sserve!manuel!munnari.oz.au!uunet!cis.ohio-state.edu!zaphod.mps.ohio-state.edu!wupost!darwin.sura.net!uvaarpa!cv3.cv.nrao.edu!laphroaig!cflatter From: cflatter@nrao.edu (Chris Flatters) Subject: Re: Jolitz 386BSD-0.1 -- floating point perform Message-ID: <1992Jul24.161646.22896@nrao.edu> Sender: news@nrao.edu Reply-To: cflatter@nrao.edu Organization: NRAO References: <l6qc51INN1gu@neuro.usc.edu> Date: Fri, 24 Jul 1992 16:16:46 GMT Lines: 50 In article l6qc51INN1gu@neuro.usc.edu, merlin@neuro.usc.edu (merlin) writes: >I have most of the US Army BRLCAD three dimensional CSG modeling and >distributed ray tracing system ported to the Jolitz 386BSD-0.1. But, >I am getting only about one fifth of the floating point performance >previously measured using AT&T pcc and GNU gcc 1.4x on ATT UNIX SYSV. > >Does the compiler default to '387 emulation? Is there some flag which >needs to be set to actually use the coprocessor? Or are there reasons >386BSD-0.1 would exhibit relatively poor floating point performance? I ran some checks last night and 386BSD is certainly exploiting the coprocessor. These are the results from the Plum2 benchmark (See section 8.2 of "C++ Programming Guidelines" by Thomas Plum and Dan Saks. The results are the average time for a register int, auto short, auto long and auto float operation and the average time to call and return from an empty function. Times are in nominal milliseconds (CLOCKS_PER_SEC was missing from <time.h> so I guessed a value of 100 --- I now think that it should have been 60. The tests were performed on a CompuAdd 325s (25MHz 80387SX CPU) with a Cyrix 83S87 FasMath coprocessor. register auto auto function auto int short long call+ret double 386BSD gcc 0.178 0.448 0.474 1.62 4.94 386BSD gcc -O 0.159 0.207 0.159 1.75 3.37 The ration of floating-point time to auto long is 21.2 (with optimization) which is in the correct ball park for a 386SX/387SX system but a little on the long size. As a control, I made a copy of the dist.fs disk with a compiled version of bench2 on it and booted it on my portable: a 16 MHz 80386SX system without a coprocessor. The results were register auto auto function auto int short long call+ret double 386BSD gcc -O 0.240 0.317 0.242 2.32 346 Note that the ratio of of f-p time to auto long is now 1429.8 --- in other words emulation is more than 60 times slower than the coprocessor. Unless BRLCAD uses very little floating-point I believe that the coprocessor is active on Alexander-James Annala's machine too (If Alexander-James wants to try these tests I'll send him the source code if he drops me a line). For final comparison, I have some old figures from Linux with gcc 2.1. Using the register int time to place the results on the same scale as the 25MHz results above the mean time for a f-p operation was 2.09 usec without optimization and 0.936 usec at -O1 and above. Chris Flatters cflatter@nrao.edu