Return to BSD News archive
Path: euryale.cc.adfa.oz.au!newshost.carno.net.au!harbinger.cc.monash.edu.au!news.rmit.EDU.AU!news.unimelb.EDU.AU!cs.mu.OZ.AU!munnari.OZ.AU!uunet!in2.uu.net!144.212.100.12!news.mathworks.com!enews.sgi.com!news.corp.sgi.com!news.sgi.com!news1.best.com!nntp1.ba.best.com!not-for-mail From: dillon@flea.best.net (Matt Dillon) Newsgroups: comp.unix.bsd.freebsd.misc,comp.sys.sgi.misc Subject: Re: no such thing as a "general user community" Date: 26 Mar 1997 00:35:42 -0800 Organization: BEST Internet Communications, Inc. Lines: 54 Message-ID: <5han4u$fnf@flea.best.net> References: <331BB7DD.28EC@net5.net> <5h91l2$gua@innocence.interface-business.de> <5h9rr0$2sj@flea.best.net> <5h9vft$8eo@fido.asd.sgi.com> NNTP-Posting-Host: flea.best.net Xref: euryale.cc.adfa.oz.au comp.unix.bsd.freebsd.misc:37944 comp.sys.sgi.misc:29484 :In article <5h9vft$8eo@fido.asd.sgi.com>, :Nate Tuck <nate@blit.engr.sgi.com> wrote: :>In article <5h9rr0$2sj@flea.best.net>, :>Matt Dillon <dillon@flea.best.net> wrote: :.. :> :>How do you define these load metrics? What is a heavy load on each of :>the machines? Is it measured by xload, response time on some app, or :>what? How many users can you stick on each machine before the load :>becomes heavy? I generally define a 'heavy load' as 'undergoing paging quite often' or 'significant number of processes blocked in I/O wait states' and a 'medium-to-heavy load' as 'occassionally paging'. Our mail/www proxies each typically have around 80 sendmail processes running and 30-40 active WWW connections, plus named. I consider this a medium load. Our main news machine runs its disks and network maxed out half of the time, with around 70 incoming and 95 outgoing processes (running Diablo). That's heavily loaded. The main newsreader machine, running inn 1.5.x, has around 250 nnrpd's going at any given moment but plenty of I/O and cpu cycles to spare, and runs it's disks at around 30% saturation (at a guess)... that's medium loaded. Except for the newsreader machine, the machines have 128MB of ram in them at the moment. :>Religious differences and rose-colored glasses aside, what is the :>difference in throughput between the two platforms in the specific :>case of BEST? Where does SGI need to do some code tuning? :> :>I'm interested. :> :>nate Well, this is a biased answer (as, probably all my SGI-related comments are), but my opinion is that the IRIX kernel needs a complete workover, especially the network, paging, and block I/O code. The problems are mainly related to inefficiency. Odd situations can result in huge swings in performance. Sometimes the rtnetd's take huge globs of cpu, sometimes not. Paging often hits degenerate cases where the machine's performance drops by an order of magnitude which, given the rate that new connections come in, generally spells a quick death. it's so bad that we STILL have a once-a-minute cron job running on the two L's which allocates 130 MBytes of ram, touches it all, then exits. The block I/O is extremely inefficient for a general multitasking load, mainly oweing to terrible buffer cache management and 16K I/O operations (64 bit kernels). Basically, we will see the performance drop from one moment to the next without any discernable cause. fork/exec overhead is also really, really bad, and shellx needs to do about 40 fork/execs a second at peak. Even now, after midnight, it's doing 20/sec. The VM systems is crazy... it reserves 'virtual' swap on a per-process basis even for read-only shared mmap()s, and if it does that, god only knows what it's doing with other shared objects. Complete insanity. -Matt