Return to BSD News archive
Path: euryale.cc.adfa.oz.au!newshost.carno.net.au!harbinger.cc.monash.edu.au!news.rmit.EDU.AU!news.unimelb.EDU.AU!cs.mu.OZ.AU!munnari.OZ.AU!uunet!in2.uu.net!144.212.100.12!news.mathworks.com!enews.sgi.com!news.corp.sgi.com!news.sgi.com!news1.best.com!nntp1.ba.best.com!not-for-mail
From: dillon@flea.best.net (Matt Dillon)
Newsgroups: comp.unix.bsd.freebsd.misc,comp.sys.sgi.misc
Subject: Re: no such thing as a "general user community"
Date: 26 Mar 1997 00:35:42 -0800
Organization: BEST Internet Communications, Inc.
Lines: 54
Message-ID: <5han4u$fnf@flea.best.net>
References: <331BB7DD.28EC@net5.net> <5h91l2$gua@innocence.interface-business.de> <5h9rr0$2sj@flea.best.net> <5h9vft$8eo@fido.asd.sgi.com>
NNTP-Posting-Host: flea.best.net
Xref: euryale.cc.adfa.oz.au comp.unix.bsd.freebsd.misc:37944 comp.sys.sgi.misc:29484
:In article <5h9vft$8eo@fido.asd.sgi.com>,
:Nate Tuck <nate@blit.engr.sgi.com> wrote:
:>In article <5h9rr0$2sj@flea.best.net>,
:>Matt Dillon <dillon@flea.best.net> wrote:
:..
:>
:>How do you define these load metrics? What is a heavy load on each of
:>the machines? Is it measured by xload, response time on some app, or
:>what? How many users can you stick on each machine before the load
:>becomes heavy?
I generally define a 'heavy load' as 'undergoing paging quite often'
or 'significant number of processes blocked in I/O wait states'
and a 'medium-to-heavy load' as 'occassionally paging'. Our mail/www
proxies each typically have around 80 sendmail processes running and
30-40 active WWW connections, plus named. I consider this a medium
load. Our main news machine runs its disks and network maxed out half
of the time, with around 70 incoming and 95 outgoing processes (running
Diablo). That's heavily loaded. The main newsreader machine, running
inn 1.5.x, has around 250 nnrpd's going at any given moment but plenty
of I/O and cpu cycles to spare, and runs it's disks at around 30%
saturation (at a guess)... that's medium loaded. Except for the newsreader
machine, the machines have 128MB of ram in them at the moment.
:>Religious differences and rose-colored glasses aside, what is the
:>difference in throughput between the two platforms in the specific
:>case of BEST? Where does SGI need to do some code tuning?
:>
:>I'm interested.
:>
:>nate
Well, this is a biased answer (as, probably all my SGI-related comments
are), but my opinion is that the IRIX kernel needs a complete workover,
especially the network, paging, and block I/O code. The problems are
mainly related to inefficiency. Odd situations can result in huge
swings in performance. Sometimes the rtnetd's take huge globs of cpu,
sometimes not. Paging often hits degenerate cases where the machine's
performance drops by an order of magnitude which, given the rate that
new connections come in, generally spells a quick death. it's so bad
that we STILL have a once-a-minute cron job running on the two L's which
allocates 130 MBytes of ram, touches it all, then exits. The block I/O
is extremely inefficient for a general multitasking load, mainly oweing
to terrible buffer cache management and 16K I/O operations (64 bit
kernels). Basically, we will see the performance drop from one moment
to the next without any discernable cause. fork/exec overhead is also
really, really bad, and shellx needs to do about 40 fork/execs a second
at peak. Even now, after midnight, it's doing 20/sec. The VM systems
is crazy... it reserves 'virtual' swap on a per-process basis even for
read-only shared mmap()s, and if it does that, god only knows what it's
doing with other shared objects. Complete insanity.
-Matt