*BSD News Article 69371

#! rnews 8800 bsd
Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!harbinger.cc.monash.edu.au!news.mira.net.au!news.vbc.net!news.cais.net!bofh.dot!news.mathworks.com!newsfeed.internetmci.com!usenet.eel.ufl.edu!bofh.dot!warwick!lyra.csx.cam.ac.uk!news
From: Damian Reeves <damian@zeus.co.uk>
Newsgroups: comp.unix.bsd.freebsd.misc
Subject: Re: Linux vs. FreeBSD ... (FreeBSD extremely mem/swap hungry)
Date: Sat, 25 May 1996 10:39:29 +0100
Organization: Zeus Technology Ltd.
Lines: 161
Message-ID: <31A6D551.41C67EA6@zeus.co.uk>
References: <3188C1E2.45AE@onramp.net> <4o3ftc$4rc@zot.io.org> <31A5A8F6.15FB7483@zeus.co.uk> <31A5D0A8.59E2B600@zeus.co.uk> <DrxB6M.Iyn@kithrup.com>
NNTP-Posting-Host: jobbie.chu.cam.ac.uk
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Mailer: Mozilla 2.02 (X11; I; FreeBSD 2.1.0-RELEASE i386)

Sean Eric Fagan wrote:
> 
> In article <31A5D0A8.59E2B600@zeus.co.uk>,
> Damian Reeves  <damian@zeus.co.uk> wrote:

    >> "Our program is so old, no one who wrote the internals is even
    >> alive anymore, we couldn't possibly alter them, let alone
    >> describe the design decisions that contributed to this large
    >> chunk of code we don't understand."

    Sean> Uh... actually, everyone who wrote the internals of BSD is
    Sean> still quite alive.  In fact, I went and got a copy of the
    Sean> 4.4BSD book on Tuesday, and got it signed by one of the
    Sean> authors.  I'll try to get the others as time goes by.

It wasn't meant to be a description of the actual history of BSD,
rather a response to the "Our code is obviously perfect because we've
used it for so long" argument.

    >> Under BSD, memory is allocated on a binary buddy system causing
    >> all blocks to be allocated of sizes that are a power of 2.
    >> This wastes a lot of memory (ask for 2mb+1byte and the kernel
    >> will reserve 4mb of memory for you).  Linux on the otherhand
    >> supports the allocation of arbitary sized blocks.

    Sean> Uh... that is a library issue, not a kernel issue.
    Sean> Specifically, malloc(), which is not a system call.  And
    Sean> FreeBSD has a couple of different mallocs it comes with;
    Sean> they behave differently (space vs. time tradeoffs are
    Sean> common).

But did you not say that the OS is more than just a kernel?

    Sean> You can roll your own malloc, of course, and several
    Sean> packages provide their own.  The kernel-level memory
    Sean> allocation system calls are brk() (sbrk() on some systems is
    Sean> an actual system call; on others, it's a wrapper for the
    Sean> brk() system call) and mmap().

Indeed, I already have.  However its not going to help my statically
linked Netscape from taking less memory is it?

I have tracked some of my memory problems down to an interaction between
Netscape, XFree 3.1.2 and ctwm configured with backing store and
save-unders.

    >> Secondly, and the most significant difference, is the different
    >> semantics of the malloc() call.  Under BSD, a process
    >> performing a malloc() of 100k will actually reserve 100k of
    >> memory from the VM system, or return failure if that memory
    >> cannot be allocated.  Under Linux, no such pre-reservation
    >> occurs, malloc's() cannot fail.  The memory is allocated on
    >> demand as the pages that have been 'reserved' are dirtied by
    >> the process.

    Sean> Historically, BSD has backed memory with swap space.
    Sean> FreeBSD, however, does not -- it uses the same method you
    Sean> ascribe to Linux.  It is called "lazy allocation."  In fact,
    Sean> Net/2 used this same method -- it was a hold-over from the
    Sean> Mach VM code that Net/2 and later used.  (I believe that
    Sean> Lite or Lite2 tries to keep track of how much swap space is
    Sean> used, so won't allow it.  FreeBSD, however, does allow it.)

OK I didn't realise that FreeBSD already did this, I was trying to
come up with some reason why FreeBSD used so much swap, now its even
harder to explain.

    >> For example, under Linux on a 8mb machine, I can malloc() a
    >> sparse array of 200mb which will succeed, write to the first
    >> and last bytes of this data, which will increase the process
    >> SIZE by only 2 4k pages.  On the same machine under BSD, this
    >> is impossible (unless you have > 200mb of swap space).

    Sean> And then, when you modify a byte somewhere in the middle,
    Sean> either your program, or some other randomly-chosen process,
    Sean> will be killed because you've run out of physical and swap
    Sean> memory.

Very true, no guarantees can be made whether a program will SEGV on
Linux.  Important system daemons need to be very careful that they
will malloc(), touch, then free() enough memory on startup to increase
their data size so that subsequent memory allocated to the process
later on in time will always be available.  Even then fragmentation
makes this almost impossible to achieve.  It could be said that it is
impossible to use a Linux style memory manager for mission critical
applications, unless they do all their processing in a fixed size
static buffers.  One could install a SEGV signal handler that tried to
restart the process in a sensible state after a memory fault, but this
is a gross hack.  Then again, how much UNIX code actually checks for
the malloc() return code to be zero and handles it appropriately?

One thing I noticed with our Linux box today is that apart from
init/kswapd the minimum text size of all the other processes on the
machine was 204k.  Now, a 'size' on /bin/bash (which is /bin/sh on
Linux), shows almost exactly 204k of code.  It would thus appear than
on a fork()/exec(), extraneous text pages are not freed back to the
OS.  Hopefully those extra pages are still shared between the other
processes, otherwise every unique program will take a minimum of 200k
of VM (although it shouldn't swap it out to swapspace but re-read off
the
filesystem).  I've yet to check this on BSD yet.

    >> I am not discussing load here, that is irrelevant.  Load and
    >> paging are two totally different things.

    Sean> Not completely.  (I could back it up, but why bother?  You'd
    Sean> just claim it was irrelevant.)

If it helps explain, please don't hold back.

    >> Do you know if the mount_mfs() process invokes the kernel
    >> sbrk() call to manually return unused pages in its data segment
    >> back to the OS?

    Sean> Well, actually, that's not how MFS works.  It allocates a
    Sean> fixed size of memory (based on command line arguments, or
    Sean> the default if none are given), "formats" that memory so
    Sean> that it looks like a UFS filesystem, and then goes into
    Sean> kernel mode and never comes back until the filesystem is
    Sean> unmounted.  Since the VM allocation (in FreeBSD) is lazy,
    Sean> neither phyiscal or swap memory are used until the memory is
    Sean> actually touched (read or written).

    Sean> When a file is removed, however, it doesn't get "returned to
    Sean> the system."  It will just sit there eating up space;
    Sean> however, the filesystem will be able to reuse it, hopefully.

Aha, so MFS is going to eat my swap and never return it then.

    Sean> You, on the other hand, appear to be trolling for flames in
    Sean> a Linux vs. BSD war.  Why, I don't know; I don't see any
    Sean> reason for baseless arguments -- and your ignorance about
    Sean> BSD internals, let alone FreeBSD internals, make most of
    Sean> your arguments baseless.

My argument has nothing to do with a Linux/BSD war, Linux is merely
something I can use to compare BSD against on the same hardware.  My
argument is why should my xbiff take 1.3MB of RSS.  Do you think that
is a useful use of memory?  Back in days gone by, one used to run a
UNIX server with 4MB of ram which supported 20 odd interactive users.
Now you'd plump for 32MB or maybe even 64MB as a minimum to achieve
this, yet the requirements of the users have hardly changed.

This may not concern you in the slightest, but I find this a very
worrying trend.  One thing that is really annoying me at the moment,
from a purely purist point of view, is the amount of memory that
programs require under UNIX.  Coming from a history of RISC assembly
programming, I still find it quite incredulous that an interactive
shell, basically a command line editor with pipe handling, requires a
200k binary with even larger memory requirements!  Yet no-one seems to
question where all this space is going and what it is used for.

I think UNIX needs a RISC style overhaul, memory needs to 'pay it's
way'.

Regards,
 Damian

-- 
Damian Reeves, <damian@zeus.co.uk>                 Zeus Technology Ltd.
Download the world's fastest webserver today!      http://www.zeus.co.uk