Return to BSD News archive
Xref: sserve comp.os.386bsd.questions:6947 comp.os.386bsd.bugs:1837 comp.os.386bsd.misc:1627 misc.test:29161 Path: sserve!newshost.anu.edu.au!munnari.oz.au!metro!sequoia!ultima!kralizec.zeta.org.au!godzilla.zeta.org.au!not-for-mail From: bde@kralizec.zeta.org.au (Bruce Evans) Newsgroups: comp.os.386bsd.questions,comp.os.386bsd.bugs,comp.os.386bsd.misc,misc.test Subject: Re: [FreeBSD-1.0R] Epsilon -> Release patches - problems Date: 10 Nov 1993 18:00:38 +1100 Organization: Kralizec Dialup Unix Sydney - +61-2-837-1183, v.32bis and v.42bis Lines: 91 Message-ID: <2bq3imINN53k@godzilla.zeta.org.au> References: <CG5LAE.4o3@agora.rain.com> NNTP-Posting-Host: godzilla.zeta.org.au In article <CG5LAE.4o3@agora.rain.com>, David Greenman <davidg@agora.rain.com> wrote: >>You are (again) incorrect. It simply means that assuming the worst >>possible memory fragmentation you can still allocate a 4k buffers. >>That's fine if you only care about 4k file systems. Given 8k file >>systems, it is still not that difficult to get enough fragmentation to >>cause noticable performance degradation--the most likely case being if >>you also try to use 4k file systems at the same time (say, on a floppy >>disk). There are much worse cases than that. E.g., 4K block allocated, 28K hole, 4K block, 28K hole, ... Only 1/8 of the address space is allocated, but no request for > 28K can be satisfied and the kernel will panic. Such a pattern is very unlikely but it's not easy to prove that it is impossible. The buffer cache won't request > 28K unless you have expanded MAXBSIZE to >= 32K, but other parts of the kernel might. >Because of the limit on number of buffers, even if all of the headers >point to 4k buffers, and even if all of the 4k buffers occupy every >other page in the malloc area, as soon as you want to expand a buffer >to be 8k, the FS cache releases one of the 4k buffers, and you then >have an 8k hole. Like I said, even with worst-case fragmentation, >there is no problem. Sorry, unless you have changed vfs__bio.c, then the new space has to be allocated before the old space can be freed so that the old space can be copied. My version of vfs__bio.c and kern_malloc.c (almost) fix this by releasing free buffers until malloc() succeeds. It's still hard to prove that this works in all cases of interest, because there might be a lot of buffers in use or severe fragmentation in the memory allocated for non-buffers. However, if (virtual address space size size in pages) >= N * ((max memory required for non-buffers) + allocbufspace) then the worst case is every N'th page allocated, so free blocks of size ((N - 1) * NBPG) are guaranteed. Take N = (2 + 1) to support MAXBSIZE = 8K, N = (4 + 1) to support MAXBSIZE = 16K, etc. This leaves the problems limiting the memory required for non-buffers and allocbufspace all being used up by in-use buffers. >>The worst case would obviously be alternating allocations of 4k and 8k >>blocks; it is easy to see why this would cause many unfillable 4k >>fragments in the address space. Assuming less adversarial timing, the It's not that obvious. If there are a lot of 8K blocks then allocbufspace limits the total number of blocks (unless nbuf is a stronger limit, like I think it is in FreeBSD). I think the worst case is closer to alternating allocations of 4K blocks and holes; then there's no way to allocate an 8K block. This case is almost handled by the old hack to 0.1: bufpages = min( NKMEMCLUSTERS*2/5, bufpages ); This allows for NKMEMCLUSTERS*2/5 allocated blocks and the same number of holes, so at most 4/5 of the address space is allocated for the buffer cache. The remaining 1/5 of the address space will usually provide a free block to coalesce with one of the holes to produce an 8K hole. The original version of the hack: bufpages = min( NKMEMCLUSTERS/2, bufpages ); is not so good because there is no remaining 1/5 of the address space. >Again, the limit on the total number of buffers makes this problem null. > >FreeBSD's malloc code no longer holds on to freed page-sized allocations >as the code in 386BSD did. This make all the difference. The bufpages limit should have been 5 times lower in 0.1 to handle this problem for MAXBSIZE = 8K! Memory was wasted in the malloc buckets for sizes 512, 1K, 2K, 4K and 8K (5 sizes gives the factor of 5). There's still the problem of internal fragmentation. The worst case for the buffer cache is nbuf pages allocated for 512-byte fragments. Then 8 times as much space will be allocated as when 512-byte fragments are packed, so the allocbufspace limit is not much help. Most versions of 386BSD depend on the nbuf limit being much stronger than the allocbufspace limit for this case. I don't use 8K file systems but did a lot of testing on DOS file systems with block sizes of 512 and 2K while testing my fixes for these problems. The limit on nbuf is unacceptable when the block size is 512 and when there are a lot of fragments. E.g., nbuf = 128 to suit a 1M cache for an 8K file system reduces you to a 64K "cache" for 512-byte file systems. I use nbuf = allocbufspace / 512 (which is equivalent to no limit). One problem with my version is that this results in a lot of empty buffers that clog up the LRU list. -- Bruce Evans bde@kralizec.zeta.org.au