*BSD News Article 86890

Path: euryale.cc.adfa.oz.au!newshost.carno.net.au!harbinger.cc.monash.edu.au!nntp.coast.net!howland.erols.net!worldnet.att.net!news.mathworks.com!newsfeed.internetmci.com!demos!news1.best.com!nntp1.best.com!not-for-mail
From: dillon@flea.best.net (Matt Dillon)
Newsgroups: comp.unix.bsd.freebsd.misc,comp.arch,comp.benchmarks,comp.sys.super
Subject: Re: benchmarking discussion at Usenix?
Date: 15 Jan 1997 16:50:45 -0800
Organization: BEST Internet Communications, Inc.
Lines: 44
Distribution: inet
Message-ID: <5bju15$6d5@flea.best.net>
References: <5am7vo$gvk@fido.asd.sgi.com> <32D3EE7E.794B@nas.nasa.gov> <32D53CB1.41C6@mti.sgi.com> <32DAD735.59E2@nas.nasa.gov>
NNTP-Posting-Host: flea.best.net
Xref: euryale.cc.adfa.oz.au comp.unix.bsd.freebsd.misc:34100 comp.arch:62447 comp.benchmarks:18905 comp.sys.super:6857

:In article <32DAD735.59E2@nas.nasa.gov>,
:Hugh LaMaster  <lamaster@nas.nasa.gov> wrote:
:>Dror Maydan wrote:
:>
:>> One more interesting category is the latency accessing objects bigger
:>> than 4 bytes.  On many cache based machines accessing everything in a
:>> cache line is just as fast as accessing one element.  I've never seen
:>> measurements, but my guess is that many data elements in compilers are
:>> bigger than 4 bytes; i.e., spatial locality works for compilers.
:>
:>Well, optimum cache line sizes have been studied extensively.
:>I'm sure there must be tables in H&P et al. showing hit rate
:>as a function of line size and total cache size.  For reasonably
:>large caches, I think the optimum used to be near 16 Bytes for 
:>32-bit byte-addressed machines.  I don't know that I have seen more 
:>recent tables for 64-bit code on, say, Alpha, but my guess is that
:>32 bytes is probably superior to 16 bytes given the larger address 
:>sizes, not to mention alignment considerations.  Just a guess.
:>Also, we often (but not always) have two levels of cache now,
:>and sometimes three, and the optimum isn't necessarily the
:>same on all three.  Numbers, anyone?

    The speed at which you can access memory from a program 
    is limited by the maximum size of the data object you can 
    read or write in a single (memory) instruction cycle, which is
    usually a long or quad word (4 or 8 bytes).  It is unrelated to 
    the cache line size for the most part.

    What IS related to the cache line size is the memory-to-cache
    and secondary-to-primary cache bandwidth.  When you run an
    instruction that reads data element N into a register, the
    processor may end up transfering elements N+1, N+2, etc...
    into the primary cache at the same time, but you still have to 
    issue instructions to read those elements to actually get a hold
    of them.

    The cache line size is also a topological tradeoff in the design
    of the cache memory.  The larger the line size, the fewer tag
    bits you need AND the higher data bits : tag bits ratio you have.
    It's a two way street, though... if the cache line size is too
    large, you loose efficiency due to data address collisions.

						-Matt