Return to BSD News archive
Xref: sserve comp.unix.bsd:12646 comp.os.linux:54902 comp.os.386bsd.misc:1028 Newsgroups: comp.unix.bsd,comp.os.linux,comp.os.386bsd.misc Path: sserve!newshost.anu.edu.au!munnari.oz.au!news.Hawaii.Edu!ames!agate!dog.ee.lbl.gov!hellgate.utah.edu!hellgate!dbo From: dbo%eroica.cs.utah.edu@cs.utah.edu (Doug Orr) Subject: Re: shared libraries (was BSD UNIX) Message-ID: <DBO.93Sep16161442@eroica.cs.utah.edu> In-reply-to: terry@cs.weber.edu's message of Mon, 13 Sep 93 20:34:42 GMT Organization: University of Utah Computer Science References: <CCu0s1.29o@ssesco.com> <1993Sep7.162843.19294@fcom.cc.utah.edu> <33833@dog.ee.lbl.gov> <1993Sep13.203442.21808@fcom.cc.utah.edu> Date: 16 Sep 93 16:14:42 Lines: 156 Hi, I noticed this and had a couple of comments Subject: Re: shared libraries (was BSD UNIX) Message-ID: <1993Sep13.203442.21808@fcom.cc.utah.edu> Sender: news@fcom.cc.utah.edu Organization: Weber State University, Ogden, UT References: <CCu0s1.29o@ssesco.com> <1993Sep7.162843.19294@fcom.cc.utah.edu> <33833@dog.ee.lbl.gov> Date: Mon, 13 Sep 93 20:34:42 GMT In article <33833@dog.ee.lbl.gov> torek@horse.ee.lbl.gov (Chris Torek) writes: [ ... Static vs. Dynamic link tradeoffs ... ] >This makes the Utah approach (as described at the last USENIX) all the >more interesting. In this case, `executing' a binary invokes the >linker, which can choose either to run a cached `pre-linked' version or >to construct a new one. As Terry notes, most applications are run much >more often than they need re-linking (the shared libraries do not >change often). Hence, the same cached `post-fixup' version can be >reused (saving time) and shared (saving space). In effect, this is >the happy medium between `pure static' and `pure dynamic': resolved >on demand, then static until something forces a change. I thought that the numbers presented in the paper were, shall we say, optimistic, especially with regards to the relative frequency of cache hits. There is also the problem of converting the usage back to vanilla (or not so vanilla) C from the C++ required in their implementation. I'm always in favor of optimism, but I'm not sure I understand your comment. If you're referring to the on-disk cache of prelinked executables (the use of the word "cache" may have been unfortunate), that's largely an (undefined) policy issue. You can set it up in whatever way works best for you. If, for example, installing a program (meta-object) *requires* generation of a cached copy (fully linked executable), there will be a very good chance you'll have a copy when you need one. Similarly, the policy can dictate that only one cached copy be maintained for a given set of executables/libraries, or whatever, to avoid proliferation when in development mode. The requirement that a cached copy be generated at install time imposes on the installer the cost of the final program link... not particularly worse than when doing normal edit/compile/debug with make and ld. Pushing this off on policy is something of a copout, given that I have never gen'd up a policy expressly for edit/compile/debug. You can certainly come up with boundary cases where a given policies won't work too well. But, I do think it's feasible to serve the needs of edit/compile/debug. I don't understand the C/C++ comment. OMOS is written in C++, but will work on any sort of a collection of executables. Various test programs cited included alpha-one (several hundred thousand lines of C code), ls, and xmh... we know what they're written in. What am I missing? >Note that if this is done `right', the cached post-fixup binaries do >not need to contain the shared libraries. Rather, the dynamic linker >chooses an address for each shared library for each binary, and >attempts to keep this address constant per library. If/when this >attempt fails, this falls back to the equivalent of `pure dynamic' >linking; when it succeeds, it works much like `pure static' linking. >The only thing that the choice of per-library address `constants' >affects is the ratio of successful `static-like' links to failed >`dynamic-like' links. For what it's worth, we do this "right". > >Assigning addresses to libraries would be the task of a program similar >to SunOS's `ldconfig'. A library that lacks a preselected address >would simply have one chosen dynamically. This would take more space, >to cache `preloaded' binaries and (at their preselected addresses) >libraries, but only those that are in fact used and/or only those that >meet some policy. Again, the fallback is, in essence, `pure dynamic' >linking; all else is merely optimization. The idea of attempting to place the tables at a known location in all binaries was the real interesting idea -- that way the post-fixup pages can be shared. The problem with assigned addresses remains, however... you still eat an address range for the library per version of the library. I don't think the work at the UofU went through very many generations of libraries, and so this problem didn't become evident. The dynamic fallback simply resolves the packing problem. Admittedly, this will alleviate the space issues somewhat, but with the excessively frequent revisions to libraries for systems undergoing active developement (like NetBSD/FreeBSD), this either implies a release authority with more frequent releases or a plethora of incompatable library images. Different versions can live at the same addresses (given that the average application won't access two versions of the same library, what with being unable to name things and all). The constraint system picks out a version that is compatible with the constraints given. The intent here was that OMOS would be the addressing authority, a given meta-object could provide a "hint" as to where it might be most desirably located, then OMOS would use a constraint system to resolve that desire with realities of what executables it has available, etc. I think it is just as possible to get around the problems with fixed GOT locations; note that global data referenced by the libraries, unless it is const data, must be pushed into the process at initial symbol resoloution time, even if the link is delayed (as it is in the Utah system). This means that data reference must be through the relocation table as well, and thus pushing the table out of the image will not be successful if there is external and internal symbol references taking place... this, in a nutshell, was the main impediment to using Pete's modification of Joerg's fixed address libraries to produce workable X (or termcap) libraries. Data is (are?) a real pain in the ass. In a copy-on-write world, what we do is push the global data up to the highest level used. Stdio data is linked in with libc, for example. Applications are bound, directly referencing "up". This works as long as your libraries are basically structured like a class with single inheritance. If they are structured more like a multiple inheritance heriarchy or there are circular library references, you're screwed. Happily, this is not often the case. Also, if you want to do data interposition (not unreasonable), you need an indirection that's fixed up at load time. As long as we stick to our desired policy of not modifying compilers, this will be hard to come by, so you're screwed there as well. Since you can insert levels of indirection, procedural interposition is possible, although mildly painful. We're looking at single address space systems, and data becomes even more of a pain, when COW is removed. But, we have plans there, as well. I think the basic justification for the Utah code comes from the ability to launch or relaunch an application rather quickly; this is not the highest item on my list of needs, and paying the per application fixup penalty up front at exec time for the first instance of an application is acceptable, especially as a trade-off to incurring additional run-time overhead. Good analysis. A form of the libraries we didn't talk about as much involves generation and exportation of an executable image which is linked with what amounts to a set of stubs that dynamically load the target library at run time. It amounts to the same sort of setup as the Sun shared libraries. The benefits of this approach include the fact that there is a tangible amount of code available to the debugger (OMOS' other mode where it maintains maximum anal/retentive control of all code requires debugger changes to get at symbol tables), and for the user to touch, cp, tar, or whatever. The version of the library that is mapped in can flex around at run time. You pay more per access of a given routine, but if you're in edit/compile/debug, who cares. I didn't do that support, so I don't remember what we do about shared variables... you can certainly ask OMOS where they are and reference them that way. I don't know if we do anything nicer. I don't know if that addresses your concerns, but it's something else that's available. -Doug