Return to BSD News archive
Xref: sserve comp.bugs.4bsd:1944 comp.os.386bsd.bugs:565 Newsgroups: comp.bugs.4bsd,comp.os.386bsd.bugs Path: sserve!newshost.anu.edu.au!munnari.oz.au!network.ucsd.edu!mvb.saic.com!unogate!news.service.uci.edu!usc!wupost!uunet!mcsun!sun4nl!eur.nl!pk From: pk@cs.few.eur.nl (Paul Kranenburg) Subject: Re: flock broken - I could use some help Message-ID: <1993Apr26.141928.26083@cs.few.eur.nl> Sender: news@cs.few.eur.nl Reply-To: pk@cs.few.eur.nl Organization: Erasmus University Rotterdam References: <C5t8wH.Hs@moxie.hou.tx.us> <1993Apr21.184636.1121@cs.few.eur.nl> <C5yDEn.JCt@ns1.nodak.edu> Date: Mon, 26 Apr 1993 14:19:28 GMT Lines: 59 In <C5yDEn.JCt@ns1.nodak.edu> tinguely@plains.NoDak.edu (Mark Tinguely) writes: >I think the lock wakeup routines clean the block structures out. I think the >problem was a close() and the master's unlock cause a race and a double >release. I wanted to make sure only one release (based on the ip->i_lockf), >but all of this gets done in lf_clearlock(), so I used that. > -------------- >*** /sys/ufs/ufs_lockf.c.orig Fri Apr 23 14:02:27 1993 >--- /sys/ufs/ufs_lockf.c Fri Apr 23 14:35:36 1993 >*************** >*** 155,161 **** > } > #endif /* LOCKF_DEBUG */ > if (error = tsleep((caddr_t)lock, priority, lockstr, 0)) { >! free(lock, M_LOCKF); > return (error); > } > } >--- 155,162 ---- > } > #endif /* LOCKF_DEBUG */ > if (error = tsleep((caddr_t)lock, priority, lockstr, 0)) { >! /* free(lock, M_LOCKF); */ >! (void) lf_clearlock(lock); > return (error); > } > } >I could not cause my machine to panic again after installing this change. >I sent it also to Greg Hackney to test before releasing it. Besides preventing your machine from crashing, it also causes memory to leak away each time a process is interrupted when waiting for a lock (look at the output of `vmstat -m' for evidence). Let's see what happens in this scenario. For the sake of argument, assume that locks are requested for the whole file. We start with an inode without any locks. The first process's locking request is granted without the need to wait: its lock structure is added to the inode's i_lockf list. Along comes process 2 that now finds the overlapping lock and decides to wait for it to go away. In order to get woken up when the first process releases its lock, the current process' lock structure is added to the list of waiting locks (this list hangs off the lockholders lock structure) and goes to sleep. Next process 2's sleep is interrupted by some signal. A call to lf_clearlock() would remove any lock in the given range from the inode's list *belonging* to the calling process. Since this process was not holding a lock yet but merely waiting for one, nothing happens. Since process 2's lock structure isn't freed either this piece of memory is lost until the next panic. Just freeing the lock is also wrong because a pointer to it is still on the lockholder's blocked list. When process 1 finally releases its lock, it first removes (and frees) its lock on the i_lockf list and then walks its blocked list to wake up all processes waiting to acquire a lock. Your machine may crash there and then when handling the bogus list entry or may just crash later under mysterious circumstances. -pk