Detecting the use of a mount in another namespace
Eric W. Biederman
ebiederm at xmission.com
Sun Jan 18 17:51:11 UTC 2015
Alexander Larsson <alexl at redhat.com> writes:
> On Thu, 2015-01-15 at 10:34 +0000, Daniel P. Berrange wrote:
>> On Thu, Jan 15, 2015 at 09:56:05AM +0100, Alexander Larsson wrote:
>> > This is a bit of a weird request, but I'm working on an app sandboxing
>> > system where each container gets /usr read-only bind mounted from a
>> > hardlinked tree. When i update the /usr tree I write the new tree to a
>> > different directory, which avoids affecting any currently running apps
>> > against the old one.
>> > However, after updating I'd like to clean out the old version if it is
>> > not in use. I had a plan for this:
>> > 1) Move the old usr to a "has been deleted" location
>> > 2) Try to remove a file inside the user (say ".ref") which the app when
>> > running has bind-mounted somewhere
>> > 3) if the remove returned EBUSY, then the usr is in use.
>> > However, with the recent changes to the semantics in this area this
>> > doesn't work. The remove always succeeds even if the file is mounted in
>> > some other namespace.
>> > I realize that this is better semantics in general, but that was a quite
>> > useful hack. Is there any other similar way i can detect that something
>> > is in use in "any other namespace".
>> Presumably you want something more efficient than scaning /proc/$PID in
>> the host OS ? eg you read /proc/$PID/mounts for each process, then iterate
>> stating /proc/$PID/root/<mount> to lookup the st_dev+st_inode of the mount
>> location to see if the one you care about still exists in any process ?
>> Not really going to scale nicely with large numbers of $PIDs, so perhaps
>> you could short circuit by keeping track of your container pid leaders ?
> Yeah, that doesn't sound very efficient. Keeping track of the pids is a
> bit painful, since the containers are not launched or monitored from
> some central place. Maybe there just is no good way to do this anymore.
> Just wanted to ask here to make sure i didn't miss any possibility.
The way I would recommend is to give each of your containers a read-only
snapshot of /usr, and then delete that snapshot when done.
cp -ldr /usr /usr-snapshot
# Some time later when you are done
rm -rf /usr-snapshot
There are more elegant ways (btrfs snapshots etc) but the above will
work on every filesystem that supports hardlinks.
For what you were wanting to do with mounts in the general case the
kernel has never had enough information to do what you want to do with
mounts. Think remote filesystems like nfs. Information from remote
filesystems about who if anyone has a mountpoint somewhere simply does
not propagate between kernels.
More information about the Containers