Containers and /proc/sys/vm/drop_caches

Matt Helsley matthltc at
Thu Jan 6 13:43:15 PST 2011

On Wed, Jan 05, 2011 at 07:46:17PM +0530, Balbir Singh wrote:
> On Wed, Jan 5, 2011 at 7:31 PM, Serge Hallyn <serge.hallyn at> wrote:
> > Quoting Daniel Lezcano (daniel.lezcano at
> >> On 01/05/2011 10:40 AM, Mike Hommey wrote:
> >> >[Copy/pasted from a previous message to lkml, where it was suggested to
> >> >  try containers@]
> >> >
> >> >Hi,
> >> >
> >> >I noticed that from within a lxc container, writing "3" to
> >> >/proc/sys/vm/drop_caches would flush the host page cache. That sounds a
> >> >little dangerous for VPS offerings that would be based on lxc, as in one
> >> >VPS instance root user could impact the overall performance of the host.
> >> >I don't know about other containers but I've been told openvz isn't
> >> >subject to this problem.
> >> >I only tested the current Debian Squeeze kernel, which is based on
> >> >
> >>
> >> There is definitively a big work to do with /proc.
> >>
> >> Some files should be not accessible (/proc/sys/vm/drop_caches,
> >> /proc/sys/kernel/sysrq, ...) and some other should be virtualized
> >> (/proc/meminfo, /proc/cpuinfo, ...).
> >>
> >> Serge suggested to create something similar to the cgroup device
> >> whitelist but for /proc, maybe it is a good approach for denying
> >> access a specific proc's file.
> >
> > Long-term, user namespaces should fix this - /proc will be owned
> > by the user namespace which mounted it, but we can tell proc to
> > always have some files (like drop_caches) be owned by init_user_ns.
> >
> > I'm hoping to push my final targeted capabilities prototype in the
> > next few weeks, and after that I start seriously attacking VFS
> > interaction.
> >
> > In the meantime, though, you can use SELinux/Smack, or a custom
> > cgroup file does sound useful.  Can cgroups be modules nowadays?
> > (I can't keep up)  If so, an out of tree proc-cgroup module seems
> > like a good interim solution.
> >
> Ideally a drop_cache should drop page cache in that container, but
> given container have a lot of shared page cache, what is suggested
> might be a good way to work around the problem

One gross hack that comes to mind: Instead of a hard permission model
limit the frequency with which the container could actually drop caches.
Then the container's ability to interfere with host performance is more
limited (but still non-zero). Or limit frequency on a per-user basis
(more like Serge's design) because running more containers by a
compromised user account shouldn't allow more frequent cache dropping.

That said, the more important question is why should we provide
drop_caches inside a container? My understanding is it's largely a
workload-debugging tool and not something meant to truly solve
problems. If that's the case then we shouldn't provide it at all or it
should actually interfere with the host cache.

	-Matt Helsley

More information about the Containers mailing list