[PATCHv3] locks: Filter /proc/locks output on proc pid ns

Eric W. Biederman ebiederm at xmission.com
Wed Aug 3 21:09:42 UTC 2016


Jeff Layton <jlayton at poochiereds.net> writes:

> On Wed, 2016-08-03 at 11:23 -0500, Eric W. Biederman wrote:
>> Nikolay Borisov <kernel at kyup.com> writes:
>> 
>> > 
>> > On busy container servers reading /proc/locks shows all the locks
>> > created by all clients. This can cause large latency spikes. In my
>> > case I observed lsof taking up to 5-10 seconds while processing
>> > around
>> > 50k locks. Fix this by limiting the locks shown only to those
>> > created
>> > in the same pidns as the one the proc fs was mounted in. When
>> > reading
>> > /proc/locks from the init_pid_ns proc instance then perform no
>> > filtering
>> 
>> If we are going to do this, this should be a recrusive belonging test
>> (because pid namespaces are recursive).
>> 
>> Right now the test looks like it will filter out child pid
>> namespaces.
>> 
>> Special casing the init_pid_ns should be an optimization not
>> something
>> that is necessary for correctness. (as it appears here).
>> 
>> Eric
>> 
>> 
>
> Ok, thanks. I'm still not that namespace savvy -- so there's a
> hierarchy of pid_namespaces?

There is.

> If so, then yeah does sound better. Is there an interface that allows
> you to tell whether a pid is a descendant of a particular
> pid_namespace?

Yes.  And each pid has an array of the pid namespaces it is in so it is
a O(1) operation to see if that struct pid is in a pid namespace.

Dumb question does anyone know the difference between fl_nspid and
fl_pid off the top of your heads?  I am looking at the code and I am
confused why we have to both.  I am afraid that there was some
sloppiness when the pid namespace was implemented and this was the
result.  I remember that file locks were a rough spot during the
conversion but I don't recall the details off the top of my head.

Eric


More information about the Containers mailing list