[ISSUE] mm: Add a user_ns owner to mm_struct and fix ptrace_may_access

Cyrill Gorcunov gorcunov at gmail.com
Mon Oct 24 20:29:25 UTC 2016


On Mon, Oct 24, 2016 at 02:01:30PM -0500, Eric W. Biederman wrote:
> 
> Adding the containers list because that is the general place for these
> kinds of discussions.

Thanks!

> Cyrill Gorcunov <gorcunov at gmail.com> writes:
> 
> > Hi Eric! A few days ago we've noticed that our zombie00 test case started
> > failing: https://ci.openvz.org/job/CRIU/view/All/job/CRIU-linux-next/406/console
> 
> > ---
> > ======================== Run zdtm/static/zombie00 in h =========================
> > Start test
> > ./zombie00 --pidfile=zombie00.pid --outfile=zombie00.out
> > Run criu dump
> > Run criu restore
> > Send the 15 signal to  30
> > Wait for zdtm/static/zombie00(30) to die for 0.100000
> > ################ Test zdtm/static/zombie00 FAIL at result check ################
> >
> > I've narrowed problem down to commit
> >
> >  | From ce99dd5fd5f600f9f4f0d37bb8847c1cb7c6e4fc Mon Sep 17 00:00:00 2001
> >  | From: "Eric W. Biederman" <ebiederm at xmission.com>
> >  | Date: Thu, 13 Oct 2016 21:23:16 -0500
> >  | Subject: [PATCH] mm: Add a user_ns owner to mm_struct and fix
> >  |  ptrace_may_access
> >  |
> >  | During exec dumpable is cleared if the file that is being executed is
> >  | not readable by the user executing the file.  A bug in
> >  | ptrace_may_access allows reading the file if the executable happens to
> >  | enter into a subordinate user namespace (aka clone(CLONE_NEWUSER),
> >  | unshare(CLONE_NEWUSER), or setns(fd, CLONE_NEWUSER).
> >
> > and the reason is that the zombie tasks do not have task::mm and in resut
> > we're obtaining -EPERM when trying to read task->exit_code from
> > /proc/pid/stat.
> 
> Hmm.  As I read the code exit_code should be returned to userspace as a
> 0.  It does not look to me as if userspace should see an error in
> that case.

I mean the ptrace-check returns -EPERM and we don't see @exit_code.
Sorry for confusion.

> 
> > Looking into commit I suspect when mm = NULL we've to move back the test
> > ptrace_has_cap(__task_cred(task)->user_ns, mode)?
> 
> Maybe.
> 
> We might want to consider if these lines from do_task_stat make
> any sense.
> 
> 	if (permitted)
> 		seq_put_decimal_ll(m, " ", task->exit_code);
> 	else
> 		seq_puts(m, " 0");
> 
> Looking at the code.  Nothing changes behavior except for privileged
> tasks looking at processes without an mm.  So yes it may be sane
> to tweak that part of the check.

I think so, otherwise we might break api.

> AKA in the in for-next branch the code currenty says:
> 	mm = task->mm;
> 	if (!mm ||
> 	    ((get_dumpable(mm) != SUID_DUMP_USER) &&
> 	     !ptrace_has_cap(mm->user_ns, mode)))
> 	    return -EPERM;
> 
> And in the case there is no mm there is no way to get
> past returning -EPERM.
> 
> Looking at why we use ptrace_may_access in the exit_code case
> I see a couple of relevant commits.
...
> 
> The commit that added task->exit_code:
> 
> commit 5b172087f99189416d5f47fd7ab5e6fb762a9ba3
> Author: Cyrill Gorcunov <gorcunov at openvz.org>
> Date:   Thu May 31 16:26:44 2012 -0700
> 
>     c/r: procfs: add arg_start/end, env_start/end and exit_code members to /proc/$pid/stat
>     
>     We would like to have an ability to restore command line arguments and
>     program environment pointers but first we need to obtain them somehow.
>     Thus we put these values into /proc/$pid/stat.  The exit_code is needed to
>     restore zombie tasks.
>     
>     Signed-off-by: Cyrill Gorcunov <gorcunov at openvz.org>
>     Acked-by: Kees Cook <keescook at chromium.org>
>     Cc: Pavel Emelyanov <xemul at parallels.com>
>     Cc: Serge Hallyn <serge.hallyn at canonical.com>
>     Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu at jp.fujitsu.com>
>     Cc: Alexey Dobriyan <adobriyan at gmail.com>
>     Cc: Tejun Heo <tj at kernel.org>
>     Cc: Andrew Vagin <avagin at openvz.org>
>     Cc: Vasiliy Kulikov <segoon at openwall.com>
>     Cc: Alexey Dobriyan <adobriyan at gmail.com>
>     Cc: "Eric W. Biederman" <ebiederm at xmission.com>
>     Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
>     Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
> 

Yes, I've been adding it for criu sake.

> Looking at do_task_stat everything else that requires permitted
> in do_tack_stat is an address.  exit_code is something else so
> I am not at all certain the ptrace_may_access permission check
> makes sense.

Well, I suspect @exit_code may be suitable for attacker to find
out if some address accessed cause sigsevg or something like that.

> 
> A process without an mm is fundamentally undumpable so an error should
> be returned in any case.  So I don't see any harm in failing
> ptrace_may_access in general.  At the same time I can see how not
> preserving the existing behavior is problematic.
> 
> So I am probably going to tweak the !mm case so that instead of failing
> we perform the old capable check in that case.  That seems the mot
> certain way to avoid regressions.  With that said, why is exit_code
> behind a ptrace_may_access permission check?

Yes, this would be great! And as to @exit_code I think better ask
Kees, CC'ed.

	Cyrill


More information about the Containers mailing list