[RFC v3 1/1] ns: add binfmt_misc to the user namespace

Jann Horn jannh at google.com
Mon Oct 8 10:58:33 UTC 2018


On Sat, Oct 6, 2018 at 8:04 AM Andrei Vagin <avagin at gmail.com> wrote:
> On Thu, Oct 04, 2018 at 12:50:22AM +0200, Laurent Vivier wrote:
> > This patch allows to have a different binfmt_misc configuration
> > for each new user namespace. By default, the binfmt_misc configuration
> > is the one of the host, but if the binfmt_misc filesystem is mounted
> > in the new namespace a new empty binfmt instance is created and used
> > in this namespace.
> >
> > For instance, using "unshare" we can start a chroot of an another
> > architecture and configure the binfmt_misc interpreter without being root
> > to run the binaries in this chroot.
> >
> > Signed-off-by: Laurent Vivier <laurent at vivier.eu>
> > ---
> >  fs/binfmt_misc.c               | 85 +++++++++++++++++++++++-----------
> >  include/linux/user_namespace.h | 15 ++++++
> >  kernel/user.c                  | 14 ++++++
> >  kernel/user_namespace.c        |  9 ++++
> >  4 files changed, 95 insertions(+), 28 deletions(-)
> >
> > diff --git a/fs/binfmt_misc.c b/fs/binfmt_misc.c
> > index aa4a7a23ff99..78780bc87506 100644
> > --- a/fs/binfmt_misc.c
> > +++ b/fs/binfmt_misc.c
> > @@ -38,9 +38,6 @@ enum {
> >       VERBOSE_STATUS = 1 /* make it zero to save 400 bytes kernel memory */
> >  };
> >
> > -static LIST_HEAD(entries);
> > -static int enabled = 1;
> > -
> >  enum {Enabled, Magic};
> >  #define MISC_FMT_PRESERVE_ARGV0 (1 << 31)
> >  #define MISC_FMT_OPEN_BINARY (1 << 30)
> > @@ -60,10 +57,7 @@ typedef struct {
> >       struct file *interp_file;
> >  } Node;
> >
> > -static DEFINE_RWLOCK(entries_lock);
> >  static struct file_system_type bm_fs_type;
> > -static struct vfsmount *bm_mnt;
> > -static int entry_count;
> >
> >  /*
> >   * Max length of the register string.  Determined by:
> > @@ -85,13 +79,13 @@ static int entry_count;
> >   * if we do, return the node, else NULL
> >   * locking is done in load_misc_binary
> >   */
> > -static Node *check_file(struct linux_binprm *bprm)
> > +static Node *check_file(struct user_namespace *ns, struct linux_binprm *bprm)
> >  {
> >       char *p = strrchr(bprm->interp, '.');
> >       struct list_head *l;
> >
> >       /* Walk all the registered handlers. */
> > -     list_for_each(l, &entries) {
> > +     list_for_each(l, &ns->binfmt_ns->entries) {
> >               Node *e = list_entry(l, Node, list);
> >               char *s;
> >               int j;
> > @@ -133,17 +127,18 @@ static int load_misc_binary(struct linux_binprm *bprm)
> >       struct file *interp_file = NULL;
> >       int retval;
> >       int fd_binary = -1;
> > +     struct user_namespace *ns = current_user_ns();
> >
> >       retval = -ENOEXEC;
> > -     if (!enabled)
> > +     if (!ns->binfmt_ns->enabled)
> >               return retval;
> >
> >       /* to keep locking time low, we copy the interpreter string */
> > -     read_lock(&entries_lock);
> > -     fmt = check_file(bprm);
> > +     read_lock(&ns->binfmt_ns->entries_lock);
>
> It looks like ns->binfmt_ns isn't protected by any lock and
> ns->binfmt_ns can be changed between read_lock() and read_unlock().
>
> This can be fixed if ns->binfmt_ns will be dereferenced only once in
> this function:
>
>         struct binfmt_namespace *binfmt_ns = ns->binfmt_ns;

Technically, wouldn't you want READ_ONCE(ns->binfmt_ns)?


More information about the Containers mailing list