[PATCH v5 00/11] FUSE mounts from non-init user namespaces

Dongsu Park dongsu at kinvolk.io
Fri Dec 22 14:32:24 UTC 2017

This patchset v5 is based on work by Seth Forshee and Eric Biederman.
The latest patchset was v4:

At the moment, filesystems backed by physical medium can only be mounted
by real root in the initial user namespace. This restriction exists
because if it's allowed for root user in non-init user namespaces to
mount the filesystem, then it effectively allows the user to control the
underlying source of the filesystem. In case of FUSE, the source would
mean any underlying device.

However, in many use cases such as containers, it's necessary to allow
filesystems to be mounted from non-init user namespaces. Goal of this
patchset is to allow FUSE filesystems to be mounted from non-init user
namespaces. Support for other filesystems like ext4 are not in the
scope of this patchset.

Let me describe how to test mounting from non-init user namespaces. It's
assumed that tests are done via sshfs, a userspace filesystem based on
FUSE with ssh as backend. Testing system is Fedora 27.

$ sudo dnf install -y sshfs
$ sudo mkdir -p /mnt/userns

### workaround to get the sshfs permission checks
$ sudo chown -R $UID:$UID /etc/ssh/ssh_config.d /usr/share/crypto-policies

$ unshare -U -r -m
# sshfs root at localhost: /mnt/userns

### You can see sshfs being mounted from a non-init user namespace
# mount | grep sshfs
root at localhost: on /mnt/userns type fuse.sshfs

# touch /mnt/userns/test
# ls -l /mnt/userns/test
-rw-r--r-- 1 root root 0 Dec 11 19:01 /mnt/userns/test

Open another terminal, check the mountpoint from outside the namespace.

$ grep userns /proc/$(pidof sshfs)/mountinfo
131 102 0:35 / /mnt/userns rw,nosuid,nodev,relatime - fuse.sshfs
root at localhost: rw,user_id=0,group_id=0

After all tests are done, you can unmount the filesystem
inside the namespace.

# fusermount -u /mnt/userns

Changes since v4:
 * Remove other parts like ext4 to keep the patchset minimal for FUSE
 * Add and change commit messages
 * Describe how to test non-init user namespaces

 * Think through potential security implications. There are 2 patches
   being prepared for security issues. One is "ima: define a new policy
   option named force" by Mimi Zohar, which adds an option to specify
   that the results should not be cached:
   The other one is to basically prevent FUSE results from being cached,
   which is still in progress.

 * Test IMA/LSMs. Details are written in

Patches 1-2 deal with an additional flag of lookup_bdev() to check for
additional inode permission.

Patches 3-7 allow the superblock owner to change ownership of inodes, and
deal with additional capability checks w.r.t user namespaces.

Patches 8-10 allow FUSE filesystems to be mounted outside of the init
user namespace.

Patch 11 handles a corner case of non-root users in EVM.

The patchset is also available in our github repo:

Eric W. Biederman (1):
  fs: Allow superblock owner to change ownership of inodes

Seth Forshee (10):
  block_dev: Support checking inode permissions in lookup_bdev()
  mtd: Check permissions towards mtd block device inode when mounting
  fs: Don't remove suid for CAP_FSETID for userns root
  fs: Allow superblock owner to access do_remount_sb()
  capabilities: Allow privileged user in s_user_ns to set security.*
  fs: Allow CAP_SYS_ADMIN in s_user_ns to freeze and thaw filesystems
  fuse: Support fuse filesystems outside of init_user_ns
  fuse: Restrict allow_other to the superblock's namespace or a
  fuse: Allow user namespace mounts
  evm: Don't update hmacs in user ns mounts

 drivers/md/bcache/super.c           |  2 +-
 drivers/md/dm-table.c               |  2 +-
 drivers/mtd/mtdsuper.c              |  6 +++++-
 fs/attr.c                           | 34 ++++++++++++++++++++++++++--------
 fs/block_dev.c                      | 13 ++++++++++---
 fs/fuse/cuse.c                      |  3 ++-
 fs/fuse/dev.c                       | 11 ++++++++---
 fs/fuse/dir.c                       | 16 ++++++++--------
 fs/fuse/fuse_i.h                    |  6 +++++-
 fs/fuse/inode.c                     | 35 +++++++++++++++++++++--------------
 fs/inode.c                          |  6 ++++--
 fs/ioctl.c                          |  4 ++--
 fs/namespace.c                      |  4 ++--
 fs/proc/base.c                      |  7 +++++++
 fs/proc/generic.c                   |  7 +++++++
 fs/proc/proc_sysctl.c               |  7 +++++++
 fs/quota/quota.c                    |  2 +-
 include/linux/fs.h                  |  2 +-
 kernel/user_namespace.c             |  1 +
 security/commoncap.c                |  8 ++++++--
 security/integrity/evm/evm_crypto.c |  3 ++-
 21 files changed, 127 insertions(+), 52 deletions(-)


More information about the Containers mailing list