[PATCH v5 00/42] idmapped mounts

Darrick J. Wong djwong at kernel.org
Thu Jan 14 17:12:41 UTC 2021


On Tue, Jan 12, 2021 at 11:00:42PM +0100, Christian Brauner wrote:
> Hey everyone,
> 
> The only major change is the inclusion of hch's patch to port XFS to
> support idmapped mounts. Thanks to Christoph for doing that work.

Yay :)

> (For a full list of major changes between versions see the end of this
>  cover letter.
>  Please also note the large xfstests testsuite in patch 42 that has been
>  kept as part of this series. It verifies correct vfs behavior with and
>  without idmapped mounts including covering newer vfs features such as
>  io_uring.
>  I currently still plan to target the v5.12 merge window.)
> 
> With this patchset we make it possible to attach idmappings to mounts,
> i.e. simply put different bind mounts can expose the same file or
> directory with different ownership.
> Shifting of ownership on a per-mount basis handles a wide range of
> long standing use-cases. Here are just a few:
> - Shifting of a subset of ownership-less filesystems (vfat) for use by
>   multiple users, effectively allowing for DAC on such devices
>   (systemd, Android, ...)
> - Allow remapping uid/gid on external filesystems or paths (USB sticks,
>   network filesystem, ...) to match the local system's user and groups.
>   (David Howells intends to port AFS as a first candidate.)
> - Shifting of a container rootfs or base image without having to mangle
>   every file (runc, Docker, containerd, k8s, LXD, systemd ...)
> - Sharing of data between host or privileged containers with
>   unprivileged containers (runC, Docker, containerd, k8s, LXD, ...)
> - Data sharing between multiple user namespaces with incompatible maps
>   (LXD, k8s, ...)

That sounds neat.  AFAICT, the VFS passes the filesystem a mount userns
structure, which is then carried down the call stack to whatever
functions actually care about mapping kernel [ug]ids to their ondisk
versions?

Does quota still work after this patchset is applied?  There isn't any
mention of that in the cover letter and I don't see a code patch, so
does that mean everything just works?  I'm particularly curious about
whether there can exist processes with CAP_SYS_ADMIN and an idmapped
mount?  Syscalls like bulkstat and quotactl present file [ug]ids to
programs, but afaict there won't be any translating going on?

(To be fair, bulkstat is an xfs-only thing, but quota control isn't.)

I'll start skimming the patchset...

--D

> 
> There has been significant interest in this patchset as evidenced by
> user commenting on previous version of this patchset. They include
> containerd, ChromeOS, systemd, LXD and a range of others. There is
> already a patchset up for containerd, the default Kubernetes container
> runtime https://github.com/containerd/containerd/pull/4734
> to make use of this. systemd intends to use it in their systemd-homed
> implementation for portable home directories. ChromeOS wants to make use
> of it to share data between the host and the Linux containers they run
> on Chrome- and Pixelbooks.
> (Fwiw, for fun and since I wanted to do this for a long time I've ported
>  my home directory to be completely portable with a simple service file
>  that now mounts my home directory on an ext4 formatted usb stick with
>  an id mapping mapping all files to the random uid I'm assigned at
>  login.)
> 
> Making it possible to share directories and mounts between users with
> different uids and gids is itself quite an important use-case in
> distributed systems environments. It's of course especially useful in
> general for portable usb sticks, sharing data between multiple users in,
> and sharing home directories between multiple users. The last example is
> now elegantly expressed in systemd's homed concept for portable home
> directories. As mentioned above, idmapped mounts also allow data from
> the host to be shared with unprivileged containers, between privileged
> and unprivileged containers simultaneously and in addition also between
> unprivileged containers with different idmappings whenever they are used
> to isolate one container completely from another container.
> 
> We have implemented and proposed multiple solutions to this before. This
> included the introduction of fsid mappings, a tiny filesystem I've
> authored with Seth Forshee that is currently carried in Ubuntu that has
> shown to be the wrong approach, and the conceptual hack of calling
> override creds directly in the vfs. In addition, to some of these
> solutions being hacky none of these solutions have covered all of the
> above use-cases.
> 
> Idmappings become a property of struct vfsmount instead of tying it to a
> process being inside of a user namespace which has been the case for all
> other proposed approaches. It also allows to pass down the user
> namespace into the filesystems which is a clean way instead of violating
> calling conventions by strapping the user namespace information that is
> a property of the mount to the caller's credentials or similar hacks.
> Each mount can have a separate idmapping and idmapped mounts can even be
> created in the initial user namespace unblocking a range of use-cases.
> 
> To this end the vfsmount struct gains a new struct user_namespace
> member. The idmapping of the user namespace becomes the idmapping of the
> mount. A caller that is privileged with respect to the user namespace of
> the superblock of the underlying filesystem can create an idmapped
> mount. In the future, we can enable unprivileged use-cases by checking
> whether the caller is privileged wrt to the user namespace that an
> already idmapped mount has been marked with, allowing them to change the
> idmapping. For now, keep things simple until the need arises.
> Note, that with syscall interception it is already possible to intercept
> idmapped mount requests from unprivileged containers and handle them in
> a sufficiently privileged container manager. Support for this is already
> available in LXD and will be available in runC where syscall
> interception is currently in the process of becoming part of the runtime
> spec: https://github.com/opencontainers/runtime-spec/pull/1074.
> 
> The user namespace the mount will be marked with can be specified by
> passing a file descriptor refering to the user namespace as an argument
> to the new mount_setattr() syscall together with the new
> MOUNT_ATTR_IDMAP flag. By default vfsmounts are marked with the initial
> user namespace and no behavioral or performance changes are observed.
> All mapping operations are nops for the initial user namespace. When a
> file/inode is accessed through an idmapped mount the i_uid and i_gid of
> the inode will be remapped according to the user namespace the mount has
> been marked with.
> 
> In order to support idmapped mounts, filesystems need to be changed and
> mark themselves with the FS_ALLOW_IDMAP flag in fs_flags. The initial
> version contains fat, ext4, and xfs including a list of examples.
> But patches for other filesystems are actively worked on and will be
> sent out separately. We are here to see this through and there are
> multiple people involved in converting filesystems. So filesystem
> developers are not left alone with this and are provided with a large
> testsuite to verify that their port is correct.
> 
> There is a simple tool available at
> https://github.com/brauner/mount-idmapped that allows to create idmapped
> mounts so people can play with this patch series. Here are a few
> illustrations:
> 
> 1. Create a simple idmapped mount of another user's home directory
> 
> u1001 at f2-vm:/$ sudo ./mount-idmapped --map-mount b:1000:1001:1 /home/ubuntu/ /mnt
> u1001 at f2-vm:/$ ls -al /home/ubuntu/
> total 28
> drwxr-xr-x 2 ubuntu ubuntu 4096 Oct 28 22:07 .
> drwxr-xr-x 4 root   root   4096 Oct 28 04:00 ..
> -rw------- 1 ubuntu ubuntu 3154 Oct 28 22:12 .bash_history
> -rw-r--r-- 1 ubuntu ubuntu  220 Feb 25  2020 .bash_logout
> -rw-r--r-- 1 ubuntu ubuntu 3771 Feb 25  2020 .bashrc
> -rw-r--r-- 1 ubuntu ubuntu  807 Feb 25  2020 .profile
> -rw-r--r-- 1 ubuntu ubuntu    0 Oct 16 16:11 .sudo_as_admin_successful
> -rw------- 1 ubuntu ubuntu 1144 Oct 28 00:43 .viminfo
> u1001 at f2-vm:/$ ls -al /mnt/
> total 28
> drwxr-xr-x  2 u1001 u1001 4096 Oct 28 22:07 .
> drwxr-xr-x 29 root  root  4096 Oct 28 22:01 ..
> -rw-------  1 u1001 u1001 3154 Oct 28 22:12 .bash_history
> -rw-r--r--  1 u1001 u1001  220 Feb 25  2020 .bash_logout
> -rw-r--r--  1 u1001 u1001 3771 Feb 25  2020 .bashrc
> -rw-r--r--  1 u1001 u1001  807 Feb 25  2020 .profile
> -rw-r--r--  1 u1001 u1001    0 Oct 16 16:11 .sudo_as_admin_successful
> -rw-------  1 u1001 u1001 1144 Oct 28 00:43 .viminfo
> u1001 at f2-vm:/$ touch /mnt/my-file
> u1001 at f2-vm:/$ setfacl -m u:1001:rwx /mnt/my-file
> u1001 at f2-vm:/$ sudo setcap -n 1001 cap_net_raw+ep /mnt/my-file
> u1001 at f2-vm:/$ ls -al /mnt/my-file
> -rw-rwxr--+ 1 u1001 u1001 0 Oct 28 22:14 /mnt/my-file
> u1001 at f2-vm:/$ ls -al /home/ubuntu/my-file
> -rw-rwxr--+ 1 ubuntu ubuntu 0 Oct 28 22:14 /home/ubuntu/my-file
> u1001 at f2-vm:/$ getfacl /mnt/my-file
> getfacl: Removing leading '/' from absolute path names
> # file: mnt/my-file
> # owner: u1001
> # group: u1001
> user::rw-
> user:u1001:rwx
> group::rw-
> mask::rwx
> other::r--
> u1001 at f2-vm:/$ getfacl /home/ubuntu/my-file
> getfacl: Removing leading '/' from absolute path names
> # file: home/ubuntu/my-file
> # owner: ubuntu
> # group: ubuntu
> user::rw-
> user:ubuntu:rwx
> group::rw-
> mask::rwx
> other::r--
> 
> 2. Create mapping of the whole ext4 rootfs without a mapping for uid and gid 0
> 
> ubuntu at f2-vm:~$ sudo /mount-idmapped --map-mount b:1:1:65536 / /mnt/
> ubuntu at f2-vm:~$ findmnt | grep mnt
> └─/mnt                                /dev/sda2  ext4       rw,relatime
>   └─/mnt/mnt                          /dev/sda2  ext4       rw,relatime
> ubuntu at f2-vm:~$ sudo mkdir /AS-ROOT-CAN-CREATE
> ubuntu at f2-vm:~$ sudo mkdir /mnt/AS-ROOT-CANT-CREATE
> mkdir: cannot create directory ‘/mnt/AS-ROOT-CANT-CREATE’: Value too large for defined data type
> ubuntu at f2-vm:~$ mkdir /mnt/home/ubuntu/AS-USER-1000-CAN-CREATE
> 
> 3. Create a vfat usb mount and expose to user 1001 and 5000
> 
> ubuntu at f2-vm:/$ sudo mount /dev/sdb /mnt
> ubuntu at f2-vm:/$ findmnt  | grep mnt
> └─/mnt                                /dev/sdb vfat       rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro
> ubuntu at f2-vm:/$ ls -al /mnt
> total 12
> drwxr-xr-x  2 root root 4096 Jan  1  1970 .
> drwxr-xr-x 34 root root 4096 Oct 28 22:24 ..
> -rwxr-xr-x  1 root root    4 Oct 28 03:44 aaa
> -rwxr-xr-x  1 root root    0 Oct 28 01:09 bbb
> ubuntu at f2-vm:/$ sudo /mount-idmapped --map-mount b:0:1001:1 /mnt /mnt-1001/
> ubuntu at f2-vm:/$ ls -al /mnt-1001/
> total 12
> drwxr-xr-x  2 u1001 u1001 4096 Jan  1  1970 .
> drwxr-xr-x 34 root  root  4096 Oct 28 22:24 ..
> -rwxr-xr-x  1 u1001 u1001    4 Oct 28 03:44 aaa
> -rwxr-xr-x  1 u1001 u1001    0 Oct 28 01:09 bbb
> ubuntu at f2-vm:/$ sudo /mount-idmapped --map-mount b:0:5000:1 /mnt /mnt-5000/
> ubuntu at f2-vm:/$ ls -al /mnt-5000/
> total 12
> drwxr-xr-x  2 5000 5000 4096 Jan  1  1970 .
> drwxr-xr-x 34 root root 4096 Oct 28 22:24 ..
> -rwxr-xr-x  1 5000 5000    4 Oct 28 03:44 aaa
> -rwxr-xr-x  1 5000 5000    0 Oct 28 01:09 bbb
> 
> 4. Create an idmapped rootfs mount for a container
> 
> root at f2-vm:~# ls -al /var/lib/lxc/f2/rootfs/
> total 68
> drwxr-xr-x 17 20000 20000 4096 Sep 24 07:48 .
> drwxrwx---  3 20000 20000 4096 Oct 16 19:26 ..
> lrwxrwxrwx  1 20000 20000    7 Sep 24 07:43 bin -> usr/bin
> drwxr-xr-x  2 20000 20000 4096 Apr 15  2020 boot
> drwxr-xr-x  3 20000 20000 4096 Oct 16 19:26 dev
> drwxr-xr-x 61 20000 20000 4096 Oct 16 19:26 etc
> drwxr-xr-x  3 20000 20000 4096 Sep 24 07:45 home
> lrwxrwxrwx  1 20000 20000    7 Sep 24 07:43 lib -> usr/lib
> lrwxrwxrwx  1 20000 20000    9 Sep 24 07:43 lib32 -> usr/lib32
> lrwxrwxrwx  1 20000 20000    9 Sep 24 07:43 lib64 -> usr/lib64
> lrwxrwxrwx  1 20000 20000   10 Sep 24 07:43 libx32 -> usr/libx32
> drwxr-xr-x  2 20000 20000 4096 Sep 24 07:43 media
> drwxr-xr-x  2 20000 20000 4096 Sep 24 07:43 mnt
> drwxr-xr-x  2 20000 20000 4096 Sep 24 07:43 opt
> drwxr-xr-x  2 20000 20000 4096 Apr 15  2020 proc
> drwx------  2 20000 20000 4096 Sep 24 07:43 root
> drwxr-xr-x  2 20000 20000 4096 Sep 24 07:45 run
> lrwxrwxrwx  1 20000 20000    8 Sep 24 07:43 sbin -> usr/sbin
> drwxr-xr-x  2 20000 20000 4096 Sep 24 07:43 srv
> drwxr-xr-x  2 20000 20000 4096 Apr 15  2020 sys
> drwxrwxrwt  2 20000 20000 4096 Sep 24 07:44 tmp
> drwxr-xr-x 13 20000 20000 4096 Sep 24 07:43 usr
> drwxr-xr-x 12 20000 20000 4096 Sep 24 07:44 var
> root at f2-vm:~# /mount-idmapped --map-mount b:20000:10000:100000 /var/lib/lxc/f2/rootfs/ /mnt
> root at f2-vm:~# ls -al /mnt
> total 68
> drwxr-xr-x 17 10000 10000 4096 Sep 24 07:48 .
> drwxr-xr-x 34 root  root  4096 Oct 28 22:24 ..
> lrwxrwxrwx  1 10000 10000    7 Sep 24 07:43 bin -> usr/bin
> drwxr-xr-x  2 10000 10000 4096 Apr 15  2020 boot
> drwxr-xr-x  3 10000 10000 4096 Oct 16 19:26 dev
> drwxr-xr-x 61 10000 10000 4096 Oct 16 19:26 etc
> drwxr-xr-x  3 10000 10000 4096 Sep 24 07:45 home
> lrwxrwxrwx  1 10000 10000    7 Sep 24 07:43 lib -> usr/lib
> lrwxrwxrwx  1 10000 10000    9 Sep 24 07:43 lib32 -> usr/lib32
> lrwxrwxrwx  1 10000 10000    9 Sep 24 07:43 lib64 -> usr/lib64
> lrwxrwxrwx  1 10000 10000   10 Sep 24 07:43 libx32 -> usr/libx32
> drwxr-xr-x  2 10000 10000 4096 Sep 24 07:43 media
> drwxr-xr-x  2 10000 10000 4096 Sep 24 07:43 mnt
> drwxr-xr-x  2 10000 10000 4096 Sep 24 07:43 opt
> drwxr-xr-x  2 10000 10000 4096 Apr 15  2020 proc
> drwx------  2 10000 10000 4096 Sep 24 07:43 root
> drwxr-xr-x  2 10000 10000 4096 Sep 24 07:45 run
> lrwxrwxrwx  1 10000 10000    8 Sep 24 07:43 sbin -> usr/sbin
> drwxr-xr-x  2 10000 10000 4096 Sep 24 07:43 srv
> drwxr-xr-x  2 10000 10000 4096 Apr 15  2020 sys
> drwxrwxrwt  2 10000 10000 4096 Sep 24 07:44 tmp
> drwxr-xr-x 13 10000 10000 4096 Sep 24 07:43 usr
> drwxr-xr-x 12 10000 10000 4096 Sep 24 07:44 var
> root at f2-vm:~# lxc-start f2 # uses /mnt as rootfs
> root at f2-vm:~# lxc-attach f2 -- cat /proc/1/uid_map
>          0      10000      10000
> root at f2-vm:~# lxc-attach f2 -- cat /proc/1/gid_map
>          0      10000      10000
> root at f2-vm:~# lxc-attach f2 -- ls -al /
> total 52
> drwxr-xr-x  17 root   root    4096 Sep 24 07:48 .
> drwxr-xr-x  17 root   root    4096 Sep 24 07:48 ..
> lrwxrwxrwx   1 root   root       7 Sep 24 07:43 bin -> usr/bin
> drwxr-xr-x   2 root   root    4096 Apr 15  2020 boot
> drwxr-xr-x   5 root   root     500 Oct 28 23:39 dev
> drwxr-xr-x  61 root   root    4096 Oct 28 23:39 etc
> drwxr-xr-x   3 root   root    4096 Sep 24 07:45 home
> lrwxrwxrwx   1 root   root       7 Sep 24 07:43 lib -> usr/lib
> lrwxrwxrwx   1 root   root       9 Sep 24 07:43 lib32 -> usr/lib32
> lrwxrwxrwx   1 root   root       9 Sep 24 07:43 lib64 -> usr/lib64
> lrwxrwxrwx   1 root   root      10 Sep 24 07:43 libx32 -> usr/libx32
> drwxr-xr-x   2 root   root    4096 Sep 24 07:43 media
> drwxr-xr-x   2 root   root    4096 Sep 24 07:43 mnt
> drwxr-xr-x   2 root   root    4096 Sep 24 07:43 opt
> dr-xr-xr-x 232 nobody nogroup    0 Oct 28 23:39 proc
> drwx------   2 root   root    4096 Oct 28 23:41 root
> drwxr-xr-x  12 root   root     360 Oct 28 23:39 run
> lrwxrwxrwx   1 root   root       8 Sep 24 07:43 sbin -> usr/sbin
> drwxr-xr-x   2 root   root    4096 Sep 24 07:43 srv
> dr-xr-xr-x  13 nobody nogroup    0 Oct 28 23:39 sys
> drwxrwxrwt  11 root   root    4096 Oct 28 23:40 tmp
> drwxr-xr-x  13 root   root    4096 Sep 24 07:43 usr
> drwxr-xr-x  12 root   root    4096 Sep 24 07:44 var
> root at f2-vm:~# lxc-attach f2 -- ls -al /my-file
> -rw-r--r-- 1 root root 0 Oct 28 23:43 /my-file
> root at f2-vm:~# ls -al /var/lib/lxc/f2/rootfs/my-file
> -rw-r--r-- 1 20000 20000 0 Oct 28 23:43 /var/lib/lxc/f2/rootfs/my-file
> 
> I'd like to say thanks to:
> Al for pointing me into the direction to avoid inode alias issues during
> lookup. David for various discussions around this. Christoph for porting
> xfs, providing good reviews and for being involved in the original idea.
> Tycho for helping with this series and on future patches to convert
> filesystems. Alban Crequy and the Kinvolk peeps located just a few
> streets away from me in Berlin for providing use-case discussions and
> writing patches for containerd. Stéphane for his invaluable input on
> many things and level head and enabling me to work on this. Amir for
> explaining and discussing aspects of overlayfs with me. I'd like to
> especially thank Seth Forshee. He provided a lot of good analysis,
> suggestions, and participated in short-notice discussions in both chat
> and video for some nitty-gritty technical details.
> 
> This series can be found and pulled from the three usual locations:
> https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=idmapped_mounts
> https://github.com/brauner/linux/tree/idmapped_mounts
> https://gitlab.com/brauner/linux/-/commits/idmapped_mounts
> 
> /* v5 */
> - Adress Christoph's feedback.
> - Use v5.11-rc3 as new base.
> - Add Christoph's xfs port.
> 
> /* v4 */
> - Split out several preparatory patches from the initial mount_setattr
>   patch as requested by Christoph.
> - Add new tests for file/directory creation in directories with the
>   setgid bit set. Specifically, verify that the setgid bit is correctly
>   ignored when creating a file with the setgid bit and the parent
>   directory's i_gid isn't in_group_p() and the caller isn't
>   capable_wrt_inode_uidgid() over the parent directory's inode when
>   inode_init_owner() is called.
>   Conversely, verify that the setgid bit is set when creating a file
>   with the setgid bit and the parent's i_gid is either in_group_p() or
>   the caller is capable_wrt_inode_uidgid() over the parent directory's
>   inode. In additiona, verify that the setgid bit is always inherited
>   when creating directories.
>   Test all of this on regular mounts, idmapped mounts, and on idmapped
>   mounts in user namespaces.
> - Add new tests to verify that the i_gid of newly created files or
>   directories is correctly set to the parent directory's i_gid when the
>   parent directory has the setgid bit set.
> - Use "mnt_userns" as the de facto name for a vfsmount's user namespace
>   everywhere as suggested by Serge.
> - Reuse existing propagation flags instead of introducing new ones as
>   suggested by Christoph. (This is in line with Linus request to not
>   introduce too many new flags as evidenced by prior discussions on
>   other patchsets such as openat2().)
> - Add first set of Acked-bys from Serge and Reviewed-bys from Christoph.
> - Fix commit messages to reflect the fact that we modify existing
>   vfs helpers but do not introduce new ones like we did in the first
>   version. Some commit messages still implied we were adding new
>   helpers.
> - Reformat all commit messages to adhere to 73 char length limit and
>   wrap all lines in commits at 80 chars whenever this doesn't hinder
>   legibility.
> - Simplify various codepaths with Christoph's suggestions.
> 
> /* v3 */
> - The major change is the port of the test-suite from the
>   kernel-internal selftests framework to xfstests as requested by
>   Darrick and Christoph. The test-suite for xfstests is patch 38 in this
>   series. It has been kept as part of this series even though it belongs
>   to xfstests so it's easier to see what is tested and to keep it
>   in-sync.
> - Note, the test-suite now has been extended to cover io_uring and
>   idmapped mounts. The IORING_REGISTER_PERSONALITY feature allows to
>   register the caller's credentials with io_uring and returns an id
>   associated with these credentials. This is useful for applications
>   that wish to share a ring between separate users/processes. Callers
>   can pass in the credential id in the sqe personality field. If set,
>   that particular sqe will be issued with these credentials.
>   The test-suite now tests that the openat* operations with different
>   registered credentials work correctly and safely on regular mounts, on
>   regular mounts inside user namespaces, on idmapped mounts, and on
>   idmapped mounts inside user namespaces.
> 
> /* v2 */
> - The major change is the rework requested by Christoph and others to
>   adapt all relevant helpers and inode_operations methods to account for
>   idmapped mounts instead of introducing new helpers and methods
>   specific to idmapped mounts like we did before. We've also moved the
>   overlayfs conversion to handle idmapped mounts into a separate
>   patchset that will be sent out separately after the core changes
>   landed. The converted filesytems in this series include fat and ext4.
>   As per Christoph's request the vfs-wide config option to disable
>   idmapped mounts has been removed. Instead the filesystems can decide
>   whether or not they want to allow idmap mounts through a config
>   option. These config options default to off. Having a config option
>   allows us to gain some confidence in the patchset over multiple kernel
>   releases.
> - This version introduces a large test-suite to test current vfs
>   behavior and idmapped mounts behavior. This test-suite is intended to
>   grow over time.
> - While while working on adapting this patchset to the requested
>   changes, the runC and containerd crowd was nice enough to adapt
>   containerd to this patchset to make use of idmapped mounts in one of
>   the most widely used container runtimes:
>   https://github.com/containerd/containerd/pull/4734
> 
> The solution proposed here has it's origins in multiple discussions
> during Linux Plumbers 2017 during and after the end of the containers
> microconference.
> To the best of my knowledge this involved Aleksa, Stéphane, Eric, David,
> James, and myself.The original idea or a variant thereof has been
> discussed, again to the best of my knowledge, after a Linux conference
> in St. Petersburg in Russia in 2017 between Christoph, Tycho, and
> myself.
> We've taken the time to implement a working version of this solution
> over the last weeks to the best of my abilities. Tycho has signed up
> for this sligthly crazy endeavour as well and he has helped with the
> conversion of the xattr codepaths and will be involved with others in
> converting additional filesystems.
> 
> Thanks!
> Christian
> 
> Christian Brauner (39):
>   namespace: take lock_mount_hash() directly when changing flags
>   mount: make {lock,unlock}_mount_hash() static
>   namespace: only take read lock in do_reconfigure_mnt()
>   fs: split out functions to hold writers
>   fs: add attr_flags_to_mnt_flags helper
>   fs: add mount_setattr()
>   tests: add mount_setattr() selftests
>   fs: add id translation helpers
>   mount: attach mappings to mounts
>   capability: handle idmapped mounts
>   namei: make permission helpers idmapped mount aware
>   inode: make init and permission helpers idmapped mount aware
>   attr: handle idmapped mounts
>   acl: handle idmapped mounts
>   fs: add file_user_ns() helper
>   commoncap: handle idmapped mounts
>   stat: handle idmapped mounts
>   namei: handle idmapped mounts in may_*() helpers
>   namei: introduce struct renamedata
>   namei: prepare for idmapped mounts
>   open: handle idmapped mounts in do_truncate()
>   open: handle idmapped mounts
>   af_unix: handle idmapped mounts
>   utimes: handle idmapped mounts
>   fcntl: handle idmapped mounts
>   notify: handle idmapped mounts
>   init: handle idmapped mounts
>   ioctl: handle idmapped mounts
>   would_dump: handle idmapped mounts
>   exec: handle idmapped mounts
>   fs: make helpers idmap mount aware
>   apparmor: handle idmapped mounts
>   ima: handle idmapped mounts
>   fat: handle idmapped mounts
>   ext4: support idmapped mounts
>   ecryptfs: do not mount on top of idmapped mounts
>   overlayfs: do not mount on top of idmapped mounts
>   fs: introduce MOUNT_ATTR_IDMAP
>   tests: extend mount_setattr tests
> 
> Christoph Hellwig (1):
>   xfs: support idmapped mounts
> 
> Tycho Andersen (1):
>   xattr: handle idmapped mounts
> 
>  Documentation/filesystems/locking.rst         |    6 +-
>  Documentation/filesystems/porting.rst         |    2 +
>  Documentation/filesystems/vfs.rst             |   19 +-
>  arch/alpha/kernel/syscalls/syscall.tbl        |    1 +
>  arch/arm/tools/syscall.tbl                    |    1 +
>  arch/arm64/include/asm/unistd32.h             |    2 +
>  arch/ia64/kernel/syscalls/syscall.tbl         |    1 +
>  arch/m68k/kernel/syscalls/syscall.tbl         |    1 +
>  arch/microblaze/kernel/syscalls/syscall.tbl   |    1 +
>  arch/mips/kernel/syscalls/syscall_n32.tbl     |    1 +
>  arch/mips/kernel/syscalls/syscall_n64.tbl     |    1 +
>  arch/mips/kernel/syscalls/syscall_o32.tbl     |    1 +
>  arch/parisc/kernel/syscalls/syscall.tbl       |    1 +
>  arch/powerpc/kernel/syscalls/syscall.tbl      |    1 +
>  arch/powerpc/platforms/cell/spufs/inode.c     |    5 +-
>  arch/s390/kernel/syscalls/syscall.tbl         |    1 +
>  arch/sh/kernel/syscalls/syscall.tbl           |    1 +
>  arch/sparc/kernel/syscalls/syscall.tbl        |    1 +
>  arch/x86/entry/syscalls/syscall_32.tbl        |    1 +
>  arch/x86/entry/syscalls/syscall_64.tbl        |    1 +
>  arch/xtensa/kernel/syscalls/syscall.tbl       |    1 +
>  drivers/android/binderfs.c                    |    6 +-
>  drivers/base/devtmpfs.c                       |   12 +-
>  fs/9p/acl.c                                   |    8 +-
>  fs/9p/v9fs.h                                  |    3 +-
>  fs/9p/v9fs_vfs.h                              |    2 +-
>  fs/9p/vfs_inode.c                             |   36 +-
>  fs/9p/vfs_inode_dotl.c                        |   39 +-
>  fs/9p/xattr.c                                 |    1 +
>  fs/adfs/adfs.h                                |    3 +-
>  fs/adfs/inode.c                               |    5 +-
>  fs/affs/affs.h                                |   10 +-
>  fs/affs/inode.c                               |    7 +-
>  fs/affs/namei.c                               |   15 +-
>  fs/afs/dir.c                                  |   34 +-
>  fs/afs/inode.c                                |    9 +-
>  fs/afs/internal.h                             |    7 +-
>  fs/afs/security.c                             |    2 +-
>  fs/afs/xattr.c                                |    2 +
>  fs/attr.c                                     |  124 +-
>  fs/autofs/root.c                              |   13 +-
>  fs/bad_inode.c                                |   36 +-
>  fs/bfs/dir.c                                  |   12 +-
>  fs/btrfs/acl.c                                |    5 +-
>  fs/btrfs/ctree.h                              |    3 +-
>  fs/btrfs/inode.c                              |   45 +-
>  fs/btrfs/ioctl.c                              |   25 +-
>  fs/btrfs/tests/btrfs-tests.c                  |    2 +-
>  fs/btrfs/xattr.c                              |    2 +
>  fs/cachefiles/interface.c                     |    4 +-
>  fs/cachefiles/namei.c                         |   19 +-
>  fs/cachefiles/xattr.c                         |   16 +-
>  fs/ceph/acl.c                                 |    5 +-
>  fs/ceph/dir.c                                 |   23 +-
>  fs/ceph/inode.c                               |   17 +-
>  fs/ceph/super.h                               |   12 +-
>  fs/ceph/xattr.c                               |    1 +
>  fs/cifs/cifsfs.c                              |    4 +-
>  fs/cifs/cifsfs.h                              |   21 +-
>  fs/cifs/dir.c                                 |    8 +-
>  fs/cifs/inode.c                               |   26 +-
>  fs/cifs/link.c                                |    3 +-
>  fs/cifs/xattr.c                               |    1 +
>  fs/coda/coda_linux.h                          |    6 +-
>  fs/coda/dir.c                                 |   17 +-
>  fs/coda/inode.c                               |    9 +-
>  fs/coda/pioctl.c                              |    6 +-
>  fs/configfs/configfs_internal.h               |    7 +-
>  fs/configfs/dir.c                             |    3 +-
>  fs/configfs/inode.c                           |    5 +-
>  fs/configfs/symlink.c                         |    5 +-
>  fs/coredump.c                                 |   14 +-
>  fs/crypto/policy.c                            |    2 +-
>  fs/debugfs/inode.c                            |    9 +-
>  fs/ecryptfs/crypto.c                          |    4 +-
>  fs/ecryptfs/inode.c                           |   80 +-
>  fs/ecryptfs/main.c                            |    6 +
>  fs/ecryptfs/mmap.c                            |    4 +-
>  fs/efivarfs/file.c                            |    2 +-
>  fs/efivarfs/inode.c                           |    4 +-
>  fs/erofs/inode.c                              |    7 +-
>  fs/erofs/internal.h                           |    5 +-
>  fs/exec.c                                     |   12 +-
>  fs/exfat/exfat_fs.h                           |    8 +-
>  fs/exfat/file.c                               |   14 +-
>  fs/exfat/namei.c                              |   14 +-
>  fs/ext2/acl.c                                 |    5 +-
>  fs/ext2/acl.h                                 |    3 +-
>  fs/ext2/ext2.h                                |    5 +-
>  fs/ext2/ialloc.c                              |    2 +-
>  fs/ext2/inode.c                               |   15 +-
>  fs/ext2/ioctl.c                               |    6 +-
>  fs/ext2/namei.c                               |   22 +-
>  fs/ext2/xattr_security.c                      |    1 +
>  fs/ext2/xattr_trusted.c                       |    1 +
>  fs/ext2/xattr_user.c                          |    1 +
>  fs/ext4/acl.c                                 |    5 +-
>  fs/ext4/acl.h                                 |    3 +-
>  fs/ext4/ext4.h                                |   21 +-
>  fs/ext4/ialloc.c                              |    7 +-
>  fs/ext4/inode.c                               |   21 +-
>  fs/ext4/ioctl.c                               |   19 +-
>  fs/ext4/namei.c                               |   49 +-
>  fs/ext4/super.c                               |    2 +-
>  fs/ext4/xattr_hurd.c                          |    1 +
>  fs/ext4/xattr_security.c                      |    1 +
>  fs/ext4/xattr_trusted.c                       |    1 +
>  fs/ext4/xattr_user.c                          |    1 +
>  fs/f2fs/acl.c                                 |    5 +-
>  fs/f2fs/acl.h                                 |    3 +-
>  fs/f2fs/f2fs.h                                |    7 +-
>  fs/f2fs/file.c                                |   35 +-
>  fs/f2fs/namei.c                               |   23 +-
>  fs/f2fs/xattr.c                               |    4 +-
>  fs/fat/fat.h                                  |    6 +-
>  fs/fat/file.c                                 |   24 +-
>  fs/fat/namei_msdos.c                          |   12 +-
>  fs/fat/namei_vfat.c                           |   15 +-
>  fs/fcntl.c                                    |    3 +-
>  fs/fuse/acl.c                                 |    3 +-
>  fs/fuse/dir.c                                 |   45 +-
>  fs/fuse/fuse_i.h                              |    4 +-
>  fs/fuse/xattr.c                               |    2 +
>  fs/gfs2/acl.c                                 |    5 +-
>  fs/gfs2/acl.h                                 |    3 +-
>  fs/gfs2/file.c                                |    4 +-
>  fs/gfs2/inode.c                               |   59 +-
>  fs/gfs2/inode.h                               |    3 +-
>  fs/gfs2/xattr.c                               |    1 +
>  fs/hfs/attr.c                                 |    1 +
>  fs/hfs/dir.c                                  |   13 +-
>  fs/hfs/hfs_fs.h                               |    2 +-
>  fs/hfs/inode.c                                |    7 +-
>  fs/hfsplus/dir.c                              |   25 +-
>  fs/hfsplus/hfsplus_fs.h                       |    5 +-
>  fs/hfsplus/inode.c                            |   16 +-
>  fs/hfsplus/ioctl.c                            |    2 +-
>  fs/hfsplus/xattr.c                            |    1 +
>  fs/hfsplus/xattr_security.c                   |    1 +
>  fs/hfsplus/xattr_trusted.c                    |    1 +
>  fs/hfsplus/xattr_user.c                       |    1 +
>  fs/hostfs/hostfs_kern.c                       |   29 +-
>  fs/hpfs/hpfs_fn.h                             |    2 +-
>  fs/hpfs/inode.c                               |    7 +-
>  fs/hpfs/namei.c                               |   20 +-
>  fs/hugetlbfs/inode.c                          |   31 +-
>  fs/init.c                                     |   27 +-
>  fs/inode.c                                    |   50 +-
>  fs/internal.h                                 |    2 +-
>  fs/jffs2/acl.c                                |    5 +-
>  fs/jffs2/acl.h                                |    3 +-
>  fs/jffs2/dir.c                                |   32 +-
>  fs/jffs2/fs.c                                 |    7 +-
>  fs/jffs2/os-linux.h                           |    2 +-
>  fs/jffs2/security.c                           |    1 +
>  fs/jffs2/xattr_trusted.c                      |    1 +
>  fs/jffs2/xattr_user.c                         |    1 +
>  fs/jfs/acl.c                                  |    5 +-
>  fs/jfs/file.c                                 |    9 +-
>  fs/jfs/ioctl.c                                |    2 +-
>  fs/jfs/jfs_acl.h                              |    3 +-
>  fs/jfs/jfs_inode.c                            |    2 +-
>  fs/jfs/jfs_inode.h                            |    2 +-
>  fs/jfs/namei.c                                |   21 +-
>  fs/jfs/xattr.c                                |    2 +
>  fs/kernfs/dir.c                               |    7 +-
>  fs/kernfs/inode.c                             |   19 +-
>  fs/kernfs/kernfs-internal.h                   |    9 +-
>  fs/libfs.c                                    |   28 +-
>  fs/minix/bitmap.c                             |    2 +-
>  fs/minix/file.c                               |    7 +-
>  fs/minix/inode.c                              |    6 +-
>  fs/minix/minix.h                              |    3 +-
>  fs/minix/namei.c                              |   24 +-
>  fs/mount.h                                    |   10 -
>  fs/namei.c                                    |  513 ++++--
>  fs/namespace.c                                |  484 +++++-
>  fs/nfs/dir.c                                  |   25 +-
>  fs/nfs/inode.c                                |    9 +-
>  fs/nfs/internal.h                             |   10 +-
>  fs/nfs/namespace.c                            |   14 +-
>  fs/nfs/nfs3_fs.h                              |    3 +-
>  fs/nfs/nfs3acl.c                              |    3 +-
>  fs/nfs/nfs4proc.c                             |    3 +
>  fs/nfsd/nfs2acl.c                             |    4 +-
>  fs/nfsd/nfs3acl.c                             |    4 +-
>  fs/nfsd/nfs4acl.c                             |    4 +-
>  fs/nfsd/nfs4recover.c                         |    6 +-
>  fs/nfsd/nfsfh.c                               |    2 +-
>  fs/nfsd/nfsproc.c                             |    2 +-
>  fs/nfsd/vfs.c                                 |   47 +-
>  fs/nilfs2/inode.c                             |   13 +-
>  fs/nilfs2/ioctl.c                             |    2 +-
>  fs/nilfs2/namei.c                             |   19 +-
>  fs/nilfs2/nilfs.h                             |    4 +-
>  fs/notify/fanotify/fanotify_user.c            |    2 +-
>  fs/notify/inotify/inotify_user.c              |    3 +-
>  fs/ntfs/inode.c                               |    6 +-
>  fs/ntfs/inode.h                               |    3 +-
>  fs/ocfs2/acl.c                                |    5 +-
>  fs/ocfs2/acl.h                                |    3 +-
>  fs/ocfs2/dlmfs/dlmfs.c                        |   17 +-
>  fs/ocfs2/file.c                               |   17 +-
>  fs/ocfs2/file.h                               |   11 +-
>  fs/ocfs2/ioctl.c                              |    2 +-
>  fs/ocfs2/namei.c                              |   21 +-
>  fs/ocfs2/refcounttree.c                       |    4 +-
>  fs/ocfs2/xattr.c                              |    3 +
>  fs/omfs/dir.c                                 |   13 +-
>  fs/omfs/file.c                                |    7 +-
>  fs/omfs/inode.c                               |    2 +-
>  fs/open.c                                     |   50 +-
>  fs/orangefs/acl.c                             |    5 +-
>  fs/orangefs/inode.c                           |   20 +-
>  fs/orangefs/namei.c                           |   12 +-
>  fs/orangefs/orangefs-kernel.h                 |   13 +-
>  fs/orangefs/xattr.c                           |    1 +
>  fs/overlayfs/copy_up.c                        |   20 +-
>  fs/overlayfs/dir.c                            |   31 +-
>  fs/overlayfs/file.c                           |    6 +-
>  fs/overlayfs/inode.c                          |   26 +-
>  fs/overlayfs/overlayfs.h                      |   44 +-
>  fs/overlayfs/super.c                          |   19 +-
>  fs/overlayfs/util.c                           |    4 +-
>  fs/posix_acl.c                                |  101 +-
>  fs/proc/base.c                                |   28 +-
>  fs/proc/fd.c                                  |    5 +-
>  fs/proc/fd.h                                  |    3 +-
>  fs/proc/generic.c                             |   12 +-
>  fs/proc/internal.h                            |    5 +-
>  fs/proc/proc_net.c                            |    5 +-
>  fs/proc/proc_sysctl.c                         |   15 +-
>  fs/proc/root.c                                |    5 +-
>  fs/proc_namespace.c                           |    3 +
>  fs/ramfs/file-nommu.c                         |    9 +-
>  fs/ramfs/inode.c                              |   18 +-
>  fs/reiserfs/acl.h                             |    3 +-
>  fs/reiserfs/inode.c                           |    7 +-
>  fs/reiserfs/ioctl.c                           |    4 +-
>  fs/reiserfs/namei.c                           |   21 +-
>  fs/reiserfs/reiserfs.h                        |    3 +-
>  fs/reiserfs/xattr.c                           |   12 +-
>  fs/reiserfs/xattr.h                           |    3 +-
>  fs/reiserfs/xattr_acl.c                       |    7 +-
>  fs/reiserfs/xattr_security.c                  |    3 +-
>  fs/reiserfs/xattr_trusted.c                   |    3 +-
>  fs/reiserfs/xattr_user.c                      |    3 +-
>  fs/remap_range.c                              |    7 +-
>  fs/stat.c                                     |   26 +-
>  fs/sysv/file.c                                |    7 +-
>  fs/sysv/ialloc.c                              |    2 +-
>  fs/sysv/itree.c                               |    6 +-
>  fs/sysv/namei.c                               |   21 +-
>  fs/sysv/sysv.h                                |    3 +-
>  fs/tracefs/inode.c                            |    4 +-
>  fs/ubifs/dir.c                                |   30 +-
>  fs/ubifs/file.c                               |    5 +-
>  fs/ubifs/ioctl.c                              |    2 +-
>  fs/ubifs/ubifs.h                              |    5 +-
>  fs/ubifs/xattr.c                              |    1 +
>  fs/udf/file.c                                 |    9 +-
>  fs/udf/ialloc.c                               |    2 +-
>  fs/udf/namei.c                                |   24 +-
>  fs/udf/symlink.c                              |    7 +-
>  fs/ufs/ialloc.c                               |    2 +-
>  fs/ufs/inode.c                                |    7 +-
>  fs/ufs/namei.c                                |   19 +-
>  fs/ufs/ufs.h                                  |    3 +-
>  fs/utimes.c                                   |    4 +-
>  fs/vboxsf/dir.c                               |   12 +-
>  fs/vboxsf/utils.c                             |    9 +-
>  fs/vboxsf/vfsmod.h                            |    8 +-
>  fs/verity/enable.c                            |    2 +-
>  fs/xattr.c                                    |  136 +-
>  fs/xfs/xfs_acl.c                              |    5 +-
>  fs/xfs/xfs_acl.h                              |    3 +-
>  fs/xfs/xfs_file.c                             |    4 +-
>  fs/xfs/xfs_inode.c                            |   26 +-
>  fs/xfs/xfs_inode.h                            |   16 +-
>  fs/xfs/xfs_ioctl.c                            |   23 +-
>  fs/xfs/xfs_iops.c                             |   98 +-
>  fs/xfs/xfs_iops.h                             |    3 +-
>  fs/xfs/xfs_qm.c                               |    3 +-
>  fs/xfs/xfs_super.c                            |    2 +-
>  fs/xfs/xfs_symlink.c                          |    5 +-
>  fs/xfs/xfs_symlink.h                          |    5 +-
>  fs/xfs/xfs_xattr.c                            |    3 +-
>  fs/zonefs/super.c                             |    9 +-
>  include/linux/capability.h                    |   15 +-
>  include/linux/fs.h                            |  158 +-
>  include/linux/ima.h                           |   17 +-
>  include/linux/lsm_hook_defs.h                 |   15 +-
>  include/linux/lsm_hooks.h                     |    1 +
>  include/linux/mount.h                         |    7 +
>  include/linux/nfs_fs.h                        |    7 +-
>  include/linux/posix_acl.h                     |   15 +-
>  include/linux/posix_acl_xattr.h               |   12 +-
>  include/linux/security.h                      |   46 +-
>  include/linux/syscalls.h                      |    4 +
>  include/linux/xattr.h                         |   30 +-
>  include/uapi/asm-generic/unistd.h             |    4 +-
>  include/uapi/linux/mount.h                    |   17 +
>  ipc/mqueue.c                                  |    8 +-
>  kernel/auditsc.c                              |    5 +-
>  kernel/bpf/inode.c                            |   13 +-
>  kernel/capability.c                           |   14 +-
>  kernel/cgroup/cgroup.c                        |    2 +-
>  kernel/sys.c                                  |    2 +-
>  mm/madvise.c                                  |    4 +-
>  mm/memcontrol.c                               |    2 +-
>  mm/mincore.c                                  |    4 +-
>  mm/shmem.c                                    |   48 +-
>  net/socket.c                                  |    6 +-
>  net/unix/af_unix.c                            |    4 +-
>  security/apparmor/apparmorfs.c                |    3 +-
>  security/apparmor/domain.c                    |   13 +-
>  security/apparmor/file.c                      |    5 +-
>  security/apparmor/lsm.c                       |   12 +-
>  security/commoncap.c                          |  109 +-
>  security/integrity/evm/evm_crypto.c           |   11 +-
>  security/integrity/evm/evm_main.c             |    4 +-
>  security/integrity/evm/evm_secfs.c            |    2 +-
>  security/integrity/ima/ima.h                  |   19 +-
>  security/integrity/ima/ima_api.c              |   10 +-
>  security/integrity/ima/ima_appraise.c         |   22 +-
>  security/integrity/ima/ima_asymmetric_keys.c  |    2 +-
>  security/integrity/ima/ima_main.c             |   31 +-
>  security/integrity/ima/ima_policy.c           |   19 +-
>  security/integrity/ima/ima_queue_keys.c       |    2 +-
>  security/security.c                           |   25 +-
>  security/selinux/hooks.c                      |   22 +-
>  security/smack/smack_lsm.c                    |   18 +-
>  tools/include/uapi/asm-generic/unistd.h       |    4 +-
>  tools/testing/selftests/Makefile              |    1 +
>  .../selftests/mount_setattr/.gitignore        |    1 +
>  .../testing/selftests/mount_setattr/Makefile  |    7 +
>  tools/testing/selftests/mount_setattr/config  |    1 +
>  .../mount_setattr/mount_setattr_test.c        | 1424 +++++++++++++++++
>  338 files changed, 4718 insertions(+), 1731 deletions(-)
>  create mode 100644 tools/testing/selftests/mount_setattr/.gitignore
>  create mode 100644 tools/testing/selftests/mount_setattr/Makefile
>  create mode 100644 tools/testing/selftests/mount_setattr/config
>  create mode 100644 tools/testing/selftests/mount_setattr/mount_setattr_test.c
> 
> 
> base-commit: 7c53f6b671f4aba70ff15e1b05148b10d58c2837
> -- 
> 2.30.0
> 


More information about the Containers mailing list