[RFC PATCH 00/27] Containers and using authenticated filesystems

Eric W. Biederman ebiederm at xmission.com
Tue Feb 19 16:35:20 UTC 2019


So you missed the main mailing lists for discussion of this kind of
thing, and the maintainer.  So I have reservations about the quality of
your due diligence already.

Looking at your description you are introducing a container id.
You don't descibe which namespace your contianer id lives in.
Without the container id living in a container this breaks
nested containers and process migration aka CRIU.

So based on the your description.

Nacked-by: "Eric W. Biederman" <ebiederm at xmission.com>



David Howells <dhowells at redhat.com> writes:

> Here's a collection of patches that containerises the kernel keys and makes
> it possible to separate keys by namespace.  This can be extended to any
> filesystem that uses request_key() to obtain the pertinent authentication
> token on entry to VFS or socket methods.
>
> I have this working with AFS and AF_RXRPC so far, but it could be extended
> to other filesystems, such as NFS and CIFS.
>
> The following changes are made:
>
>  (1) Add optional namespace tags to a key's index_key.  This allows the
>      following:
>
>      (a) Automatic invalidation of all keys with that tag when the
>      	 namespace is removed.
>
>      (b) Mixing of keys with the same description, but different areas of
>      	 operation within a keyring.
>
>      (c) Sharing of cache keyrings, such as the DNS lookup cache.
>
>      (d) Diversion of upcalls based on namespace criteria.
>
>  (2) Provide each network namespace with a tag that can be used with (1).
>      This is used by the DNS query, rxrpc, nfs idmapper keys.
>
>      [!] Note that it might still be better to move these keyrings into the
>      	 network namespace.
>
>  (3) Provide key ACLs.  These allow:
>
>      (a) The permissions can be split more finely, in particular separating
>      	 out Invalidate and Join.
>
>      (b) Permits to be granted to non-standard subjects.  So, for instance,
>      	 Search permission could be granted to a container object, allowing
>      	 a search of the container keyring by a denizen of the container to
>      	 find a key that they can't otherwise see.
>
>  (4) Provide a kernel container object.  Currently, this is created with a
>      system call and passed flags that indicate the namespaces to be
>      inherited or replaced.  It might be better to actually use something
>      like fsconfig() to configure the container by setting key=val type
>      options.
>
>      The kernel container object provides the following facilities:
>
>      (a) request_key upcall interception.  The manager of a container can
>      	 intercept requests made inside the container and, using a series
>      	 of filters, can cause the authkeys to be placed into keyrings that
>      	 serve as queues for one or more upcall processing programs.  These
>      	 upcall programs use key notifications to monitor those keyrings.
>
>      (b) Per-container keyring.  A keyring can be attached to the container
>      	 such that this is searched by a request_key() performed by a
>      	 denizen of the container after searching the thread, process and
>      	 session keyrings.  The keyring and the keys contained therein must
>      	 be granted Search for that container.
>
> 	 This allows:
>
>  	 (i) Authenticated filesystems to be used transparently inside of
> 	     the container without any cooperation from the occupant
> 	     thereof.  All the key maintenance can be done by the manager.
>
>          (ii) Keys to be made available to the denizens of a container (by
>              granting extra permissions to the container subject).
>
>      (c) Per-container ID that can be used in audit messages.
>
>      (d) Container object creation gives the manager a file descriptor that
>      	 can:
>
> 	 (i) Be passed to a dirfd parameter to a VFS syscall, such as
>      	     mkdirat(), allowing an operation to be done inside the
>      	     container.
>
>          (ii) Be passed to fsopen()/fsconfig() to indicate that the target
>              filesystem is going to be created inside a container, in that
>              container's namespaces.
>
>          (iii) Be passed to the move_mount() syscall as a destination for
>              setting the root filesystem inside a new mount namespace made
>              upon container creation.
>
>      (e) The ability to configure the container with namespaces or
>      	 whatever, and then fork a process into that container to 'boot'
>      	 it.
>
>
> Three sample programs are provided:
>
>  (1) test-container.  This:
>
> 	- Creates a kernel container with a blank mount ns.
> 	- Creates its root mount and moves it to the container root.
> 	- Mounts /proc therein.
> 	- Creates a keyring called "_container"
> 	  - Sets that as the container keyring.
> 	  - Grants Search permission to the container on that keyring.
> 	  - Removes owner permission on that keyring.
> 	- Creates a sample user key "foobar" in the container keyring.
> 	  - Grants various permissions to the container on that key.
> 	- Creates a keyring called "upcall"
> 	  - Intercepts "user" key upcalls from the container to there.
> 	- Forks a process into the container
> 	  - Prints the container keyring ID if it can
> 	  - Exec's bash.
>
>      This program expects to be given the device name for a partition it
>      can mount as the root and expects it to contain things like /etc,
>      /bin, /sbin, /lib, /usr containing programs that can be run and /proc
>      to mount procfs upon.  E.g.:
>
> 	./test-container /dev/sda3
>
>  (2) test-upcall.  This is a service program that monitors the "upcall"
>      keyring created by test-container for authkeys appearing, which it
>      then hands off to /sbin/request-key.  This:
>
> 	- Opens /dev/watch_queue.
> 	  - Sets the size to 1 page.
> 	  - Sets a filter to watch for "Link creation" key events.
> 	  - Sets a watch on the upcall keyring.
> 	- Polls the watch queue for events
> 	- When an event comes in:
> 	  - Gets the authkey ID from the event buffer.
> 	  - Queries the authkey.
> 	  - Forks of a handler which:
> 	    - Moves the authkey to its thread keyring
> 	    - Sets up a new session keyring with the authkey in it.
> 	    - Execs /sbin/request-key.
>
>      This can be run in a shell that shares the session keyring with
>      test-container, from which it will find the upcall keyring.
>      Alternatively, the keyring ID can be provided on the command line:
>
> 	./test-upcall [<upcall-keyring>]
>
>      It can be triggered from inside of the container with something like:
>
> 	keyctl request2 user debug:e a @s
>
>      and something like:
>
> 	ptrs h=4 t=2 m=2000003
> 	NOTIFY[00000004-00000002] ty=0003 sy=0002 i=01000010
> 	KEY 78543393 change=2 aux=141053003
> 	Authentication key 141053003
> 	- create 779280685
> 	- uid=0 gid=0
> 	- rings=0,0,798528519
> 	- callout='a'
> 	RQDebug keyid: 779280685
> 	RQDebug desc: debug:e
> 	RQDebug callout: a
> 	RQDebug session keyring: 798528519
>
>      will appear on stdout/stderr from it and /sbin/request-key.
>
>  (3) test-cont-grant.  This is a program to make the nominated key
>      available to a container's denizens.  It:
>
> 	- Grants search permission to the nominated key.
> 	- Links the nominated key into the container keyring.
>
>      It can be run from outside of the keyring like so:
>
> 	./test-cont-grant <key> [<container-keyring>]
>
>      If the keyring isn't given, it will look for one called "_container"
>      in the session keyring where test-container is expected to have placed
>      it.
>
>      With kAFS, it can be used like follows:
>
> 	kinit dhowells at REDHAT.COM
> 	kafs-aklog redhat.com
>
>      which would log into kerberos and then get a key for accessing an AFS
>      cell called "redhat.com".  This can be seen in the session keyring by
>      calling "keyctl show":
>
> 	 120378984 --alswrv      0     0  keyring: _ses
> 	 474754113 ---lswrv      0 65534   \_ keyring: _uid.0
> 	  64049961 --alswrv      0     0   \_ rxrpc: afs at redhat.com
> 	  78543393 --alswrv      0     0   \_ keyring: upcall
> 	 661655334 --alswrv      0     0   \_ keyring: _container
> 	 639103010 --alswrv      0     0       \_ user: foobar
>
>      Then doing:
>
> 	./test-cont-grant 64049961
>
>      will result in:
>
> 	 120378984 --alswrv      0     0  keyring: _ses
> 	 474754113 ---lswrv      0 65534   \_ keyring: _uid.0
> 	  64049961 --alswrv      0     0   \_ rxrpc: afs at procyon.org.uk
> 	  78543393 --alswrv      0     0   \_ keyring: upcall
> 	 661655334 --alswrv      0     0   \_ keyring: _container
> 	 639103010 --alswrv      0     0       \_ user: foobar
> 	  64049961 --alswrv      0     0       \_ rxrpc: afs at procyon.org.uk
>
>      Inside the container, the cell could be mounted:
>
> 	mount -t afs "%redhat.com:root.cell" /mnt
>
>      and then operations in /mnt will be done using the token that has been
>      made available.  However, this can be overridden locally inside the
>      container by doing kinit and kafs-aklog there with a different user.
>
>      More to the point, the container manager could mount the container's
>      rootfs, say, over authenticated AFS and then attach the token to the
>      container and mount the rootfs into the container and the container's
>      inhabitant need not have any means to gain a kerberos login.
>
>      [?] I do wonder if the possibility to use container key searches for
>      	 direct mounts should be controlled by a mount option, say:
>
> 		fsconfig(fsfd, FSCONFIG_SET_CONTAINER, NULL, NULL, cfd);
>
>          where you have to have the container handle available.
>
>      [!] Note that test-cont-grant picks the container by name and does not
>      	 require the container handle when setting the key ACL - but the
>      	 name must come from the set of children of the current container.
>
>
> The patches can be found here also:
>
> 	http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=container
>
> Note that this is dependent on the mount-api-viro, fsinfo, notifications
> and keys-namespace branches.
>
> David
> ---
> David Howells (27):
>       containers: Rename linux/container.h to linux/container_dev.h
>       containers: Implement containers as kernel objects
>       containers: Provide /proc/containers
>       containers: Allow a process to be forked into a container
>       containers: Open a socket inside a container
>       containers, vfs: Allow syscall dirfd arguments to take a container fd
>       containers: Make fsopen() able to create a superblock in a container
>       containers, vfs: Honour CONTAINER_NEW_EMPTY_FS_NS
>       vfs: Allow mounting to other namespaces
>       containers: Provide fs_context op for container setting
>       containers: Sample program for driving container objects
>       containers: Allow a daemon to intercept request_key upcalls in a container
>       keys: Provide a keyctl to query a request_key authentication key
>       keys: Break bits out of key_unlink()
>       keys: Make __key_link_begin() handle lockdep nesting
>       keys: Grant Link permission to possessers of request_key auth keys
>       keys: Add a keyctl to move a key between keyrings
>       keys: Find the least-recently used unseen key in a keyring.
>       containers: Sample: request_key upcall handling
>       container, keys: Add a container keyring
>       keys: Fix request_key() lack of Link perm check on found key
>       KEYS: Replace uid/gid/perm permissions checking with an ACL
>       KEYS: Provide KEYCTL_GRANT_PERMISSION
>       keys: Allow a container to be specified as a subject in a key's ACL
>       keys: Provide a way to ask for the container keyring
>       keys: Allow containers to be included in key ACLs by name
>       containers: Sample to grant access to a key in a container
>
>
>  arch/x86/entry/syscalls/syscall_32.tbl             |    3 
>  arch/x86/entry/syscalls/syscall_64.tbl             |    3 
>  arch/x86/ia32/sys_ia32.c                           |    2 
>  certs/blacklist.c                                  |    7 
>  certs/system_keyring.c                             |   12 
>  drivers/acpi/container.c                           |    2 
>  drivers/base/container.c                           |    2 
>  drivers/md/dm-crypt.c                              |    2 
>  drivers/nvdimm/security.c                          |    2 
>  fs/afs/security.c                                  |    2 
>  fs/afs/super.c                                     |   18 +
>  fs/cifs/cifs_spnego.c                              |   25 +
>  fs/cifs/cifsacl.c                                  |   28 +
>  fs/cifs/connect.c                                  |    4 
>  fs/crypto/keyinfo.c                                |    2 
>  fs/ecryptfs/ecryptfs_kernel.h                      |    2 
>  fs/ecryptfs/keystore.c                             |    2 
>  fs/fs_context.c                                    |   39 +
>  fs/fscache/object-list.c                           |    2 
>  fs/fsopen.c                                        |   54 ++
>  fs/namei.c                                         |   45 +-
>  fs/namespace.c                                     |  129 ++++-
>  fs/nfs/nfs4idmap.c                                 |   29 +
>  fs/proc/root.c                                     |   20 +
>  fs/ubifs/auth.c                                    |    2 
>  include/linux/container.h                          |  100 +++-
>  include/linux/container_dev.h                      |   25 +
>  include/linux/cred.h                               |    3 
>  include/linux/fs_context.h                         |    5 
>  include/linux/init_task.h                          |    1 
>  include/linux/key-type.h                           |    2 
>  include/linux/key.h                                |  122 +++--
>  include/linux/lsm_hooks.h                          |   20 +
>  include/linux/nsproxy.h                            |    7 
>  include/linux/pid.h                                |    5 
>  include/linux/proc_ns.h                            |    6 
>  include/linux/sched.h                              |    3 
>  include/linux/sched/task.h                         |    3 
>  include/linux/security.h                           |   15 +
>  include/linux/socket.h                             |    3 
>  include/linux/syscalls.h                           |    6 
>  include/uapi/linux/container.h                     |   28 +
>  include/uapi/linux/keyctl.h                        |   85 +++
>  include/uapi/linux/mount.h                         |    4 
>  init/Kconfig                                       |    7 
>  init/init_task.c                                   |    3 
>  ipc/mqueue.c                                       |   10 
>  kernel/Makefile                                    |    2 
>  kernel/container.c                                 |  532 ++++++++++++++++++++
>  kernel/cred.c                                      |   45 ++
>  kernel/exit.c                                      |    1 
>  kernel/fork.c                                      |  111 ++++
>  kernel/namespaces.h                                |   15 +
>  kernel/nsproxy.c                                   |   32 +
>  kernel/pid.c                                       |    4 
>  kernel/sys_ni.c                                    |    5 
>  lib/digsig.c                                       |    2 
>  net/ceph/ceph_common.c                             |    2 
>  net/compat.c                                       |    2 
>  net/dns_resolver/dns_key.c                         |   12 
>  net/dns_resolver/dns_query.c                       |   15 -
>  net/rxrpc/key.c                                    |   16 -
>  net/socket.c                                       |   34 +
>  samples/vfs/Makefile                               |   12 
>  samples/vfs/test-cont-grant.c                      |   84 +++
>  samples/vfs/test-container.c                       |  382 ++++++++++++++
>  samples/vfs/test-upcall.c                          |  243 +++++++++
>  security/integrity/digsig.c                        |   31 -
>  security/integrity/digsig_asymmetric.c             |    2 
>  security/integrity/evm/evm_crypto.c                |    2 
>  security/integrity/ima/ima_mok.c                   |   13 
>  security/integrity/integrity.h                     |    4 
>  .../integrity/platform_certs/platform_keyring.c    |   13 
>  security/keys/Makefile                             |    2 
>  security/keys/compat.c                             |   20 +
>  security/keys/container.c                          |  419 ++++++++++++++++
>  security/keys/encrypted-keys/encrypted.c           |    2 
>  security/keys/encrypted-keys/masterkey_trusted.c   |    2 
>  security/keys/gc.c                                 |    2 
>  security/keys/internal.h                           |   34 +
>  security/keys/key.c                                |   35 -
>  security/keys/keyctl.c                             |  176 +++++--
>  security/keys/keyring.c                            |  198 ++++++-
>  security/keys/permission.c                         |  446 +++++++++++++++--
>  security/keys/persistent.c                         |   27 +
>  security/keys/proc.c                               |   17 -
>  security/keys/process_keys.c                       |  102 +++-
>  security/keys/request_key.c                        |   70 ++-
>  security/keys/request_key_auth.c                   |   21 +
>  security/security.c                                |   12 
>  security/selinux/hooks.c                           |   16 +
>  security/smack/smack_lsm.c                         |    3 
>  92 files changed, 3696 insertions(+), 425 deletions(-)
>  create mode 100644 include/linux/container_dev.h
>  create mode 100644 include/uapi/linux/container.h
>  create mode 100644 kernel/container.c
>  create mode 100644 kernel/namespaces.h
>  create mode 100644 samples/vfs/test-cont-grant.c
>  create mode 100644 samples/vfs/test-container.c
>  create mode 100644 samples/vfs/test-upcall.c
>  create mode 100644 security/keys/container.c


More information about the Containers mailing list