[PATCH V3 0/6] namespaces: log namespaces per task

Michael Kerrisk mtk.manpages at gmail.com
Thu May 22 10:20:57 UTC 2014


Richard,

On Tue, May 20, 2014 at 3:12 PM, Richard Guy Briggs <rgb at redhat.com> wrote:
> The purpose is to track namespaces in use by logged processes from the
> perspective of init_*_ns.
>
> 1/6 defines a function to generate them and assigns them.
>
> Use a serial number per namespace (unique across one boot of one kernel)
> instead of the inode number (which is claimed to have had the right to change
> reserved and is not necessarily unique if there is more than one proc fs).  It
> could be argued that the inode numbers have now become a defacto interface and
> can't change now, but I'm proposing this approach to see if this helps address
> some of the objections to the earlier patchset.
>
> 2/6 adds access functions to get to the serial numbers in a similar way to
> inode access for namespace proc operations.
>
> 3/6 implements, as suggested by Serge Hallyn, making these serial numbers
> available in /proc/self/ns/{ipc,mnt,net,pid,user,uts}_snum.  I chose "snum"
> instead of "seq" for consistency with inum and there are a number of other uses
> of "seq" in the namespace code.
>
> 4/6 exposes proc's ns entries structure which lists a number of useful
> operations per namespace type for other subsystems to use.

Since the 3 and 4 change the ABI, please CC iterations of this patch
series to linux-api at vger.kernel.org, as per
Documentation/SubmitChecklist.

Cheers,

Michael


> 5/6 provides an example of usage for audit_log_task_info() which is used by
> syscall audits, among others.  audit_log_task() and audit_common_recv_message()
> would be other potential use cases.
>
> Proposed output format:
> This differs slightly from Aristeu's patch because of the label conflict with
> "pid=" due to including it in existing records rather than it being a seperate
> record.  The serial numbers are printed in hex.
>         type=SYSCALL msg=audit(1399651071.433:72): arch=c000003e syscall=272 success=yes exit=0 a0=40000000 a1=ffffffffffffffff a2=0 a3=22 items=0 ppid=1 pid=483 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="(t-daemon)" exe="/usr/lib/systemd/systemd" netns=97 utsns=2 ipcns=1 pidns=4 userns=3 mntns=5 subj=system_u:system_r:init_t:s0 key=(null)
>
> 6/6 tracks the creation and deletion of of namespaces, listing the type of
> namespace instance, related namespace id if there is one and the newly minted
> serial number.
>
> Proposed output format:
>         type=NS_INIT msg=audit(1400217435.706:94): pid=524 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:mount_t:s0 type=20000 old_snum=0 snum=a1 res=1
>         type=NS_DEL msg=audit(1400217435.730:95): pid=524 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:mount_t:s0 type=20000 snum=a1 res=1
>
>
> v2 -> v3:
>         Use atomic64_t in ns_serial to simplify it.
>         Avoid funciton duplication in proc, keying on dentry.
>         Squash down audit patch to avoid rcu sleep issues.
>         Add tracking for creation and deletion of namespace instances.
>
> v1 -> v2:
>         Avoid rollover by switching from an int to a long long.
>         Change rollover behaviour from simply avoiding zero to raising a BUG.
>         Expose serial numbers in /proc/<pid>/ns/*_snum.
>         Expose ns_entries and use it in audit.
>
>
> Notes:
> There has been some progress made for audit in net namespaces and pid
> namespaces since this previous thread.  net namespaces are now served as peers
> by one auditd in the init_net namespace with processes in a non-init_net
> namespace being able to write records if they are in the init_user_ns and have
> CAP_AUDIT_WRITE.  Processes in a non-init_pid_ns can now similarly write
> records.  As for CAP_AUDIT_READ, I just posted a patchset to check capabilities
> of userspace processes that try to join netlink broadcast groups.
>
> This set does not try to solve the non-init namespace audit messages and
> auditd problem yet.  That will come later, likely with additional auditd
> instances running in another namespace with a limited ability to influence the
> master auditd.  I echo Eric B's idea that messages destined for different
> namespaces would have to be tailored for that namespace with references that
> make sense (such as the right pid number reported to that pid namespace, and
> not leaking info about parents or peers).
>
> Bugs:
> Patch 6/6 has a timing bug such that mnt and net namespace initial namespaces
> never get logged, I suspect because they are initialized before the audit
> subsystem.  I've tried moving audit from __initcall to subsys_initcall, but
> that doesn't help.
>
> Questions:
> Is there a way to link serial numbers of namespaces involved in migration of a
> container to another kernel?  It sounds like what is needed is a part of a
> mangement application that is able to pull the audit rcords from constituent
> hosts to build an audit trail of a container.
>
> What additional events should list this information?
>
> Does this present any problematic information leaks?  Only CAP_AUDIT_CONTROL
> (and proposed CAP_AUDIT_READ) in init_user_ns can get to this information in
> the init namespace at the moment from audit.  *However*, the addition of the
> proc/<pid>/ns/*_snum does make it available to other processes now.
>
>
> Richard Guy Briggs (6):
>   namespaces: assign each namespace instance a serial number
>   namespaces: expose namespace instance serial number in proc_ns_operations
>   namespaces: expose ns instance serial numbers in proc
>   namespaces: expose ns_entries
>   audit: log namespace serial numbers
>   audit: log creation and deletion of namespace instances
>
>  fs/mount.h                     |    1 +
>  fs/namespace.c                 |   12 +++++++++
>  fs/proc/namespaces.c           |   35 +++++++++++++++++++-------
>  include/linux/audit.h          |   15 +++++++++++
>  include/linux/ipc_namespace.h  |    1 +
>  include/linux/nsproxy.h        |    8 ++++++
>  include/linux/pid_namespace.h  |    1 +
>  include/linux/proc_ns.h        |    2 +
>  include/linux/user_namespace.h |    1 +
>  include/linux/utsname.h        |    1 +
>  include/net/net_namespace.h    |    1 +
>  include/uapi/linux/audit.h     |    2 +
>  init/version.c                 |    1 +
>  ipc/msgutil.c                  |    1 +
>  ipc/namespace.c                |   20 +++++++++++++++
>  kernel/audit.c                 |   53 +++++++++++++++++++++++++++++++++++++++-
>  kernel/nsproxy.c               |   17 +++++++++++++
>  kernel/pid.c                   |    1 +
>  kernel/pid_namespace.c         |   19 ++++++++++++++
>  kernel/user.c                  |    1 +
>  kernel/user_namespace.c        |   18 +++++++++++++
>  kernel/utsname.c               |   20 +++++++++++++++
>  net/core/net_namespace.c       |   20 ++++++++++++++-
>  23 files changed, 240 insertions(+), 11 deletions(-)
>
> _______________________________________________
> Containers mailing list
> Containers at lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers



-- 
Michael Kerrisk Linux man-pages maintainer;
http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface", http://blog.man7.org/


More information about the Containers mailing list