[PATCH 2/3] lib/kobject_uevent.c: add uevent forwarding function

Greg KH gregkh at linuxfoundation.org
Wed Sep 9 20:11:23 UTC 2015


On Wed, Sep 09, 2015 at 03:24:12PM -0400, Michael J Coss wrote:
> On 9/8/2015 11:55 PM, Greg KH wrote:
> > On Tue, Sep 08, 2015 at 10:10:29PM -0400, Michael J. Coss wrote:
> >> Adds capability to allow userspace programs to forward a given event to
> >> a specific network namespace as determined by the provided pid.  In
> >> addition, support for a per-namespace kobject_sequence counter was
> >> added.  Sysfs was modified to return the correct event counter based on
> >> the current network namespace.
> >>
> >> Signed-off-by: Michael J. Coss <michael.coss at alcatel-lucent.com>
> >> ---
> >>  include/linux/kobject.h     |  3 ++
> >>  include/net/net_namespace.h |  3 ++
> >>  kernel/ksysfs.c             | 12 ++++++
> >>  lib/kobject_uevent.c        | 90 +++++++++++++++++++++++++++++++++++++++++++++
> >>  4 files changed, 108 insertions(+)
> >>
> >> diff --git a/include/linux/kobject.h b/include/linux/kobject.h
> >> index 637f670..d1bb509 100644
> >> --- a/include/linux/kobject.h
> >> +++ b/include/linux/kobject.h
> >> @@ -215,6 +215,9 @@ extern struct kobject *firmware_kobj;
> >>  int kobject_uevent(struct kobject *kobj, enum kobject_action action);
> >>  int kobject_uevent_env(struct kobject *kobj, enum kobject_action action,
> >>  			char *envp[]);
> >> +#if defined(CONFIG_UDEVNS) || defined(CONFIG_UDEVNS_MODULE)
> >> +int kobject_uevent_forward(char *buf, size_t len, pid_t pid);
> >> +#endif
> >>  
> >>  __printf(2, 3)
> >>  int add_uevent_var(struct kobj_uevent_env *env, const char *format, ...);
> >> diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
> >> index e951453..a4013e5 100644
> >> --- a/include/net/net_namespace.h
> >> +++ b/include/net/net_namespace.h
> >> @@ -134,6 +134,9 @@ struct net {
> >>  #if IS_ENABLED(CONFIG_MPLS)
> >>  	struct netns_mpls	mpls;
> >>  #endif
> >> +#if defined(CONFIG_UDEVNS) || defined(CONFIG_UDEVNS_MODULE)
> >> +	u64				kevent_seqnum;
> >> +#endif
> >>  	struct sock		*diag_nlsk;
> >>  	atomic_t		fnhe_genid;
> >>  };
> >> diff --git a/kernel/ksysfs.c b/kernel/ksysfs.c
> >> index 6683cce..4bc15fd 100644
> >> --- a/kernel/ksysfs.c
> >> +++ b/kernel/ksysfs.c
> >> @@ -21,6 +21,9 @@
> >>  #include <linux/compiler.h>
> >>  
> >>  #include <linux/rcupdate.h>	/* rcu_expedited */
> >> +#if defined(CONFIG_UDEVNS) || defined(CONFIG_UDEVNS_MODULE)
> >> +#include <net/net_namespace.h>
> >> +#endif
> > #if isn't needed here, right?
> >
> >>  
> >>  #define KERNEL_ATTR_RO(_name) \
> >>  static struct kobj_attribute _name##_attr = __ATTR_RO(_name)
> >> @@ -33,6 +36,15 @@ static struct kobj_attribute _name##_attr = \
> >>  static ssize_t uevent_seqnum_show(struct kobject *kobj,
> >>  				  struct kobj_attribute *attr, char *buf)
> >>  {
> >> +#if defined(CONFIG_UDEVNS) || defined(CONFIG_UDEVNS_MODULE)
> >> +	pid_t p = task_pid_vnr(current);
> >> +	struct net *n = get_net_ns_by_pid(p);
> >> +
> >> +	if (n != ERR_PTR(-ESRCH)) {
> >> +		if (!net_eq(n, &init_net))
> >> +			return sprintf(buf, "%llu\n", n->kevent_seqnum);
> >> +	}
> >> +#endif
> >>  	return sprintf(buf, "%llu\n", (unsigned long long)uevent_seqnum);
> >>  }
> >>  KERNEL_ATTR_RO(uevent_seqnum);
> >> diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c
> >> index d791e33..7589745 100644
> >> --- a/lib/kobject_uevent.c
> >> +++ b/lib/kobject_uevent.c
> >> @@ -379,6 +379,96 @@ int kobject_uevent(struct kobject *kobj, enum kobject_action action)
> >>  }
> >>  EXPORT_SYMBOL_GPL(kobject_uevent);
> >>  
> >> +#if defined(CONFIG_UDEVNS) || defined(CONFIG_UDEVNS_MODULE)
> >> +/**
> >> + * kobject_uevent_forward - forward event to specified network namespace
> >> + *
> >> + * @buf: event buffer
> >> + * @len: event length
> >> + * @pid: pid of network namespace
> >> + *
> >> + * Returns 0 if kobject_uevent_forward() is completed with success or the
> >> + * corresponding error when it fails.
> >> + */
> >> +int kobject_uevent_forward(char *buf, size_t len, pid_t pid)
> >> +{
> >> +	int retval = 0;
> >> +#if defined(CONFIG_NET)
> >> +	struct uevent_sock *ue_sk;
> >> +	struct net *pns;
> >> +	char *p;
> >> +	u64 num;
> >> +
> >> +	/* grab the network namespace of the provided pid */
> >> +	pns = get_net_ns_by_pid(pid);
> >> +	if (pns == ERR_PTR(-ESRCH))
> >> +		return -ESRCH;
> >> +
> >> +	/* find sequence number in buffer */
> >> +	p = buf;
> >> +	num = 0;
> >> +	while (p < (buf + len)) {
> >> +		if (strncmp(p, "SEQNUM=", 7) == 0) {
> >> +			int r;
> >> +
> >> +			p += 7;
> >> +			r = kstrtoull(p, 10, &num);
> >> +			if (r) {
> >> +				put_net(pns);
> >> +				return r;
> >> +			}
> >> +			break;
> >> +		}
> >> +		p += (strlen(p) + 1);
> >> +	}
> > Ok, that's crazy, replacing an existing sequence number with a sequence
> > number of the namespace?  An interesting hack, yes, but something we
> > want to maintain for the next 20+ years, no.  Surely there's a better
> > way to do this?
> >
> > But again, I'm not sold on this whole idea anyway.
> >
> > greg k-h
> >
> Re: the #if
> The #if is only needed because the new udevns code references a
> structure defined in that header.  Obviously it could be included
> without consequences but I #if it to show it was part of the forwarding
> code.

That's not an issue, we don't put #ifdefs in .c code if at all possible,
you will note that you added a bunch of them :(

> Re: sequence #
> I wanted it as transparent as possible, without this the udevd running
> inside the container has issues with the values of the sequence numbers
> seen, by making it tied to the namespace, udevd could run unchanged. 

Oh I know why you did it, I just don't like it :)

> Our goal was to minimize the changes to a linux distro and still be able
> to run a full desktop inside a container.  But even in absence of our
> use case, the first question is should uevents be broadcasts to every
> network namespace?  I don't think that broadcasting is the correct
> action.  And follow on questions are what if anything should be done
> with regards to uevents and containers.

I don't think you should be running udevd within a container, as a
device is never bound to a namespace (network devices are the only
exception), it's a false thing to think that a uevent is only for a
single namespace as well.  I understand your wish to change only the
kernel to get unmodified userspace to run, but I suggest modifying your
userspace instead :)

sorry,

greg k-h


More information about the Containers mailing list