uevent when moving nic between network namespaces?

Eric W. Biederman ebiederm at xmission.com
Fri Oct 12 19:38:47 UTC 2012


Serge Hallyn <serge.hallyn at canonical.com> writes:

> Quoting Eric W. Biederman (ebiederm at xmission.com):
>> Serge Hallyn <serge.hallyn at canonical.com> writes:
>> 
>> > Hi,
>> >
>> > Dan Kegel (cc:d) found an interesting nuisance relating to upstart
>> > and network interfaces with lxc containers.  In particular, when you
>> > start a container, two veths are created.  A uevent for their creation
>> > is sent, and so a 'network-interface' upstart job is created for each.
>> > One of the veths is passed into the container.  When the container
>> > shuts down, the veth in the init-net-ns gets a net-device-removed
>> > uevent, so the network-interface upstart job goes away.  But the veth
>> > in the container doesn't cause a net-device-removed upstart uevent
>> > to be sent.  So its network-interface upstart job sticks around.
>> >
>> > The details are at:
>> >
>> > https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1065589
>> >
>> > I notice that when simply renaming a netdev (sudo ip link set veth1 name
>> > veth2) then udevadm monitor shows:
>> >
>> > KERNEL[17945.234850] move     /devices/virtual/net/veth2 (net)
>> > UDEV  [17945.235758] move     /devices/virtual/net/veth2 (net)
>> >
>> > but when I do 'sudo ip link set veth2 netns 27689' then 'udevadm
>> > monitor' shows nothing.
>> >
>> > When I do
>> >
>> > 	sudo ip link set veth1 netns 32296
>> > 	(in process 32296) sudo ip link set veth1 name veth2
>> >
>> > then, again udevadm monitor shows nothing.
>> >
>> > So the question is, should the kernel be sending uevents for
>> > net-device-removed and then net-device-added when a nic is moved
>> > between network namespaces?  Or should lxc just fake that?
>> 
>> To the best of my memory I wired up those events, and they should be
>> delivered.  Now they uevents will only be delivered in the relevant
>> network namespace.
>> 
>> Hmm.  But the relevant code in the kernel is device_rename, and it
>> happens after we switch the network namespace on the device.
>> 
>> Which probably means that in practice only the new network namespace is
>> seeing uevents.
>> 
>> Grr.
>
> Ah, indeed.  A few more experiments show that:
>
> 1. 'sudo ip link add type veth' on the host ends up with some kernel
> messages, namely
>
> KERNEL[389.393581] add      /devices/virtual/net/veth1/queues/rx-0 (queues)
> KERNEL[389.394953] add      /devices/virtual/net/veth1/queues/tx-0 (queues)
>
> sent to all namespaces - though the 

Yes.  The queue uevents are not currently network namespace aware.  That
is a bug I would be happy to see fixed.

> UDEV  [389.405255] add      /devices/virtual/net/veth1 (net)
>
> only gets sent to the initial namespace.
>
> 2. Then when I 'sudo ip link set veth1 netns <pid-in-container>', I get
>
> KERNEL[405.041296] move     /devices/virtual/net/veth2 (net)
>
> only in the container's namespace - exactly as you said above should
> happen.
>
> Eric, are you working on a patch for this?  Should we just explicitly
> add a remove uevent before doing the transition, or is it more
> complicated than that?

I am not currently working on a patch for this, but I will be happy to
review one. At a quick glance it looks like this could just be as
simple as calling kobject_uevent at the proper time, but testing and
reading through the relevant code paths is probably a good idea as there
always seems to be gotchas in that code.

Eric


More information about the Containers mailing list