uevent when moving nic between network namespaces?
Eric W. Biederman
ebiederm at xmission.com
Fri Oct 12 19:38:47 UTC 2012
Serge Hallyn <serge.hallyn at canonical.com> writes:
> Quoting Eric W. Biederman (ebiederm at xmission.com):
>> Serge Hallyn <serge.hallyn at canonical.com> writes:
>>
>> > Hi,
>> >
>> > Dan Kegel (cc:d) found an interesting nuisance relating to upstart
>> > and network interfaces with lxc containers. In particular, when you
>> > start a container, two veths are created. A uevent for their creation
>> > is sent, and so a 'network-interface' upstart job is created for each.
>> > One of the veths is passed into the container. When the container
>> > shuts down, the veth in the init-net-ns gets a net-device-removed
>> > uevent, so the network-interface upstart job goes away. But the veth
>> > in the container doesn't cause a net-device-removed upstart uevent
>> > to be sent. So its network-interface upstart job sticks around.
>> >
>> > The details are at:
>> >
>> > https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1065589
>> >
>> > I notice that when simply renaming a netdev (sudo ip link set veth1 name
>> > veth2) then udevadm monitor shows:
>> >
>> > KERNEL[17945.234850] move /devices/virtual/net/veth2 (net)
>> > UDEV [17945.235758] move /devices/virtual/net/veth2 (net)
>> >
>> > but when I do 'sudo ip link set veth2 netns 27689' then 'udevadm
>> > monitor' shows nothing.
>> >
>> > When I do
>> >
>> > sudo ip link set veth1 netns 32296
>> > (in process 32296) sudo ip link set veth1 name veth2
>> >
>> > then, again udevadm monitor shows nothing.
>> >
>> > So the question is, should the kernel be sending uevents for
>> > net-device-removed and then net-device-added when a nic is moved
>> > between network namespaces? Or should lxc just fake that?
>>
>> To the best of my memory I wired up those events, and they should be
>> delivered. Now they uevents will only be delivered in the relevant
>> network namespace.
>>
>> Hmm. But the relevant code in the kernel is device_rename, and it
>> happens after we switch the network namespace on the device.
>>
>> Which probably means that in practice only the new network namespace is
>> seeing uevents.
>>
>> Grr.
>
> Ah, indeed. A few more experiments show that:
>
> 1. 'sudo ip link add type veth' on the host ends up with some kernel
> messages, namely
>
> KERNEL[389.393581] add /devices/virtual/net/veth1/queues/rx-0 (queues)
> KERNEL[389.394953] add /devices/virtual/net/veth1/queues/tx-0 (queues)
>
> sent to all namespaces - though the
Yes. The queue uevents are not currently network namespace aware. That
is a bug I would be happy to see fixed.
> UDEV [389.405255] add /devices/virtual/net/veth1 (net)
>
> only gets sent to the initial namespace.
>
> 2. Then when I 'sudo ip link set veth1 netns <pid-in-container>', I get
>
> KERNEL[405.041296] move /devices/virtual/net/veth2 (net)
>
> only in the container's namespace - exactly as you said above should
> happen.
>
> Eric, are you working on a patch for this? Should we just explicitly
> add a remove uevent before doing the transition, or is it more
> complicated than that?
I am not currently working on a patch for this, but I will be happy to
review one. At a quick glance it looks like this could just be as
simple as calling kobject_uevent at the proper time, but testing and
reading through the relevant code paths is probably a good idea as there
always seems to be gotchas in that code.
Eric
More information about the Containers
mailing list