[PATCH 2/6] C/R: Basic support for network namespaces and devices (v5)

Eric W. Biederman ebiederm at xmission.com
Mon Mar 8 09:53:20 PST 2010


Dan Smith <danms at us.ibm.com> writes:

> OL> I'm confused: in checkpoint_ns() inside the for_each_netdev() loop
> OL> you first test for dev->netdev_ops->ndo_checkpoint and then call
> OL> checkpoint_obj(... CKPT_OBJ_NETDEV) - which in turn will call
> OL> checkpoint_netdev(), which will again test for
> dev-> netdev_ops->ndo_checkpoint ...  am I reading it wrongly ?
>
> In the case of veth, yes.  It goes something like this:
>
> checkpoint_netns() {
>   foreach netdev in netns {
>     checkpoint_netdev {
>       if netdev is veth {
>         checkpoint_peer(); // Will call checkpoint_netdev again
>       }
>     }
>   }
> }
>
> It shouldn't happen, but it seems like since we could potentially add
> another checkpoint_obj(mydev) somewhere other than in
> checkpoint_netdev(), it is reasonable to check that there is actually
> something to call before we call it.
>
> Would you prefer a BUG()?
>
> OL> How about this - to me it feels simpler:
>
> OL> 	dev = rtnl_newlink(veth_new_link_msg, &veth, this_name);
> OL> 	if (IS_ERR(dev))
> OL> 		return dev;
>
> OL> 	peer = dev_get_by_name(current->nsproxy->net_ns, peer_name);
> OL> 	if (!peer) {
> OL> 		ret = -EINVAL;
> OL> 		goto err_dev;
> OL> 	}
> OL> 	ret = ckpt_obj_insert(ctx, peer, h->veth.peer_ref,
> OL> 			      CKPT_OBJ_NETDEV);
> OL> 	if (ret < 0)
> OL> 		goto err_peer;
>
> OL> 	dev_put(peer);
>
> OL> 	dq.dev = dev;
> OL> 	dq.peer = peer;
> OL> 	ret = deferqueue_add(ctx->deferqueue, &dq, sizeof(dq),
> OL> 			     netdev_noop, netdev_cleanup);
> OL> 	if (ret)
> OL> 		goto err_peer;
>
> If you fail here you need to unregister_netdev() because the dev_put()
> that the objhash will not cause it to happen.  Unless we add something
> to allow you to remove your object from the hash, you can't prevent
> that final put, so you have to have it in the deferqueue for
> later.  You can't check the refcount in the objhash function because it
> will differ depending on the number of addresses and protocols the
> device has, and those don't get released until unregister_netdev()
> which will block if you call it before you've released all of your
> references.  If the objhash put function could examine ctx->errno,
> then it could drop its reference and then call unregister_netdev(),
> but that would involve changing all the drop functions.  What am I
> missing?

Can we take advantage of the fact that when you destroy a network
namespace the virtual devices in that network namespace are also destroyed?

Eric



More information about the Containers mailing list