Eric W. Biederman
ebiederm at xmission.com
Thu Oct 3 09:17:17 UTC 2013
Amir Goldstein <amir at cellrox.com> writes:
> Excellent! let's focus the discussion on a new device driver we want
> to write
> which is namespace aware. let's call this device driver valarm-dev.
> Similarly to Android's alarm-dev, valarm-dev can be used to request
> RTC wakeup calls
> from user space and get/set RTC values, but with valarm-dev, every
> may use different values for current time.
> As you can see in our patch set, we already have a version of
> alarm-dev that maintains
> its state inside a context, instead of in global variable, so it is
> capable of providing
> different context per namespace.
> And now for the 1M$ question: per *which* namespace do we attribute
> the current realtime clock time?
To none of them. Just use a different minor per instance, then you
don't have a hard question to answer.
> To UTS namespace (because T historically stands for Time)? To device
> Even if device namespace would exist, we do not want to tie the policy
> decision of "separate time"
> to a very wide definition of "separate devices".
> So what we want to create, is an API for device driver writers, that
> will enable to write a namespace
> aware device and allow userspace to configure when the namespace aware
> device context is unshared.
> We would like to share with you our very initial thoughts about how
> this will be implemented:
> - Extend register_pernet_subsys/device(ops) API
> to register_perns_subsys/device(nstype, ops) API
> - Extend pernet_operations to perns_operations that include optional
> migrate() and/or unshare() ops
> - Let valarm-dev register_peruser_subsys/device(&alarm_userns_ops)
For the network subsystem that makes sense. But it doesn't make sense
for devices. It is just an unneeded extra complication.
> - Implement a new syscall (or netlink command if it makes more sense)
> setdevns(int dev_fd, int ns_fd, int nstype, int flags)
ioctl? master device? How do people communicate with raw devices these
> - Unlike the netlink set netns case, this API is not used solely to
> *move* a device to a different namespace,
> but also to *unshare* a device context between namespaces, for those
> devices that resigtered unshare() ops.
I really think this all makes most sense a driver a virtual driver at a
> This is our missing piece of the puzzle.
> After that, whether we make changes to existing drivers (e.g. evdev)
> or write new virtualized drivers (e.g. vevdev)
> is a technicality. We care not which way to go, whichever way seems
> more maintainable.
> What do you think of this master plan?
I think by making your devices behavior depend on which namespace they
are in you are making the drivers unnecesarily fragile, and
I think the code will be simpler/cleaner/better if you don't need to
have context outside of your drivers.
> P.S. Please try to refrain from addressing the validity of the use
> case of alarm-dev in particular,
> as we do not wish to get engage "Android sucks" wars.
> We simply want to present the case for improving the namespace
> infrastructure to cater the needs
> of device driver writers that wish to tailor their drivers for
> containers based products.
I think this is a driver interface problem, not a namespace problem.
None of the similar drivers that exist in the network namespace
change their behavior depending on which namespace they are in.
The two practical choices I see are.
1) Use a bunch of minors for your driver.
2) Act roughly like /dev/pts and use different mounts of the filesystem
to create new instances.
I think different minors is probably easier, but we have two successfull
models I am aware of so I have mentioned both.
More information about the Containers