[cgl_discussion] Re: device enumeration

Sun Feb 9 05:36:20 PST 2003

On Fri, Feb 07, 2003 at 10:29:51AM -0700, Steven Dake wrote:

Sorry for the delay in responding, I'm somewhere in China right now,
with pretty flaky internet access for the next week or so.  Please make
sure to CC: and be patient on responses...

> Performance is critical; just ask any vendor.  It may not be critical in 
> PCI hotswap and it certainly isn't critical in USB hotswap, but in 
> next-gen architectures, the difference between speed would allow an 
> operator to remove a physical device before the OS is ready to have it 
> removed (surprise removal).  This works fine in pci (reads 0xff), but 
> other architectures don't like it so much.  I am working on surprise 
> removal support for advanced tca, but its much more complicated then an 
> expected extraction and until I can say for sure it will work, 
> performance does matter.

As Pat has pointed out, this is all up to the driver within the kernel
to do this logic in the proper amount of time, if you have speed
constraints.  Only after the device is gone, is /sbin/hotplug notified.
Same thing when a device is added to the system, the kernel takes care
of it first, and when finished, calls /sbin/hotplug that something has
happened.

As such, there is no speed requirements on /sbin/hotplug itself, only
your kernel driver.  As an example, look at the time delay in pressing
the button on a pci hotplug slot on a Compaq based system.  I haven't
measured the response time in a while, but the only measurable delay in
that code path was the wait time that the hardware required to power
down the slot.

So your previous argument of /sbin/hotplug being slow and frying
hardware is moot, sorry :)

> I don't necessarily agree that _all_ kernel developers believe ioctl's 
> should be deprecated.  Just look at all of the rich ioctls in the kernel 
> currently.

Um, can you count how many new ioctls have been added to the main kernel
tree in the 2.5 kernel series?  I would be very surprised if there were
many at all.  If you want to add a new ioctl to the kernel tree, you
will have a very hard time convincing people that it is necessary.
There is one instance where an ioctl is the proper tool, but that is a
very small instance of the wide and varied usages that people have
abused the ioctl interface.

> The major problem without using ioctls (by using a 
> filesystem for accessing methods in the kernel) is that there is no way 
> to retrieve a return code.  Without a return code, how is the 
> application supposed to know what the kernel did was successful, but by 
> polling its state again?  Then the application may understand the 
> operation was faulted, but the exact failure reason is still up in the 
> air.

As Pat pointed out, please look at the return value of read() and
write().

> I suppose if the community makes the decision that living without 
> return codes is acceptable, I could live with it.

The "community" has done no such thing.

> Here is how it works.  A telco has a fault in their system.  They figure 
> out what the exact failure is (bad switch, bad hub, bad disk, bad cpu 
> blade, bad whatever), and they dispatch a 10$/hour worker to fix it. 
> The worker presses the hotswap request button, and while the OS is 
> busily executing /sbin/hotplug, the worker thinks its ok to remove the 
> device (when the OS still is using it).  This may "work" in Linux for 
> PCI, but it certainly isn't correct that a device driver should expect 
> 0xff to be returned on pci operations (in the case of PCI).  Other 
> architectures don't return anything indicating any failure causing real 
> confusion.

It doesn't "work" for PCI today with 95% of the drivers, that's why we
shut the driver down before powering down the card.  That's also why the
PCI Hotplug spec says to do that :)

> Performance is _so_ critical here, because if the removal operation is 
> fast enough, there is no phsyical way to remove the device from the 
> slot/bus/whatever before the OS has removed the device from the 
> operating system data structures.

Well look how USB handles this, it can easily handle unexpected removals
of the device at any point in time.  It is not impossible to do, and
again, none of it requires /sbin/hotplug interaction at all, it's all
done in kernelspace.

> Yes I agree sysfs/taking advantage of the driver model is a superior 
> choice to mvista's chassis manager, but hey, we had to work with what we 
> had available at the time.  If sysfs were in 2.4, we would have used 
> that instead.  In future revs, we may backport sysfs to provide this 
> sort of functionality and ensure that HDI works for both 2.4 and 2.5 
> easily without a bunch of changes to parsing driver model information.

As Pat pointed out, the ddfs/driverfs/sysfs code has been around for a
long time.  And the chassis manager code is _very_ bad stuff, sorry.

> The key difference between MontaVista's HDI and whatever anyone else is 
> working on is the excellent mechanism by which insert and remove events 
> are transmitted (via the event broker) without the need to execute any 
> type of hotplug scripts.  Perhaps both mechanisms could be used in 
> MontaVista's implementation and let selection take its course.  This 
> would allow us to keep the userspace database/api/tools/etc without the 
> need to reimplement what everyone already agrees is the correct solution.

That's great, I agree that the HDI is a useful thing if you want to port
your notification aware programs across 2.4 and 2.5.

But realize that the existing /sbin/hotplug interface is present on 2.4
and 2.5.  I've just spent some time last week working on adding this
interface to the LSB to ensure that people can write portable code
across multiple kernel versions.

thanks,

greg k-h