[cgl_discussion] Some Initial Comments on DDH-Spec-0.5h.pdf

Howell, David P david.p.howell at intel.com
Tue Sep 24 07:34:15 PDT 2002

See my comments mixed in below w/[DPH].

Dave Howell

-----Original Message-----
From: Andy Pfiffer [mailto:andyp at osdl.org]
Sent: Monday, September 23, 2002 7:35 PM
To: cgl_discussion at osdl.org
Cc: rob.rhoads at intel.com; hardeneddrivers-discuss at lists.sourceforge.net
Subject: [cgl_discussion] Some Initial Comments on DDH-Spec-0.5h.pdf

Andy Pfiffer wrote:
>[ these are my initial comments on the draft release spec -- Andy ]
>Re: DDH-Spec-0.5h.pdf
	<< Stuff Deleted >>

>	General comment: the specification, as written, does not address
>	an way to enforce compliance, largely because the Stability
>	and Reliability section is based upon the list of Good Coding
>	Practices.

[DPH] Interesting point, Solaris has it's specification and a test harness,
	as I recall to qualify drivers. They cover a  lot of ground in both 
	to make sure that the advocated interfaces are documented, used, and

	used right. For example, there were several ways to do dma but the 
	Solaris spec advocates a best API to use and the checker (I think) 
	was able to identify if an unadvocated API is used. The driver
	Best Known Methods also is part of this, although I'm not sure how a

	harness could test if they are used.

	Are we considering any kind of test harness, kinda like we discussed

	in TLT 1.0 around fault injection testing for this? Possibly a
	A open source project to provide this for validating drivers might
	a part of this effort, would need to be extensible for specific
	categories/environments. Is there anything like this now for Linux
in an
	open source project?

	No disagreement that the edge hardening would be good and should be 
	included in an effort like this. Again, any work ongoing in the

	<< Stuff Deleted >>

	> Comment:
	> I consider a "good driver" to have the following attributes:
	> 1. does not cause, directly or indirectly, fatal exceptions.
	> 2. does not cause, directly or indirectly, the system to hang.
	> 3. satisfies the relevant functions as specified,
	>   with "good performance" characteristics
	> 4. detects errors in configuration, operation, or other aspects
	>    of the hardware (or software) functions that are managed by
	>    the driver.
	> 5. is expressed in a maintainable form.

	[DPH] - Encapsulates errors, i.e. can shut down the subsystem
cleanly on 
		  fatal errors without harming the system, i.e. EIO returns
		  attempted IOs if a fatal error was detected and the
subsystem is
		  taken offline, but the system keeps running.
		- Restartable from shut down state, either by unloading and
		  the driver or by some form of reset, to take it from the
EIO state 
		  back to active. Operator or fault manager initiated.

		This would replace panic() for safe cases where errors are
clearly not
		affecting things beyond driver state (hardware error, some
		Neither one of these is easy and there is little of this in
		in use today. It would take a infrastructure to do this
cleanly that 
		would need to be part of the specification.

David Howell
Intel Corporation
Telco Server Development
Server Products Division

david.p.howell at intel.com

My opinions are my own and not necessarily those of Intel Corporation.

cgl_discussion mailing list
cgl_discussion at lists.osdl.org

More information about the cgl_discussion mailing list