[cgl_discussion] ATCA Requirements Take 3

Somes, Richard Richard.Somes at fci.com
Sun Sep 14 07:45:12 PDT 2003

Steve, Eric, all,

PLT 2.0, IPMI 1.5 Support currently requires that:

"CGL shall provide the low-level hardware controls specified in the
Intelligent Platform Management Interface (IPMI) hardware control/monitoring
specification. CGL shall also provide interface drivers and support for
higher-level software such as that described in the SA Forum Hardware
Platform Interface (HPI) specification."

My original point was that IPMI 1.5 has been extended by ATCA to include new
data structures, sensors, and the commands to support them.  It's true that
the actual messages are normally handled by an Intelligent Peripheral
Management controller, but these controllers normally have interfaces to the
payload processor so that messages can be passed up to the resident OS.  

In some cases this is just to keep the OS notified of the state of the
hardware hosting it.  Hot swap events need to be recognized by the payload
processor, for instance, so that it can begin the process of shedding load.
It's also possible that some part of the management stack, the system
manager for the whole shelf perhaps, is running on the payload.  

In any of these cases the OS must have support for the IPMI subsystem and on
an ATCA blade that support must be capable of handling the IPMI extensions.


-----Original Message-----
From: Steven Dake [mailto:sdake at mvista.com]
Sent: Friday, September 12, 2003 11:48 AM
To: Eric.Chacron at alcatel.fr
Cc: Peter Badovinatz; cgl_discussion at osdl.org
Subject: Re: [cgl_discussion] ATCA Requirements Take 3


Thanks I tried to capture your comments in the past, but I'll do a
better job on Take 4 of the requirements.  Some comments interspersed.

On Fri, 2003-09-12 at 01:13, Eric.Chacron at alcatel.fr wrote:
> Steven, Peter,
> ATCA support is an important item so i would like to commit on this
> requirement set.
> Maybe some of my previous comments has been lost, i have to repeat:
> 1) ATCA IPMI support
> This is not a requirement for Linux  but on the hardware implementations
> and IPMI protocol. Why  not removing it from this spec. ?
The OSD should provide abstractions for the SDRs, FRU data, and other
components.  The point of the requirement is to ensure that these
abstractions are provided (most likely in user space).

The IPMI driver itself has no knowledge of this functionality.

> 2) block device removal
> This is not correctly defined why you ask for such a requirement and i
> see
> any link with the reliability.
> >From my "Carrier" point of view,  removal of block device while the
> application is operating is acceptable provided you first
> stop every acces on it.
> There is no miracle, if you have open files on this device what will be
> result if you remove the device for
> the applicatiion using it ? Do you expect the file will be accessible from
> a mirrored disk ?
Stopping access is the key point.  The requirement is written such that
the user application doesn't have to stop access, but is instead its
reference is terminated by the operating system.  This allows a much
simpler programming model.  Without this model, each application using
the devices in question would have to be able to receive notification to
stop using the device.  This may be a more controllable method from a
software perspective.  The downside is that, if the application fails to
respond, the board operator is going to pull the board before the OS
cleans up I/Os because they will be impatient.  This "surprise removal"
case can be avoided for the most part if we ensure the user speedy
removal times.

The application will have the same effect as if a forced unmount was
processed, which is an implementation dependent mechanism.  We could
return EBADF on file accesses after the block device removal.

> 3) shutdowm / blue LED
> Could you just add (for regular readers ) that the blue LED is the HOT
> one ?
> Do we have to specify the same requirement for IPMI ?
> Is there already a handler in Linux 2.4 / 2.6 that perform any action when
> the ejectors are removed or not ?

agreed on hot swap led.

No handler in linux 2.4.  There may be a CompactPCI enum handler for
specific boards but nothing that extends the shutdown systemcall to
light blue leds as necessary.

> 4) Requirement: Multiple Host Syncronized Device Hotswap
> It seems for me that this is a clustering requirement related to shared
> devices.
> And also an extension of requirement 2) block device removal.
Block device removal is only for block devices.  Multiple host syncro
can be any shared device, be it character or block.  But I agree, these
two requirements (and their implementations) are definately
intertwined.  Any ideas on how the requirements could be reworked with
this in mind?

> Eric
> Steven Dake <sdake at mvista.com>@lists.osdl.org on 09/11/2003 07:55:03 PM
> Please respond to sdake at mvista.com
> Sent by:    cgl_discussion-bounces at lists.osdl.org
> To:    cgl_discussion at osdl.org
> cc:
> Subject:    [cgl_discussion] ATCA Requirements Take 3
> Requirement: ATCA IPMI Support
> The IPMI system shall implement the new IPMI commands, data structures,
> and sensors defined in the ATCA specification.
> Requirement: Block Device Removal
> The Linux kernel should allow removal of a block device while it is in
> use without degrading reliability of the system.  The block device shall
> be removeable even if in use by an open file (fdisk /dev/sda), is a
> member of raid 1 volume, or a filesystem is mounted on the device, or
> permutations thereof.
> Requirement: shutdown systemcall integrated with ATCA system management
> The Linux kernel shall ensure that the shutdown system call uses the
> ATCA system management IPMI interface to power down the cpu blade and
> light the blue led.
> Requirement: Multiple Host Syncronized Device Hotswap
> When multiple hosts are using the same block or character device, and a
> user requests to remove the device, the device's blade wont be powered
> off and if a blue led is available, lit, until all operating systems in
> the collection of cpu nodes using the device have removed all references
> to the device in the operating system.
> _______________________________________________
> cgl_discussion mailing list
> cgl_discussion at lists.osdl.org
>  http://lists.osdl.org/mailman/listinfo/cgl_discussion

cgl_discussion mailing list
cgl_discussion at lists.osdl.org

More information about the cgl_discussion mailing list