[cgl_discussion] ATCA Requirements Take 3

Eric.Chacron at alcatel.fr Eric.Chacron at alcatel.fr
Tue Sep 16 06:40:58 PDT 2003



>My original point was that IPMI 1.5 has been extended by ATCA to include
new
>data structures, sensors, and the commands to support them.  It's true
that
>the actual messages are normally handled by an Intelligent Peripheral
>Management controller, but these controllers normally have interfaces to
the
>payload processor so that messages can be passed up to the resident OS.
>
>In some cases this is just to keep the OS notified of the state of the
>hardware hosting it.  Hot swap events need to be recognized by the payload
>processor, for instance, so that it can begin the process of shedding
load.


So you mean that now with ATCA the OS have to manage some events coming
from IPMI
about the status of some devices. This is new compared to IPMI
implementations i
use to deal with where IPMI is just used by management system not by the OS
(excepted maybe the watchdog ).
What is the list of events to be reported by IPMI to the OS (shutdown ...)
?
I can imagine something like: "do a shutdown", "this device has been
removed" ...
Now the IPMI master blade ( BMC / IPMC) is the central unit that is able to
manage
some failures the payloads cannot handle ( like power failure on the 5 V) .
So what do you request excatly from the OS in term of failure handling ?

>It's also possible that some part of the management stack, the system
>manager for the whole shelf perhaps, is running on the payload.
OK that's already the case with IPMI. But we have to separate requirements
on
management form requireme,nts on availability , platform ...

>
>In any of these cases the OS must have support for the IPMI subsystem and
on
>an ATCA blade that support must be capable of handling the IPMI
extensions.
OK, that makes sense but let's dispatch this in several (existing or not)
requirements.

Eric






"Somes, Richard" <Richard.Somes at fci.com> on 09/14/2003 04:45:12 PM

To:    sdake at mvista.com, Eric CHACRON/FR/ALCATEL at ALCATEL
cc:    Peter Badovinatz <tabmowzo at us.ibm.com>, cgl_discussion at osdl.org
Subject:    RE: [cgl_discussion] ATCA Requirements Take 3


Steve, Eric, all,

PLT 2.0, IPMI 1.5 Support currently requires that:

"CGL shall provide the low-level hardware controls specified in the
Intelligent Platform Management Interface (IPMI) hardware
control/monitoring
specification. CGL shall also provide interface drivers and support for
higher-level software such as that described in the SA Forum Hardware
Platform Interface (HPI) specification."

My original point was that IPMI 1.5 has been extended by ATCA to include
new
data structures, sensors, and the commands to support them.  It's true that
the actual messages are normally handled by an Intelligent Peripheral
Management controller, but these controllers normally have interfaces to
the
payload processor so that messages can be passed up to the resident OS.

In some cases this is just to keep the OS notified of the state of the
hardware hosting it.  Hot swap events need to be recognized by the payload
processor, for instance, so that it can begin the process of shedding load.
It's also possible that some part of the management stack, the system
manager for the whole shelf perhaps, is running on the payload.

In any of these cases the OS must have support for the IPMI subsystem and
on
an ATCA blade that support must be capable of handling the IPMI extensions.

Dick


-----Original Message-----
From: Steven Dake [mailto:sdake at mvista.com]
Sent: Friday, September 12, 2003 11:48 AM
To: Eric.Chacron at alcatel.fr
Cc: Peter Badovinatz; cgl_discussion at osdl.org
Subject: Re: [cgl_discussion] ATCA Requirements Take 3


Eric

Thanks I tried to capture your comments in the past, but I'll do a
better job on Take 4 of the requirements.  Some comments interspersed.

On Fri, 2003-09-12 at 01:13, Eric.Chacron at alcatel.fr wrote:
> Steven, Peter,
>
> ATCA support is an important item so i would like to commit on this
> requirement set.
> Maybe some of my previous comments has been lost, i have to repeat:
>
> 1) ATCA IPMI support
> This is not a requirement for Linux  but on the hardware implementations
> and IPMI protocol. Why  not removing it from this spec. ?
>
The OSD should provide abstractions for the SDRs, FRU data, and other
components.  The point of the requirement is to ensure that these
abstractions are provided (most likely in user space).

The IPMI driver itself has no knowledge of this functionality.

> 2) block device removal
> This is not correctly defined why you ask for such a requirement and i
dont
> see
> any link with the reliability.
> >From my "Carrier" point of view,  removal of block device while the
> application is operating is acceptable provided you first
> stop every acces on it.
> There is no miracle, if you have open files on this device what will be
the
> result if you remove the device for
> the applicatiion using it ? Do you expect the file will be accessible
from
> a mirrored disk ?
>
Stopping access is the key point.  The requirement is written such that
the user application doesn't have to stop access, but is instead its
reference is terminated by the operating system.  This allows a much
simpler programming model.  Without this model, each application using
the devices in question would have to be able to receive notification to
stop using the device.  This may be a more controllable method from a
software perspective.  The downside is that, if the application fails to
respond, the board operator is going to pull the board before the OS
cleans up I/Os because they will be impatient.  This "surprise removal"
case can be avoided for the most part if we ensure the user speedy
removal times.

The application will have the same effect as if a forced unmount was
processed, which is an implementation dependent mechanism.  We could
return EBADF on file accesses after the block device removal.

> 3) shutdowm / blue LED
> Could you just add (for regular readers ) that the blue LED is the HOT
SWAP
> one ?
> Do we have to specify the same requirement for IPMI ?
> Is there already a handler in Linux 2.4 / 2.6 that perform any action
when
> the ejectors are removed or not ?
>

agreed on hot swap led.

No handler in linux 2.4.  There may be a CompactPCI enum handler for
specific boards but nothing that extends the shutdown systemcall to
light blue leds as necessary.

> 4) Requirement: Multiple Host Syncronized Device Hotswap
> It seems for me that this is a clustering requirement related to shared
> devices.
> And also an extension of requirement 2) block device removal.
>
Block device removal is only for block devices.  Multiple host syncro
can be any shared device, be it character or block.  But I agree, these
two requirements (and their implementations) are definately
intertwined.  Any ideas on how the requirements could be reworked with
this in mind?

> Eric
>
>
>
>
> Steven Dake <sdake at mvista.com>@lists.osdl.org on 09/11/2003 07:55:03 PM
>
> Please respond to sdake at mvista.com
>
> Sent by:    cgl_discussion-bounces at lists.osdl.org
>
>
> To:    cgl_discussion at osdl.org
> cc:
> Subject:    [cgl_discussion] ATCA Requirements Take 3
>
>
> Requirement: ATCA IPMI Support
>
> The IPMI system shall implement the new IPMI commands, data structures,
> and sensors defined in the ATCA specification.
>
> Requirement: Block Device Removal
>
> The Linux kernel should allow removal of a block device while it is in
> use without degrading reliability of the system.  The block device shall
> be removeable even if in use by an open file (fdisk /dev/sda), is a
> member of raid 1 volume, or a filesystem is mounted on the device, or
> permutations thereof.
>
> Requirement: shutdown systemcall integrated with ATCA system management
>
> The Linux kernel shall ensure that the shutdown system call uses the
> ATCA system management IPMI interface to power down the cpu blade and
> light the blue led.
>
> Requirement: Multiple Host Syncronized Device Hotswap
>
> When multiple hosts are using the same block or character device, and a
> user requests to remove the device, the device's blade wont be powered
> off and if a blue led is available, lit, until all operating systems in
> the collection of cpu nodes using the device have removed all references
> to the device in the operating system.
>
>
> _______________________________________________
> cgl_discussion mailing list
> cgl_discussion at lists.osdl.org
>  http://lists.osdl.org/mailman/listinfo/cgl_discussion
>
>
>
>
>

_______________________________________________
cgl_discussion mailing list
cgl_discussion at lists.osdl.org
 http://lists.osdl.org/mailman/listinfo/cgl_discussion








More information about the cgl_discussion mailing list