[cgl_discussion] Project Review: Panic Handler Enhancements

Cress, Andrew R andrew.r.cress at intel.com
Tue Sep 10 12:29:55 PDT 2002

1. Requirements related to the Panic Handler Enhancements project:
4.11 Linux Panic Handler Enhancement
OSDL CGL shall support enriched capabilities on system panic. Currently the
default system panic behavior is to print a short message to the console and
halt the system. OSDL CGL shall provide a set of configurable functions
including log panic event to system event log as well as the options to
reboot, power down, or power cycle when panic event occurs.

2. How Panic Handler Enhancements meet the CGL requirements:
Panic Handler Enhancements includes coverage for the most widely adopted
standard for platform firmware APIs, which is IPMI.  Other platforms can
currently be added to this feature by coding other specific handling for
certain subroutines.  Integration of other platforms will be made easier 
when a meta-standard API is published to encompass both IPMI and non-IPMI
systems.  See design information below for functional descriptions.

3. Project design information:
This feature contains both a kernel module (bmc_panic) and a component 
rpm (panicsel) for various utilities.

The bmc_panic kernel module adds additional features to the Linux Panic 
Handler so that more information can be saved and passed along if a Linux
panic condition occurs.  This package enables the bmc_panic kernel module to
handle these additional features.
bmc_panic features:
 - write OS Critical Stop message to firmware System Event Log (SEL)
 - turn on the Critical Alarm LED on the Telco Alarms Panel
 - send SNMP trap via BMC LAN Alerting mechanism
The Panic Handler module (bmc_panic) inserts itself in the panic_notifier
list, then, if a panic occurs, bmc_panic is notified, and it performs
certain functions.  It writes an "OS Critical Stop" message to the firmware
System Event Log, turns on the Critical Alarm LED on the Telco Alarms Panel,
and sends a BMC LAN Alert via the firmware SNMP capability, even after the
OS is unavailable.  This module contains a portion of the valinux IPMI
driver in order to communicate with the BMC via IPMI, but none of the IPMI
interfaces used by bmc_panic are exposed so that it will not conflict with
any other IPMI driver module that may be loaded by the kernel.
The panicsel utilities below allow the user to access the firmware System
Event Log and configure the Platform Event Filter table for the new OS
Critical Stop records.  
showsel        - show the System Event Log records
pefconfig      - show and configure the Platform Event Filter table
                 to allow BMC LAN alerts from OS Critical Stop messages,
                 also shows and sets the BMC LAN configuration parameters
hwreset        - to cause the BMC to hard reset the system    
tmconfig       - to set up the BMC Serial port for various modes, such as
                 Terminal Mode (not yet supported in this release).


The Panic Handler Enhancements currently work with platforms that
support the IPMI standard.  If the platform does not support IPMI, these
changes are inert, but the code could be modified for another system
management interface.  The Service Availability Forum is working on an
meta-standard API that could be used to group IPMI and other system
management interfaces under one meta-standard.  When this becomes available,
these Panic Handler Enhancements will conform to that API so that non-IPMI
platforms can be integrated more easily.

The Panic Handler enhancements depend on the CONFIG_BMCPANIC kernel
parameter being set in the kernel config file (/usr/src/linux/.config), 
in order to export two key variables, and include the bmc_panic module.

The panicsel utilities require an IPMI Driver, either the Intel IPMI package
(ipmidrvr, /dev/imb) or the valinux IPMI Driver (/dev/ipmikcs).

4. Code location:
The bmc_panic kernel patch and the panicsel utilities are located at:

More information about the cgl_discussion mailing list