[cgl_discussion] Use case - Boot cycle detection
ikebe.takashi at lab.ntt.co.jp
Wed Apr 13 05:22:14 PDT 2005
The following is a use case for a boot cycle detection. This
addresses SMM6.0 Boot Cycle Detection on CGL Specification 3.0.
Please feel free to comment / suggestion.
OSDL CGL specifies that carrier grade Linux shall provide support for
repeating reboot cycle due to recurring failures. This detection should
happen in user space before system services are started. This type of
failure requires a response due to the negative impact of repeatedly
taking down services. A configurable policy is needed to set thresholds
of cycling and desired shutdown actions, such as exponential back off,
shutdown, or notifying administrators.
Mainline acceptance and distro acceptance.
System administrators use the function during server operation.
System administrators activate the function during setup.
During operation, generally system administrators monitor the system
health from remote operation center. The function enables to detect
reboot cycle due to recurring failures by shutting down the system or
notifying the operator via network.
The function should have following functions at least;
1.The counter of recurring reboot.
2.The function which resets the counter.
3.The function which power off the machine.
The functions increment the counter on each boot time, and if the system
boots up normally, then the function resets the counter. If the counter
exceeds thresholds, then the function shouts down the system.
System administrators can know the system error via machine shutting down.
In addition to above functions, following functions may increase the
The function which report the reboot status to remote operation node via
(does not have function yet, but soon available.)
NTT Network Service Systems Laboratories
9-11, Midori-Cho 3-Chome Musashino-Shi,
Tokyo 180-8585 Japan
Tel : +81 422 59 4246, Fax : +81 422 60 4012
e-mail : ikebe.takashi at lab.ntt.co.jp
More information about the cgl_discussion