[cgl_discussion] [Fwd: [Fastboot] [PATCH][WIP] Using kexec for crash dumps in LKCD]

Mika Kukkonen mika at osdl.org
Thu Feb 6 08:00:03 PST 2003


FYI; I deleted the actual patch, see relevant mailing list archives
for it.

--MiKu

-----Välitetty viesti-----

From: Suparna Bhattacharya <suparna at in.ibm.com>
To: Eric W. Biederman <ebiederm at xmission.com>, lkcd-devel at lists.sourceforge.net
Cc: fastboot at osdl.org, linux-kernel at vger.kernel.org
Subject: [Fastboot] [PATCH][WIP] Using kexec for crash dumps in LKCD
Date: Day, 06 Feb 2003 21:26:15 +0530

This is an extension to LKCD to make use of Eric
Biederman's kexec implementation to delay the actual
writeout of a crashdump to disk to happen after a 
memory preserving reboot of a new kernel.

The real thanks for this goes to Dave Winchell and the 
rest of the Mission Critical Linux folks for first
implementing such an approach in mcore using Werner
Alamesberger's bootimg, and letting us learn and borrow 
ideas from it.

There is a subtle but crucial difference in the design 
of the scheme we use to get spare pages to save the dump 
which potentially enables us to save a complete memory 
snapshot (not just kernel pages) if we can get a good 
compression efficiency (i.e. theoretically limited 
only by the degree of compressability of the memory 
state and working memory space that must be left for the 
dump and kernel bootup code).

This code is still somewhat raw and there's a list of 
todo's and improvements in my mind, and loopholes to fix, 
but I decided it was high time to put this out for a start, 
so anyone who is interested could start taking a look and 
playing with it, and maybe help out if they like.

I plan to fold it into lkcd cvs tomorrow if possible unless 
anyone notices a major regression of existing lkcd 
functionality (i.e.  without CONFIG_CRASHDUMP_MEMDEV and 
CRASH_DUMP_SOFT_BOOT). I have  tried out Alt+Sysrq+d and a 
simple panic from a module as a sanity check.

(I haven't tried it out for a true panic yet - going there
bit by bit :))

In any case, I'll tag the cvs tree before checking in.

Merging and testing has been rather time consuming, so 
would appreciate if anyone planning to check in any changes 
before I do would let me know ahead of time.

I'm considering also checkin in a TODO file at the
top of the 2.5 directory in CVS to keep track of what
needs to be done. Would that be a good idea ?
I'll probably also post the TODOs on the mailing list.

OK, going ahead:

Steps to use:
--------------

A. Patching the kernel:
1) Patch vanilla 2.5.59 kernel with the kexec patches for
   2.5.59.
   I picked the ones from the OSDL site which Andy Pfiffer had
   mentioned in an earlier post
 	kexec for 2.5.59 (based upon the version for 2.5.54)
	http://www.osdl.org/cgi-bin/plm?module=patch_info&patch_id=1442

	hwfixes that makes it work for me (same as for 2.5.58):
	http://www.osdl.org/cgi-bin/plm?module=patch_info&patch_id=1444

2) Apply the latest dump patches from lkcd cvs
	i.e. apply the kernel patches under 2.5/patches
    (expect to see one reject in the 2nd hunk for reboot.c
     when applying notify_die.patch - you could ignore it for
     now) 
	and  copy the dump driver files at the appropriate
	places

3) Apply the attached patch (kexecdump.patch)

B. Kernel Build Configuration settings 
   You'll need CRASH_DUMP to be built into the kernel (not
   as a module) to be able to dump across a kexec boot
   CRASH_DUMP_BLOCKDEV, CRASH_DUMP_COMPRESS_GZIP are needed
   as we use them today
   New options you'll need CRASH_DUMP_MEMDEV (memory dump 
   driver) and CRASH_DUMP_SOFTBOOT (kexec based dumping) 

C. Run-time setup
   A new dump flag for memory-save-and-dump-after-boot 
   DUMP_FLAGS_SOFTBOOT has been introduced (0x2), which
   would need to be turned on in the dump flags.

   After running lkcd config as usual, there is one
   extra step needed to load the kernel to be kexec'ed
   This involves executing "kexec -l" with the regular
   command line options (derived from you /proc/cmdline)
   and one extra boot parameter, obtained as follows:
   crashdump=`cat /proc/sys/kernel/dump/addr`
   (This tells the new kernel where to find a saved
   in-memory crash dump from previous boot)

   e.g.
   kexec -l --command-line="root=806 console=tty0 console=
   ttyS0,38400 crashdump=`cat /proc/sys/kernel/dump/addr`"	
   <kernel bzImage>

D. On panic, the dump is saved in memory and then kexec is
   used to boot up a new kernel (instead of a regular reboot)
   If Alt+Sysrq+d is pressed then the dump is just saved
   in memory without rebooting

   [Note: The first few times you try it, it might be a 
   good idea to drop into "init 1" and unmount most filesystems 
   or remount them as read-only , before you force the panic
   - thanks to Andy Pfiffer for the tip ]

E. After running "lkcd config" triggers a writeout
   to the dump disk of the previously saved dump in memory.

F. From here on, one can run "lkcd save" as usual to generate
   the /var/log/dump/* files for analysis.


Regards
Suparna



-- 
Suparna Bhattacharya (suparna at in.ibm.com)
Linux Technology Center
IBM Software Labs, India

----





More information about the cgl_discussion mailing list