[RFC][PATCH 0/4][user-cr]: First try at integrating LXC and USER-CR

Oren Laadan orenl at cs.columbia.edu
Mon Mar 1 13:22:35 PST 2010


Suka,

To make these patches available for those who want to try out
lxc + c/r, I pulled them to a separate branch: ckpt-v19-suka.

(The last two did not apply cleanly because of recent changes
to the Makefile so I merged them manually - let me know if it
breaks).

Oren


Sukadev Bhattiprolu wrote:
> Following two sets of patches is an early attempt to integrate LXC and
> USER-CR. 
> 
> Overview:
> 
> Have USER-CR export the core checkpoint and restart functionality into a
> library (/lib/libcheckpoint.a and <usercr.h>) and have LXC link with this
> library.
> 
> TODO: 
> 
> 	1. For now, libcheckpoint.a implements only the restart functionality
> 	   and so only lxc_restart command is implemented. Implementing the
> 	   checkpoint functionality and lxc_checkpoint command can be done
> 	   similarly and is hopefully easier than the restart functionality.
> 
> 	2. The restart() functionality in user-cr makes extensive use of global
> 	   variables and debug code. The API must be extended to properly
> 	   include these variables/debug code in the API.
> 
> 	   Similarly, the 'struct restart_args' may need to be sanitized for
> 	   use in a formal API.
> 
> 	3. lxc_restart command  restarts entire containers only (specifically
> 	   it simulates the --pidns --pids --mount-pty arguments to
> 	   /bin/restart).
> 
> 	4. Link lxc_restart and lxc_checkpoint with the shared library
> 	   liblxc.so (currently links statically)
> 
> 	5. ...
> 
> 
> STATUS:
> 	I was able to checkpoint/restart a simple '/bin/sleep 1000' LXC
> 	container, except for a cgroup naming issue after restart (see below).
> 
> STEPS:
> 
> 1. [USER-CR] Build/install /lib/libcheckpoint.a, /usr/include/usercr.h
>    
>    1.1 Apply the attached [user-cr] patches to the user-cr git tree
>        (I tested with following commit as base)
> 
> 	commit 67cfee9329670ab28eb1a52e94745252b614718f
> 	Author: Oren Laadan <orenl at cs.columbia.edu>
> 	Date:   Mon Feb 22 18:00:06 2010 -0500
> 
>    1.2 Build/install user-cr binaries/libraries/includes
> 
>    	$ make all
> 
> 	$ make install
> 
> 	This should install /lib/libcheckpoint.a and /usr/include/usercr.h
> 
> 2. [LXC] Build lxc_restart using USER-CR API (usercr.h, libcheckpoint.a)
> 
>    2.1 Apply attached [lxc] patches to Daniel Lezcano's lxc.git tree (0.6.5)
> 
>    2.2 Build lxc_restart (this uses static linking for now)
> 
>    	$ make -f Makefile2 lxc_restart
> 
> 3. Create and checkpoint a simple LXC container
> 
> 	$ lxc-execute --name foo --rcfile lxc-macvlan.conf -- /bin/sleep 1000
> 
> 	$ lxc-freeze --name foo
> 
> 	TODO: 
> 		lxc_checkpoint --name foo should checkpoint the container,
> 		For now, use "lxc-ps --name foo" to find pid of lxc-init and
> 		checkpoint using:
> 
> 		$ /bin/checkpoint --output=/tmp/sleep.ckpt <pid-of-lxc-init>
> 
> 	$ lxc-unfreeze --name foo
> 
> 	$ lxc-stop --name foo
> 
> 4. Restart a checkpointed LXC container
> 
> 	$ ./lxc_restart --statefile /tmp/sleep.ckpt --name bar
> 
>    	# Test some common lxc commands after restart
> 
> 	$ lxc-ps --name "bar/1"
> 	CONTAINER    PID TTY          TIME CMD
> 	bar/1       8511 ?        00:00:00 lxc-init
> 	bar/1       8512 ?        00:00:00 sleep
> 
> 	$ lxc-freeze --name "bar/1"
> 
> 	$ grep State /proc/8511/status 
> 	State:	D (disk sleep)
> 
> 	$ grep State /proc/8512/status 
> 	State:	D (disk sleep)
> 
> 	NOTE: 	For some reason, the container name after restart is "bar/1"
> 		instead of "bar".  Due to this, when the lxc_restart is
> 		exiting, I get a "-EBUSY - failed to remove "/cgroup/bar"
> 		error.  I need to fix this still.
> 


More information about the Containers mailing list