LXC snapshot using overlayfs fsfreeze

Amir Goldstein amir73il at gmail.com
Tue Apr 11 10:37:53 UTC 2017

On Mon, Apr 10, 2017 at 5:20 PM, Tycho Andersen <tycho at docker.com> wrote:
> Hi Amir,
> On Sat, Apr 08, 2017 at 09:35:01PM +0200, Amir Goldstein wrote:
>> [moving this discussion over from fsdevel to containers list and
>> changing the title]
>> On Tue, Apr 4, 2017 at 9:07 PM, Tycho Andersen <tycho at docker.com> wrote:
>> > On Tue, Apr 04, 2017 at 09:59:16PM +0300, Amir Goldstein wrote:
>> >> On Tue, Apr 4, 2017 at 9:01 PM, Tycho Andersen <tycho at docker.com> wrote:
>> >> > On Tue, Apr 04, 2017 at 12:47:52PM -0500, Serge E. Hallyn wrote:
>> >> >> > Would lxc-snapshot gain anything from the ability to fsfreeze an overlay
>> >> >> > mount?
>> >> >>
>> >> >> lxc-snapshot only works on stopped containers.  'lxc snapshot' can do live
>> >> >> snapshots using criu.  Tycho, does that do anything right now to freeze the
>> >> >> fs?
>> >> >
>> >> > Not that I'm aware of (CRIU might, but we don't in liblxc).
>> >> >
>> >> >> I'm not sure that freezing all the tasks is necessarily enough to settle
>> >> >> the fs, but I assume you're doing something about that already?
>> >> >
>> >> > I suspect it's not, but we're not doing anything besides freezing the
>> >> > tasks. In fact, we freeze the tasks by using the freezer cgroup,
>> >> > which itself is buggy, since the freezer cgroup can race with various
>> >> > filesystems. So, freezing tasks is hard, and I haven't even thought
>> >> > about how to freeze the fs for real :)
>> >> >
>> >> > But in any case, an fs freezing primitive does sound useful for
>> >> > checkpoint restore, assuming that we're right and freezing the tasks
>> >> > is simply not enough.
>> >> >
>> >>
>> >> So I already asked Pavel that question and he said that freezing
>> >> the tasks is enough. I am not convinced it is really enough to bring
>> >> a file system image (i.e. underlying blockdev) to a quiescent state,
>> >> but I think it may be enough for getting a stable view of the mounted
>> >> file system, so the files could be dumped somewhere.
>> >> I am guessing is what lxc snapshot does?
>> >
>> > Yes, lxc snapshot is basically just a frontend for CRIU.
>> >
>> >> I still didn't understand wrt lxc snapshot, is there a use case for
>> >> taking live snapshots without using CRIU? (because freezer cgroup
>> >> mentioned races or whatnot?).
>> >
>> > No, I think CRIU is the only project that will ever attempt to do
>> > checkpoint restore this way ;-).
>> I don't doubt that.
>> My question is whether it is interesting to snapshot a live container fs
>> without having to checkpoint not restore at all.
>> >  CRIU supports two different ways of
>> > freezing tasks: one using the freezer cgroup and one without. The one
>> > without doesn't work against fork bombs very well, and the one with
>> > doesn't work because of some filesystems. So it's mostly a container
>> > engine implementation choice which to use.
>> >
>> >> It's definitely possible with btrfs and if my overlayfs freeze patches
>> >> are not terribly wrong, then it should be easy with overlayfs as well.
>> >> Does lxc snapshot already support live snapshot of btrfs container?
>> >
>> > Yes, it does. It freezes the tasks via the cgroup freezer and then
>> > does a btrfs snapshot of the filesystem once the tasks are frozen.
>> >
>> So what I am not sure is if there are use cases where criu cannot be
>> used or maybe there are reasons not to use it. and for these cases
>> if it may be interesting to support snapshot of the storage by:
>> - fsfreeze -f
>> - copy upper dir
>> - fsfreeze -u
> I don't see a reason for it, but perhaps I'm not being very
> imaginative. Without the memory state, the potentially inconsistent fs
> state doesn't seem very helpful.

Hi Tycho,

The use case is quite simple really.
Same use case as any LVM snapshot and btrfs snapshot on a
non-containerized system:
Before installing some stuff, sync, take a snapshot of the root fs and
you can always
restart your system from that snapshot of root fs if something went wrong.

You don't need to save any memory state for that and you don't need to dump any
processes info for that.
It's simply a snapshot that you can *start* from and not *resume* from.

I am quite surprised to learn that containers don't have that
functionality (they don't?).
I guess it may be because containers CAN freeze processes, so they do it,
but it's really not a prerequisite for live *image* snapshot -
fsfreeze is enough.

The thing is it is easy to snapshot container image based on LVM and btrfs today
(lvm snapshot command does fsfreeze on the file system on top of lvm volume),
but it is not possible to snapshot container image based on overlayfs
the same way.

My patches implement fsfreeze for overlayfs, and quite frankly, I am
taken by surprise,
that container users don't find this useful. I may be missing something.


More information about the Containers mailing list