C/R without "leaks"

Oren Laadan orenl at cs.columbia.edu
Thu Apr 16 11:39:05 PDT 2009



Chris Friesen wrote:
> Alexey Dobriyan wrote:
>> On Thu, Apr 16, 2009 at 12:42:17AM +0200, Greg Kurz wrote:
>>> On Wed, 2009-04-15 at 23:56 +0400, Alexey Dobriyan wrote:
>>
>>>> There are sockets and live netns as the most complex example. I'm not
>>>> prepared to describe it exactly, but people wishing to do C/R with
>>>> "leaks" should be very careful with their wishes.
>>> They should close their sockets before checkpoint and find/have some way
>>> to reconnect after. This implies some kind of C/R awareness in the code
>>> to be checkpointed.
>>
>> How do you imagine sshd closing sockets and reconnecting?
> 
> Don't you already have to handle the case where an sshd connection is
> checkpointed, then the system is shutdown and the restore doesn't happen
> until after the TCP timeout?

Any connection in that case is, of course, lost, and it's up to the
application to do something about it. If the application relies on
the state of the connection, it will have to give up (e.g. sshd, and
ssh, die).

However, there are many application that can withstand connection
lost without crashing. They simply retry (web browser, irc client,
db clients). With time, there may be more applications that are
'c/r-aware'.

Moreover, in some cases you could, on restart, use a wrapper to
create a new connection to somewhere (*), then ask restart(2) to
use that socket instead of the original, such that from the user
point of view things continue to work well, transparently.

(*) that somewhere, could be the original peer, or another server,
if it has a way to somehow continue a cut connection, or a special
wrapper server that you right for that purpose.

Oren.



More information about the Containers mailing list