[PATCH 1/1] s390: actually restart syscalls after sys_restart

Serge E. Hallyn serue at us.ibm.com
Thu Jan 21 20:11:59 PST 2010


Quoting Oren Laadan (orenl at cs.columbia.edu):
> On Thu, 21 Jan 2010, Serge E. Hallyn wrote:
> 
> > Quoting Oren Laadan (orenl at cs.columbia.edu):
> > > Hi,
> > > 
> > > I'm not sure thix fix is totally correct. 
> > > 
> > > First, a reminder of how -ERESTART... errors behave:
> > > 
> > > 1) If a (user) signal handler was *not* called then:
> > >   -ERESTARTSYS:    re-execute syscall
> > >   -ERESTARTNOHAND:    re-execute syscall   
> > >   -ERESTARTNOINTR:    re-execute syscall
> > >   -ERESTART_RESTARTBLOCK:   arrange to call sys_restart_syscall
> > > 
> > > 2) If a (user) signal handler *was* called then:
> > >   -ERESTARTSYS:    re-execute syscall iff SA_RESTART (else EINTR)
> > >   -ERESTARTNOHAND:    return -EINTR
> > >   -ERESTARTNOINTR:    re-execute syscall
> > >   -ERESTART_RESTARTBLOCK:   return -EINTR
> > > 
> > > The difference between x86 and s390 is _when_ (and how) the task state is
> > > altered (to arrange for re-execution), in case of task freezing:
> > > 
> > > * In x86, changes occur _after_ the task is frozen, and the work is split 
> > > between handle_signal() and do_signal().
> > > 
> > > * In s390, some changes occur _before_ the task is frozen, and possibly 
> > > undone after the freeze if a signal handler was called. All work is done 
> > > is do_signal(). Because of that, our checkpoint only sees a partial view 
> > > of the task's state.
> > > 
> > > Now back to the proposed fix: it isn't sufficient because:
> > > 
> > > (a) If the process has a signal pending when checkpointed, then the state 
> > > already reflects some changes by do_signal(). If the signal leads to a 
> > > (user) signal handler after restart, then do_signal() (after restart) will 
> > > _not_ undo the changes originally done before the checkpoint.
> > > 
> > > This issue holds for ERESTARTSYS, ERESTARTNOHAND and ERESTART_RESTARTBLOCK.
> > > 
> > > Note that do_signal() after restart will not do the changes again either
> > > in case of -ERESTART.... value, because the regs->svcnr was set to 0 and
> > > recorded that way.
> > > 
> > > (b) Same, but also if the process didn't have a signal at checkpoint 
> > > time, but will have one during/after restart but before resuming.
> > > 
> > > (c) Because do_restart() selects its return value from gprs[2] (upon 
> > > successful completion), then when it tries to resume userspace and 
> > > enters do_signal() it will have -EINTR (which isn't the original return 
> > > value!), and therefore not set your new TIF_RESTARTBLOCK bit. If the task 
> > > is now checkpointed again, that state will be lost.
> > 
> > Sorry if I'm misunderstanding what you're saying, but the return value
> > after exiting sys_restart() won't be -EINTR now, bc my patch sets it to
> > __NR_restart_syscall.
> 
> True ... but the consequences are the same:
> 
> Consider task A is checkpointed, then restarted, but before it completely
> resumes userspace, it is checkpointed again. For the second checkpoint,
> do_signal() will see svcnr==0, and even if you set svcnr, then gprs[2]>0.
> Therefore, do_signal() will not set (again) the TIF_RESTARTBLOCK flag
> because that happens only for svcnr!=0 and gprs[2]=-ERESTART_RESTARTBLOCK.
> 
> So the second checkpoint image lost that particular state you were 
> interested. Therefore, when you restart from that image, the gprs[2] will
> be set correctly (if ignoring issues a, b), but the TIF_RESTART_SVC won't
> be set again.
> 
> > 
> > > As a side note: I noticed that on x86 there are {checkpoint,restore}_thread() 
> > > that save the thread flags and restore the necessary flags. They also 
> > > check the flags - at checkpoint to ensure the task is checkpointable, and 
> > > at restore for sanity.  Is there not a need for something similar for s390?
> > > 
> > > That would be an appropriate place to svae and restore a TIF_RESTARTBLOCK
> > > thread flag and fix (c) above, should you stick to that method.
> > 
> > We actually want that flag cleared - the flag is only there so that
> > checkpoint can tell that it needs to set the 'should_restart' flag in
> > the thread info in the checkpoint image.  sys_estart does not then reset
> > that flag, it instead does the steps which are done in the last block
> > of do_signal(): set gprs[2] to NR_syscall_restart and add the TIF_RESTART_SVC
> > thread flag.
> 
> The problem is that sys_restart *does not even set* that flag, and 
> do_signal() will not set that flag either after restarting from a 
> checkpoint that had should_restart set.  Therefore a subsequent checkpoint
> that occurs _before_ the process really resumes to userspace will not 
> carry that information.

Good point.  But your earlier proposal of restoring TIF_RESTARTBLOCK in a
arch_restore_thread() isn't quite right, because it needs to get cleared
at some point, and we wouldn't know when to do that.  I'll have to think
about this more.

(One possibility is to have the top of do_signal() also set
TIF_RESTARTBLOCK if TIF_RESTART_SVC is already set, which it would be
if do_signal were called immediately after sys_restart and before
the original syscall were restarted.  But I suspect there is a complicating
subtlety there as well...)

thanks,
-serge


More information about the Containers mailing list