[PATCH] Clear the objhash before completing restart, but delay free until later
danms at us.ibm.com
Mon Oct 18 08:03:13 PDT 2010
MH> If we postpone clearing the object hash until restart returns to
MH> userspace there can be a race where the restarted tasks behave
MH> differently due to the references held by the objhash. One
MH> specific example of this is restarting half-closed pipes. Without
MH> this patch we've got a race between the coordinator -- about to
MH> clear the object hash -- and two restarted tasks connected via a
MH> half-closed pipe. Because the object hash contains a reference to
MH> both ends of the pipe one end of the pipe will not be closed and
MH> EPIPE/SIGPIPE won't be handled when the reading from the pipe for
MH> instance. As far as the restarted userspace task can tell the pipe
MH> may briefly appear to re-open. Moving the object hash clear
MH> prevents this race and others like it.
MH> Note that eventually the coordinator would close the pipe and
MH> correct behavior would be restored. Thus this bug would only
MH> affect the correctness of userspace -- after a close() the pipe
MH> may briefly re-open and allow more data to be sent before
MH> automatically closing again.
Sure, this sounds fine and I'll be glad to put it into the patch
MH> You might simplify this by making the queue portion into a
MH> separate patch. Then we can discuss that independently of moving
MH> the objhash clear call.
Hmm, I'm not sure what you mean exactly. The only thing in the patch
other than the queue, is the one-line additional call to _clear().
Without the additional call, the queue is nothing but overhead...
IBM Linux Technology Center
email: danms at us.ibm.com
More information about the Containers