[PATCH] cr_tests: Fix hang when robust futex lists are not restored during restart

Matt Helsley matthltc at us.ibm.com
Thu Jul 9 12:22:07 PDT 2009


The robust futex test can hang if the kernel fails to properly set the robust
list pointer. This currently happens during restart. The test should not
hang and instead should report failure.

Use a timeout to ensure that hangs are caught and reported as failure.
The timeout should return ETIMEDOUT. This limits the total amount of time
checkpoint/restart can take so a suitable timeout is essential here.

Signed-off-by: Matt Helsley <matthltc at us.ibm.com>
Reported-by: Serge Hallyn <serue at linux.vnet.ibm.com>
--
Still needs testing.

diff --git a/futex/robust.c b/futex/robust.c
index a52f638..4cda4f7 100644
--- a/futex/robust.c
+++ b/futex/robust.c
@@ -103,6 +103,10 @@ void add_rfutex(struct futex *rf)
 
 void acquire_rfutex(struct futex *rf, pid_t tid)
 {
+	struct timespec timeout = {
+		.tv_sec = 5,
+		.tv_nsec = 0
+	};
 	int val = 0;
 
 	rlist.list_op_pending = &rf->rlist; /* ARCH TODO make sure this assignment is atomic */
@@ -125,7 +129,7 @@ void acquire_rfutex(struct futex *rf, pid_t tid)
 		val = __sync_or_and_fetch(&rf->tid.counter, FUTEX_WAITERS);
 		log("INFO", "futex(FUTEX_WAIT, %x)\n", val);
 		if (futex(&rf->tid.counter, FUTEX_WAIT, val,
-			  NULL, NULL, 0) == 0)
+			  &timeout, NULL, 0) == 0)
 			break;
 		log("INFO", "futex returned with errno %d (%s).\n", errno, strerror(errno));
 		switch(errno) {
@@ -139,8 +143,9 @@ void acquire_rfutex(struct futex *rf, pid_t tid)
 				log("WARN", "EINTR while sleeping on futex\n");
 				continue;
 			case ETIMEDOUT:
-				log("WARN", "ETIMEDOUT while sleeping on futex\n");
-				continue;
+				log("FAIL", "ETIMEDOUT while sleeping on futex.\n");
+				fail++;
+				return;
 			case EACCES:
 				log("FAIL", "FUTEX_WAIT EACCES - no read access to futex memory\n");
 				fail++;


More information about the Containers mailing list