[cgl_discussion] Question for TEMs/ISVs/OEMs regarding pthrea d requirements

Thu Feb 20 01:37:52 PST 2003

Perez-Gonzalez, Inaky wrote:
>>>>Inaky (me)
>>>
>>>George Anzinger
>>
>>Mohamed Abbas
> 
> 
>>>>// So now ZF = 0, eax = *futex = LID0 | WP, ecx = MY_LID
>>>>jz got_it // ZF = 0, so we don't take the jump
>>>>ecx = eax;  // ecx = LID0 | WP
>>>>ecx |= WP;  // mark again the WP, although its there in this case
>>>
>>>
>>>I think this is an error.  I understand that you want to mark 
>>>contention on the futex as soon as possible, however, I 
>>
>>think it will 
>>
>>>cause problems if you mark it BEFORE you have set up the kernel 
>>>structures to handle the exit from a contented futex.  Better, I 
>>>think, to just leave it to the kernel to mark it contended once it 
>>>either has  all the structures in place, or at least will 
>>
>>have prior 
>>
>>>to letting any one else see the flag.  All that is at risk here is 
>>>that the current holder will exit and leave the futex free (or that 
>>>another will then grab it).  If you mark it here, you will have to 
>>>handle the case of the holder beating the contender to the 
>>
>>kernel and 
>>
>>>not knowing what to do...
> 
> 
> That case is easy to handle, because you can do it with the same mechanism
> [if the value changes, do something]. Actually, what I think is the best
> sollution for all the problems [and the ones you comment on afterwards, is
> that value passing and if it changed, go to user space and re-contend]. It
> can be even optimized by re-contending in the kernel [to avoid the overhead
> of going to back to userspace]; I don't know still if it can be made,
> though.
> 
> 
>>What I am saying is that the value is not important.  All it needs is 
>>a pointer to it.  Once the kernel has everything locked down it can 
>>look at the value and decide what to do.  I don't think the kernel 
>>needs to return to the user until the lock is granted in his favor. 
>>What are the possibilities?
> 
> 
> I agree with that, but that it can be done is another matter:
>  
> 
>>a) The lock is held by another user and is uncontended, i.e. this is 
>>the first contention. (normal)
> 
> 
> Then you need to set the WAITERS_PRESENT bit atomically.
> 
> 
>>b) The lock is held by another user and is contended, i.e. this is a 
>>second or higher contender (also normal)
> 
> 
> You need to make sure the WAITERS_PRESENT bit is set.

If it is not, then you have a corrupt mutex.  Time to do somthing 
drastic.  Actually I assumed that the WAITERS_PRESENT bit is what made 
the kernel think it was case b.  I suppose there could be kernel 
tables for the mutex which also say the same thing.  In case they 
don't you have corruption :(
> 
> 
>>c) The lock is held by the caller (this is a recursion error and 
>>should cause termination of the caller with ext ream prejudice)
> 
> 
> I'd return -EDEADLOCK or something like that, but probably termination makes
> sense too ...

Too many folks don't check for errors  :(
> 
> 
>>d) The lock is not held (i.e. the holder has gone away while the call 
>>was making its way to the kernel, fine, give the lock to the caller 
>>and continue)
> 
> 
> So you need to set your locker ID.

Yes, or go back and retry from user space.  May as well do it in the 
kernel as it avoids possible contention and you are already here.
> 
> Thus, doing a, b and d is the contend operation we did in user space. Now if
> we can only do it in kernel space, this works. 

Actually a, and b require kernel help.  For d if you get to the 
kernel, something changed on the way in.

> So question for the readers:
> 
> Can I access with a normal pointer data that is in a user page that I have
> pinned in memory?(with pinned I mean I know it is present and I did
> get_page() on it) - in other words, without having to use get_user(),
> something like mapping it also in the kernel address space?
> 
> 
> 
>>I think that covers it.  Knowing what the state was that cause entry 
>>into the kernel is basically a big so what.  One of the above will be 
>>correct and we can continue.  I suppose we need to also contend with 
>>the case:
>>e) The lock is garbage (should handle the same as c)
>>
>>The issue with e) is how to notice that it is garbage.  Just 
>>what does 
>>a valid lock look like.  We also need to contend with locks that are 
>>held by processes that have terminated.  Since, on the high speed 
>>path, the kernel does not know about the lock, how do we handle locks 
>>that are left by such processes, or must the process "register" a 
>>mutex before first use.  This would allow the kernel to check each 
>>mutex on exit to insure that the exiting task does not hold it.
> 
> 
> This is taken care of by the robust mutex support. Any contender that jumps
> in the kernel and sees the futex_q has no registered owner needs to look it
> up; if it finds none, it will be assumed it is garbage and the recovery
> process will be started if so is requested.
> 
> If it finds it, it sets it in the futex data and also sets a link from the
> task structure to the futex data, so that on process death, it can be
> accounted for.

Great, I think we understand and agree at this point.  I assume that 
the mapping you want is possible, but I don't know how to do it.
> 
> Inaky Perez-Gonzalez -- Not speaking for Intel - opinions are my own [or my
> fault]
> 
> 

-- 
George Anzinger   george at mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml