[Openais] [Corosync] [Patch] [FreeBSD] coroipcc_service_connect() returns 0 in some cases

Jerome Flesch jerome.flesch at netasq.com
Tue Mar 30 06:54:21 PDT 2010


Steven Dake a écrit :
> On Tue, 2010-03-30 at 09:57 +0200, Jerome Flesch wrote:
>>> Patch looks good for commit
>>>
>> Committed.
>>
>>
>>> responses to questions inline
>>>
>>> On Mon, 2010-03-29 at 11:10 +0200, Jerome Flesch wrote:
>>>> Hello,
>>>>
>>>> I just had a weird case where SaClmInitialize() returned the value 0 to 
>>>> me. Since this value isn't valid for a enum SaAisErrorT, I investigated 
>>>> and I found out that corosync/coroipcc.c:coroipcc_service_connect() can 
>>>> return 0 in case Corosync was compiled with _POSIX_THREAD_PROCESS_SHARED 
>>>> <= 0 (ie on a FreeBSD system) when there is a problem with the IPC 
>>>> semaphores. This case is actually likely to happen on a FreeBSD system 
>>>> due to IPC default limits.
>>>>
>>>> In the process, I also changed the way the variable 'res' was used: 
>>>> Since it's an enum, I've assumed it doesn't really make sense to use it 
>>>> to store return values of system calls, so I replaced it by sys_res.
>>>>
>>>> Also, I was wondering if there could be a way to get more details when 
>>>> coroipcc_service_connect() fails. CS_ERR_LIBRARY and CS_ERR_TRY_AGAIN 
>>>> are not exactly what I would call meaningful error codes ... It's not 
>>>> very handy to have to try to guess if it failed because permissions on 
>>>> the Unix socket are wrong or because IPC limits have been reached or 
>>>> because an upper layer failed for some reason. I assume adding error 
>>>> codes like CS_ERR_SEM, CS_ERR_SHM, could be an option but letting such 
>>>> level of implementation details be visible in the error codes is not a 
>>>> valid option, right ?
>>>>
>>> yes i prefer to have less error codes with no implementation details.
>>>
>>>> Right now, when I have to diagnose this kind of issue, I'm running a 
>>>> patched version of Flatiron including the following macro (in lib/util.h):
>>>>
>>>> +#ifdef DEBUG
>>>> +#define DPRINTX(...) do {                                            \
>>>> +     fprintf(stderr, "Corosync lib: %s:L%d: ", __FILE__, __LINE__);  \
>>>> +     fprintf(stderr, __VA_ARGS__);                                   \
>>>> +     fprintf(stderr, "\n");                                          \
>>>> +   } while (0)
>>>> +#else
>>>> +#define DPRINTX(...)
>>>> +#endif
>>>>
>>>> Is there a more adequate way of getting debug information from the 
>>>> libraries themselves ?
>>>>
>>>> I was thinking of a slightly more elegant way of doing it. I could add 
>>>> functions like for example: 'void coroipcc_set_lasterror(const char 
>>>> *str)', 'const char *coroipcc_get_lasterror(void)' ? Using 
>>>> pthread_setspecific(), it would be possible to make it thread-safe. Do 
>>>> you think it would be worth I spend some time on making a patch for that ?
>>>>
>>> sounds interesting.  I wouldn't mind some diagnostic capability coming
>>> out of the apis, but the thread safety was a problem I didn't know how
>>> to solve.  After reading the man page for pthread_setspecific, it looks
>>> like it might solve the problem.
>>>
>>> I would prefer apis something like this in coroipcc.c
>>> static void coroipcc_lasterror_set (const char *fmt, ...);
>>>
>> It makes sense. It would allow to get more details like for instance 
>> strerror(errno).
>>
>>
>>> and in coroipcc.h
>>> extern const char *coroipcc_lasterror_get (void).
>>>
>>> Curious how you plan to create the key and garbage collect the
>>> pthread_key values...
>>>
>> It's pretty simple actually: pthread_key_create() allows to attach a 
>> destructor to a pthread key. This destructor is called each time a 
>> thread stops, with as argument the pointer stored with 
>> pthread_setspecific for this thread. So if we store a simple buffer with 
>> pthread_setspecific, we can simply attach 'free()' to the key :)
>>
>>
> 
> How do you pthread_key_create?  Use pthread_once in each api call?  I
> want to avoid a global init function for apis.  pthread_once may have
> performance implications if it takes a mutex each time it runs.  Another
> option is to use a constructor section in the library.
> 
> Like to see a patch..
> 
And here it is. Of course, it's more a draft than a definitive patch. 
Also, it would be nice to know if we go for this kind of implementation 
or if we go for Angus' idea using a callback.

FYI, with this patch, when I run test/cpgverify without having a 
Corosync running, I get:
% ./cpgverify 
 

Couldn't initialize CPG service 6 : coroipcc.c, L750: connect() failed 
(61: Connection refused)


> Regards
> -steve
> 
>>> This could have good implications for the other apis beyond coroipcc...
>>>
>>> Regards
>>> -steve
>>>
>>>> _______________________________________________
>>>> Openais mailing list
>>>> Openais at lists.linux-foundation.org
>>>> https://lists.linux-foundation.org/mailman/listinfo/openais
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: coroipcc_lasterror_get.patch
Type: text/x-patch
Size: 12749 bytes
Desc: not available
Url : http://lists.linux-foundation.org/pipermail/openais/attachments/20100330/55050e15/attachment-0001.bin 


More information about the Openais mailing list