[Openais] trouble getting started with corosync API
Dan Davis
dan.davis at indexengines.com
Mon Mar 29 13:40:45 PDT 2010
Hi,
I'm a newcomer to working with corosync, and I apologize in advance if this is a noob problem. I'm trying the API to see whether it would be appropriate to solving some technical problems with getting multiple nodes of our software to cooperate. Initially to bootstrap a very non-cluster like cooperation mode, but later I'll use more capabilities.
When I start the corosync executive, it appears to work properly (now that I've figured out consensus 1201), but when I start another process that uses the CPG protocol, I get CS_ERR_TRY_AGAIN. I see no correlated activity in the corosync log when I do this. Here's the output of the executive:
[dan at ohio corosync-1.2.0]$ sudo corosync -f
Mar 29 16:22:42 corosync [MAIN ] Corosync Cluster Engine ('1.2.0'): started and ready to provide service.
Mar 29 16:22:42 corosync [MAIN ] Corosync built-in features:
Mar 29 16:22:42 corosync [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
Mar 29 16:22:42 corosync [TOTEM ] Initializing transport (UDP/IP).
Mar 29 16:22:42 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Mar 29 16:22:42 corosync [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine.
Mar 29 16:22:42 corosync [TOTEM ] The network interface [192.168.192.113] is now up.
Mar 29 16:22:42 corosync [SERV ] Service engine loaded: corosync extended virtual synchrony service
Mar 29 16:22:42 corosync [SERV ] Service engine loaded: corosync configuration service
Mar 29 16:22:42 corosync [SERV ] Service engine loaded: corosync cluster closed process group service v1.01
Mar 29 16:22:42 corosync [SERV ] Service engine loaded: corosync cluster config database access v1.01
Mar 29 16:22:42 corosync [SERV ] Service engine loaded: corosync profile loading service
Mar 29 16:22:42 corosync [SERV ] Service engine loaded: corosync cluster quorum service v0.1
It reaches poll_run() in main - I checked in gdb.
Here's the output of testcpg:
[dan at ohio sando]$ cd corosync-1.2.0/test
[dan at ohio test]$ sudo ./testcpg
Local node id is 71c0a8c0
Could not join process group, error 6
[dan at ohio test]$
A major clue is that corosync will not exit on a SIGINT or SIGTERM, even though it ought to do so. I need to kill -9 to get the executive to go down and try again. Is this a known issue in 1.2.0 or something likely correlated with my problem?
I've had the same behavior on with my hand-compiled corosync 1.2.0 on a Fedora 7 box and with an rpm on Fedora 12 with cluster-glue installed and started. Thanks to anyone who can help me get started,
Dan Davis
www.indexengines.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.linux-foundation.org/pipermail/openais/attachments/20100329/4c44872c/attachment.htm
More information about the Openais
mailing list