[Openais] I got problems of openais , help me! please! thank you very much

Tiantian Liu tiantianl at gmail.com
Thu Sep 7 01:29:50 PDT 2006


Hi:
 I am doorman of Openais.Now I use 2 computers (I call them nodes later) and
a swither to build
a cluster by openais.I download openais-0.80.tar.gz from the network. I used
"make" and "make
install" to compileand install the cluster software no both nodes.

 no node1:
  (1) I use "keygen" generated a authkey file in /etc/ais/authkey.
  (2) I config the node's NIC static IP address as 192.168.5.100
  (3) I copy the openais.conf and amf.conf (which in /conf directory) to
/etc/ais

  the original openais.conf is very simple, just has totem section, logging
section, and amf
section. The contain of openais.conf is:

 # Please read the openais.conf.5 manual page
 totem {
   version: 2
   secauth: off
   threads: 0
   interface {
   ringnumber: 0
   bindnetaddr: 192.168.5.0
   mcastaddr: 226.94.1.1
   mcastport: 5405
  }
 }

 logging {
  to_stderr: yes
  to_file: yes
  logfile: /tmp/ais
  debug: off
  timestamp: on
 }

 amf {
  mode: disabled
 }
  (4) I changed the bindnetaddr to 192.168.5.0, and do nothing else for the
openais.conf.

 there is no "timeout" section as the QUIKSTART descripte??!! should I add
the timeout section by
myself???

 no node2:
  (1) I copy the authkey file of node1, and I use the command to install the
key:
   install -D --group=3D0 --owner=3D0 --mode=3D0400 "/path_to_copy"
/etc/ais/authkey
   there is nothing prompt when I enter this command.
  (2) I config the node2 NIC static IP address: 192.168.5.200. Just as what
I did on node1, I
    changed the bindnetaddr to 192.168.5.0, and do nothing else for the
openais.conf.

 So, after I did these jobs, the two openais.conf files on each node are
same!!!! I think there must
be something wrong!! please tell me how to config the openais.conf, thank
you very much!!

  I run the openais execute--"aisexec" on the node1 at first, there is much
prompt message like this:

[MAIN ] AIS Executive Service RELEASE 'subrev 1152 version 0.80'
[MAIN ] Copyright (C) 2002-2006 MontaVista Software, Inc and contributors.
[MAIN ] Copyright (C) 2006 Red Hat, Inc.
[MAIN ] openais component openais_cpg loaded.
[MAIN ] Registering service handler 'openais cluster closed process group
service v1.01'
[MAIN ] openais component openais_cfg loaded.
[MAIN ] Registering service handler 'openais configuration service'
[MAIN ] openais component openais_msg loaded.
[MAIN ] Registering service handler 'openais message service B.01.01'
[MAIN ] openais component openais_lck loaded.
[MAIN ] Registering service handler 'openais distributed locking service
B.01.01'
[MAIN ] openais component openais_evt loaded.
[MAIN ] Registering service handler 'openais event service B.01.01'
[MAIN ] openais component openais_ckpt loaded.
[MAIN ] Registering service handler 'openais checkpoint service B.01.01'
[MAIN ] openais component openais_amf loaded.
[MAIN ] Registering service handler 'openais availability management
framework B.01.01'
[MAIN ] openais component openais_clm loaded.
[MAIN ] Registering service handler 'openais cluster membership service
B.01.01'
[MAIN ] openais component openais_evs loaded.
[MAIN ] Registering service handler 'openais extended virtual synchrony
service'
[TOTEM] Token Timeout (1000 ms) retransmit timeout (238 ms)
[TOTEM] token hold (180 ms) retransmits before loss (4 retrans)
[TOTEM] join (100 ms) consensus (200 ms) merge (200 ms)
[TOTEM] downcheck (1000000 ms) fail to recv const (50 msgs)
[TOTEM] seqno unchanged const (30 rotations) Maximum network MTU 1500
[TOTEM] window size per rotation (50 messages) maximum messages per rotation
(17 messages)
[TOTEM] send threads (0 threads)
[TOTEM] RRP token expired timeout (238 ms)
[TOTEM] RRP token problem counter (2000 ms)
[TOTEM] RRP threshold (10 problem count)
[TOTEM] RRP mode set to none.
[TOTEM] heartbeat_failures_allowed (0)
[TOTEM] max_network_delay (50 ms)
[TOTEM] HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0
[TOTEM] Receive multicast socket recv buffer size (221184 bytes).
[TOTEM] Transmit multicast socket send buffer size (221184 bytes).
[TOTEM] The network interface [192.168.5.200] is now up.
[TOTEM] Created or loaded sequence id 0.192.168.5.200 for this ring.
[TOTEM] entering GATHER state.
[SERV ] Initialising service handler 'openais extended virtual synchrony
service'
[SERV ] Initialising service handler 'openais cluster membership service
B.01.01'
[SERV ] Initialising service handler 'openais availability management
framework B.01.01'
[SERV ] Initialising service handler 'openais checkpoint service B.01.01'
[SERV ] Initialising service handler 'openais event service B.01.01'
[SERV ] Initialising service handler 'openais distributed locking service
B.01.01'
[SERV ] Initialising service handler 'openais message service B.01.01'
[SERV ] Initialising service handler 'openais configuration service'
[SERV ] Initialising service handler 'openais cluster closed process group
service v1.01'
[SYNC ] Not using a virtual synchrony filter.
[MAIN ] AIS Executive Service: started and ready to provide service.
[TOTEM] The token was lost in state 2 from timer 8b90b60
[TOTEM] entering GATHER state.
[TOTEM] The token was lost in state 2 from timer 8b90b60
[TOTEM] entering GATHER state.
[TOTEM] The token was lost in state 2 from timer 8b90b60
[TOTEM] entering GATHER state.
[TOTEM] The token was lost in state 2 from timer 8b90b60
[TOTEM] entering GATHER state.
[TOTEM] The token was lost in state 2 from timer 8b90b60
[TOTEM] entering GATHER state.
[TOTEM] The token was lost in state 2 from timer 8b90b60
[TOTEM] entering GATHER state.
[TOTEM] The token was lost in state 2 from timer 8b90b60
[TOTEM] entering GATHER state.
[TOTEM] The token was lost in state 2 from timer 8b90b60
[TOTEM] entering GATHER state.
[TOTEM] The token was lost in state 2 from timer 8b90b60
    ...................

 then enter infinite loop to show "[TOTEM] The token was lost in state 2
from timer 8b90b60"
 what is wrong???!!!

 no node2, I just run aisexec too, want let the node2 to join the
cluster,but message show:

[MAIN ] AIS Executive Service RELEASE 'subrev 1152 version 0.80'
[MAIN ] Copyright (C) 2002-2006 MontaVista Software, Inc and contributors.
[MAIN ] Copyright (C) 2006 Red Hat, Inc.
[MAIN ] openais component openais_cpg loaded.
[MAIN ] Registering service handler 'openais cluster closed process group
service v1.01'
[MAIN ] openais component openais_cfg loaded.
[MAIN ] Registering service handler 'openais configuration service'
[MAIN ] openais component openais_msg loaded.
[MAIN ] Registering service handler 'openais message service B.01.01'
[MAIN ] openais component openais_lck loaded.
[MAIN ] Registering service handler 'openais distributed locking service
B.01.01'
[MAIN ] openais component openais_evt loaded.
[MAIN ] Registering service handler 'openais event service B.01.01'
[MAIN ] openais component openais_ckpt loaded.
[MAIN ] Registering service handler 'openais checkpoint service B.01.01'
[MAIN ] openais component openais_amf loaded.
[MAIN ] Registering service handler 'openais availability management
framework B.01.01'
[MAIN ] openais component openais_clm loaded.
[MAIN ] Registering service handler 'openais cluster membership service
B.01.01'
[MAIN ] openais component openais_evs loaded.
[MAIN ] Registering service handler 'openais extended virtual synchrony
service'
[TOTEM] Token Timeout (1000 ms) retransmit timeout (238 ms)
[TOTEM] token hold (180 ms) retransmits before loss (4 retrans)
[TOTEM] join (100 ms) consensus (200 ms) merge (200 ms)
[TOTEM] downcheck (1000000 ms) fail to recv const (50 msgs)
[TOTEM] seqno unchanged const (30 rotations) Maximum network MTU 1500
[TOTEM] window size per rotation (50 messages) maximum messages per rotation
(17 messages)
[TOTEM] send threads (0 threads)
[TOTEM] RRP token expired timeout (238 ms)
[TOTEM] RRP token problem counter (2000 ms)
[TOTEM] RRP threshold (10 problem count)
[TOTEM] RRP mode set to none.
[TOTEM] heartbeat_failures_allowed (0)
[TOTEM] max_network_delay (50 ms)
[TOTEM] HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0
[TOTEM] Receive multicast socket recv buffer size (262142 bytes).
[TOTEM] Transmit multicast socket send buffer size (262142 bytes).
[TOTEM] The network interface [192.168.5.100] is now up.
[TOTEM] Created or loaded sequence id 5504.192.168.5.100 for this ring.
[TOTEM] entering GATHER state.
[SERV ] Initialising service handler 'openais extended virtual synchrony
service'
[SERV ] Initialising service handler 'openais cluster membership service
B.01.01'
[SERV ] Initialising service handler 'openais availability management
framework B.01.01'
[SERV ] Initialising service handler 'openais checkpoint service B.01.01'
[SERV ] Initialising service handler 'openais event service B.01.01'
[SERV ] Initialising service handler 'openais distributed locking service
B.01.01'
[SERV ] Initialising service handler 'openais message service B.01.01'
[SERV ] Initialising service handler 'openais configuration service'
[SERV ] Initialising service handler 'openais cluster closed process group
service v1.01'
[SYNC ] Not using a virtual synchrony filter.
[MAIN ] AIS Executive Service: started and ready to provide service.
[TOTEM] Creating commit token because I am the rep.
TOTEM] Saving state aru 0 high seq received 0
[TOTEM] Storing new sequence id for ring 5508
[TOTEM] entering COMMIT state.
[TOTEM] entering RECOVERY state.
[TOTEM] position [0] member 192.168.5.100:
[TOTEM] previous ring seq 5504 rep 192.168.5.100
[TOTEM] aru 0 high delivered 0 received flag 0
[TOTEM] Did not need to originate any messages in recovery.
[TOTEM] Sending initial ORF token
[CLM  ] CLM CONFIGURATION CHANGE
[CLM  ] New Configuration:
[CLM  ] Members Left:
[CLM  ] Members Joined:
[SYNC ] This node is within the primary component and will provide service.
[CLM  ] CLM CONFIGURATION CHANGE
[CLM  ] New Configuration:
[CLM  ]  r(0) ip(192.168.5.100)
[CLM  ] Members Left:
[CLM  ] Members Joined:
[CLM  ]  r(0) ip(192.168.5.100)
[SYNC ] This node is within the primary component and will provide service.
[TOTEM] entering OPERATIONAL state.
[SYNC ] Synchronization barrier completed
[SYNC ] Synchronization actions starting for (openais cluster membership
service B.01.01)
[CLM  ] got nodejoin message 192.168.5.100
[SYNC ] Synchronization barrier completed
[SYNC ] Committing synchronization for (openais cluster membership service
B.01.01)
[SYNC ] Synchronization actions starting for (openais availability
management framework B.01.01)
[SYNC ] Synchronization barrier completed
[SYNC ] Committing synchronization for (openais availability management
framework B.01.01)
[SYNC ] Synchronization actions starting for (openais checkpoint service
B.01.01)
[SYNC ] Synchronization barrier completed
[SYNC ] Committing synchronization for (openais checkpoint service B.01.01)
[SYNC ] Synchronization actions starting for (openais event service B.01.01)
[SYNC ] Synchronization barrier completed
[SYNC ] Committing synchronization for (openais event service B.01.01)
[SYNC ] Synchronization actions starting for (openais cluster closed process
group service v1.01)
[SYNC ] Synchronization barrier completed
[SYNC ] Committing synchronization for (openais cluster closed process group
service v1.01)
[MAIN ] AIS Executive Service RELEASE 'subrev 1152 version 0.80'
[MAIN ] Copyright (C) 2002-2006 MontaVista Software, Inc and contributors.
[MAIN ] Copyright (C) 2006 Red Hat, Inc.
[MAIN ] openais component openais_cpg loaded.
[MAIN ] Registering service handler 'openais cluster closed process group
service v1.01'
[MAIN ] openais component openais_cfg loaded.
[MAIN ] openais component openais_msg loaded.
[MAIN ] Registering service handler 'openais message service B.01.01'
[MAIN ] openais component openais_lck loaded.
[MAIN ] Registering service handler 'openais distributed locking service
B.01.01'
[MAIN ] openais component openais_evt loaded.
[MAIN ] Registering service handler 'openais event service B.01.01'
[MAIN ] openais component openais_ckpt loaded.
[MAIN ] Registering service handler 'openais checkpoint service B.01.01'
[MAIN ] openais component openais_amf loaded.
[MAIN ] Registering service handler 'openais availability management
framework B.01.01'
[MAIN ] openais component openais_clm loaded.
[MAIN ] Registering service handler 'openais cluster membership service
B.01.01'
[MAIN ] openais component openais_evs loaded.
[MAIN ] Registering service handler 'openais extended virtual synchrony
service'
[TOTEM] Token Timeout (1000 ms) retransmit timeout (238 ms)
[TOTEM] token hold (180 ms) retransmits before loss (4 retrans)
[TOTEM] join (100 ms) consensus (200 ms) merge (200 ms)
[TOTEM] downcheck (1000000 ms) fail to recv const (50 msgs)
[TOTEM] seqno unchanged const (30 rotations) Maximum network MTU 1500
[TOTEM] window size per rotation (50 messages) maximum messages per rotation
(17 messages)
[TOTEM] send threads (0 threads)
[TOTEM] RRP token expired timeout (238 ms)
[TOTEM] RRP token problem counter (2000 ms)
[TOTEM] RRP threshold (10 problem count)
[TOTEM] RRP mode set to none.
[TOTEM] heartbeat_failures_allowed (0)
[TOTEM] max_network_delay (50 ms)
[TOTEM] HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0
[TOTEM] Transmit multicast socket send buffer size (262142 bytes).
[TOTEM] The network interface [192.168.5.100] is now up.
[TOTEM] Created or loaded sequence id 5508.192.168.5.100 for this ring.
[TOTEM] entering GATHER state.
[SERV ] Initialising service handler 'openais extended virtual synchrony
service'
[SERV ] Initialising service handler 'openais cluster membership service
B.01.01'
[SERV ] Initialising service handler 'openais availability management
framework B.01.01'
[SERV ] Initialising service handler 'openais checkpoint service B.01.01'
[SERV ] Initialising service handler 'openais event service B.01.01'
[SERV ] Initialising service handler 'openais distributed locking service
B.01.01'
[SERV ] Initialising service handler 'openais message service B.01.01'
[SERV ] Initialising service handler 'openais configuration service'
[SERV ] Initialising service handler 'openais cluster closed process group
service v1.01'
[SYNC ] Not using a virtual synchrony filter.
[MAIN ] AIS Executive Service: started and ready to provide service.
[TOTEM] Creating commit token because I am the rep.
[TOTEM] Saving state aru 0 high seq received 0
[TOTEM] Storing new sequence id for ring 5512
[TOTEM] entering COMMIT state.
[TOTEM] entering RECOVERY state.
[TOTEM] position [0] member 192.168.5.100:
[TOTEM] previous ring seq 5508 rep 192.168.5.100
[TOTEM] aru 0 high delivered 0 received flag 0
[TOTEM] Did not need to originate any messages in recovery.
[TOTEM] Sending initial ORF token
[CLM  ] CLM CONFIGURATION CHANGE
[CLM  ] New Configuration:
[CLM  ] Members Left:
[CLM  ] Members Joined:
[SYNC ] This node is within the primary component and will provide service.
[CLM  ] CLM CONFIGURATION CHANGE
[CLM  ] New Configuration:
[CLM  ]  r(0) ip(192.168.5.100)
[CLM  ] Members Left:
[CLM  ] Members Joined:
[CLM  ]  r(0) ip(192.168.5.100)
[SYNC ] This node is within the primary component and will provide service.
[TOTEM] entering OPERATIONAL state.
[SYNC ] Synchronization barrier completed
[SYNC ] Synchronization actions starting for (openais cluster membership
service B.01.01)
[CLM  ] got nodejoin message 192.168.5.100
[SYNC ] Synchronization barrier completed
[SYNC ] Committing synchronization for (openais cluster membership service
B.01.01)
[SYNC ] Synchronization actions starting for (openais availability
management framework B.01.01)
[SYNC ] Synchronization barrier completed
[SYNC ] Committing synchronization for (openais availability management
framework B.01.01)
[SYNC ] Synchronization actions starting for (openais checkpoint service
B.01.01)
[SYNC ] Synchronization barrier completed
[SYNC ] Committing synchronization for (openais checkpoint service B.01.01)
[SYNC ] Synchronization actions starting for (openais event service B.01.01)
[SYNC ] Synchronization barrier completed
[SYNC ] Committing synchronization for (openais event service B.01.01)
[SYNC ] Synchronization actions starting for (openais cluster closed process
group service v1.01)
[SYNC ] Synchronization barrier completed
[SYNC ] Committing synchronization for (openais cluster closed process group
service v1.01)

 and the program just stop here if wait for something! what is wrong ????
 and I run the "testclm" to test the cluster mumbership, the message is:
 Result of saClmClusterNodeGet 1
Node Information for saClmClusterNodeGet SA_CLM_LOCAL_NODE_ID result %d
 node id is c805a8c0
 node address is family=3D1 - address=3D192.168.5.200
 Node name is 192.168.5.200
 Member is 1
 Timestamp is 10109d0f3c8a8a00 nanoseconds
result is 1
result is 1
result is 1
result is 1
result is 1
track result is 1
Node Information for Results from SA_TRACK_CURRENT:
 node id is c805a8c0
 node address is family=3D1 - address=3D192.168.5.200
 Node name is 192.168.5.200
 Member is 1
 Timestamp is 10109d0f3c8a8a00 nanoseconds
select fd is 4
press the enter key to exit with track stop and finalize.
done with select
Node Information for NODEGETCALLBACK 55

 node id is c805a8c0
 node address is family=3D1 - address=3D192.168.5.200
 Node name is 192.168.5.200
 Member is 1
 Timestamp is 10109d0f3c8a8a00 nanoseconds
Node for invocation 60 not found (7)
Node for invocation 61 not found (7)
Node Information for NODEGETCALLBACK 59

 node id is c805a8c0
 node address is family=3D1 - address=3D192.168.5.200
 Node name is 192.168.5.200
 Member is 1
 Timestamp is 10109d0f3c8a8a00 nanoseconds
Node Information for NODEGETCALLBACK 57

 node id is c805a8c0
 node address is family=3D1 - address=3D192.168.5.200
 Node name is 192.168.5.200
 Member is 1
 Timestamp is 10109d0f3c8a8a00 nanoseconds
TrackStop result is 1 (should be 1)
Finalize  result is 1 (should be 1)

 and the program stop here just like to wait somthing!! and no message about
node1!!!

 At all, I want to ask you 3 questions:
 1 if I made mistakes when I config the openais.conf file on both nodes?
 2 in the openais cluster, which is the master node? is the node,which
generate the authkey?
 3 from the message I give above about "aisexec" and "testclm", what is
wrong with my cluster??
 please tell me, and help me, thank you! and wait for your answer!!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.linux-foundation.org/pipermail/openais/attachments/200609=
07/e6e5a1b5/attachment-0001.htm


More information about the Openais mailing list