[Openais] cman does not start ... corosync died

Christine Caulfield ccaulfie at redhat.com
Mon Mar 15 06:06:14 PDT 2010


On 15/03/10 12:44, Christian Brandes wrote:
> Hi all!
>
> At the moment I am trying to run RH-Cluster, but cman won't start.
> I got error messages from corosync.
>
> Do you have an idea what's wrong?
>
> Versions:
> Ubuntu 10.04 Alpha-3
> redhat-cluster-suite 3.0.2-2ubuntu2
> cman 3.0.2-2ubuntu2
> corosync 1.2.0-0ubuntu1
>
> /etc/init.d/cman start
> Starting cluster:
>     Global setup... [ OK ]
>     Loading kernel modules... [ OK ]
>     Mounting configfs... [ OK ]
>     Setting network parameters... [ OK ]
>     Starting cman... corosync died: Could not read cluster configuration
>
> I generated /etc/cluster/cluster.conf with system-config-cluster:
> <?xml version="1.0" ?>
> <cluster alias="ubu" config_version="4" name="ubu">
>          <fence_daemon post_fail_delay="0" post_join_delay="3"/>
>          <clusternodes>
>                  <clusternode name="ubu1-24" nodeid="1" votes="1">
>                          <fence/>
>                  </clusternode>
>                  <clusternode name="ubu2-24" nodeid="2" votes="1">
>                          <fence/>
>                  </clusternode>
>          </clusternodes>
>          <cman expected_votes="1" two_node="1"/>
>          <fencedevices>
>                  <fencedevice agent="fence_manual" name="manual"/>
>          </fencedevices>
>          <rm>
>                  <failoverdomains/>
>                  <resources/>
>          </rm>
> </cluster>
>
> /etc/corosync/corosync.conf:
> totem {
>          version: 2
>          token: 3000
>          token_retransmits_before_loss_const: 10
>          join: 60
>          consensus: 4800
>          vsftype: none
>          max_messages: 20
>          clear_node_high_bit: yes
>         secauth: off
>          threads: 0
>         rrp_mode: none
>          interface {
>                  ringnumber: 0
>                  bindnetaddr: 192.168.24.221
>                  mcastaddr: 226.94.1.1
>                  mcastport: 5405
>          }
> }
>
> amf {
>          mode: disabled
> }
>
> service {
>          ver: 0
>          name: pacemaker
> }
>
> aisexec {
>          user: root
>          group: root
> }
>
> logging {
>          fileline: off
>          to_stderr: yes
>          to_logfile: no
>          to_syslog: yes
>          syslog_facility: daemon
>          debug: off
>          timestamp: on
>          logger_subsys {
>                  subsys: AMF
>                  debug: off
>                  tags: enter|leave|trace1|trace2|trace3|trace4|trace6
>          }
> }
>
> /var/log/syslog:
> Mar 15 11:52:14 ubu1 corosync[3904]: [SERV ] Unloading all Corosync service engines.
> Mar 15 11:52:14 ubu1 corosync[3904]: [SERV ] Service engine unloaded: corosync extended virtual synchrony service
> Mar 15 11:52:14 ubu1 corosync[3904]: [SERV ] Service engine unloaded: corosync configuration service
> Mar 15 11:52:14 ubu1 corosync[3904]: [SERV ] Service engine unloaded: corosync cluster closed process group service v1.01
> Mar 15 11:52:14 ubu1 corosync[3904]: [SERV ] Service engine unloaded: corosync cluster config database access v1.01
> Mar 15 11:52:14 ubu1 corosync[3904]: [SERV ] Service engine unloaded: corosync profile loading service
> Mar 15 11:52:14 ubu1 corosync[3904]: [SERV ] Service engine unloaded: openais checkpoint service B.01.01
> Mar 15 11:52:14 ubu1 corosync[3904]: [SERV ] Service engine unloaded: corosync cluster quorum service v0.1
> Mar 15 11:52:14 ubu1 corosync[3904]: [MAIN ] Corosync Cluster Engine exiting with status -1 at main.c:158.
> Mar 15 11:52:14 ubu1 corosync[4043]: [MAIN ] Corosync Cluster Engine ('1.2.0'): started and ready to provide service.
> Mar 15 11:52:14 ubu1 corosync[4043]: [MAIN ] Corosync built-in features: nss
> Mar 15 11:52:14 ubu1 corosync[4043]: [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
> Mar 15 11:52:14 ubu1 corosync[4043]: [TOTEM ] Initializing transport (UDP/IP).
> Mar 15 11:52:14 ubu1 corosync[4043]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
> Mar 15 11:52:14 ubu1 corosync[4043]: [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine.
> Mar 15 11:52:14 ubu1 corosync[4043]: [TOTEM ] The network interface [192.168.24.221] is now up.
> Mar 15 11:52:14 ubu1 corosync[4043]: [SERV ] Service failed to load 'pacemaker'.
> Mar 15 11:52:14 ubu1 corosync[4043]: [SERV ] Service engine loaded: openais checkpoint service B.01.01
> Mar 15 11:52:14 ubu1 corosync[4043]: [SERV ] Service engine loaded: corosync extended virtual synchrony service
> Mar 15 11:52:14 ubu1 corosync[4043]: [SERV ] Service engine loaded: corosync configuration service
> Mar 15 11:52:14 ubu1 corosync[4043]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01
> Mar 15 11:52:14 ubu1 corosync[4043]: [SERV ] Service engine loaded: corosync cluster config database access v1.01
> Mar 15 11:52:14 ubu1 corosync[4043]: [SERV ] Service engine loaded: corosync profile loading service
> Mar 15 11:52:14 ubu1 corosync[4043]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1
> Mar 15 11:52:14 ubu1 corosync[4043]: [TOTEM ] A processor joined or left the membership and a new membership was formed.
> Mar 15 11:52:14 ubu1 corosync[4043]: [MAIN ] Completed service synchronization, ready to provide service.
> Mar 15 11:52:32 ubu1 corosync[4077]: [MAIN ] Corosync Cluster Engine ('1.2.0'): started and ready to provide service.
> Mar 15 11:52:32 ubu1 corosync[4077]: [MAIN ] Corosync built-in features: nss
> Mar 15 11:52:32 ubu1 corosync[4077]: [MAIN ] Successfully read config from /etc/cluster/cluster.conf
> Mar 15 11:52:32 ubu1 corosync[4077]: [MAIN ] Successfully parsed cman config
> Mar 15 11:52:32 ubu1 corosync[4077]: [MAIN ] Successfully configured openais services to load
> Mar 15 11:52:32 ubu1 corosync[4077]: [MAIN ] parse error in config: The consensus timeout parameter (4800 ms) must be atleast 1.2 * token (12000 ms).
> Mar 15 11:52:32 ubu1 corosync[4077]: [MAIN ] Corosync Cluster Engine exiting with status -9 at main.c:1359.
>
> I see:
> consensus (4800 ms) must be atleast 1.2 * token (12000 ms)
> But token is 3000!
>

It's a known bug in that version of corosync. You'll need to manually 
increase consensus so that it is greater that 4800

Chrissie


More information about the Openais mailing list