[Openais] Corosync UDP ports
Colin
colin.hch at gmail.com
Mon Mar 15 02:53:55 PDT 2010
Hi All,
in a test that we started last week we have two Pacemaker+Corosync
clusters, each with three hosts, where all six hosts are on the same
network(s). The two clusters are identically configured, with one
execption: the mcastport is 688 for one, and 689 for the other.
This morning I found the clusters in a strange state, none of the
hosts could see any of the others, i.e. Pacemaker output was "as if"
Corosync wasn't running on the other nodes, although the network was
fine, as I could easily verify with a ping etc.
I then noticed in the lsof output that Corosync seems to also use the
port below the configured mcastport, which leads me to my questions:
Is this normal? It doesn't seem to be documented in
http://corosync.org/doku.php?id=faq:configure_openais and
corosync.conf(5).
Is this overlap created by the additional port a likely cause for the
cluster conking out?
Thanks, Colin
PS: I'm in the process of trying to revive the cluster;
/etc/init.d/corosync stop didn't work, but a few "kill -9" and "rm -f
/var/lib/heartbeat/crm/*" commands later I'm up-and-running again on
2x2 of the 2x3 nodes with the same config as previously, looking fine
so far...
root at h001:~# dpkg -l | grep corosync
ii corosync
1.2.0-0ubuntu1 Standards-based
cluster framework (daemon an
ii libcorosync4
1.2.0-0ubuntu1 Standards-based
cluster framework (libraries
root at h001:~# cat /etc/corosync/corosync.conf
totem {
version: 2
consensus: 1500
vsftype: none
clear_node_high_bit: yes
secauth: off
threads: 0
rrp_mode: passive
interface {
ringnumber: 0
bindnetaddr: 192.168.50.32
broadcast: yes
mcastport: 688 <=== 689 for the other cluster
}
interface {
ringnumber: 1
bindnetaddr: 192.168.52.32
broadcast: yes
mcastport: 688 <=== 689 for the other cluster
}
}
amf {
mode: disabled
}
service {
ver: 0
name: pacemaker
}
aisexec {
user: root
group: root
}
logging {
fileline: off
to_stderr: yes
to_logfile: no
to_syslog: yes
syslog_facility: daemon
debug: on
timestamp: on
logger_subsys {
subsys: AMF
debug: off
tags: enter|leave|trace1|trace2|trace3|trace4|trace6
}
}
root at h001:~# lsof -n | grep corosync | grep UDP
corosync 17688 root 5u IPv4 89563 0t0
UDP 255.255.255.255:688
corosync 17688 root 6u IPv4 89564 0t0
UDP 192.168.50.40:687
corosync 17688 root 7u IPv4 89565 0t0
UDP 192.168.50.40:688
corosync 17688 root 8u IPv4 89612 0t0
UDP 255.255.255.255:688
corosync 17688 root 9u IPv4 89613 0t0
UDP 192.168.52.40:687
corosync 17688 root 10u IPv4 89614 0t0
UDP 192.168.52.40:688
root at h001:~#
More information about the Openais
mailing list