[Openais] Help on mounting ocfs2 filesystems

Andreas Kurz andreas at hastexo.com
Wed Mar 21 14:44:20 UTC 2012


Hello,

... see comments inline ...

On 03/21/2012 03:24 PM, Carlos Xavier wrote:
> Tank you for the quick anser
>>>
>>> Im tryng to make a cluster using pacemaker on OpenSUSE 12.1 with DRBD +
>>> OCFS2 + Mysql on top of the filesystem.
>>> The system will have two DRBD resources to be mounted on /var/lib/mysql
>>> and on /export.
>>
>> Be sure you configured "resource-and-stonith" fencing policy for DRBD
>> and you use a correct fencing script like: goo.gl/O4N8f
>>
> 
> I have configured it this way
> 
>>>        stonith-enabled="false" \
>>
>> Bad idea! You should really use stonith in such a setup ... in any
>> cluster setup.
>>
> 
> Itd disable while im getting the cluster to mount the filesystems
> 
>>>
>>
>> colocation col_ocfs2 inf: .......
>>
>> Use "crm configure help colocation" to find out more, the same for order.
>>
>> You could also add the two file system primitives to the cl_ocfs2_mgmt
>> group, then only the constraints between this group and DRBD are needed.
>>
>> Regards,
>> Andreas
> 
> I was not abble to get any of the partitions mounted, I tought there was
> something very bad on my configuration
> and changed it to resemble the configuration showed on
> http://www.clusterlabs.org/wiki/Dual_Primary_DRBD_%2B_OCFS2 and try to
> get at least one of the partitions mounted.
> Now this is my running configuration:
> 
> node artemis
> node jupiter
> primitive ip_mysql ocf:heartbeat:IPaddr2 \
>        params ip="10.10.10.5" cidr_netmask="32" nic="vlan0" \
>        op monitor interval="30s"
> primitive resDLM ocf:pacemaker:controld \
>        op monitor interval="60" timeout="60"
> primitive resDRBD_export ocf:linbit:drbd \
>        params drbd_resource="export" \
>        operations $id="opsDRBD_export" \
>        op monitor interval="20" role="Master" timeout="20" \
>        op monitor interval="30" role="Slave" timeout="20" \
>        meta target-role="started"
> primitive resDRBD_mysql ocf:linbit:drbd \
>        params drbd_resource="mysql" \
>        operations $id="opsDRBD_mysql" \
>        op monitor interval="20" role="Master" timeout="20" \
>        op monitor interval="30" role="Slave" timeout="20" \
>        meta target-role="started"
> primitive resFSexport ocf:heartbeat:Filesystem \
>        params device="/dev/drbd/by-res/export" directory="/export"
> fstype="ocfs2" options="rw,noatime" \
>        op monitor interval="120s"
> primitive resO2CB ocf:ocfs2:o2cb \
>        op monitor interval="60" timeout="60"
> ms msDRBD_export resDRBD_export \
>        meta resource-stickines="100" master-max="2" clone-max="2"
> notify="true" interleave="true"
> ms msDRBD_mysql resDRBD_mysql \
>        meta resource-stickines="100" master-max="2" clone-max="2"
> notify="true" interleave="true"
> clone cloneDLM resDLM \
>        meta globally-unique="false" interleave="true"
> clone cloneFSexport resFSexport \
>        meta interleave="true" ordered="true"
> clone cloneO2CB resO2CB \
>        meta globally-unique="false" interleave="true"
> colocation colDLMDRBD inf: cloneDLM msDRBD_export:Master
> colocation colFSO2CB inf: cloneFSexport cloneO2CB
> colocation colO2CBDLM inf: cloneO2CB cloneDLM
> order ordDLMO2CB 0: cloneDLM cloneO2CB
> order ordDRBDDLM 0: msDRBD_export:promote cloneDLM

.. should be cloneDLM:start

> order ordO2CBFS 0: cloneO2CB cloneFSexport
> property $id="cib-bootstrap-options" \
>        dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
>        cluster-infrastructure="openais" \
>        expected-quorum-votes="2" \
>        stonith-enabled="false" \
>        no-quorum-policy="ignore" \
>        last-lrm-refresh="1332330463" \
>        default-resource-stickiness="1000" \
>        maintenance-mode="false"
> 
> I commited the configuration to see if I would end up with the /export
> mounted, but no luck on this too.
> then I stopped the pacemaker on both hosts and started it just on
> jupter. The filesystem did not get mounted.and taking a look at the
> /var/log/messages i could see this entries:
> 
> Mar 21 10:11:35 jupiter pengine: [28282]: WARN: unpack_rsc_op:
> Processing failed op resFSexport:0_last_failure_0 on jupiter: unknown
> error (1)
> Mar 21 10:11:35 jupiter pengine: [28282]: WARN: common_apply_stickiness:
> Forcing cloneFSexport away from jupiter after 1000000 failures
> (max=1000000)
> Mar 21 10:11:35 jupiter pengine: [28282]: WARN: common_apply_stickiness:
> Forcing cloneFSexport away from jupiter after 1000000 failures
> (max=1000000)
> Mar 21 10:11:35 jupiter pengine: [28282]: ERROR: rsc_expand_action:
> Couldn't expand cloneDLM_demote_0
> Mar 21 10:11:35 jupiter pengine: [28282]: ERROR: crm_abort:
> clone_update_actions_interleave: Triggered assert at clone.c:1200 :
> first_action != NULL || is_set(first_child->flags, pe_rsc_orphan)
> Mar 21 10:11:35 jupiter pengine: [28282]: ERROR:
> clone_update_actions_interleave: No action found for demote in resDLM:0
> (first)
> Mar 21 10:11:35 jupiter pengine: [28282]: ERROR: crm_abort:
> clone_update_actions_interleave: Triggered assert at clone.c:1200 :
> first_action != NULL || is_set(first_child->flags, pe_rsc_orphan)
> Mar 21 10:11:35 jupiter pengine: [28282]: ERROR:
> clone_update_actions_interleave: No action found for demote in resDLM:0
> (first)
> Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave  
> ip_mysql#011(Started jupiter)
> Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave  
> resDRBD_mysql:0#011(Master jupiter)
> Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave  
> resDRBD_mysql:1#011(Stopped)
> Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave  
> resDRBD_export:0#011(Master jupiter)
> Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave  
> resDRBD_export:1#011(Stopped)
> Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave  
> resDLM:0#011(Started jupiter)
> Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave  
> resDLM:1#011(Stopped)
> Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave  
> resO2CB:0#011(Started jupiter)
> Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave  
> resO2CB:1#011(Stopped)
> Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave  
> resFSexport:0#011(Stopped)
> Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave  
> resFSexport:1#011(Stopped)
> Mar 21 10:11:35 jupiter crmd: [28283]: info: do_state_transition: State
> transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
> cause=C_IPC_MESSAGE origin=handle_response ]
> 
> 
> But looking back on the log give no clues. Then I started the pacemaker
> on the second host and took a look at the log, then i found this:
> 
> Mar 21 10:28:13 artemis lrmd: [2429]: info: rsc:resFSexport:0 start[26]
> (pid 3315)
> Mar 21 10:28:13 artemis lrmd: [2429]: info: operation monitor[25] on
> resO2CB:1 for client 2432: pid 3314 exited with return code 0
> Mar 21 10:28:13 artemis crmd: [2432]: info: process_lrm_event: LRM
> operation resO2CB:1_monitor_60000 (call=25, rc=0, cib-update=26,
> confirmed=false) ok
> Mar 21 10:28:13 artemis Filesystem(resFSexport:0)[3315]: [3362]: INFO:
> Running start for /dev/drbd/by-res/export on /export
> Mar 21 10:28:13 artemis lrmd: [2429]: info: RA output:
> (resFSexport:0:start:stderr) FATAL: Module scsi_hostadapter not found.
> Mar 21 10:28:13 artemis lrmd: [2429]: info: RA output:
> (resFSexport:0:start:stderr) mount.ocfs2: Cluster stack specified does
> not match the one currently running while trying to join the group

You created the ocfs2 file system without Pacemaker running? You need to
do a: tunefs.ocfs2 --update-cluster-stack <device>

> Mar 21 10:28:13 artemis Filesystem(resFSexport:0)[3315]: [3382]: ERROR:
> Couldn't mount filesystem /dev/drbd/by-res/export on /export
> Mar 21 10:28:13 artemis lrmd: [2429]: info: operation start[26] on
> resFSexport:0 for client 2432: pid 3315 exited with return code 1
> Mar 21 10:28:13 artemis crmd: [2432]: info: process_lrm_event: LRM
> operation resFSexport:0_start_0 (call=26, rc=1, cib-update=27,
> confirmed=true) unknown error
> Mar 21 10:28:13 artemis attrd: [2430]: notice: attrd_ais_dispatch:
> Update relayed from jupiter
> Mar 21 10:28:13 artemis attrd: [2430]: notice: attrd_trigger_update:
> Sending flush op to all hosts for: fail-count-resFSexport:0 (INFINITY)
> Mar 21 10:28:13 artemis attrd: [2430]: notice: attrd_perform_update:
> Sent update 13: fail-count-resFSexport:0=INFINITY
> Mar 21 10:28:13 artemis attrd: [2430]: notice: attrd_ais_dispatch:
> Update relayed from jupiter
> Mar 21 10:28:13 artemis attrd: [2430]: notice: attrd_trigger_update:
> Sending flush op to all hosts for: last-failure-resFSexport:0 (1332336493)
> Mar 21 10:28:13 artemis attrd: [2430]: notice: attrd_perform_update:
> Sent update 16: last-failure-resFSexport:0=1332336493
> Mar 21 10:28:13 artemis crmd: [2432]: info: do_lrm_rsc_op: Performing
> key=8:10:0:0c5a17ef-3075-47e7-a0c0-a564ec772af8 op=resFSexport:0_stop_0 )
> Mar 21 10:28:13 artemis lrmd: [2429]: info: rsc:resFSexport:0 stop[27]
> (pid 3389)
> Mar 21 10:28:13 artemis Filesystem(resFSexport:0)[3389]: [3423]: INFO:
> Running stop for /dev/drbd/by-res/export on /export
> Mar 21 10:28:13 artemis lrmd: [2429]: info: operation stop[27] on
> resFSexport:0 for client 2432: pid 3389 exited with return code 0
> Mar 21 10:28:13 artemis crmd: [2432]: info: process_lrm_event: LRM
> operation resFSexport:0_stop_0 (call=27, rc=0, cib-update=28,
> confirmed=true) ok
> 
> 
> The weird thing is at this line:
> 
> Mar 21 10:28:13 artemis lrmd: [2429]: info: RA output:
> (resFSexport:0:start:stderr) FATAL: Module scsi_hostadapter not found

A left over from older days, it's already gone in latest resource agents
... but not a problem for you here.

Regards,
Andreas

-- 
Need help with Pacemaker?
http://www.hastexo.com/now

> 
> Why pacemaker is looking for a scsi device since it is configured to use
> DRBD?.
> 
> Please, can someone shade a light over this?
> 
> Regards,
> Carlos
> 
> 
> 
> ----- Original Message ----- From: "Andreas Kurz" <andreas at hastexo.com>
> To: <openais at lists.linux-foundation.org>
> Sent: Wednesday, March 21, 2012 7:49 AM
> Subject: Re: [Openais] Help on mounting ocfs2 filesystems
> 
> 
>> _______________________________________________
>> Openais mailing list
>> Openais at lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/openais 
> 
> _______________________________________________
> Openais mailing list
> Openais at lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/openais

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 222 bytes
Desc: OpenPGP digital signature
URL: <http://lists.linuxfoundation.org/pipermail/openais/attachments/20120321/329bf5e1/attachment.sig>


More information about the Openais mailing list