Koozali.org: home of the SME Server

SARK Pacemaker Cluster

Offline SARK devs

  • ****
  • 2,806
  • +1/-0
    • http://sarkpbx.com
SARK Pacemaker Cluster
« on: February 18, 2014, 12:05:10 PM »
Hi all,

More recently we turned out attention to rewriting SARK/SAIL's high availability component.  I'm pleased to announce that it is now ready and has been running in production at a couple of customer sites since early January.  While it is still early days, the initial results are encouraging. However, the news is somewhat bitter-sweet for SME lovers. V4HA uses Pacemaker, Corosync and DRBD to run a true clustered PBX.  It won't and probably never will run on SME due to the fairly serious challenges involved in getting DRBD to play nice with the automatic RAID allocation of SME.  I know there have been a few abortive attempts in the past to get clusters running on SME but sadly, all of them appear to have fallen by the wayside.  Our own V3 heartbeat cluster uses rsync to handle the data management but this is nowadays generally regarded as the old way to do it. The maturity and sophistication of tools like DRBD seem to be the only choice going forwards. As a result, the new V4HA clusters run on Debian. If anyone out there has the knowledge or ideas to make DRBD run on SME then we'd love to hear from you. 

If you'd like to try V4HA, or just learn more about it, full details are on the wiki pages here
 
http://www.sailpbx.com/mediawiki/index.php/V4_High_Availability

The ASHA cluster builder utility is available as a deb or you can clone it from git-hub here

https://github.com/aelintra/asha.git

Kind Regards
S     
« Last Edit: February 18, 2014, 12:21:51 PM by SARK devs »

Offline fpausp

  • *
  • 728
  • +0/-0
Re: SARK Pacemaker Cluster
« Reply #1 on: February 20, 2014, 08:51:24 PM »
Viribus unitis

Offline SARK devs

  • ****
  • 2,806
  • +1/-0
    • http://sarkpbx.com
Re: SARK Pacemaker Cluster
« Reply #2 on: February 21, 2014, 11:58:24 AM »
all good

How did you create the DRBD partition or LV ?

S
« Last Edit: February 21, 2014, 12:00:22 PM by SARK devs »

Offline fpausp

  • *
  • 728
  • +0/-0
Re: SARK Pacemaker Cluster
« Reply #3 on: February 22, 2014, 12:46:47 PM »
It's been a while, I used an additional hdd for drbd, two virtual server under proxmox...

I have not finished it but drbd was working. Please take al look at:

Code: [Select]

### sme8 ha-drbd howto v0.2 ###



# Preparation, two SME8-Server with additional hdd for drbd:
node1 - sme8test1 - 10.10.10.109
node2 - sme8test2 - 10.10.10.110



### Installation ###


# Install packages (on each node)
yum -y install drbd82 kmod-drbd82 --enablerepo=extras; \
yum -y install heartbeat --enablerepo=extras


# Generate public-key (on each node)
ssh-keygen -t rsa -N "" -f ~/.ssh/id_rsa


# copy the key to the other side on node1 (sme8test1)
ssh-copy-id -i ~/.ssh/id_rsa.pub "root@10.10.10.110 -p 11022"


# copy the key to the other side on node2 (sme8test2)
ssh-copy-id -i ~/.ssh/id_rsa.pub "root@10.10.10.109 -p 11022"


# Test connection on node1 (sme8test1)
ssh -p 11022 root@10.10.10.110


# Test connection on node2 (sme8test2)
ssh -p 11022 root@10.10.10.109


# create template on node1 (sme8test1)
mkdir -p /etc/e-smith/templates-custom/etc/hosts
nano /etc/e-smith/templates-custom/etc/hosts/30hosts

----------------------- content of 30hosts -----------------------
10.10.10.110   sme8test2.mydomain.at sme8test2
----------------------- content of 30hosts -----------------------


# create template on node2 (sme8test2)
mkdir -p /etc/e-smith/templates-custom/etc/hosts
nano /etc/e-smith/templates-custom/etc/hosts/30hosts

----------------------- content of 30hosts -----------------------
10.10.10.109   sme8test1.mydomain.at sme8test1
----------------------- content of 30hosts -----------------------


# expand template (on each node)
/sbin/e-smith/expand-template /etc/hosts



### DRBD Configuration ###


# configure drbd (on each node)

nano /etc/drbd.conf

----------------------- content of drbd.conf -----------------------
resource r0 {
protocol      C;

on sme8test1 {
device /dev/drbd1;
disk /dev/vdb1;
address 10.10.10.109:7789;
meta-disk internal;
}
on sme8test2 {
device /dev/drbd1;
disk /dev/vdb1;
address 10.10.10.110:7789;
meta-disk internal;
}
}
----------------------- content of drbd.conf -----------------------



# create a primary partition on the additional hdd (on each node)
fdisk /dev/vdb


# create resource (on each node)
drbdadm create-md r0


# start drbd service (on each node)
/etc/init.d/drbd start


# chose a server as Primary - node1 (sme8test1), and start replication
drbdadm -- --overwrite-data-of-peer primary r0


# temporary speedup replication (680 for virtual lan)
drbdsetup /dev/drbd1 syncer -r 680M


# watch replication (on each node)
cat /proc/drbd


# create mountpoint (on each node)
mkdir -p /media/drbd


# we can now format /dev/drbd1 and mount it on node1 (sme8test1)
mke2fs -j /dev/drbd1 ; mount /dev/drbd1 /media/drbd


# create a testfile on node1 (sme8test1)
touch /media/drbd/testfile-1.txt


# umount drbd on node1 (sme8test1) and make node1 (sme8test1) secondary
umount /media/drbd
drbdadm secondary r0


# make sme8test2 primary and mount drbd on node2 (sme8test2)
drbdadm primary r0
mount /dev/drbd1 /media/drbd


# create a testfile on node2 (sme8test2)
touch /media/drbd/testfile-2.txt


# umount drbd and make sme8sme2 secondary again on node2 (sme8test2)
umount /media/drbd
drbdadm secondary r0


# finaly make sme8test1 primary again and mount drbd on node1 (sme8test1)
drbdadm primary r0
mount /dev/drbd1 /media/drbd


# take a look on node1 (sme8test1) if the two testfiles are there
ls -alih /media/drbd


# make sure that drbd starts on boot (on each node)
ln -s /etc/rc.d/init.d/drbd /etc/rc7.d/K36drbd ; \
ln -s /etc/rc.d/init.d/drbd /etc/rc7.d/S98drbd


# reboot and test if the service start on boot
reboot


### Heartbeat V2 Configuration ###


# Let's configure a simple /etc/ha.d/ha.cf file on node1 (sme8test1)

nano /etc/ha.d/ha.cf

----------------------- content of ha.cf -----------------------
keepalive 2
deadtime 30
warntime 10
initdead 120
bcast   eth0
node sme8test1
node sme8test2
crm yes
----------------------- content of ha.cf -----------------------


# Create also the /etc/ha.d/authkeys on node1 (sme8test1)

nano /etc/ha.d/authkeys

----------------------- content of authkeys -----------------------
auth 1
1 sha1 MySecret
----------------------- content of authkeys -----------------------


# give permission 600 on node1 (sme8test1)
chmod 600 /etc/ha.d/authkeys


# start heartbeat service on node1 (sme8test1)
/etc/init.d/heartbeat start


# Replicate now the ha.cf and authkeys to node2 (sme8test2)
scp -P 11022 /etc/ha.d/ha.cf /etc/ha.d/authkeys root@sme8test2:/etc/ha.d/


# start heartbeat service on node2 (sme8test2)
/etc/init.d/heartbeat start


# Verify cluster with crm_mon on node2 (sme8test2)
crm_mon


# create resourcefile on node1 (sme8test1)
nano /var/lib/heartbeat/myCluster.xml

---------------------------- content of myCluster.xml -----------------------------
<primitive id="IP-Addr" class="ocf" type="IPaddr2" provider="heartbeat">
           <instance_attributes id="IP-Addr_instance_attrs">
             <attributes>
               <nvpair id="IP-Addr_target_role" name="target_role" value="started"/>
               <nvpair id="2e967596-73fe-444e-82ea-18f61f3848d7" name="ip" value="10.10.10.222"/>
             </attributes>
           </instance_attributes>
         </primitive>
         <instance_attributes id="My-DRBD-group_instance_attrs">
           <attributes>
             <nvpair id="My-DRBD-group_target_role" name="target_role" value="started"/>
           </attributes>
         </instance_attributes>
         <primitive id="DRBD_data" class="heartbeat" type="drbddisk" provider="heartbeat">
           <instance_attributes id="DRBD_data_instance_attrs">
             <attributes>
               <nvpair id="DRBD_data_target_role" name="target_role" value="started"/>
               <nvpair id="93d753a8-e69a-4ea5-a73d-ab0d0367f001" name="1" value="/media/drbd"/>
             </attributes>
           </instance_attributes>
         </primitive>
         <primitive id="FS_repdata" class="ocf" type="Filesystem" provider="heartbeat">
           <instance_attributes id="FS_repdata_instance_attrs">
             <attributes>
               <nvpair id="FS_repdata_target_role" name="target_role" value="started"/>
               <nvpair id="96d659dd-0881-46df-86af-d2ec3854a73f" name="fstype" value="ext3"/>
               <nvpair id="8a150609-e5cb-4a75-99af-059ddbfbc635" name="device" value="/dev/drbd1"/>
               <nvpair id="de9706e8-7dfb-4505-b623-5f316b1920a3" name="directory" value="/media/drbd"/>
             </attributes>
           </instance_attributes>
         </primitive>
---------------------------- content of myCluster.xml -----------------------------


# add resource to the cluster on node1 (sme8test1)
cibadmin -C -o resources -x /var/lib/heartbeat/myCluster.xml


# make sure that heartbeat starts on boot (on each node)
ln -s /etc/rc.d/init.d/heartbeat /etc/rc7.d/K35heartbeat ; \
ln -s /etc/rc.d/init.d/heartbeat /etc/rc7.d/S99heartbeat



### Test the HA-DRBD Cluster ###


???????????????????????????????????????????????????????????????




Best

fpausp
Viribus unitis

Offline SARK devs

  • ****
  • 2,806
  • +1/-0
    • http://sarkpbx.com
Re: SARK Pacemaker Cluster
« Reply #4 on: February 22, 2014, 03:31:57 PM »
Hi fpausp

I assumed that you had used an extra drive to get it to work. We have done similar experiments with SME VM images. Sadly, I think  the extra drive requirement may well put a lot of potential users off, especially when it isn't necessary with any of the other mainstream Distros.

Having said that, all of our Corosync/DRBD/Pacemaker CIB configs are up on GitHub and we'd be happy to offer what help we can if anyone wanted to hack them over to SME.

Best
S
« Last Edit: February 22, 2014, 03:33:59 PM by SARK devs »