Flip1405 – The Failover Asterisk Solution

Posted: July 12th, 2008 | Author: admin | Filed under: Asterisk VoIP, Linux, Mac, Tech | Tags: , , , , , , , , , , , , , , , , , , | 44 Comments »

20090327 – Update

Many people email me after using my scripts and optimizing them.  I don’t claim to be a ‘good’ programmer at all, but my bash scripts work.  Functional-programmer/admin = Me

Anyways, that shows you how great the community is with open-source software.  Without even asking my scripts are constantly improved.

The new version consists of only 1 file, that is enabled for master or slave by changing a variable.
The rsync commands are also included in the file.

Make sure to set the bindaddr to the virtual IP address in /etc/asterisk/sip.conf or /etc/asterisk/sip_general_custom.conf
IE. bindaddr=192.168.1.2 ;IP of Virtual interface

Now you can wget the one file for both servers here:

Here is the updated version of Flip1405:

#!/bin/sh
set -x
#FLIP1405 - Failover Server Solution
#RUN ON PRIMARY ASTERISK SERVER
#ORIGINAL AUTHOR GREGG HANSEN 20080208 <hansen.gregg@gmail.com>
#MODIFIED BY GREGORY BOEHNLEIN 20090201 <damin@nacs.net>
#DEPENDENCIES: nmap, arping

# Version 1.0 - 2009-03-25
# - Consolidated Master/Slave scripts into a single script
# - Converted hardcoded interface / IP configuration to variable based
# - Forced Asterisk to issue a "reload" to bind to the floating IP
# - Consolidated external rsync-replicate script to be self-contained

# If set to "1" this is the Master server.
# If commented out, or set to anything else, this will act as if it is a slave
MASTER="1"

# The Master and Slave IP Addresses for Replication / Testing
MASTERIP="192.168.1.5"
SLAVEIP="192.168.1.6"

# The IP address that will float between Master and Slave
FLOAT="192.168.1.4"

# The device on which the floating interface exists
DEVICE="eth0"

# The specific interface alias on the device
IFACE="$DEVICE:1"

# Main Code
if [ "$MASTER" == "1" ] ; then
#if Local Asterisk up = 'open|filtered'
STATUS=$(nmap --system-dns -p 4569 -sU 127.0.0.1 | awk '{print $2}' | grep open)
#if primary owns virtual = '207.166.192.51'
PRIMARYIP=$(ifconfig "$IFACE" | grep "$FLOAT" | awk '{print $2}' | sed 's/addr://g')
#if virtual is not pingable = 'down.'
VIRTUALIP=$(nmap --system-dns -sP "$FLOAT" | grep down | awk '{print $4}')

# Configuration File Replication from Master to Slave
rsync -avzr --rsh=ssh /etc/asterisk root@$SLAVEIP:/etc/
rsync -avzr --rsh=ssh /var/spool/asterisk/voicemail root@$SLAVEIP:/var/spool/asterisk
rsync -avzr --rsh=ssh /var/lib/asterisk/moh root@$SLAVEIP:/var/lib/asterisk
rsync -avzr --rsh=ssh /var/lib/asterisk/sounds root@$SLAVEIP:/var/lib/asterisk
#rsync -avzr --rsh=ssh /usr/src root@$SLAVEIP:/usr/src

#/root/rsync_replicate > /dev/null 2> /dev/null
if [ "$STATUS" == "open|filtered" ] ; then  ###is primary asterisk up?
if [ "$PRIMARYIP" != "$FLOAT" ] ; then  ###does primary not own virtual ip?
if [ "$VIRTUALIP" == "down." ] ; then  ###is the virtual IP not pingable?
ifconfig $IFACE $FLOAT/24 up
arping -U -c 5 -I $DEVICE $FLOAT  ###Gratuitous ARP request
service asterisk reload
fi
fi
else
service asterisk start
ifconfig $IFACE down
fi
else # We must be running as Slave node
###if Primary Asterisk up = 'open|filtered'
PRISTATUS=$(nmap --system-dns -p 4569 -sU $MASTERIP | awk '{print $2}' | grep open)
###if Secondary Asterisk up = 'open|filtered'
SECSTATUS=$(nmap --system-dns -p 4569 -sU 127.0.0.1 | awk '{print $2}' | grep open)
###if local owns Virtual = '207.166.192.51'
PRIMARYIP=$(ifconfig "$IFACE" | grep "$FLOAT" | awk '{print $2}' | sed 's/addr://g')
###if Virtual not pingable = 'down.'
VIRTUALIP=$(nmap --system-dns -sP "$FLOAT" | grep down | awk '{print $4}')

if [ "$PRISTATUS" != "open|filtered" ] ; then   ###is primary asterisk down?
if [ "$SECSTATUS" == "open|filtered" ] ; then     ###is secondary asterisk up?
if [ "$PRIMARYIP" != "$FLOAT" ] ; then   ###does secondary not own virtual ip?
if [ "$VIRTUALIP" == "down." ] ; then  ###is the virtual IP not pingable?
ifconfig $IFACE $FLOAT/24 up
arping -U -c 5 -I $DEVICE $FLOAT   ###Gratuitous ARP request
service asterisk reload
fi
fi
else
service asterisk start
fi
else
if [ "$SECSTATUS" == "open|filtered" ] ; then ###primary is up, is secondary up? (there can be only one!)
service asterisk stop
else
echo
if [ "$PRIMARYIP" == "$FLOAT" ] ; then
# If the Primary is up but we still own the Virtual IP, shut it down
ifconfig $IFACE down
fi
fi
fi
fi

<Original release description>

A while back I wrote a bash script to manage a virtual IP between two Asterisk VoIP servers. I finally decided to release it to the public after extensive testing in production environments. Most smaller companies can deal with their phone system failing over to their cell phones for an hour or so in the event of a hardware failure on the Asterisk server. However, business/enterprise-level systems lose (tens)thousands of dollars per hour that the phone system is down. With Flip1405 implemented, the hot-failover server picks up the virtual IP and there is less than 30 seconds of downtime.

The only dependencies for Flip1405 are nmap and arping

1. Download the files here. flip1405_primary flip1405_secondary (For your wgetting pleasure) :)

2. Create a cron.every1 folder on each server, copy flip1405_primary and flip1405_secondary to their respective servers.

3. Make sure you edit the files with your IP addressing scheme. I commented what you need to find and replace.

4. Setup shared-keys so the servers can copy between each other without user intervention. Setup Login without password

5. Schedule a “rsync -avz /etc/asterisk/* root@secondaryIP:/etc/asterisk”
To replicate the configs from the primary to the secondary every 30 minutes or so (another cronjob).
You also can use the same command (diff dir) to replicate all other files, sounds, VM, etc.)
Here is an example script I use to replicate all configs every hour. rsync_replicate

This works out so you can reboot your primary box, the secondary will take over for the time being. As soon as the primary is up, the secondary will resume slave/standby mode.
Simple and functional. Please leave a comment with any questions and I will respond asap.


44 Comments on “Flip1405 – The Failover Asterisk Solution”

  1. 1 balaji said at 12:49 am on July 14th, 2008:

    Hi

    can you explain where this script going to be installed.

    so i need 3 servers for this to run

    iam using user to register with Openser

    Openser intern send to PSTN using asteriskserver

    this where the problem come in if one of the Asterisk server go down, iam not able to send the traffic to another server, i can use failover, but if i have less number of loops in openser, it serve better

    so how does the network diagram looks yours

    user—openser—asterisk ( existing)

    with your script

    users—openser—-scriptserver——Asterisk1
    |———–asterisk2

    is this correct

    balaji

  2. 2 admin said at 6:15 am on July 14th, 2008:

    The script actually runs on the Asterisk server.

    You would schedule the primary script to run on your asterisk(PSTN) gateway and setup a secondary asterisk(PSTN) with the secondary script. I’ve found that cron.every1(minute) works best.
    This way the script would manage the virtual IP between both of the Asterisk (PSTN) gateways and it also updates the ARP tables via a gratuitous ARP request.

    Your openser would always send traffic to the same IP and any failover would be seamless within 25 seconds. You could always set the script to run every 15 seconds, there isn’t much overhead at all (just a simple query to see if the other * server is responding on UDP 5060.

    user-openser—asterisk (primary script-live) PSTN
    \—asterisk (secondary script-standby)—/

    Let me know if you need any help setting it up. I will upload some scripts to replicate configs via rsync. This will make the install/config process much easier.

  3. 3 Ian said at 6:17 am on July 16th, 2008:

    Hi Does this see if Asterisk is up or just that the virtual ip is up ? if its the latter then why not use heartbeat ?

  4. 4 admin said at 7:34 am on July 16th, 2008:

    I’ve found that heartbeat is a pain to setup based on all of its’ dependencies and the configuration.

    My script actually manages the virtual IP based on Asterisk responding to requests. It queries UDP 5060 to see if the main server is running.

    I’ve also found out too often when using heartbeat that the service will hang, but the heartbeat will continue on.

    No matter what the situation, Flip1405 has yet to cause a system outage for more than 30 seconds in the event of a hardware failure for my client companies.

    Thank you for interest and please let me know if you have any other questions.

  5. 5 Josh said at 3:21 pm on August 4th, 2008:

    I would really like some help going thru this set up. Send me an e-mail if you have some time.

  6. 6 Asterisk High Availability Solutions | Neil Hook said at 9:55 am on August 14th, 2008:

    [...] Flip1405 Manages virtual IP between two Asterisk servers and queries UDP5060 for state changes [...]

  7. 7 Awoof said at 12:25 am on August 16th, 2008:

    will give it a shot…Good job

  8. 8 Lawrence said at 2:37 pm on August 22nd, 2008:

    is there a way to replicate the mysql database for CDR records in case the primary server dies from hardware failure and then we could restore the data from the secondary box?

  9. 9 admin said at 5:53 am on August 23rd, 2008:

    Lawrence,

    You can use the MySQL Master-Slave replication configuration explained here.

    http://members.cox.net/midian/howto/mysqlReplication.htm

    Thank you.

  10. 10 Lawrence said at 7:43 am on August 24th, 2008:

    im having a brain issue here on the pri box can the IP be a live ip ie.. 208.x.x.x as static ip and the sec box have this same ip as its virtual? or do i need to set up separate live ip on both boxes and a 3rd ip as the virtual?

    i have a small call inbound call center for a taxi company 8 pots lines and 2 Sip providers 1 sip provider uses authentication for traffic, ( no need for a static ip) the other needs a static IP for sip traffic (authenticates by IP)

    i then have a second nic on 192.168.x.x for the polycom phones to authenticate and pass their traffic (polycoms dont NAT very well is the reason for this weirdness)

    my thinking is that i need to setup the live side for the failover, or do i need to set up both sides live and NAT to failover? or could i set up a virtual IP on both nics on the nat side example eth1:0 192.168.20.100 PRI box and eth1:0 192.168.20.100 on SEC box? since asterisk is not running on the SEC box it shouldnt process any sip requests?

    also i am using freepbx as the front end an had to modify the line service asterisk start to amportal start. when using service asterisk start because of the way freepbx starts asterisk asterisk is thrown into a start fail loop.

    this is a great piece of work, the * community has needed something simple for the smaller installs, i to had given up on heartbeat with the dependacy hell and huge learning curve.

    thanks again

  11. 11 T said at 8:57 pm on August 26th, 2008:

    Have you seen heartbeat and the Linux-HA project? How is yours different from say heartbeat+DRBD?

  12. 12 Chris said at 3:13 pm on August 27th, 2008:

    This really is cool – I’ve played (tried) various methods like keepalived, and various other heartbeat methods but they all seem a bit too complicated for my limited experience. Does this work OK with two Asterisk boxes in different locations – or do they need to be on the same subnet? I have a couple of Servers, one in a managed Data Centre and one at home…

    Thanks,

  13. 13 admin said at 5:38 pm on August 27th, 2008:

    The Asterisk servers have to be on the same subnet because they share the (1) virtual IP. The real IP addresses can be whatever you want them to be.

    Let me know if you have any questions and I will be glad to help.

    Gregg

  14. 14 VanMan said at 4:04 pm on October 1st, 2008:

    I’m newbie to Linux and Asterisk and went through the process as described above but logically having some issues.
    Q: What IP address is used for Phone registration? Primary or Virtual?
    Q: When I take down the Primary Machine secondary machine should be the registrar for the phones and using Virtual IP for that make sense but in flip1405_primary it takes it down if the primary ip is responding.

    I may not be asking the right question please excuse me for that.

    VanMan

  15. 15 Jiltepolley said at 8:02 pm on October 5th, 2008:

    How i may contact admin this site? I have a question.
    iijiivei

  16. 16 Grant said at 9:47 pm on October 10th, 2008:

    So what happens to users voicemail and such?

    Say server 1 fails, and is down for two days…server 2 takes over for those two days.

    But when server 1 comes back up, any CDR and voicemail that has been saved on server 2 gets deleted? Or will the syncing work both ways?

  17. 17 jean-louis said at 7:09 am on October 16th, 2008:

    Hi,
    Thanks for your script, I just tested it.
    some time ago I had tought to use UCARP as failover solution for my asterisk servers , UCARP was find but was not suitable with all “clients “.
    IAX softphones could not register on asterisk when using a virtual IP ,
    SIP clients depends
    SCCP clients no problems

    I have the same result with your script,
    did you succeed registering IAX clients on a virtual IP of an asterisk server ?
    thanks,
    jl

  18. 18 VC said at 7:43 pm on October 17th, 2008:

    When Server1 fails, then all of the calls drop, right? But then Server2 comes online so that the next call 30s later is successfully connected. Correct?

    I’m not complaining. I just want to get some clarification.

    Thanks

  19. 19 Stewart said at 6:27 pm on October 19th, 2008:

    Hi

    I’m just planning a large Trixbox install that requires failover and came across Flip1405. Can this solution be applied to a Trixbox installation?

  20. 20 Paul said at 8:42 pm on November 5th, 2008:

    is it possible to failover but have manually interventention required to fail back? this will allow time to investigate the issue and control when the fail back outage will occur like outside of busy hour, plus it means less chance of a flip-flop issue arising.

    Thanks

  21. 21 Paul said at 9:38 pm on November 16th, 2008:

    I have implemented your script and added some logging and email alerting. I have come across an issue which I am wondering if you have experienced.
    I have both my asterisk boxes on a private IP addressed local LAN and access through my linux iptables firewall to the internet, I have all services running fine using the VIP IP for the port forwarding for SIP/RTP/RTCP traffic, the issue is when failing over the IP tables does not seem to use the GARP as a realtime change for the VIP port forwarding instead it keeps the old MAC address and send incoming responses to registration attempts to the asterisk box that has failed. Have you seen this behaviour and any suggestions?

    Cheers

    Paul

  22. 22 Syed said at 8:24 am on December 16th, 2008:

    This is good job. Anyway my question is, I’ve about 50 Cisco 7960 phones registered to my Asterisk server, can I use this program? like in Phone’s setting I use virtual IP address as SIP proxy, so if primary Asterisk box goes down, will Cisco phones registered to 2ndary Asterisk box without restarting them?

    This is important question for me, would be better if you send me email aswell at syedsauds@gmail.com

  23. 23 Scot said at 8:48 pm on January 10th, 2009:

    I have seen this question asked but I havent seen any answers, what happens when server one comes back online, does server 2 (the failover server) copy and sync all the new files that have been created while server one was down, like new voicemail and new cdr info etc back to server one?

  24. 24 prashant said at 10:38 am on February 24th, 2009:

    I have followed all ur point step by step, I am trying to run your flip1405 primary script in my asterisk server system but its not running, can u tell me what’s the problem?
    what should I do to run the script in linux operating system?

    Thanks & Regards :)

  25. 25 Indiver Nehru said at 4:40 am on March 4th, 2009:

    I want to try the above procedure for failover of asterisk server. I’m new to Linux. I had some questions regarding this

    1) I downloaded and modified the rsync_replicate shell file according to my ipaddress. My question is where to keep rsync_replicate shell file in the primary or secondary. If in primary what is the destination path.

    2) I created cron.every1 folder in my primary and secondary servers. I had one more doubt is it necessary to keep them in cron_every1 folder or we can keep in cron.daily folder located in /etc.

  26. 26 Gregory Boehnlein said at 12:04 pm on March 25th, 2009:

    Gregg,
    Thanks for creating this! I’ve made several updates/changes to your script and have been using it successfully in production for a few weeks now. The changes are as follows:

    # Version 1.0 – 2009-03-25
    # – Consolidated Master/Slave scripts into a single script
    # – Converted hardcoded interface / IP configuration to variable based
    # – Forced Asterisk to issue a “reload” to bind to the floating IP
    # – Consolidated external rsync-replicate script to be self-contained

    I’d like to give these back to the community, but wanted to have a conversation with you first about the potential for setting up a little more structured way to release the code. Can you drop me an E-mail when you get a chance?

  27. 27 Lawrence said at 9:10 am on March 31st, 2009:

    im having a weird error when runnig the scripts, i changed to CentOS 5.2 with freepbx when running the script i get SIOCSIFFLAGS: Cannot assign requested address, if i do ifconfig eth0:1 192.168.xx.xx/24 up manually from command line eth0:1 shows up until the next cron.every1 rotation.
    checked to see if i had nmap and arping installed and can see both packages….

    i have the scripts running and a few different locations , just cant seem to find what i am missing in this current build..
    any suggestions would be helpful

  28. 28 Crontab said at 3:33 pm on August 17th, 2009:

    I still think heartbeat is a much more reliable solution although agreed it is slightly more difficult to setup. That being said I can see where this is useful for those who need something immediately while they work on a more robust solution.

  29. 29 Saleem said at 2:02 pm on September 15th, 2009:

    This is a good work but my question is you mentioned the rsync method to sync the data between both servers but i want primary server should share only updated/changed data but not everything, because overwriting everything is not good approach. Plz guide. thanks

  30. 30 admin said at 2:15 pm on September 15th, 2009:

    Hi,
    It seems that an update from a reader/contributor to my code removed my existing rsync script.
    Please use the script code below (modify as needed), as it will only sync to “touched” files.

    #!/bin/bash
    rsync -avz –size-only -e “ssh -i /root/rsync-key” /etc/asterisk/extensions.conf root@192.168.1.3:/etc/asterisk/extensions.conf
    rsync -avz –size-only -e “ssh -i /root/rsync-key” /etc/asterisk/voicemail.conf root@192.168.1.3:/etc/asterisk/voicemail.conf
    rsync -avz –size-only -e “ssh -i /root/rsync-key” /etc/asterisk/queues.conf root@192.168.1.3:/etc/asterisk/queues.conf
    rsync -avz –size-only -e “ssh -i /root/rsync-key” /etc/asterisk/sip.conf root@192.168.1.3:/etc/asterisk/sip.conf
    rsync -avzr –size-only -e “ssh -i /root/rsync-key” /var/spool/asterisk/voicemail root@192.168.1.3:/var/spool/asterisk
    rsync -avzr –size-only -e “ssh -i /root/rsync-key” /var/lib/asterisk/moh root@192.168.1.3:/var/lib/asterisk/moh
    rsync -avzr –size-only -e “ssh -i /root/rsync-key” /var/lib/asterisk/sounds root@192.168.1.3:/var/lib/asterisk
    rsync -avzr –size-only -e “ssh -i /root/rsync-key” /home/PlcmSpIp root@192.168.1.3:/home
    rsync -avzr –size-only -e “ssh -i /root/rsync-key” /var/www/html/ root@192.168.1.3:/var/www

  31. 31 hb said at 5:07 pm on January 20th, 2010:

    I noticed a typo in your script
    rsync -avzr –rsh=ssh /var/lib/asterisk/moh root@$SLAVEIP:/var/lib/asterisk/moh causes the moh folder to be created on folder deeper.

    removing the /moh/ fixes the typo
    rsync -avzr –rsh=ssh /var/lib/asterisk/moh root@$SLAVEIP:/var/lib/asterisk

  32. 32 admin said at 5:18 pm on January 20th, 2010:

    Thank you very much for the contribution! I edited the script.

  33. 33 RazaMetaL said at 6:43 pm on February 25th, 2010:

    Hi,

    I’m using the latest version but have some questions.

    I’ve added one cron task as root with crontab -e:
    */1 * * * * /usr/bin/flip1405.sh

    The script seems like is doing nothing, if I reboot the master server, the slave does not get the virtual ip, when the master server is up again i need to execute manually /usr/bin/flip1405.sh to have the virtual ip as secondary on my ethernet card.

    I don not know if is necessary to add any other paremeters to the cron command?

    Best regards,

  34. 34 Sam said at 9:49 am on March 24th, 2010:

    Hi,
    As Virtual IP requires both boxes to the on the same subnet, do you have any clue on how the redundancy can be archived without these two servers being on the same subnet?

  35. 35 Tony Cruz said at 10:56 am on March 25th, 2010:

    Hi I have a Elastix setup and works great I will like to implement this script but I need to know where to install it in the server and if there is anything else besides the script at the top of this post that needs to be added or if you have a guide on how to implement it I will appreciated it

    Thank you
    Tony Cruz

  36. 36 admin said at 12:18 pm on April 8th, 2010:

    Hi Tony,

    You will need to implement the script just in the default locations for the cron.
    Syncing the configuration data will be more of a challenge because you need to get the databases to sync. But for simple failover, you can get it all done by following the directions in the post.

    Hope that helps,
    Gregg

  37. 37 admin said at 12:20 pm on April 8th, 2010:

    Hi,

    The idea with the virtual IP is that the devices pointing to it will seamlessly go to either server when the IP changes ownership.
    If you can describe the environment you’d like to setup, I would be able to better understand and discuss a solution.

  38. 38 admin said at 12:23 pm on April 8th, 2010:

    You can go through the script and post from the Linux CLI to test.
    Paste in the:
    VARIABLE=$(nmap -sU -p 5060 blah blah blah etc etc)

    Then:
    echo $VARIABLE

    That way you can see if you are getting the correct data that the script needs to run. If you can send me your layout (IPs) then I would be able to help you.

    Hope that helps,
    Gregg

  39. 39 Sikander said at 12:01 am on April 10th, 2010:

    Looking for help for setting up asterisk failover. I have to setup 2 servers primary and secondary. Let me know your affordable price please.

  40. 40 kman said at 11:35 pm on May 19th, 2010:

    Awesone stuff, just finished testing it all and will be putting it into production soon.

    A note, Asterisk needs to be bound to the floating IP for some handsets to register/accept calls properly using ‘bindaddr: x.x.x.x’. I put this in /etc/asterisk/sip_general_custom.conf

    Thanks much folks!

  41. 41 admin said at 7:38 am on May 20th, 2010:

    Thank you, I forgot to mention the bindaddr portion in the article. I will add that now.
    Cheers.

  42. 42 a visitor said at 9:17 am on June 5th, 2010:

    In the instruction above you mentioned “Make sure to set the bindport to the virtual IP address”, shouldn’t it be bindadrr ? based on the comment right above.

  43. 43 Brian said at 4:02 pm on July 6th, 2010:

    Hi,

    Looks interesting! 2 Questions

    What about astdb?

    What if IAX2 support isn’t installed?

    Maybe the check would be better for SIP.

    Regards,
    Brian

  44. 44 Ron said at 12:24 pm on July 28th, 2010:

    Hi, over there,

    this is really a cool script and it does what it was designed for. Just tested it.
    Since you wanted feedback:I am just trying to have some more services checked. Maybe my ambitions are not so clever cause I could try it out with heartbeat2 … Because I am not only running httpd and asterisk on it. Furthermore, I have mysql-replication with Master and Slave (which is PRI and SEC, respectivly) and just because of this fact (keeping database clean) I have to “send” some mysql commands so that the slave becomes master.

    I don’t know if you care at all about that, but I just wanted to thank you for that good bash-template … but if YOU got any hints for me, my ears and eyes are wide opened ;)

    best wishes,
    r0n


Leave a Reply