The VoIP Addict’s Guide – VoIP Redundancy in the Cloud
Make no mistake, almost everything is becoming a cloud based service. Still running Exchange? You’re living in the past, my friend. Phone systems are, of course, no different. While I’ll maintain there are huge advantages to running an on-prem system (mostly cost and low latency), there are a lot of conveniences of having your system in the cloud. Now, when I say cloud, I am referring to platforms like Microsoft Azure and Amazon Web Services for this specific post.
Let’s talk a little about the conveniences of a cloud hosted phone system. First, it makes deploying remote phones a much easier process, mostly because every phone is now remote. It also allows anyone traveling abroad to bring their phone with them, and with Internet access, they can make calls from Singapore as if they were calling from Buffalo, New York (for example) with no international toll charges. Of course, you can always call extension to extension for zero cost. That’s a pretty amazing concept.
You might be thinking, this can all be done with an on-prem system as well, and you’d be right, but why poke holes in your corporate firewall, and subject yourself to the fun of NAT traversal if you don’t need to? You can also accomplish redundancy with an on-premise system, but you will lack the flexibility of providing multi-region connectivity and redundancy (because it’s in the same building), which is what the above-mentioned cloud services can provide.
Why is multi-region connectivity important? Well, if you’ve been reading the news lately, you’ve probably heard that Amazon dropped an entire region for a couple of hours causing mass panic, and the zombie apocalypse (not really). This is the risk you take in exchange for convenience when you place an application or service in the cloud, but when you distribute that application or service across multiple regions, you mitigate that risk significantly. Some businesses went down entirely because they stuck all of their eggs into one basket (region).
It should be known that regions in these cloud services are treated as completely siloed entities. Instances in one region, cannot simply ping an instance in another region via local IP address, even if they are on the same Amazon, or Azure account. For that, you need some sort of connector, like a VPN. Be aware, however, this is accomplished differently based on what service you are using.
Amazon Web Services, for example, does not have any built-in tools at this time to connect regions together. If you’re planning on deploying FreePBX in both Oregon, and Virginia for redundancy, you’ll need to create a VPN between the two systems with your own virtual appliance so that they can exchange configurations. This should not be confused with Sangoma’s High Availability module for FreePBX, as that requires two systems to be on the same subnet with very low latency between them.
Microsoft Azure, DOES provide the ability to create a region-to-region VPN without using a 3rd party VPN concentrator, and with my experience, the more natively supported tools and services you use, the better things work overall. Truthfully, a VPN may not always be necessary, but that will be dependent on the specific phone system, and how it prefers to communicate with its slave or warm spare. It generally isn’t a bad thing to have regardless.
Before I get more into the strategy of multi-region redundancy, I’d be remiss not to mention a second option, which is connecting either Microsoft Azure, or Amazon Web Services to your local corporate network. Both services, have native tools to create a VPN to your network, provided you have a compatible firewall on your side of the equation. In this scenario, you would have a system on your local network, with a warm spare in the cloud, which can talk local IP to local IP. This option isn’t as flexible as moving all phone system communications to the cloud, but would still provide redundancy in the event your on-prem system goes down, but you still have a live Internet connection to your building. If your entire network takes a nose dive, you are SOL.
Strategy: I originally had the idea (when writing this post) of testing Wazo’s built in high availability module, but I found that just installing the platform on Amazon was so incredibly difficult and an inconsistent process that I just gave up. Back when it was called Xivo, I tested high availability and it worked great. It didn’t work as well as Sangoma’s High Availability module, but it did a decent enough job. The way that it works (or worked), is by moving the configuration from the master system to the slave via a secure tunnel, then it would synchronize and shut down Asterisk on the slave. Its job would then be to continuously ping the master, and in the event, the master was unresponsive, start Asterisk, and bring up the SIP trunks. The only thing you’d have to worry about is registering all of your phones to the slave PBX. That can be automated by using IP phones with a secondary SIP server.
>So, because Wazo was such a PITA, I decided to go with something more mature in the open source space for this post, FreePBX. FreePBX can be configured as a warm spare similarly to Wazo, but it isn’t as automated of a process. Take a look here, to see what’s involved in the basic setup. You will STILL employ IP phones with a secondary SIP server (>Sangoma’s phones do this BTW). Like WAZO, a transfer of the configuration is sent to the warm spare in the opposite region via a secure tunnel, but the difference is in the synchronization. Wazo will instantly synchronize, but FreePBX will require a restore to be performed, which can be automated. You will also need to exclude changing the network settings on the warm spare. We aren’t exactly replacing the production system, we are just providing an alternate for phones to register to. The only intervention that should be required in the event of a failover is activating the SIP trunks (because you would have chosen to turn them off in the warm spare’s restore).
To summarize: When your production phone system has an issue and goes down, your IP phones will attempt to register to the secondary SIP server (via public IP address), which resides in another region (using either Azure, or Amazon). To complete the failover, you will need to log into the warm spare, which has now become the production system, and enable the SIP trunks. Within a reasonably quick period of time, calls in and out will occur as if nothing happened.
While this all works, the primary challenge is the timing of the synchronization between systems since it is not instantaneous. Logically, you’ll want to back and restore to the warm space nightly, but if a lot of changes are expected on a system daily, you may want to schedule that more frequently.
If you plan on deploying your phone system to the cloud, and redundancy is going to be an important priority, well then, I hope I gave you something to think about. Stay tuned for my upcoming post on creating a quick and easy VPN between Amazon Web Services regions.
Happy VoIPing!
2 Comments
Your article raises some interesting points about Cloud Computing and VoIP redundancy. I’m a little surprised by some of choices. You neglected to mention that the Sangoma high availability option costs thousands of dollars while the Wazo HA option is free. You mentioned Wazo being difficult to install. It’s worth mentioning that Wazo is a completely different Asterisk technology than FreePBX. FreePBX is essentially a code generator for Asterisk meaning that, once you make choices in the GUI, FreePBX generates Asterisk code which thereafter can run without any reliance on FreePBX. Wazo, on the other hand, is a realtime Asterisk implementation that make changes to the Asterisk operating environment in realtime as the changes are made. While changing a name associated with an extension takes a fraction of a second with Wazo, the same change using the FreePBX GUI requires a complete reload of every configuration file associated with Asterisk. In a small system, this is insignificant. But, with a system that has hundreds of extensions, the split second reload on Wazo can take 10 minutes or more using FreePBX. As for installation issues, Wazo is DNS sensitive and depends upon a fully-qualified domain name to bring up your server. I suspect most of your install issues were related to that. And it’s for that reason that we developed about a dozen tutorials at Nerd Vittles to walk administrators through the appropriate process on particular cloud platforms. We’ve never heard a complaint since. Finally, let me address cloud platforms. The two you have chosen from Microsoft and Amazon are extremely expensive particularly if you also want redundancy. Our preferences which yield as much as ten-fold cost savings include OVH, Vultr, and Digital Ocean. All of these providers offer multiple sites and support for VPNs such as NeoRouter which works extremely well for HA VoIP implementations and also happens to be free. I would encourage you to walk through an Incredible PBX for Wazo install using our tutorial on Nerd Vittles and see if you don’t have better results on one of the cloud platforms we’ve identified.
Hi Ward,
Thanks you chiming in on this topic.
To be clear, I do like Wazo, and services Vultr (I love Vultr), and Digital Ocean. OVH is a different story. While OVH’s prices are attractive, their setup process was archaic and they required me to send them PII to verify who I was. Not interested.
Amazon, and Microsoft, while considerably more expensive, were the focus of the post because they offer arguably more robust platforms and are much more configurable from a virtual networking standpoint. For an enterprise solution, I would always suggest one of the two, and perhaps Google if they ever get into gear. I am not necessarily trying suggest Vultr/DO are not worth using, but your cost savings come at the expense of flexibility (IMO).
I acknowledge and am aware that FreePBX and Wazo are two very different platforms, but my perspective on the post was more from a customer/user experience. I may not have done a great job expressing that, so I apologize. The “out of box” experiences are different and one requires a lot of extra effort to get working. Is that my fault? Could be. I plan on spending a lot more time tinkering with Wazo on your recommendation because platforms like that, and Ombutel help move the ball forward with open source/Asterisk based VoIP and that’s good for everyone.
Sangoma’s high availability option (module) is expensive from the perspective of a hobbyist, or SOHO user, but the cost may be negligible to a medical or financial organization because the cost of NOT having more sophisticated logic in their high availability solution might be higher. However, that module can’t be used in the cloud because it requires two physical network paths and for the systems to be in close proximity. Their “free” version of high availability is using a remote backup and restore strategy, which works, but I like Wazo’s implementation better.
Thanks,
Marc