Where can I get a five 9’s database for $50 per month? |

I’ve been working with telco software for the last 7 years and by far far and away the most painful part on the server side is a reliable database set up. NAT would take the cake for the most painful thing overall. This post was inspired by the sipsorcery MySQL service crashing yet again.

I have the service configured to automatically restart so the crash would have hardly been noticeable.

In the last couple of months the previously rock solid MySQL instance on the sipsorcery server has become increasingly flakey and the process is now crashing on average once a week. From the error logs the suspect is something to do with MySQL’s SSL handling and I did find an issue on the MySQL bug list that “may” be the cause. I need to schedule time to upgrade to the latest MySQL version but that’s not a trivial matter and will probably mean between 30 and 60 minutes of downtime.

The recent problems I’ve had with MySQL highlight a current theme I’ve experienced throughout my time working with VoIP/telecommunications software and that is building and maintaining a five 9’s reliability database is either very expensive or very tough.

The database I used for my first VoIP platform was Postgresl. All in all it was pretty reliable but issues did crop up. In one case the data files got corrupted due to a Postgresql bug handling Unicode on Windows. After recovering from that we moved to a Linux platform and suffered a couple of outages due to hardware and operating system issues. That led us to bite the bullet and spend a lot of money (for a small company) on a Solaris SunFire server and storage array. That ended up being disastrous; first a firmware fault in the fiber channel controller on the storage array caused some major outages and lots of messing around to replace and after sorting all that out there were a few kernel panic incidents that I can’t recall the cause of. The problem with Postgresql at the time is that it didn’t have a replication/fail over solution. There were side projects which we tested out but they were all immature and some introduced prohibitive performance penalties. We ended up using archive log shipping where the transaction logs from our main server were copied over to a standby Linux Postgresql server. Unfortunately that caused a couple of outages as well; the disk on the Linux server filled up and the Linux Postgresql instance stopped applying the transaction logs and that caused the primary Postgresql instance on the Solaris server to shutdown to preserve the integrity of the data. Eventually things settled down but it was serious enough at the time that it was considered a threat to the survival of the business.

The Solaris experience caused us to appreciate even more how important a reliable database was so we had a chat to Oracle. They promised a multi-server, real-time fail over system but the price was exorbitant and way out of our league. We were desperate enough to consider it at the time but in the end common sense prevailed and we soldiered on with Postgresql.

With mysipswitch and sipsorcery it was a chance to try and find a better solution is a less demanding environment. By that I mean if something went wrong it was an inconvenience to people but they were warned in advance that the system was experimental and came with no guarantees. The mysipswitch service actually spent it’s whole life using the same Postgresql database I mentioned above and it was only when the service morphed into sipsorcery that a different database approach was attempted. The sipsorcery service was intially deployed for a very unhappy year on Amazon’s EC2 infrastructure. Initially only a single server deployment using a local MySQL database was used. When the EC2 instance started going down every second day the deployment model changed to two servers with an SQL Azure database. I actually thought at the time SQL Azure was finally the perfect solution. It was cheap at $10 a month and all the hard things about running a database were taken care of by Microsoft. However there were a few glitches that caused 5 minutes of downtime here and there and at the time I put it down to the fact that the service was brand new, this was in Jan 2010 when SQL Azure had only just been opened for service. The small outages were bearable but the real problem came when i finally had to give up on EC2 and move to a dedicated server. At that point the SQL Azure database I was using started getting the connection requests that were previously spread over two EC2 instances from a single dedicated server. It wasn’t a huge number, between 20 and 30, but it was enough to cause the SQL Azure Denial of Service software (called DOSGuard apparently) to drop all connections for up to an hour. SQL Azure support were happy to send me emails back and forth for nearly 3 months about the issue but in the end it was something they couldn’t or wouldn’t fix. In my opinion it’s a strange limitation and one that probably stems from the fact that SQL Azure is mainly pitched as being a solution for web applications. Apparently the sipsorcery software was getting flagged because the connections were coming from seven different processes.

So after SQL Azure it’s back to a local MySQL instance on the dedicated sipsorcery server. It’s almost tempting to switch it to Postgresql to complete the circle.

At one point I did look at NoSQL options like Amazon’s SimpleDB but the latency was a killer with it taking almost a second for even the simplest queries. I also checked out some of the other similar offerings but they all appeared geared up for web applications where response times of up to 500ms aren’t a problem. For sipsorcery the response times need to be well under 100ms.

I also know MySQL has replication and load balancing options and I have explored them. The problem is to get automatic failover it needs something like 6 servers. Without automatic failover there’s not a lot of benefit for sipsorcery to replicate data to a standby node. That means Amazon’s RDS service is also not a great option.

I’d love to hear if anyone knows of any other type of service out there that might be worth looking into?

7 comments

Comments feed for this article

March 2, 2011 at 14:40

John

Minimum servers to run MySQL in a cluster is 3, but that can still be expensive for a small business.
http://dev.mysql.com/doc/refman/5.1/en/faqs-mysql-cluster.html#qandaitem-B-10-1-6

I was recently looking into postgre myself and it appears that they have a couple different options now depending on your needs. I didn’t look into the details yet though to know minimum servers or ease of setup.

http://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling

March 2, 2011 at 22:06

sipsorcery

The problem with the 3 node MySQL cluster is you still have a single point of failure. That’s where I got the six from 3 x 2. It’s probably possible to have a MySQL cluster with no single point of failure on something between 4 and 6 servers but that’s too many. I need a two server solution or even better a cloud solution.

On the postgresql front PGCluster and pgpool were the solutions that we looked at previously. It’s been 4 years since then so I imagine they’ve improved a lot. I might look into pgpool-II and see how easy it is to set up. The last thing I want to do is create spend a lot of time creating an even more fragile database layer.

March 2, 2011 at 20:12

Avi Marcus

I’ve been using Linode as a VPS for several months now. They have several datacenters (4 in USA). I can only speak for the London datacenter, it seems pretty rock solid (fremont apparently keeps having trouble) and I’m even proxying media reliably on the XEN virtualization for FreeSWITCH.
It’s $20/mo per 512mb ram / 16gb hd / 200 gb total bandwidth instance. The CPU speeds are ridiculously good (and higher levels gets you a bigger slice of the available CPU). benchmarks: http://journal.uggedal.com/vps-performance-comparison

If you NEED windows, you’ll need more ram and deal with licensing yourself..

Internal network usage in the same datacenter is free so you can set up 2 small instances with master-master replication and have your load split between the two or something like that for load balancing / performance.
I have an extra IP and have set up heartbeat failover set up to use the floating IP and automatically switch the public resource. That’s not for load-balancing, though.
There’s a 7 day money back guarantee, a very helpful IRC channel too.
And please use my referral code if you use it 🙂 65078253f9923816a558d98df5785403d32435c0

March 2, 2011 at 22:11

I wouldn’t run a Windows instance on top of Xen again. Amazon use Xen and when the sipsorcery instances were dropping off the network the Amazon tech was blaming the Xen driver.

Linode could be an option for a database set up though. I might have a hunt around and see if there’s anybody offering a hosted database service on top of it.

March 2, 2011 at 22:22

I doubt you’d find someone running something that’s not for a big premium.
Although, someone is offering a memcached service soon in the Newark datacenter – at most $3/mo per 32mb of cached stuff. Not sure how much that would help you though.

You could set up a master-master and load balance to 2 nodes. Not sure how big a deal losing sync will be. Mysql keeps a changes journal so you probably wouldn’t lost stuff even if you lost sync.
And you mostly need reads of the dialplan anyway, not stuff that keeps changing.

March 23, 2011 at 02:10

Tom Foley

I see Xerox have an offering…

http://xeround.com/mysql-cloud-db-overview/amazon-rds-feature-comparison/

March 23, 2011 at 09:35

Coincidentally I came across them the day after blogging this post. I signed up for the beta and it was incredibly easy to get up and running. There are still a few feature deficiencies that prevent it from being suitable for sipsorcery in particular no SSL support. Hopefully this is an example of more and improved hosted database offerings to come!

Where can I get a five 9’s database for $50 per month?

Links

7 comments

Leave a comment Cancel reply

Where can I get a five 9’s database for $50 per month?

Share this:

Related

Links

7 comments

Leave a comment Cancel reply