The AWS Outage and Cloud Management

What The AWS Outage Teaches Us

On February 28, a part of Amazon’s cloud service went down for several hours, causing widespread failures of web sites and web apps hosted on the platform.  Amazon’s AWS S3 service had an interruption at one of their data centers on the East Coast of the US and resulted in sites like Business Insider and Quora being inaccessible and services like Slack and Amazon’s music app also not working.  Amazon later noted in a very detailed explanation that the downtime was the result of human error and that protocols in routines had been adjusted to avoid similar mistakes in the future.

So Is Cloud Computing Safe?

So does this mean that everyone who has been worried the cloud isn’t really safe for businesses has been right?

Well, not really.  It’s important to keep a few points in mind.

First, only one data center had issues on February 28.  In Amazon’s guidelines for cloud usage, they advise have data reside in multiple data centers for redundancy.  Notably, Netflix did not suffer any service disruptions during the S3 incident, even though all of their data resides on AWS.  Netflix had taken care to ensure redundancy.  You have responsibilities as a business when using cloud services.  That being said, I think it’s pretty clear that any major cloud provider is not going to be able to simply say ‘you should have been more careful in your setup’ regarding outages.   That won’t cut it for major outages.  

Let’s also keep in mind the second key point: AWS rarely goes down.  In fact, even with this latest outage, they are still meeting their SLA-guaranteed uptime metrics for customers.  Part of the reason the outage was so newsworthy is because it’s so rare. 

Finally, many, many enterprises are continuing with a hybrid cloud solution, keeping some ‘cloud’ solution services on-premises.  They are adopting public cloud, but not fully.   This is another way to make cloud computing ‘safer.’   This trend will probably continue for the foreseeable future.  Ultimately, I think the vast majority of enterprises will ultimately do all of their major IT lifting in the public cloud.  It is safe.  And the major cloud providers are investing huge resources in it–resources individual companies can’t match.  As Google CEO Sundar Pichai noted recently, Google has invested over $30 billion in its cloud infrastructure.  Why would any enterprise try to match that, rather than just using Google’s cloud?