AWS is the leader when it comes to the cloud, and for good reason. AWS is well ahead in the quality and breadth of services they offer.
However, when a service is running at the scale of AWS, it is natural to expect some failures to occur. According to AWS EBS availability is designed for 99.999%.
The annual failure rate (AFR) is 0.1% - 0.2%, where failure means a complete or partial failure. For example, if you had 1,000 EBS discs, you should expect 1 or 2 to have a failure per year. In our experience, partial failure is significantly more common than a complete loss. Even so, a partial loss can take a lot of time to resolve and can still be debilitating to a business.
Over the years, there have been some AWS failures that made news headlines due to havoc caused for both companies and their users. These incidents put a spotlight on AWS’ imperfections.