Amazon AWS
Photo credit: Amazon

Amazon Web Services confirmed its systems were recovering following a major outage that disrupted dozens of websites and applications including Facebook, Snapchat, Amazon, Coinbase and Robinhood, reports The Wall Street Journal, which was among several media organisations affected by the outage.

The outage, which began around 3:00 AM Eastern Time, affected major retailers, airlines, social media applications, financial services companies and productivity tools across the AWS US-EAST-1 region centred around Northern Virginia. Sites including Slack, United Airlines, AI tool Perplexity and videogames Fortnite and Roblox experienced disruptions.

AWS traced the problem to its DynamoDB system, which provides websites with database storage and computing power. The service has more than one million customers across retail, financial services, media and entertainment sectors, with clients including Disney+, Zoom, Airbnb, Lyft, Dropbox and Nike.

The company identified the root cause at 2:01 AM Pacific Daylight Time as a DNS resolution issue affecting the DynamoDB API endpoint in US-EAST-1. AWS stated it was “working on multiple parallel paths to accelerate recovery” with the issue also affecting other services in the region.

Early signs of recovery

Engineers applied initial mitigations at 2:22 AM PDT with early signs of recovery appearing for some impacted services. By 3:35 AM PDT, AWS confirmed “the underlying DNS issue has been fully mitigated, and most AWS Service operations are succeeding normally now.”

However, requests to launch new EC2 instances and services that launch EC2 instances such as ECS continued experiencing increased error rates. AWS recommended customers configure EC2 instance launches without targeting specific Availability Zones and that Auto Scaling Groups be configured to use multiple zones.

Some services continued working through backlogs of events, including CloudTrail and Lambda, following initial recovery. AWS reported elevated polling delays for Lambda Event Source Mappings for SQS, affecting features depending on Lambda’s SQS polling capabilities including Organisation policy updates.

Global services and features relying on US-EAST-1 endpoints, including IAM updates and DynamoDB Global Tables, also experienced issues during the outage before recovering at 3:03 AM PDT.

The AWS infrastructure underpins millions of websites and platforms, providing cloud computing services such as servers and storage to major companies globally. The service is the largest cloud computing provider in the United States.

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

Political misinformation key reason for US divorces and breakups, study finds

Political misinformation or disinformation was the key reason for some US couples’…

Pinterest launches user controls to reduce AI-generated content in feeds

Pinterest has introduced new controls allowing users to adjust the amount of…

Meta launches ad-free subscriptions after ICO forces compliance changes

Meta will offer UK users paid subscriptions to use Facebook and Instagram…

Wikimedia launches free AI vector database to challenge Big Tech dominance

Wikimedia Deutschland has launched a free vector database enabling developers to build…

Film union condemns AI actor as threat to human performers’ livelihoods

SAG-AFTRA has condemned AI-generated performer Tilly Norwood as a synthetic character trained…

Mistral targets enterprise data as public AI training resources dry up

Europe’s leading artificial intelligence startup Mistral AI is turning to proprietary enterprise…

Wong warns AI nuclear weapons threaten future of humanity at UN

Australia’s Foreign Minister Penny Wong has warned that artificial intelligence’s potential use…

Anthropic’s Claude Sonnet 4.5 detects testing scenarios, raising evaluation concerns

Anthropic’s latest AI model recognised it was being tested during safety evaluations,…