Automated snapshot failures

Incident Report for ESS (Public)

Resolved

We’ve set up an automated process to snapshot affected clusters. The permanent fix for affected clusters will be going out with our next scheduled deployment. As the impact is now remediated, we’re closing this incident. If you have any additional concerns, please reach out to your support team! Thank you for your patience!

Posted Mar 05, 2020 - 19:19 UTC

Update

We are still working on a fix for this issue. In the interim, we are manually snapshotting any affected deployments. We will update again in 4 hours.

Posted Mar 05, 2020 - 15:06 UTC

Update

We are still working on a fix for this issue. In the interim, we are manually snapshotting any affected deployments. We will update again in 4 hours.

Posted Mar 05, 2020 - 11:03 UTC

Update

We are continuing to work on a fix for this issue. In the interim, we are manually snapshotting any affected deployments. We will update again in 4 hours.

Posted Mar 05, 2020 - 07:10 UTC

Identified

We have identified the issue and are currently developing a fix. We have manually triggered a snapshot for impacted clusters so impact for this incident at this time is low. We will provide another update in 4 hours.

Posted Mar 05, 2020 - 02:12 UTC

Update

We are continuing to investigate this issue.

While the investigation is ongoing we are going to run a manual process to create snapshots for all impacted clusters.

We will have another update in approximately 4 hours.

Posted Mar 04, 2020 - 22:41 UTC

Investigating

We have identified an issue with automated snapshots that is currently impacting all regions and providers - our investigation has determined that up to 2% of deployments are being affected. We are working on a remediation for the core issue, however in the meantime if you notice your deployment(s) is(are) not taking regular/automated snapshots you can trigger them manually as a temporary measure. These manual snapshots can be initiated through your Cloud management console, as well as using the Elasticsearch API.

We apologize for the inconvenience and will have another update for you in approximately two hours.

Posted Mar 04, 2020 - 21:04 UTC

This incident affected: Azure Netherlands (azure-westeurope) (Deployment snapshots: Azure azure-westeurope), GCP Mumbai (asia-south1) (Deployment snapshots: GCP asia-south1), GCP Oregon (us-west1) (Deployment snapshots: GCP us-west1), AWS Sydney (ap-southeast-2) (Deployment snapshots: AWS ap-southeast-2), AWS N. Virginia (us-east-1) (Deployment snapshots: AWS us-east-1), Azure Singapore (azure-southeastasia) (Deployment snapshots: Azure azure-southeastasia), GCP Belgium (europe-west1) (Deployment snapshots: GCP europe-west1), AWS Singapore (ap-southeast-1) (Deployment snapshots: AWS ap-southeast-1), AWS N. California (us-west-1) (Deployment snapshots: AWS us-west-1), GCP London (europe-west2) (Deployment snapshots: GCP europe-west2), Azure Washington (azure-westus2) (Deployment snapshots: Azure azure-westus2), GCP Iowa (us-central1) (Deployment snapshots: GCP us-central1), GCP Frankfurt (europe-west3) (Deployment snapshots: GCP europe-west3), AWS Tokyo (ap-northeast-1) (Deployment snapshots: AWS ap-northeast-1), AWS São Paulo (sa-east-1) (Deployment snapshots: AWS sa-east-1), GCP Montreal (northamerica-northeast1) (Deployment snapshots: GCP northamerica-northeast1), Azure Tokyo (azure-japaneast) (Deployment snapshots: Azure azure-japaneast), GCP Tokyo (asia-northeast1) (Deployment snapshots: GCP asia-northeast1), Azure Virginia (azure-eastus2) (Deployment snapshots: Azure azure-eastus2), AWS Oregon (us-west-2) (Deployment snapshots: AWS us-west-2), GCP Sydney (australia-southeast1) (Deployment snapshots: GCP australia-southeast1), AWS Frankfurt (eu-central-1) (Deployment snapshots: AWS eu-central-1), AWS London (eu-west-2) (Deployment snapshots: AWS eu-west-2), and AWS Ireland (eu-west-1) (Deployment snapshots: AWS eu-west-1).