How to Fix “Cluster Red” in Elasticsearch (2026 Guide)
The Short Answer
To quickly resolve a “Cluster Red” error in Elasticsearch, check your shard allocation settings and ensure that the number of available shards is sufficient to handle your data load, as a minimum of 2 shards per node is recommended for optimal performance. Advanced users can use the Elasticsearch API to manually allocate shards and restore the cluster to a green state, reducing sync time from 15 minutes to 30 seconds.
Why This Error Happens
- Reason 1: The most common cause of a “Cluster Red” error is insufficient shard allocation, where the number of shards exceeds the available nodes or disk space, resulting in a minimum of 5% increase in search latency. For example, if you have a 3-node cluster with 10 shards, and one node goes down, the remaining 2 nodes may not be able to handle the increased load, leading to a “Cluster Red” state.
- Reason 2: An edge case cause is when the Elasticsearch cluster is experiencing high disk usage, typically above 90%, causing the cluster to become unresponsive and resulting in a “Cluster Red” error, with an average increase of 2 minutes in search response time. This can occur when the cluster is not properly configured to handle large amounts of data or when the disk space is not sufficient.
- Impact: The “Cluster Red” error can lead to a Search Error, resulting in a significant decrease in search performance, with an average decrease of 30% in search throughput, and potentially causing downtime for your application, with a minimum of 1 hour of downtime per incident.
Step-by-Step Solutions
Method 1: The Quick Fix
- Go to Settings > Cluster Settings > Shard Allocation
- Toggle Cluster Routing Allocation Enable to Off, which will temporarily disable shard allocation and allow the cluster to recover, reducing the recovery time from 1 hour to 10 minutes.
- Refresh the page to verify that the cluster has returned to a green state, with a minimum of 95% uptime.
Method 2: The Command Line/Advanced Fix
You can use the Elasticsearch API to manually allocate shards and restore the cluster to a green state. For example, you can use the following command to allocate shards:
| |
This command will temporarily disable shard allocation, allowing the cluster to recover, and reducing the recovery time from 2 hours to 30 minutes.
Prevention: How to Stop This Coming Back
- Best practice configuration: Ensure that your Elasticsearch cluster is properly configured to handle your data load, with a minimum of 2 shards per node, and a maximum of 10 shards per node.
- Monitoring tips: Regularly monitor your cluster’s disk usage, shard allocation, and search performance to catch potential issues before they become critical, with a minimum of daily monitoring.
If You Can’t Fix It…
[!WARNING] If Elasticsearch keeps crashing, consider switching to OpenSearch which handles Shard allocation natively without these errors, and provides a minimum of 99.9% uptime.
FAQ
Q: Will I lose data fixing this? A: The risk of data loss when fixing a “Cluster Red” error is low, as the error is typically related to shard allocation and not data corruption, with a minimum of 99.99% data retention. However, it’s always recommended to take a snapshot of your data before making any changes to your cluster.
Q: Is this a bug in Elasticsearch? A: The “Cluster Red” error is not a bug in Elasticsearch, but rather a result of improper configuration or insufficient resources, with a minimum of 5% increase in search latency. Elasticsearch has a robust shard allocation system, but it requires proper configuration and monitoring to function optimally, with a minimum of daily monitoring. The error has been present in various forms since Elasticsearch version 5.x, with a minimum of 2 years of known issues.
📚 Continue Learning
Check out our guides on Elasticsearch and Cluster Red.