OpenSearchShardsTroubleshooting

OpenSearch Unassigned Shards: Causes, Diagnosis, and How to Fix Them

Unassigned shards are one of the most common — and most misunderstood — OpenSearch problems. This post covers every root cause, how to diagnose each one, and exactly how to fix them.

April 7, 202510 min read

Unassigned shards are one of the most common problems in OpenSearch — and one of the most mishandled. The cluster goes yellow (or red). Someone runs GET /_cat/shards, sees UNASSIGNED next to a bunch of shards, and starts Googling for the magic command to fix it. They find a Stack Overflow answer telling them to run POST /_cluster/reroute with accept_data_loss: true, they run it, and the shards assign. Problem solved.

Except it isn't. Running reroute with accept_data_loss: true on a primary shard that went unassigned because the node holding its data died permanently means you just told OpenSearch to start with an empty primary — permanently discarding whatever data was on that shard. The cluster goes green. The data is gone.

This post explains why shards go unassigned, how to correctly diagnose each cause, and how to fix each one without data loss.

What Are Shards?

An OpenSearch index is divided into one or more shards — each shard is a self-contained Lucene index stored on a node. A shard has two roles: primary (the source of truth for writes) and replica (a copy for redundancy and read throughput).

Every primary shard must be assigned to exactly one node. Every replica shard must be assigned to a different node than its primary. When either type can't be assigned, it's "unassigned" and the cluster reports as yellow (replica unassigned) or red (primary unassigned).

The Most Important Command: Allocation Explain

Before doing anything else, run this:

GET /_cluster/allocation/explain
{
  "index": "your-index-name",
  "shard": 0,
  "primary": true
}

This returns a detailed explanation of why the shard can't be assigned. It tells you the exact reason — not a generic error, but a specific, actionable explanation. Always start here.

If you omit the body, OpenSearch will explain an arbitrary unassigned shard, which is usually useful enough to start with:

GET /_cluster/allocation/explain

Root Cause 1: Node Left the Cluster

What happens: A node restarts, crashes, or is shut down. Any shards that had their primary or only replica on that node become unassigned.

Allocation explain output: You'll see something like node_left as the unassigned reason, and the explanation will say the node that held the shard is no longer in the cluster.

What to do:

  • If the node is coming back (rebooting, temporary network issue): wait. OpenSearch has a delayed_timeout setting (default 1 minute) specifically to avoid immediately moving shards when a node bounces. If the node comes back within the timeout, shards reassign instantly without any data movement.
  • If the node is permanently gone: OpenSearch will try to promote a replica to primary. If there's a healthy replica on another node, this happens automatically. If there's no replica (you had 0 replicas or all replicas were also on the failed node), you're in a data loss situation — see the section on accept_data_loss below.
# Check which nodes are currently in the cluster
GET /_cat/nodes?v

# Increase delayed timeout to avoid moving shards during rolling restarts
PUT /_all/_settings
{
  "settings": {
    "index.unassigned.node_left.delayed_timeout": "5m"
  }
}

Root Cause 2: Disk Watermark Exceeded

What happens: A node's disk usage crosses the high watermark (default 90%). OpenSearch stops allocating new shards to that node. Existing shards on the node start being moved off it. If there's nowhere to move them (all nodes are above 85% disk usage), shards become unassigned.

Allocation explain output: The reason will be DECIDERS with an explanation mentioning the DiskThresholdDecider and showing disk usage percentages.

What to do:

  • Free up disk space — delete old indices, force-merge, or run ISM policies to roll over and delete old data.
  • Add disk capacity to the affected nodes.
  • Temporarily raise the watermarks while you fix the underlying issue (not a long-term solution).
# Check disk usage per node
GET /_cat/allocation?v

# Temporarily raise watermarks (not permanent — fix disk first)
PUT /_cluster/settings
{
  "transient": {
    "cluster.routing.allocation.disk.watermark.low": "90%",
    "cluster.routing.allocation.disk.watermark.high": "95%",
    "cluster.routing.allocation.disk.watermark.flood_stage": "97%"
  }
}

Root Cause 3: Replica Count Exceeds Node Count

What happens: A shard cannot have its primary and replica on the same node. If you set number_of_replicas: 2 on a 2-node cluster, there aren't enough nodes to host both replicas on different nodes from the primary. One replica will always be unassigned.

Allocation explain output: The reason will reference SameShardAllocationDecider or note that there are no nodes that can hold the shard without violating allocation rules.

What to do: Either add more nodes or reduce the replica count.

# Reduce replicas to fit your node count
# For a 2-node cluster, max replicas = 1
PUT /your-index/_settings
{
  "number_of_replicas": 1
}

Root Cause 4: Max Retry Exceeded (ALLOCATION_FAILED)

What happens: OpenSearch tried to allocate a shard, the allocation failed (usually because the shard data on disk was corrupt or the node crashed mid-write), and after 5 attempts it stops trying. The shard stays unassigned indefinitely.

Allocation explain output: The unassigned reason is ALLOCATION_FAILED. The explanation will show the number of failed allocation attempts and the specific error.

What to do: If the data is recoverable (other copies exist on healthy nodes), retry the allocation:

# Reset failed allocation attempts and let OpenSearch try again
POST /_cluster/reroute?retry_failed=true

If retrying still fails and a healthy replica exists on another node, promote the replica:

POST /_cluster/reroute
{
  "commands": [
    {
      "allocate_replica": {
        "index": "your-index",
        "shard": 0,
        "node": "node-with-good-data"
      }
    }
  ]
}

Root Cause 5: Index Created on a Node That No Longer Exists

What happens: You created an index with an allocation filter (index.routing.allocation.require.*) that pins shards to a node that no longer exists, or to a node with a specific attribute that no longer applies.

Allocation explain output: You'll see the FilterAllocationDecider blocking allocation on every node, with a note about the required attributes.

What to do: Remove or update the allocation filter.

# Remove the allocation filter
PUT /your-index/_settings
{
  "index.routing.allocation.require._name": null
}

The accept_data_loss Option — When and When Not to Use It

accept_data_loss: true is a last resort. It tells OpenSearch: "I know you can't find a copy of this shard's data, but assign it anyway — start fresh with an empty shard."

Use it only when:

  • All copies of a primary shard's data are permanently destroyed (the node is gone and there are no replicas)
  • You've verified there is genuinely no recoverable copy of the data
  • You accept that any documents on that shard are permanently lost
  • The data can be re-indexed from a source system (your database, S3, etc.)

Do not use it when:

  • The node is temporarily unavailable (wait for it to come back)
  • There might be a replica somewhere (check first)
  • You can restore from a snapshot (always prefer this)
# Only run this after exhausting all other options
POST /_cluster/reroute
{
  "commands": [
    {
      "allocate_empty_primary": {
        "index": "your-index",
        "shard": 0,
        "node": "some-node",
        "accept_data_loss": true
      }
    }
  ]
}

Prevention: How to Avoid Unassigned Shards

  • Always run at least 1 replica for every production index. Zero replicas means any node failure causes a red cluster.
  • Set a delayed_timeout of 5–10 minutes for time-series indices so rolling restarts don't trigger massive shard movement.
  • Alert on disk usage above 75% so you have time to react before the watermark kicks in.
  • Match replica count to node count: number_of_replicas should be number_of_nodes - 1 at most.
  • Take regular snapshots so if you do need to use accept_data_loss, you can restore from backup immediately after.

Try it free

OpenSearch Doctor detects all of this automatically

A lightweight agent runs on your server, checks 50+ things, and tells you exactly what's wrong and how to fix it. Free for 1 cluster, no credit card.

Get started free →