What it checks

The agent runs 11 diagnostic categories against your OpenSearch cluster using only read-only API calls. Each category is independent — a failure in one never blocks the others.

What is never read: document data, search query content, index field values, or your credentials. See the agent README for the full guarantee.

Cluster health

GET /_cluster/health

What is collected

  • ·Cluster status (green / yellow / red)
  • ·Total node count and data node count
  • ·Active and unassigned shard counts
  • ·Number of pending cluster tasks

What it detects

  • ·Cluster status is RED — data may be unavailable
  • ·Cluster status is YELLOW — some replicas are unassigned
  • ·Unassigned shards detected with root-cause reasons
  • ·Pending cluster tasks backlog

Node resources

GET /_nodes/stats

What is collected

  • ·JVM heap used % per node
  • ·CPU % per node
  • ·Disk used % and available bytes per node
  • ·OS memory used % per node
  • ·Node uptime in milliseconds
  • ·Node roles (master, data, ingest, etc.)

What it detects

  • ·JVM heap above 85% — risk of GC pressure and OOM
  • ·Disk above 85% — OpenSearch may flip indices to read-only
  • ·High CPU % sustained across nodes
  • ·Single-node cluster (no redundancy)

Shard distribution

GET /_cat/shards

What is collected

  • ·Count and reasons of unassigned shards
  • ·Shard count per node
  • ·Average shard size in bytes

What it detects

  • ·Unassigned shards with ALLOCATION_FAILED, NODE_LEFT, or other reasons
  • ·Heavily imbalanced shard distribution across nodes
  • ·Oversized shards (above 50 GB) that impact recovery time

Index health

GET /_cat/indices, GET /_all/_settings

What is collected

  • ·Per-index health status, open/closed state
  • ·Primary shard and replica count
  • ·Document count and store size
  • ·Read-only block flags (index.blocks.read_only, read_only_allow_delete)

What it detects

  • ·Indices in red or yellow health
  • ·Indices with read-only block set (usually triggered by low disk space)
  • ·Closed indices consuming disk but not serving queries

Performance metrics

GET /_nodes/stats/indices,thread_pool

What is collected

  • ·Indexing rate and total count
  • ·Search query rate and average latency
  • ·Thread pool rejection counts (write/search)
  • ·Query cache hit rate
  • ·Fielddata eviction count
  • ·Total segment count and merge time

What it detects

  • ·Thread pool rejections — indexing or search is being throttled
  • ·High search latency indicating slow queries or under-provisioned nodes
  • ·Very high segment count (fragmentation) — force merge may help
  • ·Fielddata evictions indicating memory pressure

Snapshot health

GET /_snapshot, GET /_snapshot/_all/_all

What is collected

  • ·Number of snapshot repositories configured
  • ·Timestamp of last successful snapshot
  • ·Number of failed snapshots in the last 7 days

What it detects

  • ·No snapshot repository configured — no backup in place
  • ·No successful snapshot in the past 7 days
  • ·Repeated snapshot failures in the past week

ISM policies

GET /_plugins/_ism/policies, GET /_plugins/_ism/explain/*

What is collected

  • ·Total ISM policy count
  • ·Indices without an assigned ISM policy
  • ·Indices with ISM execution errors

What it detects

  • ·Indices with ISM errors — lifecycle management is broken for those indices
  • ·Growing indices without any lifecycle policy (risk of unbounded disk growth)

Security configuration

GET /_plugins/_security/api/*

What is collected

  • ·TLS enabled on HTTP and transport layers
  • ·Audit logging enabled
  • ·Anonymous access enabled
  • ·Authentication backend configured

What it detects

  • ·TLS not enabled on HTTP — cluster traffic is unencrypted
  • ·Anonymous access enabled — anyone can query without credentials
  • ·Audit logging disabled — no record of who did what

Installed plugins

GET /_cat/plugins

What is collected

  • ·All installed plugin names and versions
  • ·OpenSearch version number

What it detects

  • ·Used to cross-reference findings (e.g. ISM checks only apply if the ISM plugin is installed)

Ingest pipelines

GET /_ingest/pipeline, GET /_all/_settings

What is collected

  • ·Total ingest pipeline count
  • ·Pipelines not referenced by any index (orphaned)

What it detects

  • ·Orphaned pipelines — configured but not used by any index, causing confusion

Index templates

GET /_index_template, GET /_cat/indices

What is collected

  • ·Total index template count
  • ·Templates with overlapping index patterns at the same priority
  • ·Templates that don't match any existing index

What it detects

  • ·Overlapping templates at the same priority — OpenSearch picks one arbitrarily
  • ·Unused templates — leftover from old indices, may cause confusion