Overview
Relevant Files
README.asciidocserver/src/main/java/org/elasticsearch/package-info.javaserver/src/main/java/org/elasticsearch/bootstrap/Elasticsearch.javaserver/src/main/java/org/elasticsearch/node/Node.javaserver/src/main/java/org/elasticsearch/search/SearchModule.javaserver/src/main/java/org/elasticsearch/index/IndexService.java
Elasticsearch is a distributed search and analytics engine built on Apache Lucene, optimized for speed and relevance on production-scale workloads. It serves as a scalable data store and vector database supporting full-text search, logs, metrics, APM, and security analytics.
Core Architecture
The system is organized into several key layers:
Bootstrap & Node Initialization (org.elasticsearch.bootstrap)
The Elasticsearch class manages a three-phase startup process. Phase 1 initializes logging and reads CLI arguments. Phase 2 sets up security, native libraries, and entitlements. Phase 3 constructs the Node and starts accepting requests. This design ensures proper initialization order and security enforcement.
Node & Cluster Management (org.elasticsearch.node, org.elasticsearch.cluster)
The Node class represents a single Elasticsearch instance and coordinates with the cluster. It manages lifecycle components, plugins, and services. The cluster layer handles distributed state, shard allocation, and coordination across nodes.
Indexing & Search (org.elasticsearch.index, org.elasticsearch.search)
IndexService manages individual indices and their shards. SearchModule registers query parsers, aggregations, suggesters, and other search components. The search pipeline processes queries through multiple phases: query, fetch, and aggregation.
REST API & Transport (org.elasticsearch.rest, org.elasticsearch.transport)
REST handlers expose HTTP endpoints for client interaction. The transport layer enables node-to-node communication using a custom binary protocol with compression support.
Key Components
┌─────────────────────────────────────────────────────┐
│ REST API Layer (HTTP Endpoints) │
├─────────────────────────────────────────────────────┤
│ Action Framework (Requests & Responses) │
├─────────────────────────────────────────────────────┤
│ Search Module | Index Service | Cluster Service │
├─────────────────────────────────────────────────────┤
│ Lucene Integration | Storage Engine | Shards │
├─────────────────────────────────────────────────────┤
│ Transport Layer | Cluster Coordination │
└─────────────────────────────────────────────────────┘
Plugins System (org.elasticsearch.plugins)
Elasticsearch uses a modular plugin architecture. Plugins can extend search functionality, add analysis components, implement custom repositories, or provide transport implementations. The PluginsLoader manages plugin discovery and initialization.
Data Management
- Indices & Shards: Data is organized into indices, which are split into shards for distribution
- Mappings: Define field types and analysis rules
- Ingest Pipelines: Transform data before indexing
- Snapshots & Repositories: Backup and restore functionality
Module Organization
The codebase is organized as Java modules:
org.elasticsearch.server- Core server functionalityorg.elasticsearch.base- Foundational utilitiesorg.elasticsearch.xcontent- Content serializationorg.elasticsearch.plugin- Plugin infrastructureorg.elasticsearch.xcore- X-Pack core features
Request Processing Flow
- HTTP request arrives at REST handler
- Request is parsed and validated
- Action is dispatched to appropriate handler
- Handler coordinates with cluster/index services
- Lucene performs actual search/indexing
- Results are aggregated and returned
The architecture emphasizes scalability, fault tolerance, and extensibility through plugins and modular design.
Architecture & Core Components
Relevant Files
server/src/main/java/org/elasticsearch/node/Node.javaserver/src/main/java/org/elasticsearch/node/NodeConstruction.javaserver/src/main/java/org/elasticsearch/cluster/ClusterState.javaserver/src/main/java/org/elasticsearch/cluster/coordination/Coordinator.javaserver/src/main/java/org/elasticsearch/cluster/service/MasterService.javaserver/src/main/java/org/elasticsearch/cluster/service/ClusterService.java
Elasticsearch is built on a distributed architecture where multiple nodes coordinate to form a cluster. The core components work together to maintain cluster state, elect a master node, and apply state changes consistently across all nodes.
Node Lifecycle
The Node class represents a single Elasticsearch instance. During startup, it:
- Initializes plugins and services via dependency injection
- Creates core components:
ClusterService,Coordinator,TransportService - Starts lifecycle components in a specific order to ensure proper initialization
- Begins cluster discovery and joins the cluster
// Node startup sequence
Node node = new Node(environment, pluginsLoader);
node.start(); // Starts all lifecycle components
Cluster State Management
ClusterState is the immutable, distributed state held on all nodes. It contains:
- Metadata: Index definitions, settings, templates (persisted to disk)
- RoutingTable: Shard allocation across nodes
- DiscoveryNodes: Cluster membership information
- ClusterBlocks: Blocks preventing operations (e.g., during recovery)
- Custom: Plugin-specific state extensions
The metadata portion persists across full-cluster restarts, while routing and node information resets but survives master elections.
Master Election & Coordination
The Coordinator implements the Raft-based consensus protocol for master election:
- Manages voting configurations and term numbers
- Handles peer discovery and cluster bootstrap
- Publishes new cluster states to all nodes
- Ensures only one master leads at a time
Loading diagram...
Master Service & State Updates
The MasterService processes cluster state updates on the master node:
- Receives tasks from clients and internal services
- Batches tasks by priority (URGENT, HIGH, NORMAL, LOW)
- Executes tasks sequentially to compute new cluster state
- Publishes the new state via
Coordinator
Tasks are submitted via ClusterService#createTaskQueue() and processed in priority order. Higher-priority tasks can starve lower-priority ones, so use Priority.NORMAL by default.
ClusterService Integration
ClusterService is the main entry point for cluster operations:
- Wraps
MasterService(state updates) andClusterApplierService(state application) - Provides
state()method to read current cluster state - Manages listeners for state changes
- Routes operations to appropriate nodes
// Submit a cluster state update
clusterService.submitStateUpdateTask("task-name", new ClusterStateUpdateTask() {
public ClusterState execute(ClusterState currentState) {
// Compute new state
return newState;
}
});
State Application Pipeline
When a new cluster state is published:
Coordinatorreceives acknowledgments from quorum of nodes- State is committed and applied via
ClusterApplierService ClusterStateAppliercallbacks execute (no particular order)ClusterStateListenercallbacks execute after state is updated- Services react to changes (e.g.,
IndicesClusterStateServiceallocates shards)
This ensures all nodes eventually converge to the same state while allowing services to react appropriately.
Cluster Coordination & State Management
Relevant Files
server/src/main/java/org/elasticsearch/cluster/coordination/CoordinationState.javaserver/src/main/java/org/elasticsearch/cluster/coordination/Coordinator.javaserver/src/main/java/org/elasticsearch/cluster/coordination/Publication.javaserver/src/main/java/org/elasticsearch/cluster/coordination/MasterHistory.javaserver/src/main/java/org/elasticsearch/cluster/coordination/ClusterFormationFailureHelper.javaserver/src/main/java/org/elasticsearch/indices/cluster/IndicesClusterStateService.java
Elasticsearch uses a formal consensus algorithm to coordinate cluster state across all nodes. The coordination layer ensures that all nodes agree on the current cluster state and that changes are applied consistently.
Core Coordination Algorithm
CoordinationState is the heart of the coordination system, directly implementing a formal model based on Raft-like consensus. It maintains two types of state:
- Persisted State: Current term and last-accepted cluster state (written to disk)
- Transient State: Join votes, election status, and publish votes (in-memory only)
The algorithm uses terms (monotonically increasing numbers) to detect stale leaders and voting configurations to determine quorum requirements. A node wins an election when it receives votes from a quorum of master-eligible nodes.
Master Election & Joining
The Coordinator orchestrates the election process:
- Nodes start as CANDIDATE and request joins from peers
- A candidate becomes LEADER when it wins an election (receives quorum of join votes)
- Followers send periodic heartbeats to the leader; if heartbeats stop, they become candidates again
VotingConfiguration defines which nodes can vote. It supports dynamic reconfiguration via the Reconfigurator, which automatically shrinks the voting set when nodes leave (if enabled).
Cluster State Publication
When the master publishes a new cluster state:
Master → All Nodes: PublishRequest (with cluster state)
↓
All Nodes: Accept & persist state locally
↓
Master: Wait for quorum of PublishResponse
↓
Master → All Nodes: ApplyCommitRequest
↓
All Nodes: Apply state to local data structures
Publication manages this multi-phase process. The master waits for a quorum of nodes to acknowledge acceptance before committing. Once committed, ClusterStateApplier implementations (like IndicesClusterStateService) apply the state to indices, shards, and routing tables.
State Application
IndicesClusterStateService applies cluster state changes to local indices:
- Creates/deletes indices and shards based on routing assignments
- Initiates shard recovery when replicas are assigned
- Closes shards no longer assigned to this node
- Handles shard failures and recovery state transitions
Monitoring & Diagnostics
- MasterHistory tracks the last 30 minutes of master node changes (in-memory)
- ClusterFormationFailureHelper logs debug information when cluster formation fails
- LeaderChecker and FollowersChecker monitor leader-follower connectivity
Indexing & Data Storage
Relevant Files
server/src/main/java/org/elasticsearch/index/shard/IndexShard.javaserver/src/main/java/org/elasticsearch/index/engine/Engine.javaserver/src/main/java/org/elasticsearch/index/store/Store.javaserver/src/main/java/org/elasticsearch/cluster/metadata/DataStream.javaserver/src/main/java/org/elasticsearch/gateway/PersistedClusterStateService.java
Elasticsearch organizes data storage around shards, which are the fundamental unit of data distribution and persistence. Each shard contains a Lucene index backed by a Store that manages file-level access to the underlying file system.
Core Storage Architecture
The storage hierarchy flows from top to bottom:
- IndexShard — Manages a single shard replica, coordinating indexing, searching, and recovery
- Engine — Handles document indexing, deletion, and search operations using Lucene
- Store — Provides low-level file access to Lucene's Directory abstraction
- Directory — Lucene's file abstraction layer (NIOFSDirectory, MMapDirectory, etc.)
Loading diagram...
Engine & Translog
The Engine is the core write path. It processes index and delete operations, maintaining:
- In-memory buffer — Accumulates document changes before flushing
- Translog — Write-ahead log ensuring durability; survives crashes
- Lucene segments — Immutable index structures created during flush/commit
When you index a document, the Engine appends to the translog immediately, then buffers the change. Periodically, the buffer is flushed to disk as Lucene segments, and the translog is rolled to a new generation.
Store & File Management
The Store wraps Lucene's Directory and provides:
- MetadataSnapshot — Tracks committed files with checksums and metadata
- File versioning — Manages segment files, translog files, and state files
- Checksum validation — Ensures data integrity during recovery and replication
Each shard has a dedicated directory on disk containing:
indices/
{index-uuid}/
{shard-id}/
index/ # Lucene segments
translog/ # Write-ahead logs
state/ # Shard state metadata
DataStreams & Time-Series Indexing
DataStreams provide a higher-level abstraction for time-series data. They automatically manage backing indices and handle rollover:
- Routes writes to the current write index (newest backing index)
- Automatically creates new backing indices based on time or size
- Supports failure stores for capturing indexing errors
- Optimizes search by sorting leaf readers by timestamp descending
Cluster State Persistence
The PersistedClusterStateService stores cluster metadata in a bare Lucene index (one per data path) with a specialized schema:
- Global metadata — Cluster-wide settings and coordination state
- Index metadata — Per-index settings and mappings
- Mapping metadata — Cached mapping hashes for deduplication
Large documents are split across pages to respect size limits. Each commit records the current term, last-accepted version, and node identity in user data.
Recovery & Consistency
During shard recovery, Elasticsearch:
- Reads the latest Lucene commit from the Store
- Replays translog entries to recover uncommitted operations
- Validates checksums and segment integrity
- Synchronizes global checkpoint across replicas
This ensures that even after crashes, no committed data is lost and replicas converge to a consistent state.
Search Query Execution Pipeline
Relevant Files
server/src/main/java/org/elasticsearch/action/search/TransportSearchAction.javaserver/src/main/java/org/elasticsearch/search/SearchService.javaserver/src/main/java/org/elasticsearch/search/query/QueryPhase.javaserver/src/main/java/org/elasticsearch/search/dfs/DfsPhase.javaserver/src/main/java/org/elasticsearch/action/search/AbstractSearchAsyncAction.java
A search request flows through multiple coordinated phases on the coordinating node and shard nodes. The coordinating node fans out requests to all relevant shards, collects results, and merges them into a final response.
High-Level Flow
Loading diagram...
Phase Execution
DFS Phase (optional, for accuracy): Collects distributed term frequencies from all shards. This ensures scoring is consistent across shards by gathering statistics about term distribution. Executed by DfsPhase.execute() on each shard, results aggregated on coordinator.
Query Phase: Each shard executes the query using Lucene and returns top matching documents (IDs and scores). QueryPhase.execute() runs on each shard, using ContextIndexSearcher to search the index. Results are TopDocs objects containing ScoreDoc arrays with document IDs and relevance scores.
Fetch Phase: The coordinator selects which documents to fetch based on merged query results, then requests full document content from shards. FetchPhase.execute() retrieves stored fields and reconstructs SearchHit objects.
Key Components
TransportSearchAction: Entry point for search requests. Resolves indices, determines shard targets, and delegates to appropriate search phase implementation (SearchQueryThenFetchAsyncAction or SearchDfsQueryThenFetchAsyncAction).
AbstractSearchAsyncAction: Base class for fan-out logic. Manages concurrent requests to all shards, handles failures with retry logic, and collects results using SearchPhaseResults.
SearchService: Executes individual phases on shard nodes. Methods like executeQueryPhase() and executeFetchPhase() create search contexts, run phase logic, and return results.
QueryPhaseCollectorManager: Creates Lucene collectors for the query phase. Manages top docs collection, aggregations, and post-filtering. Reduces results from multiple segments into final TopDocs.
Result Aggregation
Query results from all shards are merged by SearchPhaseController. TopDocs from each shard are combined, sorted, and truncated to the requested page size. Aggregations are reduced across shards. The fetch phase then retrieves full documents for the final result set.
Error Handling
Shard failures are tracked separately. If a shard fails, the action retries on replica shards. Partial results are allowed if allow_partial_search_results is true. Timeouts are handled gracefully, returning results collected so far.
REST API & Request Handling
Relevant Files
server/src/main/java/org/elasticsearch/rest/RestHandler.javaserver/src/main/java/org/elasticsearch/rest/BaseRestHandler.javaserver/src/main/java/org/elasticsearch/rest/RestController.javaserver/src/main/java/org/elasticsearch/rest/RestChannel.javaserver/src/main/java/org/elasticsearch/rest/RestResponse.javaserver/src/main/java/org/elasticsearch/rest/action/search/RestSearchAction.javaserver/src/main/java/org/elasticsearch/action/ActionModule.java
Overview
Elasticsearch's REST API layer provides the HTTP interface for all client interactions. The architecture follows a handler-based pattern where HTTP requests are routed to specialized handlers that process them and return responses. This system supports versioning, deprecation tracking, parameter validation, and content negotiation.
Request Flow
Loading diagram...
Core Components
RestHandler Interface
The RestHandler interface is the foundation for all REST endpoints. Implementations define:
handleRequest()- Main entry point for processing requestsroutes()- List of HTTP method + path combinations this handler supportssupportedQueryParameters()- Set of allowed query parametersgetServerlessScope()- Visibility in serverless environments
BaseRestHandler Abstract Class
Most handlers extend BaseRestHandler, which provides:
- Parameter validation (checks for unsupported parameters)
- Content-type validation
- Circuit breaker integration for memory protection
- System index access control
- Automatic usage tracking
Implementations override prepareRequest() to parse the request and return a RestChannelConsumer for async execution.
RestController
The RestController is the central dispatcher that:
- Registers all REST handlers via
registerHandler() - Routes incoming requests using a
PathTriefor efficient path matching - Handles content aggregation for non-streaming requests
- Manages error responses and fallback behavior
- Tracks HTTP route statistics
RestChannel & RestResponse
RestChannel provides methods to build responses:
newBuilder()- Creates anXContentBuilderfor structured responsessendResponse()- Sends the response back to the client
RestResponse encapsulates the HTTP response with status code, content, headers, and optional chunked streaming support.
Route Definition & Deprecation
Routes are defined using the fluent Route.builder() API:
Route.builder(GET, "/_search")
.replaces(GET, "/_old_search", RestApiVersion.V_7)
.build()
Supported patterns:
- New routes -
new Route(GET, "/_search") - Deprecated for removal -
.deprecatedForRemoval(message, lastVersion) - Replaced routes -
.replaces(oldMethod, oldPath, lastVersion) - Kept but deprecated -
.deprecateAndKeep(message)
Request Handling Lifecycle
- Registration - Handlers register routes during module initialization
- Routing -
RestController.dispatchRequest()matches the request path - Validation - Parameter and content-type checks occur
- Preparation -
prepareRequest()parses the request body and parameters - Execution - The returned
RestChannelConsumerexecutes the action - Response - Results are serialized and sent via
RestChannel.sendResponse()
Content Negotiation
Response format is determined by:
formatquery parameter (e.g.,?format=json)- HTTP
Acceptheader - Request
Content-Type(fallback) - Default to JSON
The XContentBuilder automatically handles serialization to the negotiated format (JSON, YAML, CBOR, etc.).
Error Handling
Errors are caught at the RestController level and converted to RestResponse objects with appropriate HTTP status codes. The RestChannel.newErrorBuilder() method creates error responses with optional stack traces (controlled by error_trace parameter).
Actions & Transport Layer
Relevant Files
server/src/main/java/org/elasticsearch/action/package-info.javaserver/src/main/java/org/elasticsearch/action/ActionRequest.javaserver/src/main/java/org/elasticsearch/action/ActionResponse.javaserver/src/main/java/org/elasticsearch/action/ActionType.javaserver/src/main/java/org/elasticsearch/action/ActionListener.javaserver/src/main/java/org/elasticsearch/action/support/TransportAction.javaserver/src/main/java/org/elasticsearch/action/support/HandledTransportAction.javaserver/src/main/java/org/elasticsearch/action/search/SearchTransportService.javaserver/src/main/java/org/elasticsearch/tasks/package-info.java
Elasticsearch's action and transport layer forms the backbone of inter-node communication and request handling. It provides a unified framework for executing operations across the cluster while maintaining security boundaries and enabling asynchronous, non-blocking execution.
Core Concepts
ActionType is a unique identifier for an action, defined as a simple string constant. Each action (search, index, bulk, etc.) has a corresponding ActionType instance that serves as its name and type marker. Actions are registered in ActionModule#setupActions and can be contributed by plugins via ActionPlugin#getActions.
ActionRequest and ActionResponse are the serializable request and response objects. All requests extend ActionRequest and must implement validate() to check preconditions. Responses extend ActionResponse and are serialized for transmission over the network. Both inherit from transport-level classes and support binary serialization via StreamInput/StreamOutput.
ActionListener is a callback interface implementing continuation-passing style (CPS). It has two methods: onResponse(Response) for successful completion and onFailure(Exception) for errors. Listeners can be composed, mapped, and wrapped to build complex asynchronous workflows without blocking threads.
Transport Action Execution
TransportAction is the base class for all action implementations. It handles:
- Request validation before execution
- Filtering via
ActionFilterchain (security, logging, etc.) - Task registration and tracking
- Executor selection (direct or forked execution)
- Result storage if requested
The execute() method validates the request, applies filters, and delegates to doExecute() (implemented by subclasses). Execution can run directly on the calling thread or fork to a thread pool based on the configured executor.
HandledTransportAction extends TransportAction and automatically registers itself with TransportService during construction. This registration enables remote nodes to send requests to this action by name. The handler wraps incoming requests in a ChannelActionListener that sends responses back over the network.
Request/Response Flow
Loading diagram...
Search Transport Service Example
SearchTransportService demonstrates specialized transport handling for search operations. It registers handlers for multiple search phases (DFS, Query, Fetch, Can-Match) with the transport layer. Each handler:
- Receives a
ShardSearchRequestor similar - Executes the corresponding search phase via
SearchService - Wraps the listener in
ConnectionCountingHandlerto track pending requests per node - Sends results back through the channel
The service also manages search context cleanup via sendFreeContext() and scroll operations, showing how transport actions coordinate complex multi-phase workflows.
Task Integration
Actions integrate with Elasticsearch's task management system. When a request has getShouldStoreResult() == true, the action wraps the listener in TaskResultStoringActionListener, which persists task results to an index. Tasks are registered with TaskManager and can be monitored, cancelled, or queried via dedicated admin APIs.
Key Patterns
- Serialization: All requests/responses must be serializable via
writeTo(StreamOutput)and have a constructor acceptingStreamInput - Validation: Request validation happens early in the execution chain and short-circuits on errors
- Async Composition: Listeners enable chaining operations without blocking; use
RefCountingListenerfor parallel work - Security Boundary:
TransportActionenforces authorization via filter chain before executing business logic - Executor Control: Actions specify execution context (direct, search, generic, etc.) to optimize thread pool usage
Plugins & Modules System
Relevant Files
server/src/main/java/org/elasticsearch/plugins/Plugin.javaserver/src/main/java/org/elasticsearch/plugins/PluginsService.javaserver/src/main/java/org/elasticsearch/plugins/PluginDescriptor.javaserver/src/main/java/org/elasticsearch/plugins/PluginsLoader.javamodules/ingest-common/src/main/java/org/elasticsearch/ingest/common/IngestCommonPlugin.javamodules/lang-painless/src/main/java/org/elasticsearch/painless/PainlessPlugin.java
Elasticsearch provides a powerful plugin and module system that allows developers to extend core functionality without modifying the server codebase. Plugins are loaded dynamically at startup and can hook into various subsystems like search, ingest, scripting, and cluster management.
Plugin Architecture
The base Plugin class serves as the entry point for all extensions. Plugins can implement specialized interfaces to customize specific areas:
- ActionPlugin - Register custom actions and REST handlers
- AnalysisPlugin - Add analyzers, tokenizers, and token filters
- SearchPlugin - Extend search with custom queries, aggregations, and score functions
- IngestPlugin - Define ingest pipeline processors
- ScriptPlugin - Register script engines and contexts
- MapperPlugin - Add custom field types and mappers
- RepositoryPlugin - Implement snapshot repository backends
- DiscoveryPlugin - Customize node discovery mechanisms
- ClusterPlugin - Modify shard allocation and cluster behavior
- NetworkPlugin - Customize transport and HTTP layers
Plugins can implement multiple interfaces to provide functionality across different subsystems.
Plugin Loading and Lifecycle
The PluginsService manages plugin discovery, loading, and initialization:
- Discovery - Scans plugin directories for
plugin-descriptor.propertiesfiles - Validation - Verifies version compatibility and dependencies
- Instantiation - Creates plugin instances using reflection
- Initialization - Calls plugin lifecycle methods in dependency order
- Registration - Registers components (actions, processors, etc.) with core services
Each plugin is loaded in its own ClassLoader to provide isolation and prevent classpath conflicts.
Modules vs. Plugins
Modules are built-in extensions bundled with Elasticsearch (e.g., ingest-common, lang-painless, aggregations). They follow the same plugin architecture but are:
- Compiled into the distribution
- Loaded automatically without configuration
- Implemented as Java modules with
module-info.javadescriptors - Organized under the
modules/directory
Plugins are external extensions installed separately:
- Distributed as ZIP files
- Placed in the
plugins/directory - Loaded dynamically at node startup
- Can be enabled/disabled via configuration
Module System Integration
Modern modules use Java's module system for encapsulation:
module org.elasticsearch.painless {
requires org.elasticsearch.server;
requires org.elasticsearch.painless.spi;
exports org.elasticsearch.painless;
opens org.elasticsearch.painless to org.elasticsearch.painless.spi;
provides org.elasticsearch.painless.spi.PainlessExtension
with org.elasticsearch.ingest.common.ProcessorsWhitelistExtension;
}
Modules declare dependencies, export public APIs, and use Java SPI (Service Provider Interface) for extensibility.
Extensible Plugins
The ExtensiblePlugin interface allows plugins to be extended by other plugins:
public interface ExtensiblePlugin {
default void loadExtensions(ExtensionLoader loader) {}
}
This enables plugin composition patterns where one plugin provides extension points that other plugins can implement, creating a plugin ecosystem.
Plugin Descriptor
Each plugin requires a plugin-descriptor.properties file containing metadata:
name- Plugin identifierdescription- Human-readable descriptionversion- Plugin versionelasticsearch.version- Required Elasticsearch versionjava.version- Required Java versionclassname- Main plugin classextended.plugins- List of plugins this extends
This metadata is parsed into a PluginDescriptor object used for validation and dependency resolution.
Monitoring & Observability
Relevant Files
server/src/main/java/org/elasticsearch/monitor/server/src/main/java/org/elasticsearch/health/TRACING.mdx-pack/plugin/monitoring/
Elasticsearch provides a comprehensive monitoring and observability system that tracks system health, collects metrics, and enables distributed tracing. This system operates at multiple levels: node-level monitoring, cluster-level health indicators, and application-level tracing.
Core Monitoring Architecture
The MonitorService orchestrates five key monitoring subsystems:
- JVM Monitoring (
JvmService,JvmGcMonitorService) - Tracks heap usage, garbage collection events, and thread statistics - OS Monitoring (
OsService) - Collects CPU, memory, and load average metrics from the operating system - Process Monitoring (
ProcessService) - Captures process-level statistics like file descriptors and memory usage - Filesystem Monitoring (
FsService) - Monitors disk space, I/O operations, and storage health - Metrics Collection (
NodeMetrics) - Exposes operational metrics through a MeterRegistry for indices operations, memory, and transport statistics
Each service implements caching with configurable refresh intervals to balance accuracy with performance overhead.
Health Indicators System
The health indicators framework provides a pluggable system for reporting cluster and node health status. Each indicator returns a HealthStatus (GREEN, YELLOW, RED, or UNKNOWN) with detailed diagnostics.
Key built-in indicators include:
- Disk Health - Monitors disk usage against watermarks (high watermark triggers YELLOW, flood stage triggers RED)
- Shard Availability - Reports RED if primary shards are unavailable, YELLOW if replicas are missing
- Shard Capacity - Tracks available shard slots per node against configured thresholds
- Master Stability - Monitors master node changes and availability
- Repository Integrity - Validates snapshot repository health
Plugins can extend health monitoring by implementing the HealthPlugin interface and registering custom HealthIndicatorService implementations. Preflight indicators run first to determine if the cluster is stable enough for other health checks.
Distributed Tracing
Elasticsearch uses the OpenTelemetry API for distributed tracing, abstracted through a custom Tracer interface. The system primarily traces task execution, which allows work to be tracked across thread boundaries and cluster nodes.
// Tracing is configured via elasticsearch.yml
telemetry.tracing.enabled: true
telemetry.agent.server_url: https://apm-server:443
Traces are sent to an APM server (typically Elastic Cloud APM). The system supports W3C Trace Context headers for cross-system tracing. Thread contexts manage span relationships, with ThreadContext#newTraceContext() creating child spans and clearTraceContext() detaching background tasks from parent requests.
Metrics Exposure
The NodeMetrics component registers async metrics including:
- Indices operations (get, search, indexing counts and latencies)
- Memory usage and GC statistics
- Transport layer metrics
- Shard allocation and recovery progress
Metrics are exposed through a MeterRegistry and can be scraped by monitoring systems. The X-Pack monitoring plugin extends this with collectors for cluster-wide statistics, job metrics, and enrichment stats, exporting data to monitoring indices for visualization in Kibana.
Configuration
Monitoring behavior is controlled through settings:
monitor.jvm.refresh_interval- JVM stats cache refresh rate (default: 1s)monitor.fs.refresh_interval- Filesystem stats cache refresh rate (default: 1s)health.shard_capacity.unhealthy_threshold.yellow- Shard capacity threshold (default: 10)health.shard_capacity.unhealthy_threshold.red- Critical shard capacity threshold (default: 5)
Dynamic cluster settings allow runtime adjustment of APM sampling rates and other telemetry parameters without restart.