Overview
Relevant Files
README.mdserver/src/main/java/org/opensearch/node/Node.javaserver/src/main/java/org/opensearch/bootstrap/Bootstrap.javaserver/src/main/java/org/opensearch/bootstrap/OpenSearch.java
OpenSearch is an open-source, enterprise-grade search and observability suite designed to bring order to unstructured data at scale. It is a distributed search and analytics engine built on top of Apache Lucene, providing powerful full-text search, real-time analytics, and observability capabilities for large-scale data workloads.
Core Purpose
OpenSearch enables organizations to index, search, and analyze massive volumes of structured and unstructured data in real-time. It powers use cases including log and event data analysis, application performance monitoring, security analytics, and full-text search across diverse data sources.
Architecture Overview
Loading diagram...
Startup Process
The system initializes through a well-defined sequence:
-
Entry Point (
OpenSearch.java): The main entry point parses command-line arguments and configures logging. -
Bootstrap Phase (
Bootstrap.java): Performs critical initialization including:- Loading secure settings and keystores
- Validating the Java environment and jar dependencies
- Installing security managers and FIPS compliance checks
- Initializing system probes (JVM, OS, filesystem)
-
Node Creation (
Node.java): Constructs the core node instance with:- Plugin loading and initialization
- Service instantiation (cluster, search, indexing, transport)
- Thread pool configuration
- Module binding (action, cluster, search, script)
-
Node Startup: Activates all services in dependency order, starting discovery, cluster coordination, and accepting incoming requests.
Key Components
- Cluster Service: Manages cluster state, node discovery, and coordination
- Search Service: Handles query execution and result aggregation
- Indices Service: Manages index lifecycle, shards, and segment storage
- Transport Service: Enables inter-node communication and request routing
- Action Module: Registers and dispatches all cluster actions
- Plugin System: Extensible architecture for custom functionality
- Thread Pool: Manages concurrent task execution across multiple executor types
Module Organization
The codebase is organized into logical modules under server/src/main/java/org/opensearch/:
action/- Request handlers and transport actionscluster/- Cluster coordination and state managementindex/- Index operations and shard managementsearch/- Query parsing, execution, and aggregationstransport/- Network communication layerplugins/- Plugin interfaces and lifecycle managementnode/- Core node lifecycle and service orchestration
Architecture
Relevant Files
server/src/main/java/org/opensearch/cluster/ClusterModule.javaserver/src/main/java/org/opensearch/index/IndexService.javaserver/src/main/java/org/opensearch/indices/IndicesService.javaserver/src/main/java/org/opensearch/action/ActionModule.java
OpenSearch follows a modular, layered architecture designed for distributed search and analytics. The system is organized into distinct layers that handle cluster coordination, index management, request processing, and data storage.
Core Layers
Cluster Coordination Layer manages distributed consensus and state propagation. The ClusterModule configures cluster-wide services including AllocationService, which orchestrates shard placement decisions. Allocation deciders (disk threshold, awareness, concurrent recovery, etc.) enforce constraints on where shards can be placed. The BalancedShardsAllocator makes final placement decisions based on cluster balance metrics.
Index Management Layer operates through IndicesService and IndexService. IndicesService manages the lifecycle of all indices on a node, handling index creation, deletion, and shard allocation. IndexService manages individual index operations including shard creation, field data caching, and query execution context. Both services coordinate with the cluster state to maintain consistency.
Request Processing Layer routes incoming requests through REST handlers and transport actions. The ActionModule registers all available actions and their corresponding transport implementations. REST handlers parse HTTP requests and delegate to transport actions via the NodeClient. Transport actions execute on the local node or coordinate across the cluster.
Storage Layer manages Lucene indices, translog operations, and segment storage. Each shard maintains an Engine (typically InternalEngine) that handles indexing, searching, and recovery operations.
Request Flow
Loading diagram...
Shard Allocation Pipeline
When cluster state changes, AllocationService executes a multi-phase allocation process:
- Existing Shards:
GatewayAllocatorrecovers primary shards from disk - Unassigned Shards: Allocation deciders filter candidate nodes
- Balanced Allocation:
ShardsAllocatorassigns shards to minimize imbalance - Rebalancing: Shards relocate if cluster balance improves
Each decision passes through AllocationDeciders, a composite that chains individual deciders. Early rejection (NO decision) short-circuits evaluation for performance.
Cluster State Management
ClusterState is an immutable snapshot containing:
- Metadata: Index settings, mappings, templates, aliases
- RoutingTable: Shard assignments to nodes
- DiscoveryNodes: Active cluster members
- Customs: Plugin-specific state (snapshots, ingest pipelines, etc.)
State updates are published through a quorum-based consensus protocol. All nodes apply updates in the same order, ensuring consistency.
Module Composition
The architecture uses dependency injection (Guice) to wire components. ClusterModule, ActionModule, and IndicesModule configure their respective services as eager singletons. This enables loose coupling and testability while maintaining strict initialization order.
Search & Query Execution
Relevant Files
server/src/main/java/org/opensearch/search/SearchService.javaserver/src/main/java/org/opensearch/search/SearchModule.javaserver/src/main/java/org/opensearch/action/search/SearchTransportService.javaserver/src/main/java/org/opensearch/search/query/QueryPhase.javaserver/src/main/java/org/opensearch/search/fetch/FetchPhase.java
Overview
Search and query execution in OpenSearch follows a distributed, multi-phase architecture. When a search request arrives, it is decomposed into phases that execute across shards and then aggregate results. The main components are:
- SearchService: Core orchestrator managing shard-level search execution
- SearchTransportService: Handles inter-node communication for search operations
- SearchModule: Registers query types, aggregations, and search plugins at startup
Search Phases
OpenSearch executes searches through distinct phases:
- DFS Phase (optional): Distributed frequency search collects term statistics across shards for accurate scoring
- Query Phase: Executes the query on each shard, returning top-scoring document IDs and scores
- Fetch Phase: Retrieves full document content for the top results identified in the query phase
For single-shard searches, query and fetch can be combined into a single phase for efficiency.
Query Phase Execution
The query phase is initiated by SearchService.executeQueryPhase(), which:
- Retrieves the target shard and rewrites the request
- Checks if the shard can match documents using
canMatch()to short-circuit empty results - Forks execution to the search thread pool
- Calls
QueryPhase.execute()to run the actual query against the Lucene index
// Query phase can use cached results if available
private void loadOrExecuteQueryPhase(ShardSearchRequest request, SearchContext context) {
boolean canCache = indicesService.canCache(request, context);
if (canCache) {
indicesService.loadIntoContext(request, context, queryPhase);
} else {
queryPhase.execute(context);
}
}
The query phase processes aggregations, applies scoring, and handles rescoring if configured.
Fetch Phase Execution
After query results are reduced at the coordinator level, the fetch phase retrieves actual documents:
SearchService.executeFetchPhase()loads the stored fields and document contentFetchPhase.execute()iterates through document IDs and applies fetch sub-phases- Sub-phases include highlighting, field retrieval, and script fields
private QueryFetchSearchResult executeFetchPhase(ReaderContext reader, SearchContext context, long afterQueryTime) {
shortcutDocIdsToLoad(context);
fetchPhase.execute(context);
return new QueryFetchSearchResult(context.queryResult(), context.fetchResult());
}
Transport Layer
SearchTransportService registers request handlers for each phase:
QUERY_ACTION_NAME: Query phase execution on shardsFETCH_ID_ACTION_NAME: Fetch phase execution on shardsDFS_ACTION_NAME: DFS phase for term statisticsFREE_CONTEXT_ACTION_NAME: Cleanup of reader contexts
These handlers are registered via registerRequestHandler() and delegate to corresponding SearchService methods.
Reader Context Management
Search contexts are maintained in activeReaders map to support pagination and point-in-time (PIT) searches. Contexts are:
- Created per shard when a search begins
- Kept alive via keep-alive timeouts (configurable via
search.keep_alive) - Freed explicitly or when they expire
- Reused across multiple fetch requests for the same search
Loading diagram...
Caching and Optimization
OpenSearch caches query results at the shard level when:
- The same query is executed multiple times
- The index has not been modified
- Cache settings permit caching
The loadOrExecuteQueryPhase() method checks cache eligibility before executing queries, reducing redundant computation.
Indexing & Storage
Relevant Files
server/src/main/java/org/opensearch/index/shard/IndexShard.javaserver/src/main/java/org/opensearch/index/engine/InternalEngine.javaserver/src/main/java/org/opensearch/index/IndexModule.javaserver/src/main/java/org/opensearch/index/store/Store.javaserver/src/main/java/org/opensearch/index/translog/Translog.java
OpenSearch uses a multi-layered architecture for indexing and storage, combining Lucene's segment-based indexing with a transaction log for durability. Each shard maintains its own engine, store, and translog to ensure data consistency and recovery capabilities.
Core Components
IndexShard is the primary entry point for all indexing operations. It coordinates document indexing, manages the engine lifecycle, and handles shard-level operations like flushing and recovery. The shard delegates actual indexing to the Engine, which manages the Lucene index writer and maintains version tracking.
InternalEngine is the default engine implementation that orchestrates document indexing into Lucene. It uses a DocumentIndexWriter to add, update, or delete documents. The engine maintains a version map to track document versions and sequence numbers, enabling conflict detection and optimistic concurrency control.
Store provides low-level access to Lucene's Directory abstraction, which represents the file system interface for reading and writing index files. The Store manages metadata snapshots of committed segments, handles file checksums, and supports recovery operations. It uses reference counting to ensure safe concurrent access.
Indexing Pipeline
Document Request
↓
IndexShard.index()
↓
Engine.prepareIndex() [Parse & Map]
↓
Engine.index() [Version Check]
↓
indexIntoLucene() [Add/Update in Lucene]
↓
Translog.add() [Durability]
↓
IndexResult [Return to Client]
When a document arrives, the shard prepares an Engine.Index operation that includes parsing, mapping, and version resolution. The engine then executes the indexing strategy—either optimized append-only for new documents or update-based for existing ones. Documents are written to Lucene's index writer and simultaneously logged to the translog for durability.
Transaction Log (Translog)
The Translog records all non-committed index operations in a durable manner. Each shard has one translog instance per engine. Operations are written sequentially to a current translog file, and the translog is synced to disk based on the durability setting: ASYNC (time-based) or REQUEST (per operation).
Translog files are organized by generation ID. When a file reaches capacity or during periodic maintenance, a new generation is rolled. Old generations are retained based on the TranslogDeletionPolicy, which ensures operations needed for recovery or replication are preserved. The translog UUID is stored with each Lucene commit to prevent accidental recovery from mismatched logs.
Storage Layer
The Store wraps Lucene's Directory and provides:
- MetadataSnapshot: A snapshot of committed segment files with checksums and metadata
- Reference Counting: Prevents segment deletion while readers hold references (for point-in-time queries)
- Corruption Detection: Marks and detects corrupted indices
- File Verification: Validates checksums during recovery and replication
The Store uses a StoreDirectory wrapper that caches directory size information and logs delete operations for debugging.
Durability & Recovery
OpenSearch ensures durability through:
- Lucene Commits: Periodic flushes write segments to disk and create a commit point
- Translog Syncing: Operations are synced to disk based on durability settings
- Sequence Numbers: Track operation order for consistent recovery
- Local Checkpoint Tracker: Maintains the highest sequence number of persisted operations
During recovery, the engine replays translog operations up to the global checkpoint, restoring any uncommitted writes. Segment replication can skip translog replay by copying committed segments directly from the primary.
Key Settings
index.translog.durability: Controls translog sync frequency (ASYNC or REQUEST)index.store.type: Selects the directory implementation (niofs, mmapfs, etc.)index.store.preload: Pre-loads specific file extensions into memoryindex.store.hybrid.nio.extensions: Files to load with NIO instead of mmap
Cluster Coordination & Discovery
Relevant Files
server/src/main/java/org/opensearch/cluster/coordination/Coordinator.javaserver/src/main/java/org/opensearch/discovery/DiscoveryModule.javaserver/src/main/java/org/opensearch/gateway/GatewayService.javaserver/src/main/java/org/opensearch/cluster/coordination/CoordinationState.javaserver/src/main/java/org/opensearch/cluster/coordination/JoinHelper.javaserver/src/main/java/org/opensearch/discovery/PeerFinder.javaserver/src/main/java/org/opensearch/cluster/coordination/LeaderChecker.javaserver/src/main/java/org/opensearch/cluster/coordination/FollowersChecker.java
OpenSearch uses a Raft-inspired consensus algorithm called Zen2 to coordinate cluster state and elect a cluster manager. The coordination system ensures all nodes agree on cluster state changes and maintains a single leader.
Core Architecture
The Coordinator class is the central orchestrator that manages cluster formation, leader election, and state publication. It operates in three modes:
- CANDIDATE - Node is seeking election or waiting for a leader
- LEADER - Node has won election and publishes cluster state
- FOLLOWER - Node has joined a leader and applies published state
The DiscoveryModule bootstraps the coordination system by instantiating the Coordinator with seed hosts providers, election strategies, and join validators. For single-node clusters, it uses simplified discovery logic.
Discovery & Peer Finding
The PeerFinder discovers other cluster-manager-eligible nodes by:
- Resolving configured seed hosts (from settings or files)
- Probing addresses to establish connections
- Exchanging peer information via
PeersRequest/PeersResponse - Tracking the current leader and found peers
When a candidate node discovers enough peers to form a quorum, it triggers the election scheduler.
Leader Election
Election follows a two-phase process:
Pre-voting Phase: The PreVoteCollector gathers pre-votes from peers without incrementing the term. This prevents unnecessary term bumps when a node cannot win.
Voting Phase: Once pre-voting quorum is reached, the node sends StartJoinRequest to all discovered nodes, incrementing the term. Nodes respond with Join votes. The CoordinationState tracks join votes and determines if an election quorum is achieved using the ElectionStrategy.
The JoinHelper manages join requests and responses. When a candidate accumulates enough votes, it transitions to LEADER mode and begins publishing cluster state.
State Publication & Consistency
Once elected, the leader publishes cluster state updates via PublicationTransportHandler. The CoordinationState ensures:
- Only the leader can publish new state
- State updates include term and version numbers
- Followers acknowledge publication before applying state
- A publish quorum (majority of voting nodes) must acknowledge before committing
Health Monitoring
The LeaderChecker allows followers to detect leader failures by periodically sending health checks. If checks fail, the follower becomes a candidate and restarts election.
The FollowersChecker allows the leader to detect unhealthy followers and remove them from the cluster.
Gateway & State Recovery
The GatewayService manages cluster state recovery on startup. When the cluster manager is elected, it waits for a configurable number of data nodes to join before removing the STATE_NOT_RECOVERED_BLOCK, allowing normal operations to begin.
// Example: Coordinator mode transitions
if (mode == Mode.CANDIDATE) {
startElection(); // Send StartJoinRequest to peers
} else if (mode == Mode.LEADER) {
publishClusterState(newState); // Publish to followers
}
Key Concepts
- Term - Monotonically increasing epoch number; higher term always wins
- Voting Configuration - Set of nodes whose votes count toward quorum
- Last Accepted State - Latest state persisted locally
- Committed State - State acknowledged by a publish quorum
Loading diagram...
REST API & Actions
Relevant Files
server/src/main/java/org/opensearch/rest/RestController.javaserver/src/main/java/org/opensearch/rest/BaseRestHandler.javaserver/src/main/java/org/opensearch/rest/RestHandler.javaserver/src/main/java/org/opensearch/action/ActionType.javaserver/src/main/java/org/opensearch/action/support/TransportAction.java
OpenSearch exposes functionality through two complementary layers: the REST API for HTTP clients and the Action framework for internal node-to-node communication. These layers work together to handle user requests and coordinate cluster operations.
REST API Layer
The REST API is the primary interface for external clients. The RestController acts as the central dispatcher, routing HTTP requests to appropriate handlers based on method and path.
Request Flow:
- HTTP request arrives at
RestController.dispatchRequest() - Controller matches the request path and method against registered handlers using a
PathTriedata structure - Matched
RestHandlerprocesses the request and sends a response through theRestChannel
Handler Registration:
Handlers are registered via RestController.registerHandler() with route definitions:
public void registerHandler(final RestHandler restHandler) {
restHandler.routes().forEach(route ->
registerHandler(route.getMethod(), route.getPath(), restHandler));
restHandler.deprecatedRoutes().forEach(route ->
registerAsDeprecatedHandler(route.getMethod(), route.getPath(),
restHandler, route.getDeprecationMessage()));
}
Routes support three types: active routes, deprecated routes, and replaced routes (deprecated with a new replacement).
BaseRestHandler:
Most REST handlers extend BaseRestHandler, which provides:
- Parameter validation and consumption tracking
- Automatic usage counting for monitoring
- Request preparation via
prepareRequest()returning aRestChannelConsumer - Response parameter filtering to exclude format-control parameters from strict validation
Action Framework
Actions represent internal operations that can be executed via transport (node-to-node) or REST layers. Each action is defined by an ActionType with a unique name and response reader.
ActionType Structure:
public class ActionType<Response extends ActionResponse> {
private final String name;
private final Writeable.Reader<Response> responseReader;
}
Execution Pipeline:
NodeClient.execute(ActionType, Request, ActionListener)initiates action executionTransportActionvalidates the request and registers a taskdoExecute()implements the actual operation logic- Response or exception is delivered via
ActionListenercallback
Transport Registration:
Actions are registered in ActionModule and mapped to TransportAction implementations. The transport layer handles serialization, network communication, and thread pool execution.
Request-Response Cycle
Loading diagram...
Circuit Breaker Integration
The REST layer integrates with circuit breakers to prevent resource exhaustion. Handlers declare via canTripCircuitBreaker() whether they should count against in-flight request limits. The controller tracks request content length and releases resources when responses are sent.
Error Handling
The controller provides standardized error responses:
- 400 Bad Request: No matching handler or invalid parameters
- 405 Method Not Allowed: Valid path but unsupported HTTP method
- 406 Not Acceptable: Invalid Content-Type for streaming handlers
Errors include detailed messages with suggestions for similar parameter names using Levenshtein distance matching.
Ingest Pipeline & Processing
Relevant Files
server/src/main/java/org/opensearch/ingest/IngestService.javaserver/src/main/java/org/opensearch/ingest/Pipeline.javaserver/src/main/java/org/opensearch/ingest/CompoundProcessor.javaserver/src/main/java/org/opensearch/ingest/Processor.java
The ingest pipeline is OpenSearch's data transformation framework that processes documents before indexing. It enables enrichment, filtering, and modification of data through a composable chain of processors.
Architecture Overview
Loading diagram...
Core Components
Pipeline is the top-level container that holds a sequence of processors and optional on-failure handlers. Each pipeline has a unique ID and tracks execution metrics (latency, success/failure counts). When a document is processed, the pipeline delegates to its CompoundProcessor and measures the total time spent in ingest.
CompoundProcessor orchestrates sequential execution of multiple processors. It maintains two lists: regular processors and on-failure processors. If any processor throws an exception and ignoreFailure is false, the compound processor invokes the on-failure chain. On-failure processors receive the original document plus metadata about the error (message, processor type, tag).
Processor is the base interface for all transformation logic. Implementations can override either the synchronous execute(IngestDocument) method or the asynchronous execute(IngestDocument, BiConsumer) method for operations requiring async calls (e.g., HTTP enrichment). Processors return null to drop a document or the modified document to continue.
IngestService manages the lifecycle of all pipelines. It stores pipeline definitions in cluster state, validates processor configurations, and coordinates execution across nodes. It also handles pipeline resolution (default, final, and system pipelines) and batch processing for bulk operations.
Execution Flow
- Resolution: IngestService resolves which pipeline(s) to apply based on request parameters, index settings, and cluster defaults.
- Sequential Processing: CompoundProcessor executes each processor in order, passing the document through the chain.
- Error Handling: If a processor fails and on-failure handlers exist, they execute with failure metadata injected into the document.
- Metrics: Pipeline tracks execution time and failure counts for monitoring and debugging.
- Batch Optimization: For bulk operations, processors can implement
batchExecute()to process multiple documents efficiently.
Key Features
- Composability: Pipelines can invoke other pipelines via the pipeline processor, enabling reusable pipeline templates.
- Conditional Processing: Processors support conditional execution based on document state.
- Error Recovery: On-failure handlers allow graceful degradation or alternative processing paths.
- Metrics & Observability: Per-pipeline and per-processor metrics track latency, throughput, and failure rates.
- Batch Processing: Bulk indexing benefits from batch-aware processors that optimize for multiple documents.
- Processor Limits: Configurable maximum processor count per pipeline prevents runaway complexity.
Plugin System & Extensions
Relevant Files
server/src/main/java/org/opensearch/plugins/Plugin.javaserver/src/main/java/org/opensearch/plugins/SearchPlugin.javaserver/src/main/java/org/opensearch/plugins/ActionPlugin.javaserver/src/main/java/org/opensearch/plugins/PluginsService.javaserver/src/main/java/org/opensearch/plugins/IngestPlugin.javaserver/src/main/java/org/opensearch/plugins/AnalysisPlugin.java
OpenSearch provides a comprehensive plugin system that allows developers to extend core functionality without modifying the main codebase. Plugins are loaded at startup and can hook into multiple subsystems including search, actions, ingest pipelines, and analysis.
Plugin Architecture
The plugin system is built on a base Plugin class that serves as the foundation for all extensions. Plugins can implement specialized interfaces to extend specific functionality areas. The PluginsService is responsible for discovering, loading, and managing all plugins and modules at runtime.
Key lifecycle methods in the base Plugin class include:
createComponents()- Register custom services and components with dependency injectioncreateGuiceModules()- Define Guice modules for dependency managementgetSettings()- Add custom configuration settingsonIndexModule()- Hook into index creation to register index-level extensionsgetBootstrapChecks()- Enforce startup validation rules
Extension Points
Plugins can implement multiple specialized interfaces to extend different subsystems:
Search Extensions (SearchPlugin): Register custom queries, aggregations, suggesters, score functions, and rescore implementations. Plugins can also provide custom fetch sub-phases and highlighters.
Action Extensions (ActionPlugin): Add custom REST handlers, transport actions, and action filters. Plugins can wrap REST requests for authentication, logging, or request validation.
Ingest Extensions (IngestPlugin): Define custom ingest processors for data transformation during indexing. Supports both regular and system-level processors.
Analysis Extensions (AnalysisPlugin): Contribute custom tokenizers, token filters, and character filters for text analysis.
Other Extensions: Additional interfaces support cluster coordination, discovery, repositories, scripting, mappers, and network transport customization.
Registration Pattern
Plugins follow a consistent registration pattern using specification classes. For example, SearchPlugin.QuerySpec wraps a custom query builder with its parser and serialization reader:
public List<QuerySpec<?>> getQueries() {
return Arrays.asList(
new QuerySpec<>(
"my_custom_query",
MyCustomQueryBuilder::new,
MyCustomQueryParser::parse
)
);
}
Each spec includes:
- A name (used for serialization and REST parsing)
- A reader (deserializes from wire protocol)
- A parser (converts from JSON/YAML to builder objects)
Component Lifecycle
Plugins can register components that participate in the node lifecycle. Components implementing LifecycleComponent are automatically started when the node starts and stopped during shutdown. This enables plugins to manage resources like thread pools, connections, or caches.
public Collection<Object> createComponents(
Client client,
ClusterService clusterService,
ThreadPool threadPool,
// ... other services
) {
return Arrays.asList(new MyCustomService(client, threadPool));
}
Best Practices
- Use
ParseFieldfor query names to support deprecated aliases - Register result readers for aggregations to enable serialization
- Implement proper error handling in REST handlers
- Use bootstrap checks to validate required configurations
- Avoid blocking operations in hot paths
- Document custom settings with descriptions and defaults
Transport & Networking
Relevant Files
server/src/main/java/org/opensearch/transport/TransportService.javaserver/src/main/java/org/opensearch/transport/Transport.javaserver/src/main/java/org/opensearch/transport/TcpTransport.javaserver/src/main/java/org/opensearch/transport/ConnectionManager.javaserver/src/main/java/org/opensearch/http/AbstractHttpServerTransport.javaserver/src/main/java/org/opensearch/http/HttpServerTransport.javaserver/src/main/java/org/opensearch/common/network/NetworkService.java
OpenSearch uses a layered networking architecture to handle both inter-node communication (transport) and client-facing HTTP requests. Understanding these layers is essential for debugging network issues, implementing custom transports, or optimizing cluster communication.
Transport Layer (Inter-Node Communication)
The transport layer enables nodes to communicate with each other using a binary protocol. The architecture follows a clear hierarchy:
Core Components:
- Transport - Interface defining the contract for sending/receiving messages between nodes
- TcpTransport - Abstract implementation using TCP sockets for inter-node communication
- TransportService - High-level service wrapping Transport, managing connections, request routing, and response handling
- ConnectionManager - Manages persistent connections to remote nodes with connection pooling and lifecycle management
Request/Response Flow:
TransportService.sendRequest()
→ ConnectionManager.getConnection(node)
→ Transport.Connection.sendRequest()
→ OutboundHandler (serializes & sends bytes)
→ TcpChannel (low-level socket)
→ Remote Node receives via InboundHandler
→ ProtocolMessageHandler (deserializes)
→ RequestHandlerRegistry (routes to handler)
→ Handler executes & sends response back
HTTP Layer (Client Communication)
The HTTP layer handles REST API requests from clients. It's built on top of Netty for high-performance async I/O.
Request Processing Pipeline:
- HttpServerTransport - Binds to HTTP port, manages server channels
- Netty4HttpServerTransport - Netty-based implementation with HTTP/1.1 and HTTP/2 support
- HttpRequestDecoder - Parses raw HTTP into structured requests
- HttpPipeliningHandler - Manages request pipelining for multiple concurrent requests
- RestController - Routes HTTP requests to appropriate REST handlers
- RestHandler - Executes business logic and sends responses
HTTP Client Request
→ Netty4HttpServerTransport (receives on port 9200)
→ HttpRequestDecoder (parses HTTP)
→ Netty4HttpRequestHandler (creates HttpRequest)
→ AbstractHttpServerTransport.incomingRequest()
→ RestController.dispatchRequest()
→ RestHandler.handleRequest()
→ Response sent back via HttpChannel
Network Configuration
The NetworkService provides centralized network configuration:
- Bind hosts - Addresses the server listens on (default: localhost)
- Publish hosts - Addresses advertised to other nodes (for multi-NIC setups)
- TCP settings - TCP_NO_DELAY, TCP_KEEP_ALIVE, buffer sizes, etc.
Key settings:
network.host # Bind and publish address
network.bind_host # Override bind address
network.publish_host # Override publish address
network.tcp.no_delay # Disable Nagle's algorithm (default: true)
network.tcp.keep_alive # Enable TCP keep-alive (default: true)
Connection Profiles
ConnectionProfile defines how many connections to maintain per node and their purposes:
- LIGHT - Single connection for cluster coordination
- BULK - Multiple connections for bulk indexing
- RECOVERY - Dedicated connections for shard recovery
- HANDSHAKE - Initial connection for version negotiation
Each profile specifies connection count, timeouts, and request types.
Key Design Patterns
Local Node Optimization - Requests to the local node bypass the network stack entirely, using direct method calls for performance.
Connection Pooling - ConnectionManager maintains a pool of connections per node, reusing them across requests to reduce overhead.
Async I/O - Both transport and HTTP layers use non-blocking I/O with Netty, allowing thousands of concurrent connections.
Protocol Abstraction - The Transport interface allows pluggable implementations (native protocol, gRPC, etc.) without changing upper layers.
Build System & Testing
Relevant Files
build.gradleDEVELOPER_GUIDE.mdTESTING.mdgradle/(build configuration)buildSrc/(custom Gradle plugins)test/framework/(test framework)
OpenSearch uses Gradle as its build system with custom plugins for compilation, testing, and distribution. The build orchestrates unit tests, integration tests, REST API tests, and backwards compatibility (BWC) tests across multiple modules and plugins.
Build System Overview
The root build.gradle applies modular Gradle scripts from the gradle/ directory, including formatting, code coverage, FIPS compliance, and local distribution builds. Custom plugins in buildSrc/ extend Gradle to handle OpenSearch-specific concerns like test clustering, REST API specifications, and plugin packaging.
Key build tasks:
./gradlew assemble- Build all distributions (tar, zip, RPM, DEB)./gradlew localDistro- Build platform-specific distribution./gradlew check- Run all verification tasks (precommit, unit tests, integration tests)./gradlew precommit- Run static checks and formatting validation
Test Framework Architecture
OpenSearch uses JUnit 5 with randomized testing via the randomizedtesting-runner library. The test framework (test/framework/) provides base classes for different test scopes:
OpenSearchTestCase- Base for unit testsOpenSearchSingleNodeTestCase- Single-node cluster testsOpenSearchIntegTestCase- Multi-node integration testsOpenSearchRestTestCase- REST API tests against external clustersOpenSearchClientYamlSuiteTestCase- YAML-based REST tests
Test Types and Execution
# Unit tests (default test task)
./gradlew :module:test
# Integration tests (in-memory cluster)
./gradlew :module:internalClusterTest
# REST tests (YAML-based)
./gradlew :rest-api-spec:yamlRestTest
# Java REST tests
./gradlew :module:javaRestTest
# Backwards compatibility tests
./gradlew bwcTest
Test filtering and customization:
# Run specific test
./gradlew :server:test --tests "*.ClassName.testMethod"
# Run with custom seed for reproducibility
./gradlew test -Dtests.seed=DEADBEEF
# Repeat test N times
./gradlew test -Dtests.iters=5 -Dtests.class=*.ClassName
# Configure heap size and JVM args
./gradlew test -Dtests.heap.size=4G -Dtests.jvm.argline="-XX:+UseG1GC"
Test Clustering and REST Testing
The TestClustersPlugin manages ephemeral test clusters for integration tests. REST tests can run against:
- Embedded clusters - Created per test via
OpenSearchIntegTestCase - External clusters - Via
OpenSearchRestTestCasewith properties:-Dtests.cluster=localhost:9200-Dtests.rest.cluster=localhost:9200
YAML REST tests support filtering and denylisting:
./gradlew :rest-api-spec:yamlRestTest \
-Dtests.rest.suite=index,get \
-Dtests.rest.denylist="index/**/Index document"
Code Quality and Coverage
- Formatting - Spotless plugin enforces Eclipse JDT formatter (4-space indent, 140-char lines)
- Code Coverage - JaCoCo generates reports:
./gradlew jacocoTestReport - Forbidden APIs - Prevents use of unsafe JDK internals
- Javadoc - Enforced with custom tags (
@opensearch.api,@opensearch.internal)
Test Reliability and Retries
Flaky tests are mitigated via the test-retry plugin. In CI, tests retry up to 3 times with max 10 failures before stopping. Known flaky tests are explicitly listed in build.gradle and tracked with the flaky-test GitHub label.
Backwards Compatibility Testing
BWC tests verify upgrades from previous versions. Tests can run against released versions (downloaded) or unreleased versions (built from source). Environment variables JAVA11_HOME, JAVA17_HOME, etc., are required for multi-version testing.
# Test specific version
./gradlew v5.3.2#bwcTest
# Test with custom distribution
./gradlew bwcTest -PcustomDistributionDownloadType=bundle
Running OpenSearch Locally
# Start OpenSearch from source
./gradlew run
# With custom settings
./gradlew run -Dtests.opensearch.http.host=0.0.0.0 -Dtests.heap.size=4G
# Debug mode
./gradlew run --debug-jvm