Overview
Relevant Files
README.mddoc.gocmd/jaeger/main.gocmd/jaeger/internal/command.gocmd/jaeger/internal/components.gointernal/storage/
Jaeger is a distributed tracing platform created by Uber Technologies and donated to the Cloud Native Computing Foundation (CNCF). It is used to monitor and troubleshoot microservices-based distributed systems by collecting, storing, and visualizing trace data from applications.
What is Jaeger?
Jaeger helps developers understand the behavior of complex distributed systems by tracking requests as they flow through multiple services. Each request generates a trace composed of spans (individual operations), allowing teams to identify performance bottlenecks, debug issues, and understand service dependencies.
Architecture Overview
Loading diagram...
Core Components
Jaeger v2 is built on the OpenTelemetry Collector framework and consists of:
-
Collector - Receives traces from applications via OTLP, Jaeger, Kafka, or Zipkin protocols. Processes traces through configurable pipelines (receivers, processors, exporters).
-
Storage Backend - Persists trace data. Supports multiple backends: memory, Elasticsearch, Cassandra, ClickHouse, ScyllaDB, and others via plugins.
-
Query Service - Retrieves traces from storage and serves the Jaeger UI. Provides gRPC and HTTP APIs for querying trace data.
-
Jaeger UI - React-based web interface for searching, visualizing, and analyzing traces.
Key Features
- Multiple Protocol Support: OTLP, Jaeger (Thrift), Zipkin, Kafka
- Flexible Processing: Batch processing, tail sampling, adaptive sampling, filtering, and metrics generation
- Pluggable Storage: Extensible architecture supporting various backends
- All-in-One Mode: Single binary with embedded memory storage for quick evaluation
- Metrics Integration: Span-to-metrics conversion via connectors
- Multi-Tenancy: Built-in support for tenant isolation
Project Structure
cmd/jaeger/- Main Jaeger v2 binary and CLIcmd/query/- Legacy query serviceinternal/storage/- Storage backend implementationsinternal/- Core libraries (config, telemetry, sampling, etc.)jaeger-ui/- Frontend (React, submodule)idl/- Data models (Protobuf, Thrift, submodule)
Architecture & Core Components
Relevant Files
cmd/jaeger/internal/components.gocmd/jaeger/internal/exporters/storageexportercmd/jaeger/internal/extension/jaegerstoragecmd/jaeger/internal/extension/jaegerqueryinternal/storage/v2/api/tracestoreinternal/storage/v1/factory.go
Jaeger v2 is built on the OpenTelemetry Collector framework, combining a modular pipeline architecture with pluggable storage backends. The system separates concerns into three main layers: the collector pipeline, storage abstraction, and query interface.
Collector Pipeline Architecture
Jaeger v2 uses the OpenTelemetry Collector's standard pipeline model with custom Jaeger components:
- Receivers accept traces in multiple formats (OTLP, Jaeger, Zipkin, Kafka)
- Processors transform and filter traces (batch, tail sampling, adaptive sampling)
- Exporters write processed traces to storage backends
- Connectors bridge pipelines (e.g., span metrics generation)
- Extensions provide auxiliary services (storage, query UI, health checks)
The components.go file registers all available factories. Jaeger adds custom receivers (Jaeger, Kafka), processors (adaptive sampling), and exporters (storage exporter) to the standard OTEL Collector components.
Storage Layer Design
The storage layer uses a factory pattern with two API versions:
V2 API (internal/storage/v2/api/tracestore) defines the modern interface:
type Factory interface {
CreateTraceReader() (Reader, error)
CreateTraceWriter() (Writer, error)
}
V1 API (internal/storage/v1) provides legacy span store interfaces. V1 implementations are wrapped by adapters to present the V2 interface, ensuring backward compatibility.
Supported Backends
Jaeger supports multiple storage backends through pluggable factories:
- Memory – In-memory store for testing and all-in-one deployments
- Badger – Embedded key-value store for single-node deployments
- Cassandra – Distributed NoSQL database
- Elasticsearch/OpenSearch – Full-text search and analytics
- ClickHouse – Columnar database for high-volume tracing
- gRPC – Remote storage backend via gRPC protocol
Each backend implements the factory pattern, allowing the system to instantiate readers and writers on demand.
Extension Integration
The jaegerstorage extension manages all configured storage backends. It:
- Initializes storage factories during startup
- Provides lookup by name for exporters and query services
- Handles authentication and metrics collection per backend
The jaeger_storage_exporter connects the pipeline to storage by obtaining a trace writer from the extension and writing sanitized OTEL traces to the backend.
Query Service
The jaegerquery extension implements the traditional Jaeger UI and query API. It depends on the storage extension to access trace readers and provides HTTP endpoints for trace retrieval and UI serving.
Loading diagram...
Configuration
Storage backends are configured declaratively in YAML. Each backend gets a name and type-specific settings:
extensions:
jaeger_storage:
backends:
primary:
elasticsearch:
endpoints: [http://localhost:9200]
archive:
memory:
max_traces: 100000
The exporter and query service reference backends by name, enabling flexible multi-backend deployments (e.g., hot storage for recent traces, archive storage for older data).
Storage Backends & Abstraction
Relevant Files
internal/storage/v2/api/tracestore– V2 trace storage interfacesinternal/storage/v2/api/depstore– V2 dependency storage interfacesinternal/storage/v1/api/spanstore– V1 span storage interfaces (legacy)internal/storage/v2/memory– In-memory backendinternal/storage/v2/elasticsearch– Elasticsearch/OpenSearch backendinternal/storage/v2/clickhouse– ClickHouse backendinternal/storage/v2/cassandra– Cassandra backendinternal/storage/v2/badger– Badger embedded store backendinternal/storage/v2/grpc– Remote gRPC backendinternal/storage/v2/v1adapter– Adapter layer for V1 → V2 migrationcmd/internal/storageconfig– Storage factory configuration
Jaeger uses a pluggable storage abstraction to support multiple backends. The architecture separates storage concerns into two API versions: V2 (current) and V1 (legacy), with an adapter layer enabling gradual migration.
Storage API Architecture
The V2 API defines three core interfaces:
tracestore.Writer– Writes spans to storage viaWriteTraces(ctx, ptrace.Traces)tracestore.Reader– Queries traces using iterators for efficient streamingdepstore.Reader– Retrieves service dependency graphs
Each backend implements a Factory interface that creates reader/writer instances. This factory pattern allows configuration-driven backend selection at startup.
Supported Backends
Loading diagram...
- Memory – In-process store for testing and all-in-one deployments
- Badger – Embedded key-value store for single-node production use
- Cassandra – Distributed NoSQL for high-availability clusters
- Elasticsearch/OpenSearch – Full-text search with tag-based filtering
- ClickHouse – Columnar database optimized for high-volume analytics
- gRPC – Remote storage backend via gRPC protocol
V1 → V2 Migration Pattern
The v1adapter package bridges legacy V1 implementations to the V2 API:
// V1 factories (Cassandra, Badger, Elasticsearch) wrap via adapters
v1Reader := v1Factory.CreateSpanReader()
v2Reader := v1adapter.NewTraceReader(v1Reader)
This allows existing V1 backends to function as V2 implementations without rewriting them. New backends like ClickHouse implement V2 directly.
Factory Configuration
The storageconfig package provides unified configuration:
type TraceBackend struct {
Memory *memory.Configuration
Badger *badger.Config
Cassandra *cassandra.Options
Elasticsearch *escfg.Configuration
ClickHouse *clickhouse.Configuration
GRPC *grpc.Config
}
At startup, exactly one backend is selected and its factory is instantiated. The factory creates reader/writer instances on demand, enabling lazy initialization and resource pooling.
Key Design Patterns
Iterator-Based Streaming – V2 readers return Go 1.22 iterators (iter.Seq2) for memory-efficient pagination of large result sets.
Metrics Decoration – Readers are wrapped with tracestoremetrics.ReadMetricsDecorator to collect latency and error metrics per operation.
Idempotent Writes – Writers support idempotent span ingestion; partial failures return errors without atomic guarantees.
Dependency Tracking – Separate depstore interface handles service dependency graphs, enabling independent optimization.
Trace Processing & Conversion
Relevant Files
internal/jptrace- Core trace utilities and aggregationinternal/jptrace/sanitizer- Trace data sanitization and normalizationinternal/converter/thrift/jaeger- Thrift format conversionsinternal/storage/v2/v1adapter- OTLP to Jaeger model translationinternal/uimodel/converter/v1/json- Domain model to UI JSON conversioncmd/query/app/otlp_translator.go- Query API OTLP translation
Jaeger processes traces through multiple conversion layers, transforming data from wire formats (OTLP, Thrift) into internal domain models and finally into UI-consumable JSON. This pipeline ensures data consistency, validates integrity, and enriches traces with metadata.
Conversion Pipeline
Traces flow through three primary representations:
- Wire Format (OTLP/Thrift) - Raw incoming data from instrumented applications
- Domain Model (
model.Trace,model.Span) - Jaeger's internal canonical representation - UI Model (
uimodel.Trace,uimodel.Span) - JSON-serializable format for frontend consumption
The v1adapter package bridges OTLP and domain models using OpenTelemetry's translator, while sanitizers ensure data quality at each stage.
OTLP to Domain Conversion
The V1BatchesFromTraces() function converts OpenTelemetry Protocol traces to Jaeger batches:
// Converts ptrace.Traces to []*model.Batch
batches := v1adapter.V1BatchesFromTraces(otlpTraces)
// Reverse conversion
traces := v1adapter.V1BatchesToTraces(batches)
This leverages the OpenTelemetry Collector's jaegertranslator package, then applies warning transfer to preserve metadata about transformations.
Trace Aggregation
The AggregateTraces() function in jptrace combines trace chunks into complete traces:
// Aggregates iter.Seq2[[]ptrace.Traces, error] into individual traces
aggregated := jptrace.AggregateTraces(tracesSeq)
Storage backends may return traces in chunks; this function merges spans by trace ID, yielding complete traces for processing.
Sanitization Pipeline
Three standard sanitizers clean and normalize trace data:
- Empty Service Name - Replaces missing or empty service names with placeholders
- UTF-8 Validation - Fixes invalid UTF-8 in span names and attributes
- Negative Duration - Corrects spans where end time < start time
// Apply all standard sanitizers
sanitized := sanitizer.Sanitize(traces)
Sanitizers are composable; custom chains can be created via NewChainedSanitizer().
Domain to UI Conversion
The FromDomain() function transforms domain spans into UI format:
// Converts *model.Trace to *uimodel.Trace
uiTrace := json.FromDomain(domainTrace)
This handles:
- Deduplicating processes and assigning process IDs
- Converting timestamps to microseconds since epoch
- Normalizing span references
- Preserving warnings and validation errors
Warning System
Warnings track data quality issues during transformations. They're stored as span attributes using the @jaeger@warnings key:
// Add warnings to a span
jptrace.AddWarnings(span, "invalid-utf8-detected")
// Retrieve warnings
warnings := jptrace.GetWarnings(span)
Warnings propagate through conversions, allowing the UI to display data integrity notes.
Thrift Format Support
Legacy Thrift spans convert via converter/thrift/jaeger:
// Thrift to domain model
spans := jaeger.ToDomain(thriftSpans, thriftProcess)
// Single span conversion
span := jaeger.ToDomainSpan(thriftSpan, thriftProcess)
Errors during conversion are embedded as span tags rather than failing the entire batch, ensuring partial data is preserved.
Query API Translation
The Query API accepts OTLP JSON and converts it to domain traces:
// Unmarshal OTLP JSON, convert to batches, aggregate by trace ID
traces, err := otlp2traces(otlpJsonBytes)
This enables the Query API to accept traces in OpenTelemetry format while maintaining internal consistency.
Data Flow Diagram
Loading diagram...
Key Design Patterns
- Composable Sanitizers - Chain multiple validation functions without nesting
- Warning Preservation - Metadata about transformations survives conversions
- Graceful Degradation - Errors become tags/warnings rather than failures
- Lazy Aggregation - Traces aggregate on-demand via iterators, reducing memory overhead
Query Service & API
Relevant Files
cmd/query/app/server.gocmd/query/app/querysvc/query_service.gocmd/query/app/grpc_handler.gocmd/query/app/http_handler.gocmd/query/app/apiv3/grpc_handler.gocmd/query/app/apiv3/http_gateway.go
The Query Service is the core component that exposes Jaeger's trace data through multiple APIs. It handles trace retrieval, service discovery, and dependency analysis, supporting both gRPC and HTTP protocols with multiple API versions.
Architecture Overview
Loading diagram...
Server Setup
The Server struct in server.go manages both HTTP and gRPC listeners. It initializes separate servers for each protocol, with support for TLS and multi-port configurations. The server requires separate ports when TLS is enabled to avoid conflicts.
Key initialization steps:
- Create gRPC server with interceptors for authentication and tenancy
- Register gRPC handlers for API v2 and v3
- Create HTTP server with router and middleware
- Register HTTP routes for all API versions
Query Service Core
The QueryService (in querysvc/query_service.go) is the business logic layer that:
- Retrieves traces by ID or search criteria
- Fetches service and operation metadata
- Applies trace adjustments (clock skew correction)
- Falls back to archive storage when traces aren't found in primary storage
Main methods:
GetTrace()- Fetch a single trace by IDFindTraces()- Search traces by service, operation, tags, durationGetServices()- List all servicesGetOperations()- List operations for a serviceGetDependencies()- Retrieve service dependency graph
API Versions
API v2 (Legacy): Implemented via GRPCHandler and APIHandler. Uses Jaeger's native span model. Supports streaming responses for large traces.
API v3 (Current): Implemented via apiv3.Handler and HTTPGateway. Uses OpenTelemetry Protocol (OTLP) format internally. Provides both gRPC and HTTP endpoints with consistent interfaces.
HTTP Endpoints
The HTTP API exposes RESTful endpoints under /api/ prefix:
GET /api/traces/{traceID}- Retrieve trace by IDGET /api/traces?service=...&operation=...- Search tracesGET /api/services- List servicesGET /api/operations?service=...- List operationsGET /api/dependencies- Service dependenciesPOST /api/archive/{traceID}- Archive a traceGET /api/metrics/latencies- Latency metricsGET /api/metrics/calls- Call rate metricsGET /api/metrics/errors- Error rate metrics
API v3 endpoints are available under /api/v3/ with similar functionality but using OTLP format.
Request Flow
- Request arrives at HTTP or gRPC server
- Handler validates request parameters and converts to internal format
- QueryService executes the query against primary storage
- Archive fallback triggered if trace not found (when configured)
- Adjustments applied (clock skew, span ordering) unless raw traces requested
- Response streamed back to client in chunks (for large traces)
Storage Integration
The query service abstracts storage through two interfaces:
- Primary storage: Fast, recent trace data (e.g., Elasticsearch, Cassandra)
- Archive storage: Long-term retention (optional, separate backend)
When a trace isn't found in primary storage, the service automatically queries archive storage. This enables cost-effective retention policies where hot data stays in fast storage and cold data moves to cheaper backends.
Error Handling
- Trace not found: Returns
codes.NotFound(gRPC) or 404 (HTTP) - Invalid parameters: Returns
codes.InvalidArgument(gRPC) or 400 (HTTP) - Storage errors: Returns
codes.Internal(gRPC) or 500 (HTTP) - Nil requests: Rejected with validation errors
Performance Considerations
- Streaming responses: Large traces are sent in chunks (max 10 spans per chunk) to avoid memory overhead
- Raw traces option: Skips adjustment processing for faster responses when clock skew correction isn't needed
- Concurrent requests: Both HTTP and gRPC servers run concurrently with independent listeners
- Tenancy support: Multi-tenant deployments use interceptors to enforce data isolation
Sampling & Adaptive Sampling
Relevant Files
internal/sampling/samplingstrategyinternal/sampling/samplingstrategy/adaptiveinternal/sampling/http/handler.gointernal/sampling/grpc/grpc_handler.gocmd/jaeger/internal/processors/adaptivesamplingcmd/jaeger/internal/extension/remotesampling
Jaeger supports two sampling strategies: file-based (static) and adaptive (dynamic). Both are served via HTTP and gRPC endpoints that SDKs query to determine which traces to sample.
File-Based Sampling
File-based sampling uses a static JSON configuration file that defines sampling probabilities for services and operations. The Provider periodically reloads the file at a configurable interval, allowing updates without restarting the collector.
Configuration example:
{
"default_strategy": {
"type": "probabilistic",
"param": 0.5
},
"service_strategies": [
{
"service": "my-service",
"type": "probabilistic",
"param": 0.8,
"operation_strategies": [
{
"operation": "/health",
"type": "probabilistic",
"param": 0.0
}
]
}
]
}
Supported strategy types: probabilistic (sampling rate 0-1) and ratelimiting (max traces per second).
Adaptive Sampling
Adaptive sampling dynamically adjusts sampling probabilities based on observed traffic patterns. It aims to maintain a target number of samples per second across all services and operations.
Three main components:
-
Aggregator — Runs in the trace processing pipeline. Observes root spans, counts traces per service/operation, and periodically flushes throughput metrics to storage.
-
Post-Aggregator — Loads throughput data from storage (aggregated across all collector instances), calculates optimal sampling probabilities to meet the target QPS, and writes probabilities back to storage. Uses leader-follower election to ensure only one instance performs calculations.
-
Provider — Periodically reads computed probabilities from storage and converts them into
SamplingStrategyResponseobjects served to SDKs. Followers refresh probabilities at a shorter interval than leaders.
Loading diagram...
Key Configuration
Adaptive sampling options:
target_samples_per_second— Global target QPS (e.g., 100 traces/sec)initial_sampling_probability— Probability for new services/operations (default 0.001)min_samples_per_second— Minimum QPS per operationaggregation_interval— How often aggregator flushes throughput (default 1 second)calculation_interval— How often post-aggregator recalculates probabilities (default 1 minute)
Important: Adaptive sampling does not perform sampling itself. The Jaeger backend calculates probabilities and exposes them via the Remote Sampling protocol. OpenTelemetry SDKs query this endpoint and perform the actual sampling based on returned probabilities.
Remote Sampling Extension
The remotesampling extension manages both file-based and adaptive providers. It exposes HTTP and gRPC endpoints for SDKs to query sampling strategies. Configuration specifies either file or adaptive (not both):
extensions:
remotesampling:
http:
endpoint: 0.0.0.0:5778
adaptive:
sampling_store: badger
target_samples_per_second: 100
The extension integrates with the jaegerstorage extension to access the sampling store backend (memory, Cassandra, Badger, Elasticsearch, OpenSearch).
Authentication & Multi-Tenancy
Relevant Files
internal/auth/transport.gointernal/auth/bearertoken/http.gointernal/auth/bearertoken/grpc.gointernal/auth/bearertoken/context.gointernal/auth/apikey/apikey-context.gointernal/tenancy/manager.gointernal/tenancy/grpc.gointernal/tenancy/http.gointernal/tenancy/context.gointernal/tenancy/flags.go
Overview
Jaeger implements a flexible authentication and multi-tenancy system that supports bearer tokens, API keys, and tenant isolation. Authentication mechanisms propagate credentials across HTTP and gRPC transports, while the tenancy system enables data isolation in multi-tenant deployments.
Authentication Architecture
The authentication system uses a pluggable design with two primary mechanisms:
Bearer Tokens extract credentials from HTTP Authorization headers or gRPC metadata. The PropagationHandler middleware parses bearer tokens from incoming requests and stores them in the request context for downstream propagation. It supports both standard Authorization: Bearer <token> format and fallback to X-Forwarded-Access-Token headers.
API Keys provide an alternative authentication method stored directly in context. The GetAPIKey() and ContextWithAPIKey() functions manage API key lifecycle within request contexts.
The RoundTripper wrapper injects authentication headers into outbound HTTP requests by extracting tokens from the request context or using fallback token functions. This enables seamless credential propagation across service boundaries.
Bearer Token Propagation
Bearer tokens propagate through both HTTP and gRPC layers:
- HTTP:
PropagationHandlerextracts tokens from request headers and injects them into context - gRPC Server:
NewUnaryServerInterceptor()andNewStreamServerInterceptor()extract tokens from gRPC metadata - gRPC Client:
NewUnaryClientInterceptor()andNewStreamClientInterceptor()inject tokens into outgoing request metadata
// Extract from HTTP request
authHeaderValue := r.Header.Get("Authorization")
token := strings.Split(authHeaderValue, " ")[1] // "Bearer token"
ctx = ContextWithBearerToken(ctx, token)
// Propagate to gRPC metadata
ctx = metadata.AppendToOutgoingContext(ctx, "bearer.token", token)
Multi-Tenancy System
The tenancy system isolates data by tenant when enabled. The Manager validates tenant headers against a configured list of allowed tenants.
Configuration uses command-line flags:
--multi-tenancy.enabled: Enable tenant isolation--multi-tenancy.header: HTTP header name (default:x-tenant)--multi-tenancy.tenants: Comma-separated list of allowed tenant values
Tenant Extraction follows a priority order:
- Check if tenant is already in context (via
GetTenant()) - Check OpenTelemetry client metadata
- Extract from gRPC incoming metadata or HTTP headers
Validation ensures exactly one tenant value per request. Multiple or missing tenant headers return PermissionDenied or Unauthenticated gRPC errors.
Request Flow Diagram
Loading diagram...
Context Propagation Pattern
Both authentication and tenancy use Go's context.Context for request-scoped storage:
// Bearer token context
ctx = ContextWithBearerToken(ctx, token)
token, ok := GetBearerToken(ctx)
// Tenant context
ctx = WithTenant(ctx, "tenant-id")
tenant := GetTenant(ctx)
// API key context
ctx = ContextWithAPIKey(ctx, apiKey)
apiKey, ok := GetAPIKey(ctx)
Integration Points
- HTTP Handlers: Wrap with
PropagationHandlerandExtractTenantHTTPHandler - gRPC Servers: Register interceptors from
bearertokenandtenancypackages - gRPC Clients: Use client interceptors to inject credentials into outbound calls
- Storage Backends: Access credentials via context for per-tenant data isolation
Utilities & Tools
Relevant Files
cmd/tracegencmd/anonymizercmd/remote-storagecmd/es-index-cleanercmd/es-rollovercmd/esmapping-generatorinternal/metrics
Jaeger provides several command-line utilities and tools to support operational tasks, testing, and observability. These tools extend the core platform with specialized functionality for trace generation, data management, and metrics collection.
Trace Generation & Testing
tracegen generates a steady flow of synthetic traces for performance testing and tuning. It creates traces concurrently from multiple worker goroutines, allowing you to simulate realistic trace patterns without requiring actual application instrumentation.
Key features:
- Configurable number of workers and trace duration
- Support for multiple services and child spans
- Integration with OpenTelemetry exporters (OTLP, stdout)
- Adaptive sampling support via remote sampling endpoints
- Available as Docker image:
jaegertracing/jaeger-tracegen
Example usage:
docker run jaegertracing/jaeger-tracegen -service myapp -traces 1000 -workers 4
Data Anonymization
anonymizer queries Jaeger for a specific trace and anonymizes sensitive fields using hashing. This utility is useful for sharing trace data for debugging without exposing production information.
The tool:
- Fetches traces from Jaeger query service
- Hashes standard tags, custom tags, logs, and process information
- Outputs original, anonymized, and UI-compatible JSON files
- Generates mapping files to track anonymization transformations
Remote Storage
remote-storage exposes single-node storage backends (memory, Badger) over gRPC, implementing the Jaeger Remote Storage API. This enables distributed deployments where multiple Jaeger components share a centralized storage backend.
Configuration supports:
- Memory storage with configurable trace limits
- Badger persistent storage with TTL policies
- Multi-tenancy with tenant-specific storage backends
- gRPC endpoint configuration and TLS
Elasticsearch Index Management
es-index-cleaner removes old Jaeger indices from Elasticsearch to manage storage costs and retention policies. It calculates a cutoff date and deletes indices older than the specified number of days.
es-rollover manages Elasticsearch index lifecycle through three operations:
- init: Creates initial indices and aliases
- rollover: Transitions to new write indices when size or age thresholds are met
- lookback: Removes old indices from read aliases
Both tools support:
- Index prefix customization
- Archive and dependency index handling
- Elasticsearch authentication and TLS
- Index Lifecycle Management (ILM) policies
esmapping-generator generates Elasticsearch mappings for Jaeger indices, ensuring proper field types and analysis configurations.
Metrics Infrastructure
The internal/metrics package provides an abstraction layer for metrics collection with multiple backend implementations:
Core Interfaces:
Counter: Tracks event occurrencesGauge: Records instantaneous measurementsTimer: Measures operation durationHistogram: Tracks value distributions
Implementations:
- Prometheus: Default backend using Prometheus client library
- OpenTelemetry: OTEL metrics integration
- Local: In-memory metrics for testing via
metricstest - Null: No-op implementation for disabling metrics
Metrics are initialized via reflection using struct tags:
type MyMetrics struct {
RequestCount metrics.Counter `metric:"requests.count"`
Duration metrics.Timer `metric:"request.duration"`
}
metrics.Init(&m, factory, globalTags)
The metricsbuilder package provides CLI flag support for selecting backends and configuring HTTP scrape endpoints.
Configuration & Deployment
Relevant Files
cmd/jaeger/config.yamlcmd/jaeger/internal/all-in-one.yamlcmd/jaeger/internal/command.gocmd/internal/storageconfig/config.gocmd/jaeger/Dockerfiledocker-compose/
Jaeger v2 uses YAML-based configuration built on the OpenTelemetry Collector framework. The system supports multiple deployment modes, from all-in-one development setups to distributed production architectures with various storage backends.
Configuration System
Jaeger v2 configuration is managed through YAML files that define the complete pipeline: receivers, processors, exporters, extensions, and telemetry settings. The configuration system uses Viper for loading and environment variable substitution.
Configuration Loading:
- Configurations are loaded via the
--configflag pointing to a YAML file - Environment variables can override values using
${env:VAR_NAME:-default}syntax - If no config is provided, Jaeger defaults to an embedded all-in-one configuration with memory storage
- Multiple configuration providers are supported: file, HTTP, HTTPS, environment, and YAML
Core Configuration Structure:
service:
extensions: [jaeger_storage, jaeger_query]
pipelines:
traces:
receivers: [otlp, jaeger, zipkin]
processors: [batch]
exporters: [jaeger_storage_exporter]
extensions:
jaeger_storage:
backends:
primary-store:
memory:
max_traces: 100000
jaeger_query:
storage:
traces: primary-store
Storage Backends
Jaeger v2 supports multiple storage backends configured under jaeger_storage.backends. Each backend can be independently configured and referenced by name.
Supported Backends:
- Memory - In-process storage, ideal for development and testing
- Badger - Embedded key-value store with TTL support for single-node deployments
- Cassandra - Distributed NoSQL database for high-scale deployments
- Elasticsearch/OpenSearch - Full-text search and analytics capabilities
- ClickHouse - Columnar database optimized for trace analytics
- gRPC - Remote storage via the Jaeger Remote Storage API
Example: Elasticsearch Configuration
jaeger_storage:
backends:
main-storage:
elasticsearch:
server_urls:
- http://localhost:9200
indices:
index_prefix: "jaeger-main"
spans:
date_layout: "2006-01-02"
rollover_frequency: "day"
Deployment Modes
All-in-One (Default): Runs without a config file using embedded all-in-one.yaml. Includes memory storage, query service, and sampling endpoints. Suitable for development and demos.
Distributed: Multiple Jaeger instances with separate collector and query components, each configured independently. Enables horizontal scaling and high availability.
Remote Storage: Uses the jaeger-remote-storage binary to expose single-node backends (Memory, Badger) over gRPC, allowing multiple Jaeger instances to share storage.
Docker Deployment
The Dockerfile exposes standard Jaeger ports:
4317- OTLP gRPC receiver4318- OTLP HTTP receiver14250- Jaeger gRPC receiver14268- Jaeger HTTP receiver9411- Zipkin receiver16686- Web UI5778- Sampling config HTTP5779- Sampling config gRPC13133- Health check HTTP
Docker Compose Example:
services:
jaeger:
image: jaegertracing/jaeger:latest
volumes:
- "./config.yaml:/etc/jaeger/config.yml"
command: ["--config", "/etc/jaeger/config.yml"]
ports:
- "16686:16686"
- "4317:4317"
- "4318:4318"
Configuration Validation
All configuration components implement validation through the Validate() method. Storage backends, query settings, and exporters are validated at startup to catch configuration errors early. The system uses govalidator for struct validation with required field checks.
Environment Variable Substitution
Configuration values support environment variable expansion:
jaeger_storage:
backends:
main:
elasticsearch:
server_urls:
- "${env:ELASTICSEARCH_URL:-http://localhost:9200}"
This enables flexible deployment across different environments without modifying configuration files.
Testing & Integration
Relevant Files
internal/storage/integration– Unit-mode storage integration testscmd/jaeger/internal/integration– End-to-end integration tests for Jaeger v2internal/testutils– Shared testing utilities (leak detection, logging)internal/metricstest– Metrics testing helpers
Jaeger uses a two-tier testing strategy: unit-mode tests that directly exercise storage APIs, and end-to-end (E2E) tests that validate the full Jaeger v2 collector pipeline.
Unit-Mode Storage Integration Tests
Located in internal/storage/integration, these tests write and read span data directly to storage backends without going through the network layer. The StorageIntegration struct provides a reusable framework:
type StorageIntegration struct {
TraceWriter tracestore.Writer
TraceReader tracestore.Reader
DependencyWriter depstore.Writer
DependencyReader depstore.Reader
SamplingStore samplingstore.Store
CleanUp func(t *testing.T)
}
Each storage backend (Elasticsearch, Cassandra, Badger, etc.) implements its own test file that instantiates this struct and calls RunAll() or RunSpanStoreTests(). Tests are conditionally skipped using SkipUnlessEnv() based on environment variables like STORAGE=elasticsearch.
End-to-End Integration Tests
The cmd/jaeger/internal/integration package tests the complete Jaeger v2 OtelCol binary. These tests:
- Start the Jaeger v2 binary with a specific storage backend configuration
- Write spans via OTLP RPC client to the collector’s receiver
- Read spans via RPC client to the jaeger_query extension
- Verify results match expected data
The E2EStorageIntegration struct extends StorageIntegration and manages binary lifecycle:
type E2EStorageIntegration struct {
integration.StorageIntegration
ConfigFile string
BinaryName string
HealthCheckPort int
EnvVarOverrides map[string]string
}
Storage Cleaner Extension
Integration tests require clean state between test runs. The storagecleaner extension auto-injects into collector configs and exposes an HTTP endpoint (POST /purge) that calls the storage backend’s Purge() method.
Test Utilities
Leak Detection (internal/testutils/leakcheck.go):
VerifyGoLeaks()detects goroutine leaks inTestMain- Ignores expected leaks from glog, go-metrics, and HTTP transports
- Call via
defer testutils.VerifyGoLeaksOnce(t)for specific tests
Logging (internal/testutils/logger.go):
NewLogger()creates a zap logger backed by a test buffer- Useful for capturing and asserting log output in tests
Metrics Testing (internal/metricstest):
AssertCounterMetrics()andAssertGaugeMetrics()verify metric values- Snapshot metrics and compare against expected values
Running Tests
# Unit tests with memory storage
make test
# Integration tests for a specific backend
STORAGE=elasticsearch make jaeger-v2-storage-integration-test
# Coverage report
STORAGE=memory make cover
# Format and lint
make fmt
make lint
Tests use go test with build tags like memory_storage_integration to conditionally compile storage implementations. The Makefile orchestrates test execution with proper environment setup and colorized output.