Install Now

prometheus/prometheus

Prometheus Monitoring System

Last updated on Dec 18, 2025 (Commit: 4c7377f)

Overview

Relevant Files
  • README.md
  • cmd/prometheus/main.go
  • docs/getting_started.md
  • documentation/internal_architecture.md

Prometheus is a Cloud Native Computing Foundation project that provides a comprehensive systems and service monitoring platform. It collects metrics from configured targets at regular intervals, evaluates rule expressions, displays results, and triggers alerts when specified conditions are met.

Key Characteristics

Prometheus distinguishes itself through several core features:

  • Multi-dimensional data model: Time series are defined by metric name and key/value label pairs, enabling flexible querying and aggregation
  • PromQL query language: A powerful, flexible language designed to leverage the multi-dimensional data model
  • Autonomous single-server architecture: No dependency on distributed storage; each Prometheus instance operates independently
  • Pull-based metrics collection: Prometheus pulls metrics from targets via HTTP, with optional push support via an intermediary gateway
  • Service discovery: Targets are discovered dynamically or configured statically
  • Flexible visualization: Multiple modes for graphing and dashboarding
  • Federation support: Hierarchical and horizontal federation capabilities for scaling

Architecture Overview

Loading diagram...

Core Components

Scrape Manager: Orchestrates metric collection from discovered targets. It manages scrape pools for each job configuration and applies target relabeling to produce final scrape targets.

Storage Layer: Abstracts local and remote storage through a fanout mechanism. The local TSDB stores time series on disk, while remote storage endpoints receive copies of all samples for long-term retention or external systems.

PromQL Engine: Evaluates PromQL queries against the time series database. It parses expressions into abstract syntax trees and executes them lazily over time series iterators.

Rule Manager: Periodically evaluates recording and alerting rules using PromQL. For alerting rules, it manages alert lifecycle, tracks state transitions, and sends alerts to the notifier.

Notifier: Decouples alert generation from dispatch, queuing alerts and forwarding them to configured Alertmanager instances.

Running Prometheus

Prometheus can run in two modes: Server mode (default) for full monitoring capabilities, or Agent mode for lightweight metric collection and remote write only.

Basic startup requires a configuration file:

./prometheus --config.file=prometheus.yml

The web interface is available at http://localhost:9090 by default, providing query execution, alert inspection, and server status monitoring.

Architecture & Data Flow

Relevant Files
  • cmd/prometheus/main.go
  • storage/fanout.go
  • storage/interface.go
  • documentation/internal_architecture.md

Prometheus follows a modular actor-based architecture where independent components coordinate through channels and shared interfaces. The main function orchestrates startup and shutdown using the oklog/run group pattern, ensuring proper initialization order and graceful termination.

High-Level System Overview

Loading diagram...

Core Components

Scrape Discovery Manager discovers and continuously updates monitoring targets using service discovery mechanisms (Kubernetes, DNS, Consul, etc.). It runs one discovery instance per scrape config and sends target group updates over a synchronization channel to the scrape manager.

Scrape Manager maintains a hierarchy of scrape pools and scrape loops. Each scrape pool corresponds to a scrape config, and each loop handles one target. The manager applies target relabeling, spreads scrapes deterministically across the interval, and forwards scraped samples to storage.

Fanout Storage abstracts local and remote storage behind a unified interface. It differentiates between primary (local TSDB) and secondary (remote) storages. Reads merge results from all sources; writes duplicate to all destinations. This allows Prometheus to simultaneously store locally and forward to remote endpoints.

Local Storage (TSDB) persists time series data on disk in blocks. The TSDB manages the head chunk (in-memory), WAL (write-ahead log), and compacted blocks. It supports out-of-order samples, exemplars, and native histograms.

Remote Storage manages remote write and read endpoints. For each remote_write config, it creates a queue manager that parallelizes writes using dynamic sharding. Remote read results are merged with local data.

PromQL Engine evaluates queries lazily against the storage. It parses expressions into ASTs, creates iterators for matching series, and evaluates operators. The engine is used by both the web API and rule manager.

Rule Manager evaluates recording and alerting rules periodically. It writes recording rule results back to storage and sends firing alerts to the notifier, tracking alert state transitions based on the configured for duration.

Notifier decouples alert generation from dispatch. It queues alerts from the rule manager and forwards them to configured Alertmanager instances, with configurable batch sizes and queue capacity.

Data Flow

Metrics flow from targets through scrape loops into the fanout storage, which duplicates writes to local TSDB and remote endpoints. Queries originate from the web API or rule manager, hitting the fanout storage which merges local and remote results. Rules generate new time series (recording rules) or alerts (alerting rules) that feed back into storage or the notifier.

Configuration Reload

The configuration reload handler listens for SIGHUP signals or web requests. On reload, it re-reads the config file and applies it to all components via their ApplyConfig() methods. Discovery managers restart with new configs, scrape managers update pools, and remote storage recreates queue managers.

Metric Collection & Service Discovery

Relevant Files
  • scrape/manager.go
  • scrape/scrape.go
  • scrape/metrics.go
  • discovery/manager.go
  • discovery/discovery.go
  • discovery/metrics.go
  • discovery/targetgroup/targetgroup.go

Overview

Prometheus uses a two-stage pipeline to collect metrics: Service Discovery identifies targets to scrape, and Scrape Management collects metrics from those targets. These components work together through a channel-based architecture that decouples target discovery from metric collection.

Loading diagram...

Service Discovery

The Discovery Manager maintains a set of Discoverer implementations (Kubernetes, Consul, AWS EC2, DNS, file-based, etc.) that identify available targets. Each discoverer implements the Discoverer interface:

type Discoverer interface {
    Run(ctx context.Context, up chan<- []*targetgroup.Group)
}

Discoverers send TargetGroup objects through a channel, where each group contains:

  • Targets: List of endpoints with labels (e.g., __address__, __meta_* labels)
  • Labels: Common labels applied to all targets in the group
  • Source: Unique identifier for the group

The Discovery Manager aggregates updates from all providers and sends consolidated target updates to the Scrape Manager via SyncCh(). It batches updates with a configurable delay (default 5 seconds) to avoid excessive reloads.

Scrape Management

The Scrape Manager receives target groups from the Discovery Manager and organizes them into scrape pools, one per scrape configuration. Each pool manages:

  • Target lifecycle: Adding, updating, and removing targets
  • HTTP client configuration: TLS, authentication, headers
  • Scrape loops: Concurrent goroutines that periodically fetch metrics from targets
  • Relabeling: Applying label transformations before storage

When targets change, the manager calls Sync() on the affected pool, which reconciles the current targets with the new set. Scrape loops are created or destroyed as needed.

Metrics Collection

Both managers expose Prometheus metrics for observability:

Discovery Metrics (prometheus_sd_*):

  • prometheus_sd_discovered_targets: Current target count per config
  • prometheus_sd_failed_configs: Failed SD configurations
  • prometheus_sd_received_updates_total: Total updates from providers
  • prometheus_sd_updates_total: Updates sent to consumers

Scrape Metrics (prometheus_target_*):

  • prometheus_target_scrape_pools: Active scrape pools
  • prometheus_target_scrape_pool_reloads_total: Pool reload count
  • prometheus_target_scrape_pool_targets_added: Targets per pool
  • prometheus_target_interval_length_seconds: Scrape interval distribution
  • prometheus_target_scrape_sample_limit: Samples exceeding limits

Configuration Flow

  1. Config Load: Prometheus loads scrape configs with discovery sections
  2. Provider Registration: Discovery Manager starts providers for each config
  3. Initial Discovery: Providers send initial target groups
  4. Pool Creation: Scrape Manager creates pools and scrape loops
  5. Continuous Updates: Providers send updates on target changes
  6. Reload: Scrape Manager syncs pools with new target groups

Time Series Database (TSDB)

Relevant Files
  • tsdb/db.go
  • tsdb/head.go
  • tsdb/block.go
  • tsdb/compact.go
  • tsdb/wlog/
  • tsdb/chunks/

Prometheus TSDB is a time series storage engine optimized for high-cardinality metrics at scale. It uses a hybrid architecture combining in-memory and on-disk storage with automatic compaction.

Architecture Overview

Loading diagram...

Core Components

Head Block manages recent time series data in memory. It holds all active series in a stripeSeries structure, which shards series by ID and label hash to reduce lock contention. The Head maintains:

  • memSeries: In-memory representation of each series with chunks and metadata
  • MemPostings: Reverse index mapping label-value pairs to series references
  • WAL (Write Ahead Log): Persists writes for crash recovery
  • Chunk Disk Mapper: Memory-maps head chunks to disk for efficient storage

Persistent Blocks store immutable time series data on disk. Each block covers a continuous time range and contains:

  • Index: Maps label-value pairs to series and chunks
  • Chunks: Compressed time series data (XOR encoding for floats, delta-of-delta for histograms)
  • Tombstones: Marks deleted data without rewriting chunks
  • Meta: Block metadata including ULID, time range, and compaction history

Data Flow: Write Path

  1. Append: Call db.Appender() to get an appender for writing samples
  2. Series Lookup: Appender finds or creates memSeries in stripeSeries by label hash
  3. Chunk Management: Samples are appended to the current chunk; new chunks are cut when full
  4. WAL Write: Samples are written to the Write Ahead Log for durability
  5. Commit: Appender commits changes; if Head is compactable, triggers compaction

Compaction Strategy

The LeveledCompactor uses exponential block ranges (default: 2h, 4h, 8h, etc.) to organize data:

  • Level 0: Head block compacted into first persistent block
  • Level N: Blocks of similar size are merged upward
  • Overlapping Compaction: Handles out-of-order data by merging overlapping blocks
  • Retention: Old blocks are deleted based on retention duration or max bytes

Compaction merges series from multiple blocks, deduplicates samples, applies tombstones, and re-indexes data. The process is automatic and runs in the background.

Query Path

Queries merge data from the Head and all persistent blocks:

  1. Series Selection: Use MemPostings (Head) and block indices to find matching series
  2. Chunk Iteration: Iterate chunks in time order across all blocks
  3. Sample Decoding: Decompress and decode samples on-the-fly
  4. Merging: Combine results from multiple blocks with proper ordering

Key Optimizations

  • Striped Locking: Series sharded by ID/hash to reduce contention
  • Chunk Encoding: XOR compression reduces storage by ~90%
  • Memory Mapping: Head chunks mapped to disk, reducing memory pressure
  • Isolation: MVCC-style isolation prevents readers from seeing uncommitted writes
  • Out-of-Order Support: Separate OOO head and chunks for late-arriving data

PromQL Query Engine

Relevant Files
  • promql/engine.go
  • promql/functions.go
  • promql/parser/parse.go
  • promql/parser/ast.go
  • promql/value.go

The PromQL Query Engine is the core execution layer that transforms PromQL query strings into evaluated results. It orchestrates parsing, validation, and evaluation across time ranges while managing resource constraints and cancellation.

Architecture Overview

Loading diagram...

Core Components

Engine (Engine struct) manages the query lifecycle. It holds configuration like timeout, max samples per query, lookback delta, and feature flags. The engine creates queries and coordinates their execution through an evaluator.

Query represents a single query execution. It wraps the query string, parsed AST (EvalStmt), storage queryable, and statistics. The Exec() method triggers evaluation and returns a Result containing the value, warnings, and errors.

Evaluator performs the actual expression evaluation. It traverses the AST recursively, handling different expression types: vector selectors, matrix selectors, binary operations, aggregations, and function calls. It maintains state like current timestamp, sample count, and the storage querier.

Query Execution Flow

  1. Creation: NewInstantQuery() or NewRangeQuery() creates a query object and parses the expression string into an AST.

  2. Validation: The engine validates the AST structure and checks for unsupported features based on configuration flags.

  3. Execution: Exec() calls the engine's internal exec() method, which:

    • Acquires a querier from storage for the required time range
    • Populates series metadata at query preparation time
    • Creates an evaluator with the parsed expression
    • Evaluates the expression recursively
  4. Evaluation: The evaluator processes each expression node:

    • VectorSelector: Fetches matching series from storage
    • MatrixSelector: Extends vector selection with a time range
    • BinaryExpr: Applies operations with vector matching logic
    • AggregateExpr: Groups and reduces vectors
    • Call: Invokes built-in functions
  5. Result Assembly: Results are sorted and returned as a Matrix, Vector, Scalar, or String.

Value Types

PromQL expressions evaluate to four types:

  • Scalar: A single float64 value with a timestamp
  • Vector: Multiple samples (instant vector), each with labels and a value
  • Matrix: Multiple series, each containing timestamped samples (range vector)
  • String: A string value with a timestamp

Resource Management

The engine enforces limits to prevent runaway queries:

  • Sample Limit: Tracks cumulative samples loaded; panics if exceeded
  • Timeout: Context cancellation after configured duration
  • Active Query Tracking: Limits concurrent queries
  • Query Logger: Optional logging of all queries for debugging

Key Functions

evalSeries() generates a Matrix by iterating through storage series in step-sized intervals, collecting samples at each timestamp.

evalSubquery() evaluates nested subqueries, converting them to equivalent matrix selectors for composition.

eval() is the recursive dispatcher that handles all expression types, delegating to specialized methods for each node kind.

Built-in Functions (in functions.go) implement aggregations (sum, avg, max), range functions (rate, increase), and transformations (label_replace, label_join).

Error Handling

Errors during evaluation are caught via panic recovery in the evaluator. This allows early termination without explicit error checking at each step. Context cancellation and timeouts are converted to specific error types (ErrQueryCanceled, ErrQueryTimeout) for client handling.

Rules & Alerting

Relevant Files
  • rules/manager.go
  • rules/group.go
  • rules/alerting.go
  • rules/recording.go
  • notifier/manager.go

Prometheus evaluates rules in groups at regular intervals, generating alerts and recording metrics. The rules engine manages both alerting and recording rules, coordinating their evaluation and dispatching alerts to Alertmanager.

Rule Types

Prometheus supports two types of rules:

  • Alerting Rules - Generate alerts when a PromQL expression evaluates to a non-empty result. Alerts transition through states (Pending > Firing) based on a hold duration.
  • Recording Rules - Pre-compute and store the results of expensive PromQL expressions as new time series, improving query performance.

Both rule types are organized into groups, each with a name, file path, and evaluation interval.

Rule Groups and Evaluation

A Group contains multiple rules evaluated sequentially at a fixed interval. Groups are loaded from YAML files and managed by the Manager. Key aspects:

  • Interval-based Evaluation - Groups evaluate at consistent time slots determined by their hash and interval duration.
  • State Preservation - Alert state (Pending/Firing) persists across evaluations, enabling the hold duration mechanism.
  • Dependency Tracking - Rules can depend on outputs of other rules within the same group, enabling ordered evaluation.
// Group evaluation runs at regular intervals
tick := time.NewTicker(g.interval)
g.evalIterationFunc(ctx, g, evalTimestamp)

Alert State Machine

Alerts follow a state progression:

Loading diagram...
  • StatePending - Alert condition met but hold duration not elapsed
  • StateFiring - Hold duration exceeded; alert is active
  • StateInactive - Alert no longer matches; kept briefly for Alertmanager resilience
  • KeepFiringFor - Optional duration to keep firing after resolution (useful for maintenance windows)

Alert Dispatch

The notifier.Manager handles sending alerts to Alertmanager:

  • Alerts are queued and batched (default 256 per request)
  • Pending alerts are not sent; only Firing and recently Resolved alerts
  • Resend delay ensures periodic re-notification of active alerts
  • External labels and relabeling rules are applied before dispatch

Recording Rules

Recording rules pre-compute expensive queries and store results as new metrics:

groups:
  - name: cpu_rules
    interval: 30s
    rules:
      - record: job:cpu:avg_5m
        expr: avg(rate(cpu_seconds_total[5m])) by (job)

Results are written to storage with the specified metric name and labels, enabling faster queries and reducing load on the query engine.

Rule Manager Lifecycle

The Manager orchestrates rule evaluation:

  1. Loads rule groups from files via GroupLoader
  2. Starts goroutines for each group to run evaluation loops
  3. Handles configuration reloads without losing alert state
  4. Coordinates with the notifier to dispatch alerts
  5. Tracks metrics: evaluation duration, failures, and alert counts

Remote Storage & Federation

Relevant Files
  • storage/remote/storage.go
  • storage/remote/queue_manager.go
  • storage/remote/write.go
  • storage/remote/read.go
  • prompb/remote.proto
  • web/federate.go

Prometheus supports sending metrics to remote storage systems and reading from them, enabling long-term storage, cross-instance queries, and federation patterns. This section covers the architecture and implementation of remote storage operations.

Remote Write Architecture

Remote write allows Prometheus to push samples to external storage backends. The system uses a queue-based architecture with sharding for parallel writes:

  1. WriteStorage manages all remote write endpoints configured in remote_write config sections
  2. QueueManager handles each endpoint independently, managing a queue of samples to send
  3. Shards parallelize writes across multiple goroutines, with dynamic resharding based on throughput

The flow is: Samples → WAL Watcher → QueueManager → Shards → HTTP Client → Remote Endpoint

Queue Manager & Sharding

The QueueManager maintains:

  • Series tracking: Maps series references to labels and metadata for relabeling
  • Batch queues: Each shard has a queue that batches samples before sending
  • EWMA rate tracking: Exponentially weighted moving average of samples in/out to calculate desired shards
  • Resharding: Automatically adjusts shard count every 10 seconds based on throughput
// Desired shards calculated from:
// - Incoming sample rate
// - Outgoing sample rate and latency
// - Pending backlog
// - Tolerance threshold (30% before scaling)
desiredShards = timePerSample * (dataInRate * dataKeptRatio + backlogCatchup)

Data Types Supported

Remote write supports multiple data types:

  • Samples: Float64 values with timestamps
  • Exemplars: Trace references attached to samples
  • Native Histograms: Distribution data (v2 protocol)
  • Metadata: Metric type, unit, and help text

Remote Read

Remote read enables querying external storage. The Storage struct combines:

  • Read clients: One per remote_read config endpoint
  • Merge querier: Combines results from multiple remote sources
  • External label filtering: Optionally filters by external labels
  • Required matchers: Enforces label constraints on queries
// Querier merges results from all configured remote read endpoints
queriers := make([]storage.Querier, len(queryables))
for _, queryable := range queryables {
    q, _ := queryable.Querier(mint, maxt)
    queriers = append(queriers, q)
}
return storage.NewMergeQuerier(nil, queriers, storage.ChainedSeriesMerge)

Federation

Federation allows one Prometheus instance to scrape metrics from another via the /federate endpoint. The handler:

  1. Parses metric selectors from match[] query parameters
  2. Queries local storage for matching series
  3. Returns the most recent sample for each series in exposition format
  4. Supports both classic and native histogram formats

Federation is useful for aggregating metrics across multiple Prometheus instances without requiring remote storage.

Protocol Versions

Prometheus supports multiple remote write protocols:

  • v1: Original protocol with samples, exemplars, and metadata
  • v2: Improved protocol with better compression and metadata handling
  • v2.1+: Enhanced features and optimizations

The protocol version is negotiated during client initialization and affects how data is serialized and sent.

Error Handling & Retries

The queue manager implements sophisticated error handling:

  • Recoverable errors: Network timeouts, 5xx responses → retry with backoff
  • Non-recoverable errors: 4xx responses, invalid data → drop and log
  • Backoff strategy: Exponential backoff with configurable min/max
  • Flush deadline: Graceful shutdown waits for pending sends before force-closing

Web API & User Interface

Relevant Files
  • web/web.go
  • web/api/v1/api.go
  • web/ui/ui.go
  • web/ui/README.md

Prometheus exposes both a REST API and a web user interface for querying metrics, managing configuration, and monitoring system health. The web layer is split into two main components: the HTTP API server and the React-based UI.

HTTP Server Architecture

The Handler struct in web/web.go manages all HTTP endpoints and serves as the central router for incoming requests. It integrates with the query engine, storage layer, scrape manager, and rule manager to provide comprehensive monitoring capabilities. The handler uses Go's http.Server with OpenTelemetry instrumentation for request tracing and Prometheus metrics collection.

Key responsibilities include:

  • Request routing via the route.Router with middleware for compression, CORS, and metrics instrumentation
  • Lifecycle management through readiness checks (/-/ready, /-/healthy) and graceful shutdown
  • Static asset serving for the React UI (both new Mantine-UI and legacy React-App versions)
  • Console template rendering for custom dashboards using Go templating

REST API v1 Endpoints

The API v1 implementation in web/api/v1/api.go provides comprehensive query and management endpoints:

Query Endpoints:

  • /api/v1/query – Instant queries at a specific timestamp
  • /api/v1/query_range – Range queries over time intervals
  • /api/v1/query_exemplars – Exemplar data for traces
  • /api/v1/labels – Available label names
  • /api/v1/label/:name/values – Values for a specific label
  • /api/v1/series – Series matching label matchers

Metadata Endpoints:

  • /api/v1/metadata – Metric metadata (type, help text)
  • /api/v1/targets – Active and dropped scrape targets
  • /api/v1/targets/metadata – Target-specific metadata
  • /api/v1/alertmanagers – Configured AlertManager instances

Status Endpoints:

  • /api/v1/status/config – Current configuration
  • /api/v1/status/runtimeinfo – Runtime statistics (goroutines, memory, uptime)
  • /api/v1/status/buildinfo – Build version and revision
  • /api/v1/status/tsdb – Time-series database statistics
  • /api/v1/features – Enabled feature flags

Admin Endpoints (when enabled):

  • /api/v1/admin/tsdb/delete_series – Delete time-series data
  • /api/v1/admin/tsdb/snapshot – Create database snapshots
  • /api/v1/admin/tsdb/clean_tombstones – Clean deletion markers

User Interface

Prometheus provides two React-based UIs:

  • Mantine-UI (v3) – Modern default UI with improved UX, served from /static/mantine-ui
  • React-App (v2) – Legacy UI, available via --enable-feature=old-ui flag

Both UIs are built from source in web/ui/ and compiled into the binary using Go's embed package. The UI communicates with the backend via the REST API, proxying requests to /api/v1/ endpoints. React Router handles client-side navigation for paths like /query, /alerts, /rules, and /targets.

Request Handling Pipeline

Loading diagram...

Error responses follow a standard format with status codes, error types, and optional warnings. The API supports content negotiation for different response formats and includes per-request statistics when requested via the stats parameter.

Configuration & Lifecycle Management

Relevant Files
  • config/config.go
  • config/reload.go
  • model/labels/labels_common.go
  • model/relabel/relabel.go

Prometheus configuration is hierarchical and validated at load time. The system supports both static configuration files and dynamic reloading via SIGHUP signals or HTTP endpoints.

Configuration Loading

The Load() and LoadFile() functions parse YAML configuration into a Config struct. Key steps include:

  1. YAML Parsing: Strict unmarshaling ensures no unknown fields are present
  2. Environment Variable Expansion: External labels support $VAR substitution from environment variables
  3. Validation: Global config, scrape configs, and relabel rules are validated against their schemas
  4. Directory Resolution: Relative file paths are resolved relative to the config file's directory
// Load parses YAML into a Config
cfg, err := config.Load(yamlString, logger)

// LoadFile reads and validates a config file
cfg, err := config.LoadFile(filename, agentMode, logger)

Configuration Structure

The top-level Config struct contains:

  • GlobalConfig: Default scrape interval, timeout, evaluation interval, external labels
  • ScrapeConfigs: List of scrape job configurations
  • ScrapeConfigFiles: Glob patterns for dynamic scrape config files
  • RemoteWriteConfigs: Remote storage write endpoints
  • RemoteReadConfigs: Remote storage read endpoints
  • AlertingConfig: Alertmanager endpoints and relabel rules
  • RuleFiles: Glob patterns for alert and recording rules

Dynamic Configuration Reload

Configuration changes are detected and applied through three mechanisms:

  1. SIGHUP Signal: Manual reload via kill -HUP &lt;pid&gt;
  2. HTTP Endpoint: POST /-/reload triggers reload
  3. Auto-Reload: Periodic checksum comparison (if enabled) detects file changes

The reload process:

1. Generate SHA256 checksum of config file + referenced files
2. Load new configuration
3. Apply to each subsystem (scrape manager, rules manager, etc.)
4. Update checksum on success

Relabeling

Relabel rules transform metric labels before ingestion or remote write. Each rule specifies:

  • source_labels: Labels to concatenate as input
  • regex: Pattern to match against concatenated values
  • action: Operation to perform (replace, keep, drop, labelmap, etc.)
  • target_label: Destination label for the result
  • replacement: Template for replacement values
relabel_configs:
  - source_labels: [__meta_kubernetes_pod_name]
    regex: "(.+)"
    target_label: pod
    replacement: "$1"
    action: replace

Supported actions include replace, keep, drop, keepequal, dropequal, hashmod, labelmap, labeldrop, labelkeep, lowercase, and uppercase.

Labels and Validation

Labels are key-value pairs with validation schemes:

  • UTF8Validation: Allows UTF-8 characters (default)
  • LegacyValidation: Restricts to [a-zA-Z_][a-zA-Z0-9_]* pattern

The Labels interface provides efficient storage and iteration. Metric names are stored as the special __name__ label. Label operations are optimized for performance with minimal allocations.

Configuration Validation

Each configuration section validates:

  • ScrapeConfig: Scrape timeout < scrape interval, job name uniqueness, relabel rule validity
  • RelabelConfig: Action-specific requirements (e.g., hashmod requires non-zero modulus)
  • GlobalConfig: Valid scrape protocols, metric name validation scheme consistency

Validation occurs at load time and during reload to catch configuration errors early.