Overview
Relevant Files
README.mdprograms/main.cppprograms/server/Server.hsrc/Client/ClientApplicationBase.hsrc/Server/IServer.hdocs/en/development/architecture.md
ClickHouse is an open-source, column-oriented database management system (DBMS) designed for online analytical processing (OLAP). It enables users to generate analytical reports using SQL queries in real-time, processing hundreds of millions to billions of rows per second with exceptional performance.
Core Architecture
ClickHouse uses a column-oriented storage model where data is stored by columns rather than rows. This design enables vectorized query execution—operations are dispatched on arrays of values rather than individual rows, significantly reducing computational costs. The architecture is built on proven concepts from array programming languages and modern analytical databases.
The system implements a multi-protocol server architecture:
- HTTP Interface - Simple, stateless protocol for external clients and applications
- TCP Interface - Native protocol for ClickHouse clients and inter-server communication during distributed queries
- Interserver HTTP - Dedicated protocol for replication between cluster nodes
Entry Points and Applications
The main executable (programs/main.cpp) acts as a universal launcher for multiple ClickHouse applications:
- clickhouse-server - The main database server
- clickhouse-client - Interactive SQL client
- clickhouse-local - Local query execution without a server
- clickhouse-keeper - Standalone coordination service (ZooKeeper replacement)
- Utility tools - Benchmark, format, compressor, obfuscator, and more
The dispatcher automatically selects the appropriate application based on the binary name, command-line arguments, or symbolic links.
Server Components
Loading diagram...
The Server class (programs/server/Server.h) manages:
- Configuration - Loads and reloads settings from configuration files
- Protocol Servers - HTTP, TCP, and specialized handlers (MySQL, PostgreSQL, gRPC)
- Query Execution - Registers interpreters, functions, aggregate functions, and storage engines
- Replication - Manages data synchronization across cluster nodes
- Coordination - Integrates with Keeper for distributed consensus
Client Architecture
The ClientApplicationBase class provides a foundation for client applications, handling:
- Command-line argument parsing
- Signal handlers for graceful shutdown (Ctrl+C)
- Configuration management
- Interactive query execution with suggestions
Key Design Principles
- Backward Compatibility - Full backward and forward compatibility for TCP protocol (maintained for ~1 year)
- Vectorized Execution - Operations on data chunks rather than individual rows
- Physical Replication - Only compressed parts transferred between nodes, not queries
- Distributed Query Execution - Queries can span multiple nodes with automatic coordination
- No Dynamic Loading - Deliberately disables
dlopenfor security and stability
Architecture & Query Execution Pipeline
Relevant Files
src/Interpreters/executeQuery.cppsrc/Interpreters/executeQuery.hsrc/Planner/Planner.cppsrc/Planner/Planner.hsrc/QueryPipeline/QueryPipeline.cppsrc/QueryPipeline/QueryPipelineBuilder.cppsrc/Processors/Executors/PipelineExecutor.cppsrc/Core/QueryProcessingStage.h
ClickHouse's query execution follows a multi-stage pipeline architecture that transforms SQL text into executable processor graphs. The system is designed for high-performance distributed query processing with fine-grained control over execution stages.
Query Execution Pipeline
The execution flow consists of five main stages:
- Parsing – SQL text is parsed into an Abstract Syntax Tree (AST)
- Analysis – The AST is analyzed and converted to a Query Tree (when
enable_analyzeris enabled) - Planning – The Query Tree is converted to a
QueryPlan(a DAG of logical steps) - Pipeline Building – The
QueryPlanis compiled into aQueryPipeline(a graph of processors) - Execution – The
PipelineExecutorruns processors in parallel, pulling or pushing data through ports
Key Components
executeQuery (src/Interpreters/executeQuery.cpp) is the main entry point. It orchestrates the entire pipeline: parsing the query, creating an interpreter, building the execution plan, and managing logging and error handling. It supports multiple overloads for different use cases (streaming input, string queries, low-level server-to-server interaction).
Planner (src/Planner/Planner.cpp) converts analyzed Query Trees into QueryPlan objects. It handles:
- Join tree planning (table sources, joins, filters)
- Aggregation and window function planning
- Sorting and limiting
- Distributed query planning for parallel replicas
The planner builds a tree of QueryPlanStep objects, each representing a logical operation.
QueryPlan is a directed acyclic graph (DAG) of logical execution steps. Each step has input and output ports with typed headers. The plan can be optimized before compilation into a pipeline.
QueryPipeline and QueryPipelineBuilder convert the logical QueryPlan into a concrete QueryPipeline – a graph of IProcessor objects connected via input/output ports. Processors are the actual execution units that transform data blocks.
PipelineExecutor (src/Processors/Executors/PipelineExecutor.cpp) executes the pipeline using a work-stealing scheduler. It:
- Manages thread pools and CPU slot allocation
- Builds an
ExecutingGraphfrom processors - Schedules processor tasks across threads
- Handles exceptions and profiling
Query Processing Stages
The QueryProcessingStage enum controls how far a query is executed:
- FetchColumns – Read only requested columns
- WithMergeableState – Execute until results can be merged across servers
- Complete – Full execution on a single node
- WithMergeableStateAfterAggregation – Execute aggregations but keep mergeable state
- QueryPlan – Use the new query plan infrastructure
Distributed queries use intermediate stages to push computation to remote nodes, then merge results locally.
Data Flow Architecture
Loading diagram...
Processor Model
Processors are the fundamental execution units. Each processor has:
- Input ports – Receive data blocks from upstream processors
- Output ports – Send data blocks to downstream processors
- Work method – Called repeatedly until the processor is finished
The executor uses a work-stealing queue to schedule processor tasks across threads, enabling efficient parallel execution with minimal synchronization overhead.
Storage Engines & MergeTree Family
Relevant Files
src/Storages/MergeTree/MergeTreeData.hsrc/Storages/MergeTree/MergeTreeData.cppsrc/Storages/StorageMergeTree.hsrc/Storages/StorageMergeTree.cppsrc/Storages/StorageReplicatedMergeTree.hsrc/Storages/StorageReplicatedMergeTree.cpp
MergeTree is ClickHouse's primary storage engine family, designed for high-performance analytical queries on large datasets. It organizes data into immutable parts that are merged in the background, enabling efficient compression, indexing, and incremental data loading.
Core Architecture
MergeTreeData is the foundational class containing the data structure parameters, part management, and core logic. StorageMergeTree extends it for standalone tables, while StorageReplicatedMergeTree adds distributed replication via ZooKeeper. Both inherit from IStorage and implement the storage interface for reads, writes, merges, and mutations.
Loading diagram...
Data Organization
Data is organized into parts, each containing sorted rows within a partition. Parts are immutable after creation and are merged in the background. Each part has a directory structure with:
columns.txt– Column names and typeschecksums.txt– File checksums for integrity verificationprimary.idx– Primary key index for range queriespartition.dat– Partition key valuescount.txt– Row count- Column data files (
.bin) and mark files (.mrk)
Part Types
MergeTree supports two part formats optimized for different scenarios:
Wide Format: Each column stored in separate files with individual mark files. Efficient for large parts and selective column reads. Default for parts exceeding min_bytes_for_wide_part or min_rows_for_wide_part thresholds.
Compact Format: All columns stored in a single data.bin file with marks in data.mrk3. Optimized for small parts (<10MB), reducing file count and I/O overhead.
Merging Modes
MergeTree supports specialized merging behaviors for different use cases:
- Ordinary – No special merge logic; simple concatenation
- Replacing – Keeps only the latest row per primary key (or highest version)
- Summing – Sums numeric columns for rows with identical primary keys
- Aggregating – Merges aggregate function states for the same primary key
- Collapsing – Pairs rows with opposite sign values to collapse them
- VersionedCollapsing – Collapsing with version-based conflict resolution
- Coalescing – Column-level upserts using NULL values for unchanged columns
- Graphite – Specialized for time-series thinning and coarsening
Background Operations
Merges combine multiple parts into one, triggered by heuristic algorithms. Mutations (ALTER DELETE/UPDATE) create new parts with modified data. Moves relocate parts between disks based on TTL rules. All background tasks are managed by MergeTreeBackgroundExecutor and coordinated through task queues.
Replication (StorageReplicatedMergeTree)
Replicated tables use ZooKeeper for coordination:
- Metadata and schema stored in
/metadataand/columns - Action log at
/log/log-...with entries for inserts, merges, and mutations - Replica registry at
/replicaswith activity status and addresses - Leader election for assigning merges and mutations
- Deduplication blocks stored at
/blockswith checksums - Quorum writes coordinated at
/quorum
Each replica maintains a queue of operations to execute, enabling eventual consistency across the cluster.
Query Parsing & Analysis
Relevant Files
src/Parsers/parseQuery.cpp– Query parsing entry pointssrc/Parsers/ParserQuery.h– Main query parser interfacesrc/Parsers/IParser.h– Parser base class and token handlingsrc/Analyzer/QueryTreeBuilder.cpp– AST to query tree conversionsrc/Analyzer/QueryTreeBuilder.h– Query tree builder interfacesrc/Analyzer/IQueryTreeNode.h– Query tree node types and base classsrc/Analyzer/Resolve/QueryAnalyzer.h– Identifier resolution and type inferencesrc/Analyzer/README.md– Analyzer architecture documentation
ClickHouse processes SQL queries through a two-stage pipeline: parsing (syntax analysis) and analysis (semantic resolution). This section explains how queries are transformed from raw text into an executable form.
Parsing Stage: Text to AST
The parsing stage converts raw SQL text into an Abstract Syntax Tree (AST). The entry point is parseQuery() in src/Parsers/parseQuery.cpp, which uses a recursive descent parser:
- Lexical Analysis: The
Lexertokenizes the input string into tokens (keywords, identifiers, literals, operators). - Syntax Analysis:
ParserQueryand specialized parsers (e.g.,ParserSelectQuery,ParserInsertQuery) recursively parse tokens and build AST nodes. - Error Handling: The
Expectedstruct tracks parsing alternatives and provides detailed error messages with line/column information.
Key parser classes inherit from IParserBase and implement parseImpl() to handle specific SQL constructs. For example, ParserSelectQuery parses SELECT statements, ParserCreateQuery handles CREATE TABLE, etc.
Query Tree: Semantic Representation
After parsing, the AST is converted to a Query Tree by QueryTreeBuilder. This intermediate representation is more semantic than syntactic:
- AST nodes are generic (e.g.,
ASTIdentifierfor any identifier) - Query tree nodes are semantic (e.g.,
ColumnNode,TableNode,FunctionNode)
The query tree contains 17 node types: IDENTIFIER, COLUMN, FUNCTION, QUERY, JOIN, UNION, LAMBDA, WINDOW, and others. Each node type has specific properties and semantics.
Analysis Stage: Identifier Resolution
The QueryAnalyzer in src/Analyzer/Resolve/ performs semantic analysis:
- Identifier Resolution: Replaces
IdentifierNodewith concrete nodes (ColumnNode,TableNode,FunctionNode). - Type Inference: Determines data types for all expressions.
- Scope Management: Handles lexical scoping for CTEs, subqueries, and lambda expressions.
- Alias Substitution: Resolves column and table aliases.
The analyzer uses IdentifierResolveScope to track resolution context and IdentifierResolver to perform lookups. Resolution follows a priority order: lambda arguments > aliases > table columns.
Query Tree Passes: Optimization
After analysis, the QueryTreePassManager applies optimization passes (in src/Analyzer/Passes/). These implement the IQueryTreePass interface and transform the query tree:
- Logical optimizations: Convert OR chains to IN, eliminate redundant conditions
- Function optimizations: Fuse functions, convert array operations
- Join optimizations: Convert CROSS JOIN to INNER JOIN when possible
Loading diagram...
Key Concepts
Weak Pointers in Query Tree: Nodes can hold weak pointers to other nodes (e.g., a ColumnNode references its source table). This preserves semantic information without creating circular references.
Unresolved State: After analysis, some parts may remain unresolved (e.g., table function arguments). The analyzer tracks which nodes are resolved and which require later processing.
Convertibility to AST: The query tree must be convertible back to AST without information loss, enabling query rewriting and logging.
Processors & Execution Model
Relevant Files
src/Processors/IProcessor.hsrc/Processors/ISource.hsrc/Processors/ISink.hsrc/Processors/Port.hsrc/Processors/Chunk.hsrc/Processors/QueryPlan/QueryPlan.hsrc/Processors/Executors/PipelineExecutor.hsrc/Processors/Executors/ExecutingGraph.h
Overview
ClickHouse's query execution is built on a processor-based pipeline architecture. Processors are lightweight building blocks that form a directed acyclic graph (DAG), where data flows through connected ports. This design enables efficient, composable query execution with support for both synchronous and asynchronous operations.
Core Concepts
Processors are the fundamental execution units. Each processor has zero or more input ports and zero or more output ports. Data flows between processors via ports in the form of Chunks—lightweight, move-only data structures containing columns and row counts.
Ports are the connection points between processors. An OutputPort produces data; an InputPort consumes it. Ports maintain state (finished, needed, data available) and enable backpressure signaling between processors.
Chunks are the data units transferred over ports. Unlike blocks, chunks are move-only, don't store column names/types, and can represent empty columns or rows—useful for signaling without data transfer.
Processor Lifecycle
Processors follow a state machine with the prepare() method as the core:
enum class Status {
NeedData, // Waiting for input
PortFull, // Output port full, can't proceed
Finished, // All work done
Ready, // Can call work()
Async, // Can call schedule()
ExpandPipeline // Can call expandPipeline()
};
The execution flow is: prepare() → work() (or schedule() for async) → repeat until Finished.
Key invariant: prepare() is never called in parallel for connected processors, but work() can execute in parallel across different processors.
Processor Types
- Sources (
ISource): No inputs, one output. Generate data and report read progress. - Sinks (
ISink): One input, no outputs. Consume data (e.g., INSERT, output formatting). - Transforms: One input, one output. Process data (e.g., filtering, expressions).
- Merging transforms: Multiple inputs, one output (e.g., merge sorted streams).
- Resizing processors: Arbitrary inputs/outputs (e.g., UNION, splits).
Query Plan to Pipeline
Loading diagram...
Execution Model
ExecutingGraph transforms the processor graph into an execution-ready structure with nodes and edges. Each node tracks processor status, ports, and exceptions. The executor uses a task queue and thread pool to drive processors through their lifecycle.
Backpressure is handled via port states: if an output port is full, the processor returns PortFull; if input is needed, it returns NeedData. This prevents memory bloat and enables efficient resource usage.
Async operations (e.g., remote fetches) use schedule() to return a file descriptor. The executor polls it; when ready, work() resumes processing.
Key Design Patterns
- Lazy evaluation: Processors only compute when
work()is called, enabling pipeline-level optimizations. - Port-driven execution: Executor schedules processors based on port state changes, not explicit dependencies.
- Composability: New processor types can be added without modifying the executor.
- Cancellation support: Processors check
isCancelled()flag for graceful shutdown. - Profiling integration: Each processor tracks elapsed time, input/output wait time, and data statistics for
system.processors_profile_log.
Distributed Query Execution & Clustering
Relevant Files
src/Interpreters/ClusterProxy/executeQuery.cppsrc/Interpreters/ClusterProxy/executeQuery.hsrc/Processors/QueryPlan/ReadFromRemote.hsrc/Interpreters/ClusterProxy/SelectStreamFactory.hsrc/Interpreters/Cluster.cppsrc/Storages/MergeTree/ParallelReplicasReadingCoordinator.cpp
Overview
Distributed query execution in ClickHouse enables queries to span multiple shards and replicas across a cluster. The system decomposes a single query into per-shard sub-queries, executes them in parallel, and merges results. This architecture supports both horizontal scaling and high availability through replica coordination.
Core Architecture
Query Routing & Shard Selection
The ClusterProxy::executeQuery() function orchestrates distributed execution. It receives a cluster definition and iterates over each shard, creating a query plan for that shard. Key steps:
- Cluster Preparation: Validates cluster configuration and applies shard-level optimizations (e.g., skipping unused shards based on WHERE conditions).
- Per-Shard Query Generation: For each shard, the query is cloned and potentially rewritten (e.g., sharding key filters for custom key distribution).
- Parallel Replicas Decision: Determines if parallel replica reading should be enabled based on replica count and settings.
- Stream Factory:
SelectStreamFactory::createForShard()generates execution plans for each shard.
Query Plan Construction
The system builds a QueryPlan with multiple branches:
- Local Plans: If a shard is local, a local query plan is created.
- Remote Plans: For remote shards, a
ReadFromRemotestep is added, which creates remote connections. - Union Step: Multiple shard plans are combined using
UnionStepto merge results.
Parallel Replicas Coordination
When a shard has multiple replicas, the ParallelReplicasReadingCoordinator distributes work across them:
- Replica Pool Management: Connection pools are shuffled based on load balancing strategy (RANDOM, ROUND_ROBIN, NEAREST_HOSTNAME, etc.).
- Work Distribution: The coordinator assigns mark ranges to replicas using consistent hashing or work-stealing algorithms.
- Replica Failure Handling: Unavailable replicas are marked, and their work is redistributed to healthy replicas.
Loading diagram...
Settings & Optimization
Key Settings:
max_parallel_replicas: Limits replicas used per shard.parallel_replicas_mode: Selects coordination strategy (default, custom_key).load_balancing: Determines replica selection (ROUND_ROBIN, NEAREST_HOSTNAME, etc.).optimize_skip_unused_shards: Skips shards that don't match WHERE conditions.force_optimize_skip_unused_shards: Enforces shard skipping even in nested queries.
Context Updates: updateSettingsForCluster() adjusts per-user limits and nesting depth for remote execution, ensuring queries don't exceed resource constraints on leaf nodes.
Remote Execution
The ReadFromRemote step manages remote query execution:
- Connection Pooling: Maintains connection pools per shard with failover support.
- Query Serialization: Queries are serialized and sent to remote nodes.
- Throttling: Network bandwidth limits are applied via
Throttler. - Scalars & External Tables: Query context (scalars, external tables) is propagated to remote nodes.
Custom Key Parallel Replicas
For tables with custom sharding keys, executeQueryWithParallelReplicasCustomKey() enables replica-level parallelism:
- Key-Based Filtering: Each replica receives a filtered subset based on custom key ranges.
- Shard Filter Generator: Generates per-replica WHERE clauses to partition work.
- Local Plan Support: Optionally builds a local plan for the initiator's replica to avoid network overhead.
DDL on Cluster
Distributed DDL execution (executeDDLQueryOnCluster()) follows a similar pattern but coordinates schema changes across all nodes using ZooKeeper, ensuring consistency.
Coordination & ClickHouse Keeper
Relevant Files
src/Coordination/KeeperDispatcher.hsrc/Coordination/KeeperDispatcher.cppsrc/Coordination/Changelog.hsrc/Coordination/KeeperServer.hsrc/Coordination/KeeperStateMachine.hprograms/keeper/Keeper.hprograms/keeper/Keeper.cpp
ClickHouse Keeper is a ZooKeeper-compatible coordination service built on the NuRaft consensus library. It manages distributed state and ensures consistency across a cluster through a Raft-based replication protocol.
Architecture Overview
The Keeper system consists of three main layers:
KeeperDispatcher is the high-level request handler that manages the request/response lifecycle. It maintains multiple worker threads: a request thread that batches incoming requests, a response thread that delivers results to clients, a session cleaner that removes expired sessions, and a snapshot thread for persistence.
KeeperServer wraps the NuRaft consensus engine and coordinates between the state machine and the Raft protocol. It manages cluster membership, leader election, and log replication across nodes.
KeeperStateMachine applies committed log entries to the in-memory state (KeeperStorage) and handles snapshots. It implements the state_machine interface from NuRaft, processing both write operations and read-only queries.
Loading diagram...
Request Processing Pipeline
Requests flow through a batching mechanism to maximize throughput. The request thread collects multiple requests into a batch (limited by size and count), then submits them to the Raft log via putRequestBatch(). Read requests are handled separately: if quorum reads are disabled, they bypass Raft and execute directly on the state machine after the preceding write batch commits.
The dispatcher maintains two callback maps: one for normal session responses and another for new session ID requests (which don't yet have a session ID). When responses are generated by the state machine, they're queued and delivered by the response thread using these callbacks.
Changelog and Log Storage
The Changelog class manages persistent storage of Raft log entries on disk. It rotates log files based on a configurable interval and supports compression. Log entries are stored with headers containing version, index, term, and blob size information.
The LogEntryStorage structure provides an intelligent two-level caching system: a latest logs cache for recent entries and a commit logs cache for entries being applied. This design avoids expensive disk reads during sequential log processing, with prefetching support for S3-based storage.
Snapshot Management
Snapshots capture the entire state machine state at a specific log index. The snapshot thread processes snapshot creation tasks asynchronously, allowing the state machine to continue accepting requests. Leaders can upload snapshots to S3 for faster recovery of lagging followers.
Consensus and Replication
NuRaft handles the core Raft protocol: leader election, log replication, and commit index advancement. The KeeperStateManager persists cluster configuration and server state. When a log entry reaches the commit index, the state machine's commit() method is invoked to apply it to the storage.
Configuration changes (reconfiguration) are handled specially: they're processed as regular log entries but trigger cluster membership updates through the commit_config() callback.
Data Types & Functions
Relevant Files
src/DataTypes/IDataType.hsrc/DataTypes/DataTypeFactory.hsrc/Functions/IFunction.hsrc/Functions/FunctionFactory.hsrc/AggregateFunctions/AggregateFunctionFactory.h
Overview
ClickHouse's type and function systems form the core of query execution. Data types define how values are stored and serialized, while functions transform data. Both systems use factory patterns for extensibility and registration-based discovery.
Data Types Architecture
Data types in ClickHouse inherit from IDataType, an immutable, shared interface that defines:
- Type Identity: Family name (e.g.,
String,Array), type ID, and parametric flags - Column Creation: Methods to create columns matching the type
- Serialization: Pluggable serialization strategies for different formats
- Subcolumns: Support for nested type access (e.g.,
Nullableunwrapping)
Type Categories
- Primitive Types:
UInt8,Int64,Float64,String,Date,DateTime - Parametric Types:
Array(T),Tuple(T1, T2),Map(K, V),FixedString(N) - Special Types:
Nullable(T),LowCardinality(T),Enum,UUID,Decimal - Complex Types:
AggregateFunction,Function,Dynamic,Variant
The DataTypeFactory singleton manages type registration and instantiation. Types are registered during initialization via functions like registerDataTypeNumbers(), registerDataTypeArray(), etc.
Function Architecture
Functions follow a three-level hierarchy:
IFunctionOverloadResolver (handles overload resolution)
↓
IFunctionBase (resolved function with known types)
↓
IExecutableFunction (prepared for execution)
Function Properties
Functions declare capabilities through virtual methods:
- Type Resolution:
getReturnTypeImpl()determines output type from input types - Execution:
executeImpl()processes columns and returns results - Optimization:
isInjective(),isDeterministic(),isSuitableForConstantFolding() - Special Handling:
useDefaultImplementationForNulls(),useDefaultImplementationForConstants()
Function Registration
The FunctionFactory singleton registers functions with:
- Creator Function: Produces
FunctionOverloadResolverinstances - Documentation: Syntax, arguments, examples, and return types
- Case Sensitivity: Optional case-insensitive aliases
Aggregate Functions
Aggregate functions are specialized via AggregateFunctionFactory:
- State-based: Maintain internal state across rows
- Parametric: Accept parameters (e.g.,
quantiles(0.5, 0.9)) - Combinable: Support
-Stateand-Mergecombinators for distributed execution - Properties: Flags for null handling, determinism, and optimization hints
Type-Function Integration
Functions interact with types through:
- Type Checking: Validate argument types in
getReturnTypeImpl() - Column Operations: Create result columns via
type->createColumn() - Serialization: Use type's serialization for format conversion
- Type Coercion: Apply
getLeastSupertype()for mixed-type operations
Extension Points
- Custom Data Types: Inherit
IDataType, register viaDataTypeFactory::registerDataType() - Custom Functions: Inherit
IFunction, register viaFunctionFactory::registerFunction() - Custom Aggregate Functions: Implement
IAggregateFunction, register viaAggregateFunctionFactory::registerFunction()
Server & Protocol Handlers
Relevant Files
src/Server/TCPHandler.cpp– Native ClickHouse protocol handlersrc/Server/HTTPHandler.cpp– HTTP/HTTPS request handlersrc/Server/GRPCServer.cpp– gRPC server implementationsrc/Server/MySQLHandler.cpp– MySQL wire protocol handlersrc/Server/PostgreSQLHandler.cpp– PostgreSQL wire protocol handlersrc/Server/TCPServer.h– Base TCP server infrastructuresrc/Server/TCPServerConnectionFactory.h– Factory pattern for handler creation
ClickHouse supports multiple network protocols through a unified handler architecture. Each protocol has a dedicated handler class that processes incoming connections and translates protocol-specific messages into ClickHouse queries.
Architecture Overview
Loading diagram...
Handler Factory Pattern
Each protocol uses a factory class implementing TCPServerConnectionFactory to create handler instances:
- TCPHandlerFactory – Creates
TCPHandlerinstances for the native ClickHouse protocol - HTTPServerConnectionFactory – Creates HTTP handlers for REST API requests
- MySQLHandlerFactory – Creates
MySQLHandlerinstances with SSL support and connection IDs - PostgreSQLHandlerFactory – Creates
PostgreSQLHandlerinstances with authentication methods - GRPCServer – Manages gRPC connections directly (not factory-based)
Factories are registered during server startup in programs/server/Server.cpp and instantiate handlers for each incoming connection.
Native Protocol (TCPHandler)
The native ClickHouse protocol uses a binary format with chunked compression. TCPHandler manages:
- Query state – Tracks query ID, context, compression settings, and execution stage
- Packet processing – Handles client packets (Query, Data, Ping, Cancel, TablesStatusRequest)
- Result streaming – Sends blocks, totals, extremes, and profile information
- Asynchronous inserts – Queues INSERT operations for batch processing
The handler maintains a QueryState struct containing parsed queries, execution blocks, and internal log queues.
HTTP Handler (HTTPHandler)
HTTP requests are processed through a layered architecture:
- DynamicQueryHandler – Extracts query from URL parameters or request body
- PredefinedQueryHandler – Matches URL patterns to predefined queries
- Output buffering – Cascades through optional compression and delayed write buffers
HTTP handlers support custom authentication, response headers, and format negotiation.
MySQL & PostgreSQL Compatibility
ClickHouse implements wire-protocol compatibility for MySQL and PostgreSQL clients:
- MySQLHandler – Supports MySQL 5.7+ protocol with prepared statements, SSL, and authentication plugins
- PostgreSQLHandler – Implements PostgreSQL protocol with startup messages, authentication, and prepared statements
Both handlers translate protocol-specific commands into ClickHouse queries and return results in the expected format.
gRPC Server
The GRPCServer provides asynchronous gRPC support with:
- Async service – Uses gRPC completion queues for non-blocking request handling
- Configurable compression – Supports transport-level compression algorithms
- Message size limits – Configurable max send/receive message sizes
Connection Lifecycle
- Accept – TCPServer accepts incoming socket connection
- Factory – Appropriate factory creates protocol-specific handler
- Handshake – Handler performs protocol-specific initialization (auth, capabilities)
- Query Loop – Handler receives and processes queries until client disconnects
- Cleanup – Session and resources are released
Protocol Stack Support
For advanced scenarios, TCPProtocolStackHandler chains multiple handlers:
- TLSHandler – Upgrades connection to TLS/SSL
- ProxyV1Handler – Parses PROXY protocol headers
- Protocol handlers – Native, MySQL, or PostgreSQL handlers
This enables secure protocol upgrades and proxy support on the same port.
Client Tools & Utilities
Relevant Files
programs/client/Client.cpp&Client.hprograms/local/LocalServer.cpp&LocalServer.hprograms/keeper-client/KeeperClient.cpp&KeeperClient.hprograms/benchmark/Benchmark.cppsrc/Client/ClientApplicationBase.h&ClientBase.h
ClickHouse provides multiple client tools for different use cases, all built on a shared foundation of client infrastructure. These tools handle query execution, connection management, and interactive sessions.
Architecture Overview
Loading diagram...
Core Components
ClientBase is the foundation layer providing query parsing, execution, and result handling. It manages interactive and non-interactive modes, handles signal interruption (Ctrl+C), and processes query fuzzing for testing.
ClientApplicationBase extends ClientBase with application-level concerns: command-line argument parsing using Boost.program_options, configuration file loading, signal handlers, and logging setup. It serves as the base for standalone client applications.
Client Tools
clickhouse-client (Client class) is the primary interactive and batch query tool. It connects to a remote ClickHouse server, supports multiple hosts for failover, handles JWT authentication, and provides interactive features like syntax highlighting, command history, and auto-completion. It includes AST fuzzing capabilities for query testing.
clickhouse-local (LocalServer class) executes queries without a server. It creates an in-process execution context, supports reading from files or stdin, and is optimized for lightweight operations. No networking, configuration files, or logging overhead—ideal for data processing scripts and one-off queries.
clickhouse-keeper-client (KeeperClient class) manages ZooKeeper-compatible coordination. It provides an interactive shell with commands like ls, cd, create, set, and delete for manipulating the keeper tree. Supports both interactive and batch modes with four-letter-word commands for diagnostics.
clickhouse-benchmark measures query performance under concurrent load. It spawns multiple threads executing the same query, collects latency statistics (min, max, mean, percentiles), and supports confidence intervals. Useful for performance testing and regression detection.
Key Features
Connection Management: Client tools support multiple connection strategies—remote TCP connections with automatic failover, local in-process execution, and keeper coordination. Connection parameters include host, port, user, password, and database selection.
Query Processing: All tools parse SQL queries, handle multi-statement execution, and manage result formatting. They support external tables, query parameters, and various output formats (JSON, CSV, Pretty, TSV, etc.).
Interactive Features: The client and keeper-client provide REPL-style interaction with history, auto-completion, and syntax highlighting. Signal handlers enable graceful query cancellation via Ctrl+C.
Configuration: Tools load settings from configuration files, environment variables, and command-line arguments. Settings control output format, progress display, logging, and server-side behavior.
Execution Flow
- Initialization: Parse command-line arguments, load configuration, establish connection
- Setup: Register formats, functions, and aggregate functions; initialize context
- Execution: Process queries (interactive loop or batch mode), handle results
- Cleanup: Close connections, flush output, exit with appropriate status code
The modular design allows code reuse across tools while supporting tool-specific customizations through virtual methods and configuration options.