Trino - Distributed SQL Query Engine | Augment Code

Overview

Relevant Files

README.md
core/trino-server-main/src/main/java/io/trino/server/TrinoServer.java
docs/src/main/sphinx/overview/concepts.md
core/trino-main/src/main/java/io/trino/server/Server.java

Trino is a fast distributed SQL query engine designed for big data analytics. It enables users to query data across multiple sources using standard SQL, without requiring data movement or transformation into a centralized data warehouse.

Key Characteristics

Trino excels at:

Distributed query execution across multiple servers in parallel
Multi-source querying – query data lakes, relational databases, and other systems in a single SQL statement
Interactive analytics with sub-second to minute-level query latencies
Extensibility through a plugin architecture supporting 50+ data connectors

Architecture Overview

Loading diagram...

Core Components

Coordinator – The "brain" of Trino. Parses SQL statements, creates distributed query plans, manages worker nodes, and returns final results to clients. Every cluster requires exactly one coordinator.

Workers – Execute tasks and process data. Workers fetch data from connectors, exchange intermediate results with other workers, and return processed data to the coordinator. A cluster can have zero or more workers.

Connectors – Adapters that enable Trino to interact with data sources. Each connector implements the Service Provider Interface (SPI) to translate Trino's query model into source-specific operations. Examples include Hive, PostgreSQL, Iceberg, and 50+ others.

Catalogs – Named configurations that define how to access a specific data source. Each catalog specifies a connector and connection details. Multiple catalogs can use the same or different connectors.

Query Execution Model

When you submit a SQL statement, Trino transforms it through several stages:

Parsing & Analysis – Coordinator parses SQL and validates against catalog metadata
Logical Planning – Creates an abstract query plan with optimization
Physical Planning – Fragments the plan into stages and tasks for distributed execution
Task Scheduling – Coordinator assigns tasks to workers based on data locality
Execution – Workers execute tasks in parallel using drivers and operators
Result Aggregation – Coordinator collects results and returns to client

Parallelism Hierarchy

Trino achieves parallelism at multiple levels:

Cluster-level – Multiple workers process different data partitions simultaneously
Node-level – Each node runs a single JVM with multiple threads
Task-level – Stages are implemented as multiple tasks distributed across workers
Driver-level – Tasks contain multiple drivers executing operators in parallel

This hierarchical approach enables Trino to efficiently process massive datasets while maintaining interactive query performance.

Architecture

Relevant Files

core/trino-main/src/main/java/io/trino/execution/SqlQueryExecution.java
core/trino-main/src/main/java/io/trino/execution/QueryExecution.java
core/trino-main/src/main/java/io/trino/execution/QueryStateMachine.java
core/trino-main/src/main/java/io/trino/sql/planner/LogicalPlanner.java
core/trino-main/src/main/java/io/trino/sql/planner/QueryPlanner.java
core/trino-main/src/main/java/io/trino/sql/planner/PlanFragmenter.java

Trino's query execution architecture follows a well-defined pipeline that transforms SQL text into distributed execution tasks. The system is organized around three core phases: analysis, planning, and execution.

Query Lifecycle

Every query progresses through distinct states managed by QueryStateMachine:

QUEUED → DISPATCHING → PLANNING → STARTING → RUNNING → FINISHING → FINISHED (or FAILED)

The SqlQueryExecution class orchestrates this flow. When start() is called, it transitions the query to PLANNING state and invokes the analysis and planning phases. Once planning completes, it transitions to STARTING and initializes the query scheduler.

Analysis Phase

The Analyzer processes the SQL statement and produces an Analysis object containing semantic information:

Validates table and column references
Resolves function calls and type information
Extracts table lineage and column lineage
Determines query type (SELECT, INSERT, UPDATE, DELETE, etc.)

This phase runs early in the query lifecycle and captures all metadata needed for planning.

Planning Phase

The LogicalPlanner transforms the Analysis into an optimized Plan through multiple stages:

Initial Plan Generation → QueryPlanner builds the logical plan tree from the analysis
Validation → PlanSanityChecker validates intermediate plan correctness
Optimization → Multiple PlanOptimizer instances apply transformation rules
Final Validation → Ensures the optimized plan is valid

Plan optimization includes projection pushdown, predicate pushdown, join reordering, limit pushdown, and many other cost-based transformations.

Plan Fragmentation

After optimization, PlanFragmenter breaks the logical plan into distributed PlanFragment objects:

Each fragment represents work that can execute on a single node
Fragments are connected via exchange operators for data movement
The root fragment collects final results on the coordinator

SubPlan fragmentedPlan = planFragmenter.createSubPlans(
    session, plan, false, warningCollector);

Execution Phase

The QueryScheduler (either PipelinedQueryScheduler or EventDrivenFaultTolerantQueryScheduler) converts fragments into tasks:

Creates SqlTask instances for each fragment on worker nodes
Manages task scheduling, split assignment, and execution
Monitors task progress and handles failures
Collects results and transitions query to terminal states

Loading diagram...

Key Components

QueryStateMachine tracks query state and coordinates transitions. It maintains metadata like query ID, session, resource group, and timing information.

PlanOptimizers apply iterative rule-based transformations. The system runs multiple optimizer passes, each applying specific rules (e.g., projection pruning, filter pushdown, join elimination).

AdaptivePlanner (for fault-tolerant execution) re-optimizes plans during execution based on runtime statistics, enabling dynamic query adaptation.

The architecture enables Trino to handle complex distributed queries efficiently through careful separation of concerns: analysis validates correctness, planning optimizes for performance, and execution manages distributed coordination.

Connectors & Plugin System

Relevant Files

core/trino-spi/src/main/java/io/trino/spi/Plugin.java
core/trino-spi/src/main/java/io/trino/spi/connector/Connector.java
core/trino-spi/src/main/java/io/trino/spi/connector/ConnectorFactory.java
core/trino-spi/src/main/java/io/trino/spi/connector/ConnectorMetadata.java
core/trino-main/src/main/java/io/trino/server/PluginManager.java
core/trino-main/src/main/java/io/trino/connector/DefaultCatalogFactory.java

Trino's extensibility is built on a plugin architecture that allows data sources, functions, security providers, and other components to be plugged into the system. Connectors are the primary mechanism for integrating new data sources.

Plugin System Architecture

The plugin system uses Java's ServiceLoader mechanism for discovery. Each plugin must provide an implementation of the Plugin interface with a service descriptor at META-INF/services/io.trino.spi.Plugin. When Trino starts, the PluginManager loads plugins from designated directories, creates isolated class loaders for each plugin, and registers their components.

Loading diagram...

Connector Lifecycle

A connector's lifecycle begins when a catalog is created. The ConnectorFactory.create() method receives the catalog name, configuration properties, and a ConnectorContext. Most connectors use Guice for dependency injection and configuration management. The factory instantiates the Connector object, which provides access to metadata, split management, and data reading/writing capabilities.

@Override
public Connector create(String catalogName, Map<String, String> config, ConnectorContext context)
{
    Bootstrap app = new Bootstrap(
            "io.trino.bootstrap.catalog." + catalogName,
            new ConnectorModule());
    
    Injector injector = app
            .doNotInitializeLogging()
            .setRequiredConfigurationProperties(config)
            .initialize();
    
    return injector.getInstance(MyConnector.class);
}

Core Connector Interfaces

The Connector interface defines the contract for data sources. Key methods include:

beginTransaction() – Start a transaction with isolation level and read/write mode
getMetadata() – Retrieve metadata for schema, table, and column information
getSplitManager() – Partition data into splits for parallel processing
getPageSourceProvider() – Read data in page format (columnar batches)
getPageSinkProvider() – Write data in page format
shutdown() – Release resources when the connector is unloaded

The ConnectorMetadata interface handles schema operations: listing schemas and tables, retrieving table handles, column metadata, statistics, and supporting DDL operations like CREATE TABLE and DROP TABLE.

Plugin Extensibility

Beyond connectors, plugins can provide:

Types – Custom data types registered in the type registry
Functions – SQL functions and aggregations
System Access Control – Fine-grained authorization policies
Event Listeners – Query execution hooks for monitoring and auditing
Authenticators – Password, certificate, and header-based authentication
Resource Group Managers – Query resource allocation policies

Configuration and Isolation

Connector configuration is defined in catalog properties files. Trino provides robust configuration through Airlift's @Config annotations, which support type conversion, validation, and secret handling. Each plugin runs in an isolated class loader, allowing plugins to use different library versions without conflicts. The SPI packages (io.trino.spi.*, Jackson annotations, Slice) are shared across all plugins to ensure compatibility.

Task Scheduling & Execution

Relevant Files

core/trino-main/src/main/java/io/trino/execution/scheduler/PipelinedQueryScheduler.java
core/trino-main/src/main/java/io/trino/execution/scheduler/SourcePartitionedScheduler.java
core/trino-main/src/main/java/io/trino/execution/SqlTaskManager.java
core/trino-main/src/main/java/io/trino/execution/executor/TaskExecutor.java
core/trino-main/src/main/java/io/trino/execution/RemoteTask.java

Trino's task scheduling and execution system orchestrates distributed query processing across a cluster. The system operates in two main phases: scheduling (deciding what work to do and where) and execution (actually running the work).

Scheduling Architecture

The scheduling pipeline follows a hierarchical structure:

QueryScheduler (interface) - Top-level orchestrator for a query
PipelinedQueryScheduler - Main implementation that manages distributed stages
StageScheduler (interface) - Schedules tasks for individual stages
SourcePartitionedScheduler - Handles data source partitioning and split assignment

The PipelinedQueryScheduler uses an ExecutionSchedule (typically PhasedExecutionSchedule) to determine which stages to schedule in what order. This ensures optimal resource utilization by scheduling source stages before dependent stages that consume their output.

Stage and Task Creation

// Scheduling loop in PipelinedQueryScheduler
while (!executionSchedule.isFinished()) {
    StagesScheduleResult stagesScheduleResult = executionSchedule.getStagesToSchedule();
    for (StageExecution stageExecution : stagesScheduleResult.getStagesToSchedule()) {
        ScheduleResult result = stageSchedulers.get(stageExecution.getStageId()).schedule();
        // Handle blocked stages and retry
    }
}

Each StageExecution represents a stage instance and manages task creation via scheduleTask(). Tasks are created on specific nodes based on data locality and cluster topology. The ScheduleResult indicates whether scheduling is complete, how many splits were scheduled, and if the stage is blocked waiting for resources.

Split Assignment and Distribution

SourcePartitionedScheduler manages split discovery and assignment:

Fetches splits from connectors in batches
Places splits on nodes using SplitPlacementPolicy (considers data locality)
Creates tasks on nodes as needed
Handles dynamic filter collection to unblock dependent stages

Splits are assigned to tasks via RemoteTask.addSplits(), which sends them to worker nodes for execution.

Task Execution

On each worker node, SqlTaskManager manages local task execution:

// Task lifecycle
TaskExecutor taskExecutor = new TimeSharingTaskExecutor(...);
TaskHandle handle = taskExecutor.addTask(taskId, ...);
taskExecutor.enqueueSplits(handle, intermediate, splits);

The TaskExecutor interface abstracts split execution. Implementations like TimeSharingTaskExecutor manage:

Split queuing - Splits are queued per task
Driver scheduling - Splits are executed by drivers (query execution pipelines)
Concurrency control - Limits concurrent drivers per task for fair resource sharing
Stuck split detection - Identifies and fails splits that exceed processing thresholds

Split Execution Flow

Split → SplitRunner → Driver → Operators → Results

Each split becomes a SplitRunner that processes data through a Driver (pipeline of operators). The executor schedules splits across available threads, adjusting concurrency dynamically based on task utilization.

Blocking and Backpressure

The scheduler handles blocking scenarios:

Split queues full - Tasks can't accept more splits; scheduler waits
Waiting for source - Splits not yet available from connector
Output buffers full - Downstream tasks can't consume results

When blocked, the scheduler waits on futures and resumes when conditions change, enabling efficient resource utilization without busy-waiting.

Monitoring and Cleanup

SqlTaskManager runs periodic maintenance tasks:

Removes old completed tasks (200ms interval)
Fails abandoned tasks (tasks not updated within timeout)
Detects and fails stuck splits (configurable threshold)
Updates task statistics (1s interval)

This ensures resources are reclaimed and long-running splits are identified for debugging.

Type System & Data Structures

Relevant Files

core/trino-spi/src/main/java/io/trino/spi/type/Type.java
core/trino-spi/src/main/java/io/trino/spi/block/Block.java
core/trino-spi/src/main/java/io/trino/spi/block/ValueBlock.java
core/trino-spi/src/main/java/io/trino/spi/block/BlockBuilder.java
core/trino-spi/src/main/java/io/trino/spi/Page.java
docs/src/main/sphinx/develop/types.md

Trino's type system and data structures form the foundation for efficient columnar data processing. The system separates type metadata (what values represent) from block storage (how values are physically stored), enabling flexible encoding strategies and optimizations.

Type Interface

The Type interface defines how Trino understands SQL data types. Each type implementation specifies:

Type signature – Globally unique name (e.g., VARCHAR, BIGINT, ARRAY(INT))
Java representation – Stack-based type for expression execution (boolean, long, double, or object)
Block storage – Preferred ValueBlock implementation for storing values
Comparability & orderability – Whether the type supports equality, hashing, and ordering operations
Value accessors – Methods to extract values as boolean, long, double, Slice, or generic Object

Types are either fixed-width (all values same size, e.g., BIGINT) or variable-width (values differ in size, e.g., VARCHAR). The FixedWidthType interface optimizes storage for fixed-size types.

Block Abstraction

Blocks are the core data structure for columnar storage. The Block interface is sealed with three implementations:

ValueBlock – Direct storage of values. Implementations include:

LongArrayBlock – Fixed-width storage for 64-bit values
IntArrayBlock – Fixed-width storage for 32-bit values
VariableWidthBlock – Variable-length storage with offset array
Int128ArrayBlock – Fixed-width storage for 128-bit values

DictionaryBlock – Compression via dictionary encoding. Stores an array of indices into a shared dictionary, reducing memory when values repeat frequently. Includes metadata for compactness tracking and dictionary identity.

RunLengthEncodedBlock – Compression for repeated values. Stores a single value repeated N times, ideal for constants or filtered results. Minimal memory overhead.

BlockBuilder Pattern

BlockBuilder is the mutable interface for constructing blocks. Key operations:

append(ValueBlock, position) – Append a single value
appendRange(ValueBlock, offset, length) – Append a range of values
appendPositions(ValueBlock, positions[], offset, length) – Append specific positions
appendNull() – Append a null value
build() – Finalize and return a Block (may optimize to RLE or other encoding)
buildValueBlock() – Finalize and return a ValueBlock (no compression)

Page Structure

A Page represents a batch of columnar data for a single query stage. It contains:

Blocks array – One block per column
Position count – Number of rows (same for all blocks)
Size tracking – Cached size and retained size for memory management

Pages are the unit of data exchange between operators. They enable efficient vectorized processing while maintaining memory awareness.

Data Flow Example

Loading diagram...

Type Hierarchy

Trino provides built-in types organized by category:

Primitive – BOOLEAN, BIGINT, INTEGER, SMALLINT, TINYINT, DOUBLE, REAL
String – VARCHAR, CHAR, VARBINARY
Temporal – DATE, TIME, TIMESTAMP, INTERVAL
Complex – ARRAY, MAP, ROW
Specialized – JSON, UUID, IPADDRESS, TDIGEST

Plugins can register custom types via Plugin.getTypes() and Plugin.getParametricTypes() for parameterized types like VARCHAR(10) or DECIMAL(22, 5).

Memory Management

Blocks track memory usage through:

getSizeInBytes() – Estimated size if fully expanded (for cost estimation)
getRetainedSizeInBytes() – Actual memory held including over-allocations
retainedBytesForEachPart() – Detailed breakdown of memory components

This enables accurate memory accounting for spill decisions and resource management during query execution.

Functions & Operators

Relevant Files

core/trino-main/src/main/java/io/trino/metadata/FunctionManager.java
core/trino-main/src/main/java/io/trino/metadata/GlobalFunctionCatalog.java
core/trino-main/src/main/java/io/trino/metadata/SystemFunctionBundle.java
core/trino-main/src/main/java/io/trino/sql/InterpretedFunctionInvoker.java
core/trino-spi/src/main/java/io/trino/spi/function/OperatorType.java

Trino's function and operator system provides a comprehensive framework for executing SQL functions and operators. Functions are registered in a global catalog, resolved at query time, and invoked through specialized implementations. Operators are a special category of functions that perform arithmetic, comparison, and type conversion operations.

Function Registration & Catalog

Functions are organized into function bundles that are registered with the GlobalFunctionCatalog during server startup. The SystemFunctionBundle provides all built-in functions including scalar functions, aggregations, and window functions. Each function has a unique FunctionId and can have multiple aliases.

// Functions are registered via bundles
globalFunctionCatalog.addFunctions(functionBundle);

// Built-in functions are stored in the "builtin" schema
GlobalFunctionCatalog.BUILTIN_SCHEMA = "builtin"

The catalog maintains a FunctionMap that indexes functions by name and ID, enabling fast lookup during query execution. Functions are validated at registration time to prevent duplicates and naming conflicts.

Function Resolution & Binding

When a query references a function, the FunctionResolver binds the function name to a concrete implementation based on argument types. This process:

Looks up candidate functions by name
Matches argument types to function signatures
Resolves type coercions if needed
Returns a ResolvedFunction with bound type information

ResolvedFunction resolveFunction(
    Session session,
    QualifiedName name,
    List<TypeSignatureProvider> parameterTypes,
    AccessControl accessControl)

Function Invocation

The FunctionManager caches specialized implementations for scalar, aggregation, and window functions. The InterpretedFunctionInvoker executes resolved functions by:

Retrieving the ScalarFunctionImplementation (a MethodHandle)
Preparing arguments (handling nullability, lambda conversions)
Invoking the method handle with proper calling conventions

Object invoke(ResolvedFunction function, ConnectorSession session, List<Object> arguments)

Operators

Operators are functions with special semantics. The OperatorType enum defines all supported operators:

Arithmetic: ADD, SUBTRACT, MULTIPLY, DIVIDE, MODULUS, NEGATION
Comparison: EQUAL, LESS_THAN, LESS_THAN_OR_EQUAL, COMPARISON_UNORDERED_LAST/FIRST
Type Operations: CAST, SATURATED_FLOOR_CAST, SUBSCRIPT
Internal: HASH_CODE, XX_HASH_64, READ_VALUE, IDENTICAL, INDETERMINATE

Operators are registered as functions with mangled names (e.g., $operator$ADD). The OperatorNameUtil handles name mangling and unmangling. Operators are resolved through the same mechanism as functions but with type-specific implementations.

Loading diagram...

Function Types

Trino supports multiple function categories:

Scalar Functions: Deterministic operations on individual rows
Aggregation Functions: Combine multiple rows (e.g., SUM, COUNT)
Window Functions: Operate over ordered row sets (e.g., ROW_NUMBER, RANK)
Table Functions: Return multiple rows/columns (e.g., UNNEST)
Language Functions: User-defined functions in SQL or other languages

Each type has specialized metadata, implementation providers, and invocation conventions.

Client Interfaces

Relevant Files

client/trino-cli/src/main/java/io/trino/cli/Console.java
client/trino-cli/src/main/java/io/trino/cli/Trino.java
client/trino-cli/src/main/java/io/trino/cli/QueryRunner.java
client/trino-jdbc/src/main/java/io/trino/jdbc/TrinoDriver.java
client/trino-jdbc/src/main/java/io/trino/jdbc/NonRegisteringTrinoDriver.java
client/trino-jdbc/src/main/java/io/trino/jdbc/TrinoConnection.java
client/trino-client/src/main/java/io/trino/client/StatementClient.java
client/trino-client/src/main/java/io/trino/client/StatementClientFactory.java

Trino provides two primary client interfaces for connecting to a Trino cluster: the Command Line Interface (CLI) for interactive query execution and the JDBC driver for Java-based applications. Both clients communicate with the Trino coordinator using the client protocol over HTTP/HTTPS.

CLI Architecture

The Trino CLI is a self-executing JAR file that provides an interactive terminal shell for running SQL queries. The entry point is the Trino class, which uses PicoCLI for command-line argument parsing and creates a Console instance that implements the interactive shell.

The Console class manages the interactive session lifecycle:

Parses command-line options (connection details, output format, session properties)
Creates a QueryRunner that handles HTTP communication with the coordinator
Implements an interactive REPL loop with command history, auto-completion, and special commands (QUIT, EXPLAIN, DESCRIBE, SHOW)
Supports both interactive and non-interactive modes (batch execution with --execute)

The QueryRunner class wraps HTTP client communication:

Maintains an OkHttpClient for authenticated requests to the coordinator
Maintains a separate OkHttpClient for unauthenticated segment data downloads
Creates StatementClient instances for each query execution
Manages session state (catalog, schema, session properties)

JDBC Driver Architecture

The Trino JDBC driver follows the standard JDBC driver pattern with two main classes:

TrinoDriver extends NonRegisteringTrinoDriver and registers itself with Java's DriverManager during class initialization. This allows applications to load the driver via Class.forName("io.trino.jdbc.TrinoDriver").

NonRegisteringTrinoDriver implements the core JDBC Driver interface:

Parses JDBC URLs (format: jdbc:trino://host:port/catalog/schema)
Creates TrinoConnection instances with configured HTTP clients
Manages shared connection pools and dispatcher threads for efficiency

TrinoConnection represents a single JDBC connection:

Maintains connection state (catalog, schema, session properties, transaction isolation level)
Implements JDBC connection lifecycle (auto-commit, transaction management)
Creates TrinoStatement instances for query execution
Manages session properties and prepared statements

Query Execution Flow

Both CLI and JDBC use the same underlying StatementClient interface for query execution:

Client Request
    ↓
StatementClientV1 (HTTP communication)
    ↓
Coordinator (query execution)
    ↓
Results streamed back via HTTP
    ↓
Client processes rows

The StatementClient interface provides methods to:

Check query status (isRunning(), isFinished())
Retrieve current results (currentStatusInfo(), currentRows())
Handle errors and warnings
Support both direct protocol (all data through coordinator) and spooling protocol (data from workers)

Key Design Patterns

Resource Management: Both QueryRunner and TrinoConnection implement Closeable to properly shut down HTTP clients and connection pools.

Session Isolation: Each client maintains its own session state, allowing concurrent queries with different catalogs, schemas, and session properties.

HTTP Client Pooling: Shared connection pools and dispatchers reduce overhead for multiple queries within the same client instance.

Error Handling: Query failures are propagated through QueryStatusInfo with detailed error messages and stack traces available in debug mode.

Testing Infrastructure

Relevant Files

testing/trino-testing/src/main/java/io/trino/testing/BaseConnectorTest.java
testing/trino-testing/src/main/java/io/trino/testing/BaseConnectorSmokeTest.java
testing/trino-testing/src/main/java/io/trino/testing/AbstractTestQueryFramework.java
testing/trino-product-tests/README.md
docs/src/main/sphinx/develop/tests.md

Trino uses a comprehensive, multi-layered testing strategy combining unit tests, integration tests, and product tests. The framework prioritizes fast execution, reliability, and practical coverage while minimizing hardware requirements.

Test Hierarchy

The testing infrastructure is organized in a clear inheritance hierarchy:

AbstractTestQueryFramework - Base class for all query-based tests. Manages QueryRunner lifecycle, H2 reference database, and query assertions. Uses JUnit 5 with @TestInstance(PER_CLASS) and concurrent execution.
AbstractTestQueries - Extends AbstractTestQueryFramework with standard SQL query tests (aggregations, joins, window functions, etc.). Provides reusable test methods for common query patterns.
BaseConnectorTest - Comprehensive connector test suite extending AbstractTestQueries. Tests connector-specific functionality including DDL, DML, metadata operations, and query pushdown. Requires distributed query runner with >= 3 nodes.
BaseConnectorSmokeTest - Lightweight smoke tests for connector configuration variants. Tests basic functionality without deep coverage. Useful for testing multiple connector configurations.

Query Runners

Tests execute against different QueryRunner implementations:

DistributedQueryRunner - Multi-node Trino cluster for testing distributed execution, query planning, and connector integration. Used by connector tests.
H2QueryRunner - In-memory H2 database for reference query execution and result comparison.
Custom Runners - Plugins implement specialized runners (e.g., ElasticsearchQueryRunner, JdbcQueryRunner) for connector-specific setup.

Assertion Framework

Tests use AssertJ for assertions and custom query assertion helpers:

assertQuery("SELECT * FROM nation");
assertUpdate("INSERT INTO table VALUES (1, 2)", 1);
query(session, "SELECT * FROM orders")
    .matches(expectedValues("(1, 'a'), (2, 'b')"));

Product Tests

Product tests exercise end-to-end functionality using user-visible interfaces (CLI, JDBC). They run in Docker containers with full Trino deployments and external systems (Hadoop, databases, etc.).

Environments: multinode, multinode-tls, singlenode-ldap, two-kerberos-hives, etc.
Framework: Tempto harness for Java-based and convention-based SQL tests.
Execution: testing/bin/ptl test run --environment <env>
Groups: Tests tagged with groups (e.g., string_functions, authorization) for selective execution.

Testing Conventions

All tests must follow these standards:

Use JUnit 5 with statically imported AssertJ assertions
Class names start with Test (e.g., TestExample); methods start with test
Classes are package-private and final
Prefer unit tests over product tests; use TestContainers for infrastructure abstraction
Avoid combinatorial tests; test items in isolation with minimal combinations
Connector tests must use DistributedQueryRunner with >= 3 nodes

Behavior-Driven Testing

TestingConnectorBehavior enum allows tests to adapt based on connector capabilities:

protected boolean hasBehavior(TestingConnectorBehavior behavior) {
    return behavior.hasBehaviorByDefault(this::hasBehavior);
}

@Test
public void testInsert() {
    if (!hasBehavior(SUPPORTS_INSERT)) {
        assertQueryFails("INSERT...", "not supported");
        return;
    }
    // Test insert functionality
}

This pattern enables reusable test suites across connectors with different feature sets.