Temporal Server | Augment Code

Overview

Relevant Files

README.md
docs/architecture/README.md
docs/architecture/workflow-lifecycle.md
go.mod

Temporal is a durable execution platform that enables developers to build scalable, resilient applications. It automatically handles transient failures, retries, and state management for long-running business processes called Workflows.

What is Temporal?

Temporal executes application logic in a fault-tolerant manner using event sourcing. Every workflow execution maintains an append-only history of events, allowing the system to recover from failures and replay execution at any point. This design separates user code (Workflows and Activities) from the Temporal cluster, enabling independent scaling and deployment.

Core Architecture

The system consists of three main components:

User-Hosted Processes: Applications and Workers that define and execute Workflows and Activities using Temporal SDKs.
Temporal Cluster: Distributed services that orchestrate workflow execution, manage task queues, and persist state.
Persistence Layer: Durable storage for workflow history, mutable state, and task queues (supports Cassandra, PostgreSQL, MySQL, SQLite, Elasticsearch).

Key Services

Frontend Service: Handles client requests (start workflow, query, signal, etc.)
History Service: Manages individual workflow executions, drives them to completion, and stores all state
Matching Service: Manages task queues that workers poll for Workflow and Activity tasks
Worker Service: Internal service for system-level operations

How It Works

Loading diagram...

Workflows are deterministic code that orchestrates Activities (side-effect operations). Workers continuously poll task queues for Workflow Tasks and Activity Tasks, execute them, and report results back to the cluster. The History Service processes these results, updates mutable state, and enqueues new tasks as needed.

Repository Structure

/api: Protocol buffer definitions and generated code for inter-service communication
/service: Core services (frontend, history, matching, worker)
/common: Shared libraries (persistence, metrics, configuration, caching, etc.)
/chasm: State machine library for workflow execution
/client: Client libraries for service-to-service communication
/cmd: CLI tools and server entry points
/schema: Database schema definitions for all supported backends
/tests: Integration and end-to-end tests
/docs: Architecture documentation and development guides

Key Design Principles

Deterministic Workflows: Workflow code must be deterministic and side-effect free (except for specific SDK operations)
Idempotent Activities: Activities must be idempotent or explicitly non-retryable
Durable Execution: Workflows survive process crashes and network failures through event sourcing
Scalability: Horizontal scaling through sharding and distributed task queues
Separation of Concerns: User code runs in user-owned processes; cluster handles orchestration and durability

Architecture & Core Services

Relevant Files

temporal/server.go - Server initialization and service orchestration
temporal/server_impl.go - Service startup and shutdown logic
service/frontend/service.go - Frontend service implementation
service/history/service.go - History service implementation
service/matching/service.go - Matching service implementation
service/worker/service.go - Worker service implementation
docs/architecture/README.md - High-level architecture overview

Temporal's architecture is built on four core services that work together to execute workflows durably and at scale. Each service has distinct responsibilities, and they communicate via gRPC to coordinate workflow execution.

Core Services

Frontend Service handles all client-facing requests: starting workflows, querying execution state, signaling workflows, and polling for tasks. It acts as the entry point for both user applications and worker processes. The Frontend routes task polling requests to the Matching Service and forwards workflow commands to the History Service.

History Service manages individual workflow executions and maintains the event-sourced history for each workflow. It processes requests from both the user application and workers, appends events to the workflow history, and creates tasks (Transfer Tasks for immediate work, Timer Tasks for delayed work) that are enqueued in the Matching Service. History is sharded across multiple instances to support millions of concurrent workflows.

Matching Service manages task queues that workers poll for Workflow and Activity tasks. It receives tasks from the History Service and dispatches them to waiting workers. Task queues are partitioned for higher throughput, with a hierarchical forwarding mechanism to route tasks between partitions when needed.

Worker Service handles background processing: replication from remote clusters, workflow archival, and internal system tasks. It runs continuously and coordinates with other services to maintain cluster consistency.

Service Dependencies

Loading diagram...

Startup Order

Services start in a specific order to ensure dependencies are ready:

Matching Service - Started first; provides task queue infrastructure
History Service - Depends on Matching; manages workflow state
Frontend Service - Depends on History and Matching; routes client requests
Worker Service - Depends on Frontend; handles background processing

Shutdown occurs in reverse order to gracefully drain traffic and complete in-flight operations.

Key Interactions

Task Flow: When a workflow command is executed, the History Service creates a task and enqueues it in Matching. Workers continuously poll Matching for tasks, process them, and report results back through Frontend to History.

Consistency: The History Service ensures that tasks in Matching are consistent with workflow state through careful ordering of state transitions and task creation. Transfer Tasks represent immediate work, while Timer Tasks represent delayed work that fires when timers expire.

Scaling: History shards partition workflows by ID, allowing horizontal scaling. Matching partitions task queues to increase throughput. Both services use membership management to coordinate ownership and handle failures.

History Service & Event Sourcing

Relevant Files

service/history/handler.go
service/history/shard/context_impl.go
service/history/history_engine.go
service/history/historybuilder/event_store.go
common/persistence/execution_manager.go
docs/architecture/history-service.md

The History Service implements event sourcing as its core architectural pattern. Every workflow execution is represented as an immutable sequence of history events, and the current state can be reconstructed by replaying these events.

Event Sourcing Model

Workflow Execution History is a linear sequence of HistoryEvent objects stored in a branching tree structure (to support workflow resets and conflict resolution). Each event is immutable and contains:

Event type (e.g., WorkflowExecutionStarted, ActivityTaskScheduled, TimerFired)
Event ID (monotonically increasing within a workflow run)
Version (for multi-cluster replication)
Associated data (payloads, timeouts, etc.)

The complete history is sufficient to recover all workflow state—this is the defining property of event sourcing.

Mutable State & Caching

While history events are the source of truth, maintaining a complete event replay on every request would be slow. Instead, the History Service maintains Mutable State: a cached summary of the workflow's current state including:

In-progress activities, timers, child workflows
Workflow execution metadata
Latest event ID reflected in this state

Mutable State is persisted as a single row (optimized for Cassandra) and cached in memory for recently accessed executions.

State Transitions

When the History Service handles a request (from user app, worker, or timer), it performs an atomic state transition:

Loading diagram...

The process:

In-memory: Generate new history events, update mutable state, create internal tasks
Append events: Write new events to history storage (via AppendHistoryNodes)
Atomic transaction: Update mutable state and add history tasks in a single database transaction

This ensures consistency: if the transaction fails, the workflow reloads from persistence and retries.

History Storage

Events are stored in a history tree structure:

History Nodes: Batches of events grouped by transaction
History Branches: Support for workflow resets (creates new branches)
Branch Tokens: Identify which branch a workflow is on

The EventStore class manages in-memory event buffering before persistence, tracking:

Database-buffered events
In-memory event batches
Scheduled-to-started event ID mappings

Task Generation

State transitions generate History Tasks (internal to History Service):

Transfer Tasks: Execute immediately (e.g., enqueue workflow/activity tasks to Matching Service)
Timer Tasks: Execute at a scheduled time (e.g., timeouts, user-defined timers)
Visibility Tasks: Update searchable workflow metadata
Replication Tasks: Sync state to other clusters

These tasks are persisted alongside mutable state and processed asynchronously by queue processors.

Consistency Guarantees

Event sourcing enables strong consistency:

Mutable State ↔ Tasks: Atomic database transactions
Events ↔ Mutable State: Mutable state tracks the latest event ID it reflects; invalid events are rejected
History Service ↔ Matching Service: Transactional Outbox pattern—transfer tasks are persisted before returning to client, ensuring eventual consistency

Matching Service & Task Queues

Relevant Files

service/matching/matching_engine.go
service/matching/physical_task_queue_manager.go
service/matching/task_queue_partition_manager.go
service/matching/handler.go
service/matching/backlog_manager.go
service/matching/task_reader.go
service/matching/task_writer.go
service/matching/matcher.go

The Matching Service is responsible for managing task queues and delivering workflow and activity tasks to workers. It acts as a broker between the Frontend Service (which receives long-poll requests from workers) and the History Service (which produces tasks).

Architecture Overview

Loading diagram...

Core Components

Matching Engine (matching_engine.go) is the central orchestrator. It manages all task queues and their partitions, handles task routing, and coordinates between workers and the history service. The engine maintains in-memory state for active task queues and loads/unloads them based on activity.

Task Queue Partitions split a logical task queue into multiple physical partitions to increase throughput. Each partition is managed by a taskQueuePartitionManager. Partitions form a tree hierarchy where child partitions can forward tasks and polls to parent partitions when empty, converging at a root partition. This forwarding mechanism ensures tasks find available workers even when distributed across partitions.

Physical Task Queue Manager (physical_task_queue_manager.go) manages a single partition's state. It coordinates the backlog manager, task matcher, and handles task dispatch. It tracks active pollers, manages task forwarding, and enforces rate limits.

Task Flow

Tasks flow through the system in two paths:

Sync Match: When a task arrives and a poller is immediately available, the task is matched synchronously and returned directly without persistence.
Async Match: If no poller is available, the task is written to the backlog (persistence layer) via the taskWriter. The taskReader then loads tasks from persistence and attempts to match them with waiting pollers.

Backlog Management

The backlog manager (backlog_manager.go) handles task persistence and in-memory buffering. It uses three key components:

Task Writer (task_writer.go): Writes tasks sequentially to persistence, allocating task IDs and managing range leases to prevent concurrent ownership conflicts.
Task Reader (task_reader.go): Reads tasks from persistence in batches, maintains an in-memory buffer, and dispatches tasks to the matcher.
Task GC: Periodically deletes acknowledged tasks from persistence to prevent unbounded growth.

Task Matching

The TaskMatcher (matcher.go) synchronously matches task producers with consumers (pollers). It uses channels to coordinate between task sources (task reader, direct task additions) and waiting pollers. The matcher respects rate limits and can forward tasks to parent partitions if local pollers are unavailable.

Fairness and Prioritization

The matching service supports fair task distribution across multiple fairness keys (e.g., different workflow types). Tasks are assigned a <pass, id> level pair that determines dispatch order. The fair task reader ensures tasks are dispatched in a fair manner by reading and buffering tasks ordered by their level, preventing starvation of lower-priority work.

Worker Versioning

The service integrates with worker versioning to route tasks to compatible worker deployments. Each physical queue can have versioned sub-queues for different build IDs. The matcher respects versioning rules and can redirect tasks between versions based on deployment configuration.

Key Design Patterns

Partition Hierarchy: Child partitions forward to parents, reducing memory overhead for low-traffic queues.
Dual-Path Dispatch: Sync matching for immediate availability, async for backlog processing.
Lease-Based Ownership: Range IDs prevent concurrent partition ownership and task loss during failover.
Rate Limiting: Per-queue and per-partition rate limits prevent overwhelming workers and the persistence layer.

Persistence & Data Storage

Relevant Files

common/persistence/persistence_interface.go
common/persistence/execution_manager.go
common/persistence/history_manager.go
common/persistence/sql/factory.go
common/persistence/cassandra/factory.go
schema/embed.go

The persistence layer is the abstraction that decouples Temporal from specific database implementations. It provides a unified interface for storing and retrieving workflow execution state, history, tasks, and metadata.

Architecture Overview

Loading diagram...

Core Interfaces

The persistence layer defines several key interfaces in persistence_interface.go:

DataStoreFactory – Creates store instances for different data types. Implementations exist for SQL (MySQL, PostgreSQL, SQLite) and Cassandra backends.

ExecutionStore – Manages workflow execution state including mutable state snapshots, mutations, and history branches. Handles create, update, and conflict resolution operations.

TaskStore – Manages task queues and individual tasks. Supports both standard and fair-weighted task distribution.

MetadataStore – Manages namespace definitions and cluster metadata.

HistoryStore – Manages workflow history events organized as branching trees (for multi-DC scenarios).

QueueV2 – Generic FIFO queue interface supporting dynamic queue names (replaces legacy Queue interface).

Execution Manager

The ExecutionManager wraps the low-level ExecutionStore and handles serialization/deserialization of workflow state. Key responsibilities:

Serialization: Converts protobuf objects to binary blobs for storage
Event batching: Groups history events into transactions
Mutation handling: Applies incremental state changes or full snapshots
Conflict resolution: Handles concurrent workflow updates via condition checks
XDC caching: Maintains cross-datacenter event cache for replication

History Manager

The HistoryManager manages workflow history as a tree structure supporting branching:

Branch forking: Creates new branches for continue-as-new and reset operations
History reading: Supports forward, reverse, and batch reading with pagination
Branch deletion: Cleans up unused branches while preserving referenced segments
Transaction validation: Ensures event continuity and version consistency

Backend Implementations

SQL Backend (common/persistence/sql/) – Supports MySQL, PostgreSQL, and SQLite. Uses connection pooling and reference counting. Schema is embedded in the binary via schema/embed.go.

Cassandra Backend (common/persistence/cassandra/) – Optimized for distributed deployments. Handles eventual consistency and multi-region replication.

Both backends implement the same interfaces, allowing runtime selection via configuration.

Data Serialization

The persistence layer uses protobuf serialization with pluggable codecs. Data is stored as DataBlob objects containing:

Raw binary data
Encoding type (e.g., proto3, json)
Optional compression

This allows schema evolution and codec flexibility without database migrations.

Key Design Patterns

Snapshot vs. Mutation – Workflow state can be stored as a complete snapshot or as incremental mutations. Snapshots are used for new executions; mutations for updates.

Condition-based updates – Concurrent updates use version conditions to detect conflicts and trigger conflict resolution workflows.

Lazy deserialization – Blobs are deserialized only when needed, reducing memory overhead for large histories.

Shard-based partitioning – Data is partitioned by shard ID for horizontal scalability and load distribution.

CHASM State Machine Framework

Relevant Files

chasm/engine.go
chasm/statemachine.go
chasm/component.go
chasm/task.go
chasm/context.go
chasm/library.go
service/history/chasm_engine.go
docs/architecture/chasm.md

CHASM (Coordinated Heterogeneous Application State Machines) is a framework for building distributed, event-driven state machines that power Temporal's core abstractions like Workflows, Activities, and Schedulers. It provides a composable, type-safe way to define and manage complex execution hierarchies.

Core Concepts

Executions and Components: An Execution is an instance of an Archetype (e.g., a workflow execution). Each execution has a tree structure where each subtree is a Component. A Component is the fundamental unit of state and behavior in CHASM. Components can be nested, forming a hierarchy that represents the logical structure of an execution.

ComponentRef: A ComponentRef uniquely identifies a component within an execution across all clusters. It contains an ExecutionKey, ComponentPath, and transition history information. This allows precise targeting of specific components for operations.

State Machines: CHASM uses a generic state machine pattern. The Transition type represents a state change from source states to a destination state, triggered by an event. Transitions are validated before application and can schedule tasks as side effects.

Engine Interface

The Engine interface is the primary API for interacting with CHASM:

type Engine interface {
  NewExecution(ctx, ref, newFn, opts) (ExecutionKey, []byte, error)
  UpdateComponent(ctx, ref, updateFn, opts) ([]byte, error)
  ReadComponent(ctx, ref, readFn, opts) error
  PollComponent(ctx, ref, predicateFn, operationFn, opts) ([]byte, error)
}

NewExecution creates a new execution with an initial component. UpdateComponent modifies an existing component and persists changes. ReadComponent provides read-only access without mutations. PollComponent enables polling-based workflows (not yet implemented).

Tasks and Contexts

Tasks are units of work scheduled during component transitions. CHASM supports two task types:

Pure Tasks: Execute within the transaction context via PureTaskExecutor, allowing state mutations.
Side-Effect Tasks: Execute outside the transaction via SideEffectTaskExecutor, for external operations.

The MutableContext provides access to the execution state and allows scheduling tasks via AddTask(). The immutable Context enables read-only operations with access to component references and current time.

Libraries and Registration

Components and tasks are organized into Library implementations. Each library registers its components and tasks with a central Registry. This enables:

Type-safe component discovery
Task routing and execution
Search attribute mapping for visibility
gRPC service registration

Libraries can define ephemeral components (not persisted) or single-cluster components (not replicated).

Loading diagram...

Practical Usage

When implementing a component, embed UnimplementedComponent for forward compatibility and implement LifecycleState() and Terminate(). Use transitions to define valid state changes and schedule tasks. The engine handles persistence, locking, and transaction semantics automatically.

Worker Service & Background Processing

Relevant Files

service/worker/service.go
service/worker/worker.go
service/worker/replicator/replicator.go
service/worker/scheduler/workflow.go
service/worker/batcher/workflow.go
service/worker/scanner/scanner.go
service/worker/pernamespaceworker.go

The Worker Service is Temporal's background processing engine, responsible for all asynchronous cluster operations. It runs system workflows and activities that handle replication, scheduling, batch operations, and database maintenance.

Architecture Overview

Loading diagram...

Core Components

Worker Manager orchestrates SDK workers that execute system workflows. It maintains a default worker for most workflows and creates dedicated workers for components requiring isolated task queues. Each worker registers workflows and activities, then starts processing tasks from their assigned task queues.

Per-Namespace Workers handle namespace-specific background jobs. The per-namespace worker manager dynamically creates and manages workers for each namespace, scaling based on configuration and membership changes. Components like the Scheduler and Batcher register per-namespace to isolate their work.

Replicator consumes replication tasks from remote clusters and applies them locally. It listens to cluster metadata changes, creates replication message processors for each remote cluster, and maintains a cleanup queue to prevent task accumulation.

Scanner performs full database scans for resource cleanup and monitoring. It runs multiple specialized scavengers:

Executions Scavenger: Validates open workflow executions and cleans up orphaned data
History Scavenger: Deletes old history branches after retention periods
Task Queue Scavenger: Cleans up stale task queue metadata
Build ID Scavenger: Removes unused worker build IDs

Scheduler Workflow

The Scheduler manages scheduled workflow executions. It maintains schedule state, processes time-based triggers, handles overlapping executions via configurable policies (skip, buffer, cancel), and supports updates and patches. The workflow uses caching for time calculations and continues-as-new to keep history clean.

Batcher Workflow

The Batcher processes bulk operations on workflow executions: terminate, cancel, signal, delete, reset, and update options. It pages through matching executions, applies operations with configurable concurrency and retry logic, and tracks progress via heartbeats for resumption on failure.

Lifecycle

The service starts by initializing cluster metadata and namespace registry, then launches the scanner for database maintenance. If global namespaces are enabled, it starts the replicator. The worker manager and per-namespace worker manager then start, registering all system workflows and activities. Graceful shutdown stops all workers and cleanup processes.

Replication & Multi-Cluster Support

Relevant Files

service/history/replication/stream_sender.go
service/history/replication/stream_receiver.go
service/history/replication/task_processor.go
service/history/replication/task_executor.go
service/history/ndc/conflict_resolver.go
service/history/ndc/workflow_resetter.go
common/persistence/namespace_replication_queue.go
service/worker/replicator/replicator.go

Overview

Temporal supports multi-cluster deployments where workflows can be replicated across geographically distributed clusters. The replication system ensures consistency and enables failover by continuously synchronizing workflow state, history events, and namespace metadata between clusters.

Architecture

The replication system uses a bidirectional streaming model where each cluster acts as both a sender and receiver:

Stream Sender (active cluster): Monitors local workflow changes and sends replication tasks to passive clusters
Stream Receiver (passive cluster): Receives and applies replication tasks from active clusters
Task Processor: Executes replication tasks and handles conflicts
Namespace Replication Queue: Manages namespace-level replication tasks for global namespace updates

Loading diagram...

Replication Task Types

Replication tasks carry different types of state changes:

History Tasks: Workflow execution events and state transitions
Sync Activity Tasks: Activity state synchronization
Sync Workflow State Tasks: Complete workflow mutable state snapshots
Sync HSM Tasks: Hierarchical State Machine state synchronization
Namespace Tasks: Global namespace configuration updates
Task Queue User Data: Task queue metadata replication
Backfill History Tasks: Historical event recovery for new replicas

Conflict Resolution (NDC)

When replication tasks arrive at a passive cluster, conflicts may occur if the local cluster has processed events independently. The Conflict Resolver uses version histories to determine the correct branch:

Each workflow maintains a version history tracking which cluster made changes at each event
Incoming tasks with higher version numbers become the new current branch
Lower version tasks are applied to alternate branches for potential recovery
The Workflow Resetter rebuilds mutable state when branch changes occur

Flow Control

Replication uses bidirectional flow control to prevent overwhelming receivers:

Sender Flow Controller: Pauses sending when receiver queue depth exceeds thresholds
Receiver Flow Controller: Signals pause/resume based on task processing capacity
Tiered Processing: High-priority and low-priority tasks are tracked separately for better resource utilization

Dead Letter Queue (DLQ)

Failed replication tasks are written to a DLQ for manual intervention:

Tasks that fail repeatedly are moved to the DLQ
Operators can inspect, fix, and replay DLQ messages
DLQ supports range deletion and selective message replay

Namespace Replication

Global namespaces are replicated through a dedicated queue:

Namespace updates (create, update, failover) are published to the replication queue
The Worker Replicator consumes these tasks and applies them to all clusters
Ensures consistent namespace configuration across the cluster group

Frontend Service & Public API

Relevant Files

service/frontend/service.go
service/frontend/workflow_handler.go
service/frontend/admin_handler.go
service/frontend/operator_handler.go
service/frontend/http_api_server.go
service/frontend/nexus_handler.go

The Frontend Service is the primary entry point for all client interactions with Temporal. It exposes three main gRPC services and optional HTTP/REST APIs, handling request validation, rate limiting, authorization, and routing to backend services.

Architecture Overview

Loading diagram...

Core Services

WorkflowService handles all workflow execution APIs: starting workflows, polling for tasks, completing tasks, signaling, querying, and managing workflow state. It validates requests, enforces quotas, and routes operations to History and Matching services.

AdminService provides internal administrative operations: namespace management, cluster metadata, replication, search attributes, and deep health checks. It manages cluster-wide state and coordinates with History Service.

OperatorService exposes operator-level APIs for managing search attributes, listing workers, and cluster operations. It provides visibility into system state without requiring admin privileges.

HTTP API Layer

The optional HTTP API Server (enabled via HTTPPort configuration) provides REST access to WorkflowService and OperatorService using gRPC-Gateway. It:

Translates HTTP requests to gRPC calls via an inline client connection
Applies the same interceptors and validation as gRPC
Supports OpenAPI specification endpoints (/swagger.json, /openapi.yaml)
Enforces allowed hosts via dynamic configuration

Nexus Integration

Nexus operations are dispatched through dedicated HTTP handlers that route requests to the Matching Service as Nexus tasks. The frontend:

Validates Nexus endpoints and namespaces
Applies authorization and rate limiting
Supports both endpoint-based and namespace/task-queue-based dispatch
Handles cross-cluster forwarding when enabled

Request Processing Pipeline

All requests flow through a chain of interceptors:

Namespace Validation – Verifies namespace exists and is active
Authorization – Checks permissions via configured authorizer
Rate Limiting – Enforces per-namespace and global quotas
Telemetry – Records metrics and traces
Handler Logic – Executes the actual RPC method

Configuration & Quotas

The Config struct contains 100+ dynamic configuration properties controlling:

Rate limits (RPS per namespace, global, visibility operations)
Size limits (blobs, memos, search attributes)
Feature flags (schedules, deployments, worker versioning, Nexus APIs)
Timeout defaults and retry policies
gRPC keep-alive settings

Quotas are enforced at multiple levels: global, per-namespace, and per-task-queue, with burst ratio support for traffic spikes.

Health & Lifecycle

The Service manages startup and shutdown:

Registers all three gRPC services with the server
Starts version checker, handlers, and HTTP server
Provides health check endpoints for load balancers
Graceful shutdown with configurable drain duration

Development, Testing & Tools

Relevant Files

CONTRIBUTING.md
Makefile
temporaltest/server.go
tests/testcore/functional_test_base.go
tools/tdbg/app.go
docs/development/testing.md

Build & Compilation

The project uses a comprehensive Makefile for building and code generation. Key targets include:

make or make install: Install tools and build binaries
make bins: Rebuild binaries without running tests
make proto: Recompile proto files and regenerate code
make lint-code: Run linters and type checks
make fmt-imports: Format imports using gci

Build tags control compilation behavior:

test_dep: Enables test hooks (required for some tests)
TEMPORAL_DEBUG: Extends timeouts for debugging sessions
disable_grpc_modules: Reduces binary size by excluding unused gRPC dependencies

Test Categories

Tests are organized into three categories:

Unit Tests (make unit-test): Fast, isolated tests with no external dependencies
Integration Tests (make integration-test): Test server integration with databases (Cassandra, SQL, Elasticsearch)
Functional Tests (make functional-test): End-to-end tests covering Temporal server functionality

Run all tests with make test. Individual tests can be run using:

go test -v <path> -run <TestSuite> -testify.m <TestSpecificTaskName>

Test Infrastructure

FunctionalTestBase (tests/testcore/functional_test_base.go) provides the foundation for functional tests. It includes:

Test cluster setup and teardown
Namespace management
Task poller for workflow task handling
Assertion helpers (ProtoAssertions, HistoryRequire, UpdateUtils)

temporaltest.TestServer (temporaltest/server.go) offers a lightweight testing API for end-to-end tests:

Embedded Temporal server on a system-chosen port
Client and worker management
Namespace creation and configuration

Test Helpers

The common/testing package provides utilities:

testvars: Generate consistent test identifiers (workflow IDs, namespaces, task queues)
taskpoller: Handle workflow tasks with full control over worker behavior
softassert: Log invariant violations without stopping test execution
testlogger: Configurable logging for tests

Debugging & Tools

tdbg (tools/tdbg/) is a command-line debugging tool for Temporal servers:

Inspect workflow executions and task queues
Decode and analyze task blobs
Manage dead letter queues (DLQ)
Supports custom task serialization

Environment Variables for Testing:

TEMPORAL_TEST_LOG_FORMAT: json or console
TEMPORAL_TEST_LOG_LEVEL: debug, info, warn, error, fatal
TEMPORAL_TEST_OTEL_OUTPUT: Path for OpenTelemetry trace output on test failures
CGO_ENABLED=0: Speeds up compilation

IDE Debugging (GoLand): Add build tags to "Go tool arguments":

-tags disable_grpc_modules,test_dep

Code Generation

Proto changes require regeneration:

make proto

This runs:

protoc: Compiles proto files
proto-codegen: Generates service clients and interceptors
go-generate: Processes //go:generate directives

Use make update-go-api to pull latest API changes from the api-go repository.