LangChain: The Platform for Reliable Agents | Augment Code

Overview

Relevant Files

README.md
libs/core/README.md
libs/core/langchain_core/init.py
libs/README.md

LangChain is a framework for building agents and LLM-powered applications. It provides a standard interface for models, embeddings, vector stores, and more, enabling developers to chain together interoperable components and third-party integrations while future-proofing decisions as underlying technology evolves.

What is LangChain?

LangChain helps developers build applications powered by large language models (LLMs) through composable, modular abstractions. The framework supports real-time data augmentation, model interoperability, rapid prototyping, and production-ready features like monitoring and evaluation.

Key capabilities include:

Real-time data augmentation – Connect LLMs to diverse data sources and external systems
Model interoperability – Swap models seamlessly as your needs evolve
Rapid prototyping – Build and iterate quickly with modular components
Production-ready features – Deploy with built-in monitoring and debugging support
Vibrant ecosystem – Leverage integrations, templates, and community contributions

Monorepo Architecture

This is a Python monorepo with multiple independently versioned packages using uv for dependency management. The structure follows a layered architecture:

Loading diagram...

Core Layer (libs/core/) – Base abstractions, interfaces, and protocols. Defines chat models, LLMs, vector stores, retrievers, Runnables, and LangChain Expression Language (LCEL). Kept lightweight with minimal dependencies.

Implementation Layer (libs/langchain_v1/) – Actively maintained package with concrete implementations and high-level utilities. Built on top of langchain-core.

Legacy Layer (libs/langchain/) – langchain-classic package containing deprecated functionality and legacy chains. Not recommended for new projects.

Integration Layer (libs/partners/) – Third-party provider integrations maintained by the LangChain team, including OpenAI, Anthropic, Ollama, and others. Many integrations are maintained in separate repositories like langchain-google and langchain-aws.

Testing Layer (libs/standard-tests/) – Standardized test suite for integration packages.

Utilities (libs/text-splitters/, libs/model-profiles/, libs/cli/) – Document chunking, model configuration, and command-line tools.

Core Concepts

Runnables – The universal invocation protocol enabling synchronous, asynchronous, batch, and streaming operations on any component.

LangChain Expression Language (LCEL) – A declarative syntax for composing Runnables into production-grade programs with built-in support for async, batch, and streaming.

Chat Models vs. LLMs – Chat models use message sequences with distinct roles (user, assistant, system), while legacy LLMs use plain text input/output.

Getting Started

Install the main package:

pip install langchain

Or install just the core abstractions:

pip install langchain-core

For advanced agent orchestration, pair LangChain with LangGraph, the low-level agent framework trusted in production by companies like LinkedIn, Uber, and Klarna.

Architecture & Monorepo Structure

Relevant Files

libs/core/langchain_core – Base abstractions and interfaces
libs/langchain_v1/langchain – Main implementation package
libs/partners/ – Third-party integrations (OpenAI, Anthropic, etc.)
libs/text-splitters/langchain_text_splitters – Document chunking utilities
libs/standard-tests/langchain_tests – Shared test suite for integrations
libs/model-profiles/langchain_model_profiles – Model configuration profiles
libs/cli/langchain_cli – Command-line interface tools

LangChain is organized as a Python monorepo using uv for dependency management. This structure enables independent versioning, focused development, and clear separation of concerns across the ecosystem.

Three-Layer Architecture

The monorepo follows a layered design pattern:

Core Layer (langchain-core): Defines base abstractions, interfaces, and protocols. Includes the Runnable protocol, message types, embeddings, language models, and output parsers. Kept lightweight with minimal dependencies.
Implementation Layer (langchain): Provides concrete implementations, high-level utilities, and pre-built agent architectures. Depends on langchain-core and integrates partner packages.
Integration Layer (partners/): Third-party service integrations maintained by the LangChain team. Each integration is independently versioned and can be installed separately.

Package Organization

libs/
├── core/                    # langchain-core (base abstractions)
├── langchain_v1/            # langchain (main package)
├── partners/                # Integrations (OpenAI, Anthropic, Ollama, etc.)
├── text-splitters/          # Document chunking utilities
├── standard-tests/          # Shared test suite for integrations
├── model-profiles/          # Model configuration profiles
└── cli/                     # Command-line tools

Key Packages

langchain-core: Provides the foundational abstractions that all other packages build upon. Includes Runnable (universal invocation protocol), message types, embeddings interfaces, and LLM base classes.

langchain: The main user-facing package. Provides agents, chains, and high-level utilities for building LLM applications. Built on top of langchain-core.

partners/: Contains integrations with model providers and services. Examples include langchain-openai, langchain-anthropic, langchain-ollama, and vector store integrations like langchain-chroma and langchain-qdrant.

text-splitters: Specialized package for document chunking with support for multiple formats (Markdown, JSON, HTML, Python, LaTeX, etc.).

standard-tests: Provides standardized test suites that integration packages can use to ensure compliance with LangChain interfaces.

model-profiles: Configuration profiles for different models, enabling consistent behavior across providers.

cli: Command-line tools for project scaffolding, integration templates, and development utilities.

Development Workflow

Each package has its own pyproject.toml and uv.lock file for independent dependency management. Local development uses editable installs via [tool.uv.sources]. Run tests with make test, lint with make lint, and format with make format from any package directory.

Core Abstractions & Runnables

Relevant Files

libs/core/langchain_core/runnables/base.py
libs/core/langchain_core/runnables/config.py
libs/core/langchain_core/language_models/base.py
libs/core/langchain_core/language_models/chat_models.py
libs/core/langchain_core/embeddings/embeddings.py

LangChain's core abstractions are built around the Runnable interface, which provides a unified way to invoke, batch, stream, and compose components. These abstractions form the foundation of the LangChain Expression Language (LCEL).

The Runnable Interface

A Runnable is a unit of work that transforms input to output. All Runnables expose a consistent API:

invoke(input) – Transform a single input synchronously
ainvoke(input) – Transform a single input asynchronously
batch(inputs) – Process multiple inputs in parallel (default uses thread pool)
abatch(inputs) – Process multiple inputs asynchronously
stream(input) – Stream output as it's produced
astream(input) – Stream output asynchronously

All methods accept an optional config parameter for controlling execution behavior, adding tags/metadata for tracing, and managing callbacks.

Composition Primitives

Runnables compose declaratively using two main patterns:

RunnableSequence (| operator): Chains runnables sequentially, passing one's output as the next's input.

from langchain_core.runnables import RunnableLambda

sequence = RunnableLambda(lambda x: x + 1) | RunnableLambda(lambda x: x * 2)
sequence.invoke(1)  # Returns 4

RunnableParallel (dict literal): Invokes runnables concurrently with the same input, returning a dict of outputs.

sequence = RunnableLambda(lambda x: x + 1) | {
    "mul_2": RunnableLambda(lambda x: x * 2),
    "mul_5": RunnableLambda(lambda x: x * 5),
}
sequence.invoke(1)  # Returns {'mul_2': 4, 'mul_5': 10}

Language Models & Embeddings

BaseLanguageModel and BaseChatModel are Runnables that wrap LLM APIs. They accept LanguageModelInput (prompts, strings, or message sequences) and return structured outputs.

Embeddings is a simpler interface for text-to-vector models, with methods embed_documents() and embed_query(). Async variants use run_in_executor() by default.

Configuration & Execution

RunnableConfig is a TypedDict controlling execution:

tags – Filter and trace calls
metadata – JSON-serializable context
callbacks – Lifecycle hooks for monitoring
run_name – Tracer run identifier
max_concurrency – Limit parallel calls
recursion_limit – Prevent infinite loops (default: 25)
configurable – Runtime values for configurable fields

Standard Methods

All Runnables support modifiers that return new Runnables:

.bind(**kwargs) – Bind fixed arguments
.with_config(config) – Bind configuration
.with_retry(stop_after_attempt=N) – Add retry logic
.with_listeners(on_start, on_end, on_error) – Add lifecycle hooks
.with_fallbacks([fallback_runnable]) – Add fallback chains

These methods compose seamlessly, enabling declarative error handling, tracing, and customization without modifying the underlying logic.

Loading diagram...

Messages & Prompts

Relevant Files

libs/core/langchain_core/messages/base.py
libs/core/langchain_core/messages/ai.py
libs/core/langchain_core/messages/human.py
libs/core/langchain_core/messages/system.py
libs/core/langchain_core/prompts/base.py
libs/core/langchain_core/prompts/chat.py
libs/core/langchain_core/prompts/prompt.py

Message Types

Messages are the fundamental building blocks for chat-based interactions in LangChain. The BaseMessage class serves as the abstract base for all message types, with each message containing content (text or structured blocks), optional additional_kwargs for provider-specific data, and response_metadata for tracking information like token counts.

The core message types are:

HumanMessage – Input from the user to the model
AIMessage – Output from the model, optionally containing tool calls
SystemMessage – Instructions that prime the model's behavior, typically placed first
ChatMessage – Generic messages with custom roles
ToolMessage – Results from tool/function calls
FunctionMessage – Legacy function call results

Each message type has a corresponding *Chunk variant for streaming scenarios.

Content Blocks

Messages support multimodal content through standardized content blocks—a provider-agnostic format that unifies different LLM APIs. Instead of provider-specific schemas, content is represented as a list of typed blocks:

TextContentBlock – Plain text
ImageContentBlock – Images (URL or base64)
AudioContentBlock – Audio data
VideoContentBlock – Video data
ReasoningContentBlock – Model reasoning/thinking
ToolCall – Function invocations with arguments

Adapters translate these standard blocks into provider-specific formats (OpenAI, Anthropic, etc.).

Prompt Templates

Prompt templates are reusable patterns for constructing inputs to language models. The BasePromptTemplate class defines the interface, with two main categories:

String Prompts (PromptTemplate) use template syntax (f-string, Jinja2, or Mustache) to format text:

from langchain_core.prompts import PromptTemplate

prompt = PromptTemplate.from_template("Say {foo}")
prompt.format(foo="bar")  # "Say bar"

Chat Prompts (ChatPromptTemplate) compose multiple message templates into a conversation:

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    MessagesPlaceholder("history"),
    ("human", "{question}"),
])

Key Concepts

Partial Variables – Pre-fill template placeholders to reduce required inputs:

prompt = PromptTemplate.from_template(
    "Say {greeting} to {name}",
    partial_variables={"greeting": "Hello"}
)
prompt.format(name="Alice")  # "Say Hello to Alice"

MessagesPlaceholder – Insert dynamic message lists (e.g., chat history) into prompts without formatting.

Input Variables – Declared template parameters that must be provided at runtime. Optional variables are auto-inferred and don't require input.

Output Parsers – Optional post-processing to structure model outputs (attached to prompts via output_parser field).

Workflow

A typical flow chains messages and prompts: format a prompt template with user input → invoke a chat model with the resulting messages → parse the model's response message. This pattern enables flexible, composable LLM applications.

Tools & Agent Interfaces

Relevant Files

libs/core/langchain_core/tools/base.py
libs/core/langchain_core/tools/structured.py
libs/core/langchain_core/tools/simple.py
libs/core/langchain_core/tools/convert.py
libs/core/langchain_core/agents.py

LangChain tools are reusable components that agents can invoke to perform specific actions. They bridge language models with external systems, APIs, and computations. The tool system is built on a hierarchy of abstractions that support both simple and complex use cases.

Tool Architecture

Loading diagram...

Core Tool Classes

BaseTool is the abstract base class for all tools. It extends RunnableSerializable and defines the interface that agents use to invoke tools. Key properties include:

name: Unique identifier for the tool
description: Tells the model when and how to use the tool
args_schema: Pydantic model defining input arguments
return_direct: Whether to stop the agent loop after execution
handle_tool_error: Error handling strategy for tool failures

StructuredTool supports multiple typed inputs through a Pydantic schema. It accepts both sync (func) and async (coroutine) implementations. Use this when your tool needs structured, validated inputs.

Tool is a simpler variant that accepts a single string input. It's useful for basic operations that don't require complex argument validation.

Creating Tools

The @tool decorator is the recommended way to create tools from functions:

from langchain_core.tools import tool

@tool
def search(query: str) -> str:
    """Search for information about a topic.
    
    Args:
        query: The search query.
    """
    return f"Results for {query}"

@tool(return_direct=True)
def calculator(expression: str) -> str:
    """Evaluate a mathematical expression."""
    return str(eval(expression))

The decorator automatically infers the schema from type hints. You can customize behavior with parameters like description, return_direct, args_schema, and parse_docstring.

Agent Integration

Agents use tools through the AgentAction schema, which specifies:

tool: Name of the tool to execute
tool_input: Arguments to pass (string or dict)
log: Additional context about the decision

When an agent decides to use a tool, it emits an AgentAction. The agent executor then:

Looks up the tool by name
Validates and parses the input
Executes the tool
Returns the result as an observation
Feeds the observation back to the agent

Tool Rendering

Tools can be rendered for display to language models using utility functions:

render_text_description(): Shows tool name and description
render_text_description_and_args(): Includes argument schemas

These renderers help the model understand available tools and their expected inputs.

Error Handling

Tools support graceful error handling through ToolException and validation error handlers. Set handle_tool_error to a boolean, string, or callable to control how errors are reported back to the agent, allowing recovery without stopping execution.

Data Retrieval & Vector Stores

Relevant Files

libs/core/langchain_core/vectorstores/base.py
libs/core/langchain_core/retrievers.py
libs/core/langchain_core/document_loaders/base.py
libs/text-splitters/langchain_text_splitters/base.py
libs/core/langchain_core/indexing/api.py

Overview

Data retrieval is the process of loading, processing, and searching through documents to find relevant information. LangChain provides a modular pipeline: Document Loaders fetch raw data, Text Splitters chunk it into manageable pieces, Vector Stores embed and index documents, and Retrievers query them efficiently.

Loading diagram...

Document Loaders

Document loaders implement the BaseLoader interface to fetch data from various sources (files, APIs, databases). They support both eager and lazy loading:

load() – Returns all documents at once
lazy_load() – Yields documents one at a time (memory-efficient)
load_and_split() – Loads and chunks documents in one step

from langchain_core.document_loaders import BaseLoader

class CustomLoader(BaseLoader):
    def lazy_load(self):
        # Yield Document objects
        yield Document(page_content="...", metadata={})

Text Splitters

Text splitters break large documents into chunks with optional overlap. The TextSplitter base class provides:

chunk_size – Maximum characters per chunk
chunk_overlap – Overlap between consecutive chunks
split_text() – Core splitting logic (implemented by subclasses)
split_documents() – Splits a list of documents while preserving metadata

Overlap helps maintain context across chunk boundaries, preventing information loss at split points.

Vector Stores

Vector stores embed documents and enable semantic search. The VectorStore interface provides:

add_documents() – Embed and store documents
similarity_search() – Find documents similar to a query
max_marginal_relevance_search() – Diverse results (MMR algorithm)
delete() – Remove documents by ID

Vector stores can be used as retrievers via as_retriever(), converting them to the BaseRetriever interface.

Retrievers

Retrievers are more general than vector stores—they only need to return documents, not store them. The BaseRetriever abstract class requires:

_get_relevant_documents(query: str) – Sync retrieval logic
_aget_relevant_documents(query: str) – Async retrieval logic (optional)

Retrievers follow the standard Runnable interface, supporting invoke(), ainvoke(), batch(), and abatch() methods.

Indexing API

The index() function manages document lifecycle in vector stores using a RecordManager to track changes:

Deduplication – Hashes documents to detect duplicates
Incremental updates – Only re-indexes changed documents
Cleanup modes – incremental, full, or scoped_full to remove stale documents

from langchain_core.indexing import index

result = index(
    docs_source=loader,
    record_manager=manager,
    vector_store=store,
    cleanup="incremental",
    source_id_key="source"
)

The indexing result reports num_added, num_updated, num_deleted, and num_skipped documents.

Partner Integrations & Model Providers

Relevant Files

libs/partners/openai/langchain_openai
libs/partners/anthropic/langchain_anthropic
libs/partners/ollama/langchain_ollama
libs/partners/groq/langchain_groq
libs/model-profiles/langchain_model_profiles
libs/core/langchain_core/language_models

Partner integrations provide LangChain with access to third-party model providers and services. Each integration is independently versioned and maintained as a separate package in libs/partners/, allowing teams to use only the providers they need.

Architecture Overview

Loading diagram...

Core Components

Chat Models inherit from BaseChatModel in langchain-core. Each integration implements:

invoke() – Synchronous message processing
stream() – Token-by-token streaming
batch() – Parallel request handling
bind_tools() – Tool/function calling support
with_structured_output() – Structured response generation

Embeddings implement the Embeddings interface with:

embed_documents() – Embed multiple texts
embed_query() – Embed a single query
Async variants (aembed_documents, aembed_query)

Major Integrations

OpenAI (langchain-openai): ChatOpenAI, OpenAIEmbeddings, Azure variants. Supports vision, tool calling, structured outputs, and reasoning models.

Anthropic (langchain-anthropic): ChatAnthropic with Claude models. Features include prompt caching, file search, and specialized middleware for tools and memory.

Ollama (langchain-ollama): Local model support via Ollama service. ChatOllama, OllamaLLM, OllamaEmbeddings with model validation on init.

Groq (langchain-groq): ChatGroq for fast inference. Optimized for low-latency completions with standard LangChain interfaces.

Model Profiles System

Model profiles expose programmatic access to model capabilities via the .profile field on chat models. Profiles include:

Context window sizes (max_input_tokens, max_output_tokens)
Modality support (image, audio, video, PDF inputs/outputs)
Feature flags (tool calling, structured output, reasoning)

The langchain-model-profiles CLI tool refreshes profile data from models.dev:

langchain-profiles refresh --provider anthropic --data-dir ./langchain_anthropic/data

Profiles are stored in _profiles.py and can be augmented via profile_augmentations.toml for provider-specific overrides.

Integration Pattern

Each partner package follows a consistent structure:

libs/partners/{provider}/
├── langchain_{provider}/
│   ├── __init__.py          # Public exports
│   ├── chat_models.py       # Chat model implementation
│   ├── embeddings.py        # Embedding implementation
│   └── data/
│       ├── _profiles.py     # Generated model profiles
│       └── profile_augmentations.toml
├── tests/
│   ├── unit_tests/
│   └── integration_tests/
└── pyproject.toml

All integrations use uv for dependency management and follow LangChain's code quality standards: full type hints, comprehensive docstrings, and standardized test coverage.

Adding New Integrations

New partner integrations should:

Inherit from BaseChatModel or Embeddings base classes
Implement required abstract methods with full type hints
Include model profiles via profile_augmentations.toml
Add unit and integration tests in tests/
Export public classes in __init__.py

See libs/cli/integration_template/ for a starter template.

Observability, Callbacks & Testing

Relevant Files

libs/core/langchain_core/callbacks/base.py
libs/core/langchain_core/callbacks/manager.py
libs/core/langchain_core/tracers/init.py
libs/standard-tests/langchain_tests/base.py
libs/cli/langchain_cli/cli.py

Callback System Architecture

LangChain's callback system enables observability by allowing handlers to listen to lifecycle events across LLMs, chains, tools, and retrievers. The system uses a mixin-based design where different component types expose specific callback hooks.

Core Components:

BaseCallbackHandler – Base class combining all mixin interfaces (LLMManagerMixin, ChainManagerMixin, ToolManagerMixin, RetrieverManagerMixin, RunManagerMixin)
AsyncCallbackHandler – Async variant supporting coroutine-based event handlers
CallbackManager – Orchestrates handler invocation and manages run hierarchies
Run Managers – Specialized managers for LLM, Chain, Tool, and Retriever runs

Event Lifecycle & Run Hierarchy

Every component execution creates a run with a unique UUID. Runs form a tree structure via parent_run_id, enabling hierarchical tracing. The callback system fires events at key points:

# LLM lifecycle
on_llm_start(serialized, prompts, run_id, parent_run_id, tags, metadata)
on_llm_new_token(token, chunk, run_id)  # Streaming only
on_llm_end(response, run_id)
on_llm_error(error, run_id)

# Chain lifecycle
on_chain_start(serialized, inputs, run_id, parent_run_id, tags, metadata)
on_chain_end(outputs, run_id)
on_chain_error(error, run_id)

# Tool & Retriever follow similar patterns

Tags and metadata propagate through the hierarchy, allowing filtering and enrichment at any level.

Tracers & LangSmith Integration

Tracers are specialized callback handlers that persist run data. Key implementations:

LangChainTracer – Sends runs to LangSmith for visualization and debugging
ConsoleCallbackHandler – Prints run events to stdout
LogStreamCallbackHandler – Streams run logs for real-time monitoring
EvaluatorCallbackHandler – Evaluates run outputs against criteria

Enable LangSmith tracing via environment variables:

LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY=your_key
LANGCHAIN_PROJECT=project_name

Standard Test Framework

The langchain_tests package provides standardized test suites for integrations. Tests are organized into unit and integration categories:

# Integration test example
from langchain_tests.integration_tests import ChatModelIntegrationTests

class TestMyChatModel(ChatModelIntegrationTests):
    @property
    def chat_model_class(self):
        return MyChatModel
    
    @property
    def chat_model_params(self):
        return {"model": "my-model", "temperature": 0}

Key Features:

BaseStandardTests – Enforces test consistency; prevents test deletion or override without @pytest.mark.xfail
Parametrized Tests – Cover happy paths, edge cases, and error conditions
Async Support – Tests both sync and async implementations
Fixture-Based – Uses pytest fixtures for model instantiation and configuration

Custom Callbacks & Event Dispatch

Create custom handlers by subclassing BaseCallbackHandler or AsyncCallbackHandler:

from langchain_core.callbacks import BaseCallbackHandler

class MyHandler(BaseCallbackHandler):
    def on_llm_end(self, response, *, run_id, **kwargs):
        print(f"LLM finished: {response}")

Pass handlers via the callbacks config parameter:

chain.invoke(input, config={"callbacks": [MyHandler()]})

Custom Events – Dispatch application-specific events:

from langchain_core.callbacks import dispatch_custom_event

dispatch_custom_event("my_event", {"data": "value"}, run_id=run_id)

Error Handling & Async Execution

Callbacks execute in a thread pool by default to avoid blocking the event loop. Set run_inline=True for synchronous execution. The raise_error flag controls whether callback exceptions propagate.

class StrictHandler(BaseCallbackHandler):
    raise_error = True  # Propagate exceptions
    run_inline = True   # Execute synchronously

The handle_event function gracefully handles NotImplementedError (e.g., falling back from on_chat_model_start to on_llm_start) and logs warnings for other exceptions.