Overview
Relevant Files
README.mdhphp/hack/README.mdhphp/hhvm/main.cpphphp/doc/hackers-guide/README.md
HHVM (HipHop Virtual Machine) is an open-source virtual machine designed to execute programs written in Hack, a statically-typed language that interoperates seamlessly with PHP. The system uses just-in-time (JIT) compilation to achieve superior performance while maintaining the development flexibility of dynamic languages.
What is Hack?
Hack is a modern programming language that combines the fast development cycle of PHP with static typing and features found in contemporary languages. The Hack typechecker provides instantaneous type checking via a local server that watches the filesystem, typically completing in under 200 milliseconds. This enables developers to integrate static analysis into their workflow without noticeable delays.
System Architecture
HHVM operates as a multi-stage compilation pipeline:
Loading diagram...
The pipeline consists of:
- Frontend (OCaml): Lexer, parser, and bytecode emitter convert source code to HHBC bytecode
- HHBBC Optimizer: Performs whole-program analysis and optimization on bytecode
- JIT Compiler (C++): Translates HHBC to machine code through HHIR and VASM intermediate representations
- Runtime (C++): Executes bytecode or compiled machine code with sophisticated memory management
Key Components
- Hack Typechecker: Static analysis engine providing fast, incremental type checking
- Bytecode Compiler (hphpc): Translates source code to HHBC format
- HHBBC: Whole-program bytecode optimizer using type inference and data flow analysis
- JIT Compiler: Translates hot code paths to optimized machine code with runtime type specialization
- Runtime VM: Executes bytecode with garbage collection, exception handling, and built-in functions
- Hack Standard Library (HSL): Modern standard library with type-safe APIs
- Server Infrastructure: Built-in Proxygen web server or FastCGI support for web hosting
Execution Modes
HHVM supports multiple execution modes:
- Standalone scripts: Run Hack/PHP files directly with
hhvm script.hack - Web server: Host applications via built-in Proxygen or FastCGI with nginx/Apache
- Ahead-of-time compilation: Pre-compile to bytecode repo for deployment
- JIT-enabled execution: Dynamically compile hot code paths to machine code
Development Workflow
The codebase is organized into distinct layers enabling clear separation of concerns. The compiler frontend handles parsing and type checking, the bytecode layer provides a portable intermediate format, the optimizer performs whole-program analysis, and the JIT compiler applies sophisticated runtime optimizations. This architecture enables HHVM to deliver both the flexibility of dynamic languages and the performance of statically-compiled systems.
Architecture & Compilation Pipeline
Relevant Files
hphp/compiler/compiler.hhphp/hhbbc/READMEhphp/doc/hackers-guide/jit-core.mdhphp/runtime/vm/jit/translate-region.hhphp/runtime/vm/jit/irlower.cpphphp/doc/bytecode.specification
HHVM's architecture is organized into distinct compilation and execution layers, each optimized for specific tasks. The system transforms source code through multiple intermediate representations before reaching machine code execution.
Compilation Pipeline Overview
The compilation pipeline consists of four major stages:
- Frontend (Hack Compiler): Parses Hack/PHP source code and emits HHBC bytecode
- HHBBC Optimizer: Performs whole-program bytecode analysis and optimization
- JIT Compiler: Translates hot code paths to machine code via HHIR and VASM
- Runtime Execution: Executes bytecode or compiled machine code
Loading diagram...
HHBC: Bytecode Intermediate Format
HHBC (HipHop Bytecode) is a stack-based intermediate representation designed for both interpretation and JIT compilation. Each source file compiles to a separate unit containing bytecode instructions and metadata. The bytecode specification defines over 200 opcodes organized into categories: basic operations, literals, arithmetic, control flow, member operations, and function calls.
Key characteristics:
- Stack-based execution model with explicit stack operations
- Metadata tables for functions, classes, and exception handlers
- Bytecode offsets for precise source location tracking
- Support for dynamic typing with runtime type information
HHBBC: Whole-Program Optimizer
HHBBC performs sophisticated analysis on complete programs before JIT compilation. It uses a fixed-point iteration algorithm that refines type information across multiple passes until convergence.
The optimizer:
- Builds an
Indexstructure containing resolved function and class information - Performs type inference via abstract interpretation on control flow graphs
- Tracks only-growing and only-shrinking type sets to enable safe optimizations
- Runs analysis passes in parallel on work units (functions/classes)
- Applies optimizations like constant propagation, dead code elimination, and function inlining
JIT Compilation: Region Selection to Machine Code
The JIT compiler translates selected bytecode regions to machine code through multiple lowering stages:
Region Selection: Identifies which bytecode to compile. The tracelet selector chooses regions based on current VM state and type information. Profile-guided optimization (PGO) enables selecting larger regions after profiling runs.
HHIR Generation: Bytecode regions are lowered to HHIR, an SSA-form intermediate representation with strong typing. Each HHIR value has a precise type (e.g., Int, Obj<=Class, Str). The irgen module contains emitters for each bytecode instruction, producing HHIR instructions that the IRBuilder optimizes during construction.
VASM Lowering: HHIR is lowered to VASM (virtual assembly), an architecture-neutral instruction set. This stage performs register allocation, instruction selection, and architecture-specific optimizations. Separate lowering passes handle x86-64 and ARM64 specifics.
Code Emission: VASM is finally emitted to native machine code, with relocation and metadata generation for debugging and profiling.
Type System
HHIR's type system represents sets of values with precise distinctions:
- Primitive types:
Int,Bool,Dbl,Str,Arr,Obj,Cls,Func - Specialized types:
Obj<=Class(object of specific class or subclass),Arr(kind)(array of specific kind) - Constant types:
Int<5>,Str("hello") - Reference-counted variants:
CountedStr,PersistentStr,BoxedT,PtrToT
Type comparisons use set semantics: S <= T means S is a subtype of T. This enables precise type-based optimizations while maintaining soundness.
Execution Modes
HHVM supports multiple execution strategies:
- Bytecode Interpretation: Direct HHBC execution for rarely-used code
- JIT Compilation: Translates hot code to optimized machine code
- Repo Authoritative Mode: Pre-compiles entire programs for deployment
Hack Typechecker & Static Analysis
Relevant Files
hphp/hack/src/hh_server.ml- Main typechecker daemonhphp/hack/src/hh_client.ml- Client interface to typecheckerhphp/hack/src/hh_single_type_check.ml- Single-file type checkinghphp/hack/src/parser/- Full-fidelity parser (Rust & OCaml)hphp/hack/src/naming/- Naming phase and NASThphp/hack/src/decl/- Declaration phasehphp/hack/src/typing/- Type checking and TASThphp/hack/hhi/- Hack header interface files
The Hack typechecker is a sophisticated static analysis system that enforces Hack's type system with sub-200ms latency. It operates as a daemon (hh_server) that watches the filesystem and performs incremental type checking, making it practical for real-time IDE integration.
Architecture Overview
Loading diagram...
The typechecker pipeline consists of four main phases:
- Parsing: Full-fidelity parser (written in Rust) converts source code to an Abstract Syntax Tree (AST)
- Naming: Elaboration phase resolves all names and produces a Named AST (NAST)
- Declaration: Extracts type signatures for classes, functions, and constants
- Typing: Performs type inference and checking, producing a Typed AST (TAST)
Key Components
hh_server is the daemon process that maintains an in-memory representation of the codebase. It uses Watchman to monitor file changes and performs incremental rechecking of affected files. The server maintains a dependency graph to determine which files need rechecking when a file changes.
hh_client is the user-facing interface. It communicates with hh_server via socket, sending commands like check, hover, find-refs, and autocomplete. For batch operations, hh_single_type_check performs standalone type checking without a daemon.
Parser is a full-fidelity recursive descent parser that preserves all source information including whitespace and comments. It's implemented in Rust for performance and can operate in declaration mode (fast, extracts only signatures) or full mode (complete AST).
Naming resolves all identifiers to their definitions. It handles namespace resolution, validates that all referenced symbols exist, and produces the NAST which is the input to type checking.
Declaration Phase extracts type information from class and function definitions without analyzing their bodies. This allows the typechecker to understand the public API of all files before type checking any function bodies.
Typing is the core type checking engine. It performs bidirectional type inference, constraint solving, and error reporting. The TAST (Typed AST) annotates every expression with its inferred type.
Data Structures
- AST: Raw syntax tree from parser, preserves all source structure
- NAST: Named AST with resolved identifiers, input to type checking
- TAST: Typed AST with inferred types on every expression, enables IDE features
- Shallow Decls: Fast type signatures extracted during declaration phase
- Folded Decls: Complete class information including inherited members
Incremental Checking
The typechecker uses a dependency graph to track which files depend on which declarations. When a file changes, only files that transitively depend on its declarations are rechecked. This enables fast incremental updates even in large codebases.
Configuration
Type checking behavior is controlled via .hhconfig files in the project root. Options include strict mode enforcement, experimental features, and various type system strictness levels. The typechecker supports multiple modes: strict (most restrictive), partial (mixed typed/untyped), and decl (declaration-only).
Bytecode Compiler (hphpc)
Relevant Files
hphp/compiler/compiler.cpphphp/compiler/package.hhphp/compiler/option.hhphp/hhbbc/hhbbc.hhphp/runtime/vm/unit-emitter.hhphp/runtime/vm/hackc-translator.cpp
The bytecode compiler (hphpc) is the core component that transforms PHP/Hack source code into HHBC (HipHop Bytecode), an intermediate representation that HHVM executes. It orchestrates parsing, code generation, optimization, and serialization.
Compilation Pipeline
The compilation process follows these stages:
-
File Discovery & Packaging - The
Packageclass scans directories and collects source files based on include patterns, static files, and exclusion rules. It manages file metadata and symbol references for on-demand parsing. -
Parsing & Bytecode Generation - HackC (the Rust-based Hack compiler) parses source code and generates HHBC. The
hackc_compilefunction invokes HackC, which produces ahhbc::Unitcontaining functions, classes, constants, and type definitions. -
Translation to UnitEmitter - The
hackc-translator.cppconverts HackC's internal representation intoUnitEmitterobjects.UnitEmitteris the pre-runtime representation that holds bytecode, metadata, and symbol information before runtime instantiation. -
Whole-Program Optimization (HHBBC) - If enabled, the HHBBC optimizer performs whole-program analysis and optimization on all
UnitEmitterobjects. It analyzes type information, eliminates dead code, and optimizes bytecode across unit boundaries. -
Emission & Serialization - Optimized
UnitEmitterobjects are either serialized to a bytecode repository or used to create runtimeUnitobjects.
Key Components
UnitEmitter - Represents a single compilation unit (typically one PHP file). It contains:
- Bytecode for functions and methods
- Class and type definitions
- Constants and literals
- Symbol references and dependencies
- SHA1 hashes for source and bytecode
Package - Manages file discovery and compilation configuration:
- Scans directories for source files
- Applies include/exclude patterns
- Tracks symbol references for parse-on-demand
- Coordinates with extern_worker for distributed parsing
HHBBC (HipHop Bytecode to Bytecode Compiler) - Performs whole-program optimization:
- Analyzes type information across all units
- Eliminates unreachable code
- Optimizes function calls and property access
- Refines type inference based on global analysis
Compilation Modes
The compiler supports different output modes controlled by Option flags:
GenerateBinaryHHBC- Produces a serialized bytecode repositoryGenerateTextHHBC- Outputs human-readable bytecode dumpsGenerateHhasHHBC- Generates HHAS (HipHop Assembly) text formatNoOutputHHBC- Performs compilation without output (for validation)
Distributed Compilation
For large codebases, hphpc uses extern_worker for distributed parsing and indexing:
- Files are grouped and sent to worker processes
- Each worker parses independently and returns
UnitEmitterobjects - Results are aggregated and passed to HHBBC for optimization
- Configurable thread count via
ParserThreadCountoption
HHBBC: Bytecode Optimizer
Relevant Files
hphp/hhbbc/main.cpphphp/hhbbc/analyze.cpphphp/hhbbc/optimize.cpphphp/hhbbc/dce.cpphphp/hhbbc/index.hhphp/hhbbc/whole-program.cpp
HHBBC (HipHop Bytecode to Bytecode Compiler) is a whole-program bytecode optimizer that runs after the Hack compiler emits HHBC. It performs sophisticated type inference and optimization passes to improve runtime performance.
Architecture Overview
HHBBC operates in three main phases:
- Parse Phase: Converts UnitEmitters into an internal representation (php::Func, php::Class, php::Block structures)
- Analysis Phase: Performs iterative type inference and dependency tracking
- Optimization Phase: Applies bytecode transformations based on inferred types
Loading diagram...
Whole-Program Analysis
The core algorithm uses fixed-point iteration to refine type information:
- Initial Pass: Analyze all functions and classes in parallel, recording dependencies on Index queries
- Update Step: Single-threaded update of Index with newly inferred types
- Dependency Scheduling: Re-analyze any function that queried information that changed
- Repeat: Continue until Index reaches a fixed point (no new information discovered)
This approach ensures type information is never incorrect—only progressively refined. Functions that only grow types (like return types) are analyzed in context, while types that shrink (like property types) are stored in the Index.
Optimization Passes
After reaching a fixed point, the optimizer applies per-function transformations:
Iterator Optimization: Converts normal iterators to local iterators (liters) when the iterator base is stored in a local that isn't modified across the iteration loop.
Local DCE (Dead Code Elimination): Within each block, removes instructions whose results aren't used, assuming all variables are live at block exit.
Global DCE: Across all blocks, removes dead code using liveness analysis. Can change local types, triggering re-analysis.
Control Flow Optimization: Simplifies the CFG by merging blocks, removing unreachable code, and converting conditional jumps to unconditional ones when branches are identical.
Constant Propagation: Replaces instructions with constant values when types are narrowed to specific constants.
Type System
The type inference engine uses a forward dataflow analysis on an abstract interpreter:
- Tracks types of locals and eval stack values through each block
- Handles exceptional control flow by propagating pre-instruction state to throw edges
- Supports constant values in types, enabling constant propagation
- Distinguishes reference-counted vs. non-reference-counted values (important for copy-on-write optimization)
Key Data Structures
Index: Central repository of whole-program information. Stores inferred return types, class constants, property types, and dependency metadata. Thread-safe for concurrent reads during analysis.
FuncAnalysis: Per-function analysis result containing inferred types for each block's entry state, bytecode updates, and dependency information.
BlockData: Per-block state tracking, including reverse-post-order numbering and input/output type states.
Parallelization
Analysis passes run in parallel across all functions and classes—no thread synchronization needed since analysis only reads the immutable php representation and queries the thread-safe Index. The update step is single-threaded to safely modify the Index without locks.
JIT Compiler & Code Generation
Relevant Files
hphp/runtime/vm/jit/mcgen-translate.cpphphp/runtime/vm/jit/vasm-internal-inl.hhphp/runtime/vm/jit/region-tracelet.cpphphp/runtime/vm/jit/vasm-x64.cpphphp/runtime/vm/jit/vasm-xls.cpp
The JIT compiler transforms selected bytecode regions into optimized machine code through a multi-stage pipeline. This process balances compilation speed with runtime performance through careful region selection, intermediate representation optimization, and architecture-specific code generation.
Region Selection & Tracelet Formation
The tracelet selector identifies which bytecode sequences to compile. It uses selectTracelet() to form regions based on current VM state, live type information, and execution frequency. Regions are bounded by configuration limits (MaxRegionInstrs, MaxLiveRegionInstrs) to control compilation time. The selector performs eager or lazy type guarding depending on the number of live locations, always eagerly guarding MBase (memory base) to catch type mismatches early.
HHIR Generation Pipeline
Bytecode regions are lowered to HHIR (HipHop Intermediate Representation), a strongly-typed SSA-form IR. The irgen module contains emitters for each bytecode instruction, producing HHIR instructions that the IRBuilder optimizes during construction. Key structures include:
- IRUnit: Container for all IR blocks and instructions for a compilation unit
- IRGS: IR generation state tracking the current block, stack state, and type information
- IRBuilder: Constructs IR incrementally, performing parse-time optimizations and type analysis
The IR represents all operations with precise types (e.g., Int, Obj<=Class, Str), enabling type-safe optimizations and eliminating runtime type checks where possible.
Vasm: Virtual Assembly
HHIR is lowered to Vasm, a virtual assembly layer abstracting architecture differences. Vasm uses virtual registers (Vreg) and architecture-neutral instructions (Vinstr). A Vunit contains blocks of Vasm instructions organized into three code areas: main, cold, and frozen. This separation enables profile-guided code layout optimization.
Register Allocation with XLS
The Extended Linear Scan (XLS) algorithm allocates physical registers to virtual registers. The process:
- Liveness Analysis: Computes which values are live at each instruction
- Interval Building: Creates lifetime intervals for each Vreg with use positions
- Register Assignment: Greedily assigns physical registers, splitting intervals when necessary
- Spill Resolution: Inserts spill/reload instructions for values exceeding available registers
XLS handles register constraints, hints from copy instructions, and SIMD values. Spill slots are allocated in multiples of 16 bytes to maintain alignment.
Architecture-Specific Lowering & Emission
For x64, lowerForX64() transforms abstract Vasm instructions into concrete x64 operations. The Vgen template emits actual machine code, handling calling conventions, memory addressing modes, and instruction selection. Block layout is optimized using profile-guided ordering (pgoLayout) for Optimize translations, or RPO ordering otherwise.
Loading diagram...
Optimization Phases
The pipeline includes multiple optimization passes: IR-level optimizations during generation, Vasm-level copy optimization and dead code elimination, and post-register-allocation peephole optimization. Profile-guided retranslation (retranslateAll) recompiles hot functions with larger regions and aggressive optimizations after profiling data is collected.
Runtime & Virtual Machine
Relevant Files
hphp/runtime/vm/bytecode.hhphp/runtime/vm/class.hhphp/runtime/vm/func.hhphp/runtime/base/execution-context.hhphp/runtime/vm/act-rec.hhphp/runtime/vm/unit.h
The HHVM runtime executes Hack/PHP code through a sophisticated virtual machine that combines bytecode interpretation with JIT compilation. The system is organized around three core concepts: Units (compilation units), Functions (Func), and Classes, all coordinated by the ExecutionContext.
Bytecode Execution Model
HHVM uses HHBC (HipHop Bytecode), a stack-based instruction set. The interpreter processes bytecode sequentially, with each opcode manipulating a value stack. The VmStack class manages this stack, providing methods like push(), pop(), and allocTV() for TypedValue manipulation. Bytecode execution can be triggered via enterVMAtCurPC(), which transitions control to the JIT compiler when available, or falls back to the interpreter.
Activation Records (ActRec)
Function calls are managed through ActRec structures, which represent call frames on the VM stack. Each ActRec contains:
m_sfp: Previous frame pointer (for RBP chaining)m_savedRip: Return address (native code pointer)m_funcId: Identifier of the executing functionm_callOffAndFlags: Bytecode offset and flags (LocalsDecRefd, IsInlined, AsyncEagerRet)m_thisUnsafe/m_clsUnsafe: Instance or late-bound class context
The ActRec lifecycle has three states: pre-live (during FCall setup), live (executing), and post-live (after cleanup).
Units, Functions, and Classes
A Unit represents a compiled PHP file or eval block, containing:
m_funcs: Vector of global functionsm_preClasses: Vector of class definitionsm_litstrs,m_arrays: Literal string and array pools- Metadata for source locations and type information
Func objects describe function signatures, including parameter types, return types, exception handlers, and native function pointers. Class objects define class structure, methods, properties, and inheritance relationships.
ExecutionContext and VMState
The ExecutionContext (per-request) maintains the current VMState:
pc: Program counter (bytecode offset)fp: Frame pointer (current ActRec)sp: Stack pointerjitReturnAddr: Return address from JIT code
This state is thread-local and synchronized between the interpreter and JIT compiler.
JIT Integration
The JIT compiler (jit::enterTC()) translates hot bytecode paths to native machine code. When the interpreter encounters a translation request, it calls handleTranslate() to compile the bytecode at the current offset. The JIT maintains a translation cache (TC) and uses service requests to handle runtime events like cache misses or type guard failures.
Function Calls
When executing an FCall opcode, doFCall() performs:
- Argument arity and type checking
- Generics validation
- Coeffect verification
- ActRec initialization with function metadata
- Local variable initialization
The function prologue then executes, either as JIT-compiled code or interpreted bytecode.
Server Infrastructure & Web Hosting
Relevant Files
hphp/runtime/server/http-server.hhphp/runtime/server/server.hhphp/runtime/server/http-request-handler.hhphp/runtime/server/proxygen/proxygen-server.hhphp/runtime/server/fastcgi/fastcgi-server.hhphp/runtime/server/transport.h
HHVM's server infrastructure provides a pluggable, high-performance HTTP hosting layer that supports multiple server backends and protocols. The architecture separates concerns between protocol handling, request processing, and application execution.
Core Architecture
The server infrastructure is built on a factory pattern that allows different server implementations to be registered and instantiated dynamically. The ServerFactoryRegistry maintains a mapping of server type names to factory objects, enabling new server types to be plugged in without modifying core code.
Loading diagram...
Request Handling Pipeline
The request handling flow follows a consistent pattern across all server implementations:
- Connection Acceptance: A server backend (Proxygen or FastCGI) accepts incoming connections
- Transport Creation: Each connection gets a
Transportobject that abstracts protocol details - Handler Instantiation: A
RequestHandleris created via the registered factory - Request Execution: The handler processes the request through setup, execution, and teardown phases
- Response Delivery: The transport sends the response back to the client
The RequestHandler base class defines the minimal interface: setupRequest(), handleRequest(), abortRequest(), and teardownRequest(). The HttpRequestHandler implementation handles PHP request execution, including URI resolution, static file serving, and proxy routing.
Server Backends
Proxygen Server is the modern, high-performance HTTP server built on Facebook's Proxygen library. It uses:
- Multiple worker threads with event-driven I/O (libevent)
HPHPSessionAcceptorto handle HTTP/1.1 and HTTP/2 connectionsProxygenTransportfor protocol abstraction- Connection pooling and efficient memory management
FastCGI Server enables HHVM to run as a FastCGI application server behind a web server (nginx, Apache):
FastCGIAcceptorlistens for FastCGI protocol connectionsFastCGISessionmanages individual FastCGI connectionsFastCGITransportimplements the FastCGI protocol- Supports multiplexing multiple requests over a single connection
Configuration & Lifecycle
The HttpServer class orchestrates the overall server lifecycle:
- Initializes primary and optional secondary page servers
- Manages graceful shutdown with
PrepareToStop()andStopOldServer() - Tracks server statistics and shutdown events
- Handles SSL/TLS configuration and certificate reloading
Server options include thread counts, queue sizes, socket configuration, SSL settings, and request timeouts. The warmup phase can throttle requests during startup to allow JIT compilation before full load.
Virtual Hosts & Routing
The VirtualHost system allows configuration of multiple virtual hosts with different document roots, path translations, and URL routing rules. Request URI resolution determines which virtual host handles a request and translates logical paths to filesystem paths.
Static content can be served directly from disk when enabled, bypassing PHP execution. Proxy routing allows unmatched URLs to be forwarded to upstream servers, enabling hybrid deployments.
Extensions & Standard Library
Relevant Files
hphp/runtime/ext/extension.hhphp/runtime/ext/extension-registry.hhphp/runtime/ext/extension-registry.cpphphp/hsl/srchphp/runtime/ext/hslhphp/runtime/ext/std
Extension System Architecture
HHVM's extension system provides a modular way to add native functionality to the runtime. Extensions are C++ modules that register native functions, classes, and constants with the VM. The base Extension class in extension.h defines the lifecycle and interface for all extensions.
Each extension implements virtual methods for different initialization phases: moduleLoad() (configuration), moduleInit() (startup), moduleRegisterNative() (function registration), and moduleShutdown() (cleanup). Extensions can also define thread-local and request-local initialization via threadInit() and requestInit().
The ExtensionRegistry manages all loaded extensions, handling dependency ordering, initialization sequencing, and providing lookup functions like extension_loaded() and get_loaded_extensions().
Hack Standard Library (HSL)
The Hack Standard Library is a comprehensive, well-typed standard library built into HHVM since version 4.108. It provides consistent APIs organized into namespaces for common operations.
Core HSL Namespaces:
HH\Lib\C– Container operations (count, filter, map, reduce)HH\Lib\Vec,HH\Lib\Dict,HH\Lib\Keyset– Hack array transformationsHH\Lib\Str– String manipulation with locale supportHH\Lib\Math– Mathematical functions and constantsHH\Lib\Async– Async utilities (Poll, Semaphore, LowPri)HH\Lib\IO– File and stream I/O abstractionsHH\Lib\OS– Low-level POSIX operations (sockets, file descriptors)HH\Lib\Network– TCP and Unix socket abstractionsHH\Lib\Regex– Regular expression utilitiesHH\Lib\Random– Cryptographically secure random number generationHH\Lib\Locale– Locale-aware string operations
HSL Implementation
HSL modules are implemented as extensions in hphp/runtime/ext/hsl/. Each namespace typically has a corresponding extension (e.g., hsl_io, hsl_os, hsl_random) that provides native implementations for performance-critical operations.
HSL code is embedded in the binary via systemlib files (.php and .hack files compiled into the executable). The hsl_systemlib extension loads the pure Hack implementations, while specialized extensions like hsl_os provide C++ bindings to system calls.
// Example: Using HSL for array operations
$numbers = vec[1, 2, 3, 4, 5];
$squared = Vec\map($numbers, $x ==> $x * $x);
$evens = Vec\filter($squared, $x ==> $x % 2 == 0);
Standard Extension
The standard extension provides core PHP-compatible functions for strings, arrays, files, math, and process control. It registers hundreds of native functions through registerNativeStandard() and related methods, maintaining backward compatibility with PHP while leveraging Hack's type system.
Extension Registration & Lifecycle
Loading diagram...
Extensions are registered globally via ExtensionRegistry::registerExtension(). The registry respects dependency ordering through getDeps(), ensuring extensions initialize in the correct sequence. Native functions are registered in a function table and resolved at runtime through the native function dispatch mechanism.
Testing, Debugging & Tools
Relevant Files
hphp/test/run.php- Main test runner orchestrating test executionhphp/runtime/debugger/debugger.h- Debugger core infrastructurehphp/tools/hhvm_wrapper.php- Development wrapper with debugging flagshphp/test/README.md- Test suite organization and conventionshphp/runtime/vm/debugger-hook.cpp- VM integration for debugging
Test Infrastructure
HHVM uses a comprehensive test suite organized into multiple suites: quick (high-quality, high-signal tests), slow (full-featured tests), and zend (PHP compatibility tests). Tests follow a file-based format where each test consists of a source file (.php, .hack, or .hhas) paired with expected output files (.expect or .expectf).
The test runner (hphp/test/run.php) orchestrates execution across multiple configurations including JIT mode, interpreter mode, and RepoAuthoritative mode. Tests can be executed with various options:
test/run test/quick # Quick tests with JIT
test/run -m interp -r test/slow # Slow tests in interpreter + repo mode
test/run --list-tests test/quick # List tests without running
Debugging Capabilities
HHVM provides multiple debugging interfaces. The hphpd debugger enables local and remote debugging with breakpoint support:
hhvm -m debug myscript.php # Local debugging
hhvm -m debug -h mymachine.com # Remote debugging
Within the debugger, you can set breakpoints, step through code, inspect variables, and evaluate expressions. The VSDebug extension provides IDE integration via the Debug Adapter Protocol, enabling debugging in VS Code and other compatible editors.
Diagnostic & Profiling Tools
The hhvm_wrapper.php script provides convenient shortcuts for common development tasks:
hhvm_wrapper.php -i test.php # Run with interpreter (JIT disabled)
hhvm_wrapper.php -v test.php # Dump bytecode (HHBC)
hhvm_wrapper.php --dump-hhas test.php # Dump HHAS (human-readable assembly)
hhvm_wrapper.php -g --compile test.php # Run under GDB with repo compilation
TRACE=hhir:2 hhvm_wrapper.php test.php # Trace IR generation
Tracing & IR Inspection
The TRACE environment variable enables detailed logging of internal operations. Key modules include printir (JIT IR output), hhir (high-level IR), and bcinterp (bytecode interpreter):
TRACE=printir:1,hhir:2 hhvm script.php # Trace multiple modules
HPHP_TRACE_FILE=/tmp/trace.log hhvm script.php # Write to file
Bytecode and IR can be dumped using runtime options:
hhvm -vEval.DumpBytecode=1 script.php # Dump HHBC
hhvm -vEval.DumpHhas=1 script.php # Dump HHAS
hhvm -vEval.DumpIR=2 script.php # Dump JIT IR
Performance Analysis
HHVM integrates with jemalloc profiling for memory analysis and perf for CPU profiling. The admin server provides profiling endpoints when enabled:
hhvm -m server -vHHProf.Enabled=true -vHHProf.Active=true
jeprof 'localhost:8088/pprof/heap' > profile.raw
Test execution supports JIT serialization for profiling-guided optimization and retranslation testing to validate optimization correctness.