Overview
Relevant Files
README.mdCONTRIBUTING.mddoc/building.mdsrc/hotspotsrc/java.basemake/Main.gmktest/
OpenJDK is the open-source implementation of the Java Platform, Standard Edition (Java SE). This repository contains the complete source code for the Java Development Kit (JDK), including the HotSpot virtual machine, Java class libraries, development tools, and comprehensive test suites.
Project Structure
The JDK is organized into several major components:
-
HotSpot Virtual Machine (
src/hotspot/) - The core execution engine that interprets and compiles Java bytecode into native machine code. Includes the runtime, interpreter, JIT compiler, garbage collector, and memory management systems. -
Java Standard Library (
src/java.*,src/jdk.*) - Modular Java class libraries providing standard APIs. Key modules includejava.base(core classes),java.desktop(GUI),java.net.http(HTTP client), and many others. -
Build System (
make/,configure,Makefile) - GNU Make and autoconf-based build infrastructure that compiles the entire project across multiple platforms (Linux, Windows, macOS). -
Testing Infrastructure (
test/) - Comprehensive test suites including unit tests, integration tests, and performance benchmarks organized by component.
Getting Started
To build the JDK from source:
git clone https://git.openjdk.org/jdk
bash configure
make images
./build/*/images/jdk/bin/java -version
The build process requires a boot JDK, native compiler toolchain, and various build tools. The configure script will identify missing dependencies and suggest installation steps.
Key Concepts
Modules - The JDK uses Java's module system to organize code into logical units with explicit dependencies. Each module in src/ corresponds to a Java module.
Platform Support - The codebase supports multiple operating systems (Linux, Windows, macOS) and architectures (x86, ARM, etc.). Platform-specific code is organized in separate directories.
Build Artifacts - The build process generates a complete JDK image in build/*/images/jdk/ containing the runtime, libraries, and tools needed to run Java applications.
Contributing
The OpenJDK project welcomes contributions. See the OpenJDK Developers' Guide for detailed contribution guidelines, code review processes, and development workflows.
Architecture & Core Components
Relevant Files
src/hotspot/share/runtime- VM initialization, threading, safepoints, synchronizationsrc/hotspot/share/interpreter- Bytecode interpretation and executionsrc/hotspot/share/compiler- JIT compilation infrastructure (C1, C2, JVMCI)src/hotspot/share/memory- Memory management, metaspace, heap allocationsrc/hotspot/share/gc- Garbage collection implementations (G1, ZGC, Shenandoah, etc.)src/hotspot/share/oops- Object representation and metadata structuressrc/hotspot/share/classfile- Class loading, linking, and verification
HotSpot is a sophisticated multi-threaded virtual machine with layered execution strategies. The architecture centers on three core subsystems: the Runtime (thread management and VM operations), the Execution Engine (interpreter and JIT compilers), and Memory Management (heap, metaspace, and garbage collection).
Runtime & Thread Management
The runtime layer manages VM initialization, thread lifecycle, and synchronization. The VMThread is a special non-Java thread that executes VM operations in a single-threaded manner, ensuring consistency. JavaThreads represent user-level threads and execute Java code. The Safepoint Mechanism is critical: it brings all threads to safe points where the VM can perform operations like garbage collection or deoptimization. Threads poll safepoint counters or use memory barriers to detect when a safepoint is requested.
Synchronization uses ObjectMonitor for Java-level locks. Modern HotSpot employs a LockStack (thread-local) for lightweight locking before inflating to full monitors. The Handshake Protocol allows efficient per-thread operations without global safepoints.
Execution Engine
The execution engine has two paths:
-
Interpreter (
src/hotspot/share/interpreter) - Executes bytecode directly. The template interpreter generates platform-specific code for each bytecode, while the zero interpreter is a portable C++ implementation. -
JIT Compilers - Compile hot methods to native code:
- C1 (Client Compiler) - Fast, low-overhead compilation for quick startup
- C2 (Server Compiler) - Aggressive optimizations (escape analysis, inlining, speculative optimization)
- JVMCI - Java-based compiler interface enabling external compilers like GraalVM
The CompileBroker manages compilation tasks, using invocation counters and profiling data to decide when to compile.
Memory Management
HotSpot divides memory into:
- Java Heap - Managed by garbage collectors, stores all Java objects
- Metaspace - Stores class metadata, method bytecode, and JIT-compiled code
- Code Cache - Holds compiled methods and stubs
Class Loading (SystemDictionary) resolves and caches loaded classes. The ClassLoaderDataGraph tracks metadata for each class loader.
Garbage Collection
Multiple GC algorithms are available:
- G1GC - Generational, region-based, low-latency
- ZGC - Ultra-low pause times via concurrent marking
- Shenandoah - Concurrent compaction
- Serial/Parallel - Traditional stop-the-world collectors
All GCs coordinate with safepoints and use write barriers to track cross-generational references.
Initialization Flow
VM startup follows a strict sequence: basic types initialization → code cache setup → universe initialization (heap, stubs) → interpreter setup → compiler initialization → class loading → final system initialization. This ordering ensures dependencies are satisfied.
Loading diagram...
HotSpot Virtual Machine
Relevant Files
src/hotspot/share/runtime/thread.cppsrc/hotspot/share/interpreter/interpreter.cppsrc/hotspot/share/compiler/compileBroker.cppsrc/hotspot/share/oops/instanceKlass.cpp
HotSpot is the core Java Virtual Machine that executes Java bytecode. It combines three key execution paths: interpretation, JIT compilation, and class loading. The VM manages threads, memory, and code generation to achieve high performance.
Thread Management
HotSpot maintains multiple thread types. JavaThreads execute user Java code and can be suspended at safepoints. The VMThread is a special non-Java thread that executes VM operations in a single-threaded manner, ensuring consistency. The Safepoint Mechanism is critical: it brings all threads to safe points where the VM can perform operations like garbage collection or deoptimization. Threads poll safepoint counters or use memory barriers to detect when a safepoint is requested.
Bytecode Interpretation
The interpreter executes Java bytecode through a dispatch table mechanism. Each bytecode instruction has a corresponding handler. The interpreter maintains a bytecode pointer (BCP) and operand stack. For each instruction, it loads the next bytecode, dispatches to the appropriate handler, and updates the stack and local variables. The dispatch is highly optimized using architecture-specific assembly code that uses table-based jumps to minimize overhead.
// Bytecode dispatch pattern (simplified)
load_next_bytecode(rbx); // Load bytecode into register
advance_bcp(); // Move bytecode pointer forward
jmp dispatch_table[rbx]; // Jump to handler via table
JIT Compilation
HotSpot uses tiered compilation with two JIT compilers:
- C1 (Client Compiler): Fast compilation with basic optimizations, used for quick startup
- C2 (Server Compiler): Aggressive optimizations including escape analysis, loop unrolling, and inlining
Methods are compiled based on invocation and backedge counters. When thresholds are exceeded, compilation is triggered. The CompileBroker manages compilation queues and coordinates between interpreter and compiled code. Compiled code is stored in the code cache, which is managed with sweeping and unloading policies.
Class Loading and Linking
The class loading pipeline has three stages:
- Parsing: ClassFileParser reads the
.classfile, validates the magic number, version, and parses the constant pool, fields, and methods - Linking: Verifies bytecode correctness, resolves symbolic references, and prepares static fields
- Initialization: Executes static initializers (
<clinit>) in a thread-safe manner
The SystemDictionary maintains a global registry of loaded classes. ClassLoaderData tracks classes loaded by each classloader, enabling efficient unloading when classloaders become unreachable.
Code Cache Management
The code cache stores compiled methods and stubs. It has separate regions for different code types. When the cache fills up, the sweeper identifies and unloads cold methods. Aggressive sweeping is triggered when free space drops below a threshold, preventing cache exhaustion.
Loading diagram...
Performance Optimization
HotSpot achieves high performance through:
- Adaptive optimization: Methods are profiled during interpretation, then compiled with profile-guided optimizations
- Inlining: Frequently called methods are inlined to reduce call overhead
- Escape analysis: Objects that don't escape a method are allocated on the stack instead of the heap
- Speculative optimization: Assumptions are made and checked; if violated, code is deoptimized back to the interpreter
Garbage Collection & Memory Management
Relevant Files
src/hotspot/share/gc/shared- Shared GC infrastructuresrc/hotspot/share/gc/g1- G1 (Garbage First) collectorsrc/hotspot/share/gc/parallel- Parallel GC collectorsrc/hotspot/share/gc/serial- Serial GC collectorsrc/hotspot/share/memory/metaspace- Metaspace memory managementsrc/hotspot/share/gc/shared/cardTable.hpp- Write barrier tracking
HotSpot implements multiple garbage collection algorithms optimized for different workloads. All GCs coordinate through safepoints and use write barriers to track cross-generational references.
GC Algorithms
G1GC (Garbage First) divides the heap into fixed-size regions and uses concurrent marking with incremental compaction. It prioritizes collecting regions with the most garbage first, enabling predictable pause times. G1 maintains a remembered set per region to track cross-region references via card tables.
Parallel GC uses parallel stop-the-world collection with generational heap layout (young and old generations). Young collections use scavenging; full collections use parallel mark-compact. It scales well on multi-core systems through worker threads.
Serial GC is a single-threaded stop-the-world collector with generational heap organization. It uses mark-sweep-compact for full collections and is suitable for small heaps or single-core systems.
ZGC and Shenandoah are low-latency collectors using concurrent marking and evacuation with load barriers or snapshot-at-the-beginning write barriers to enable concurrent object movement.
Write Barriers & Card Tables
Write barriers intercept object reference assignments to track cross-generational pointers. The CardTable divides the heap into fixed-size cards (typically 512 bytes). When a reference is written, the corresponding card is marked dirty. During young GC, only dirty cards in the old generation are scanned, avoiding full heap traversal.
// Card table entry: 1 byte per card
// dirty_card = 0, clean_card = -1
CardTable::CardValue* card_table_base = ...;
uintptr_t card_index = address >> CardTable::card_shift();
card_table_base[card_index] = CardTable::dirty_card_val();
Metaspace Memory Management
Metaspace stores class metadata (Klass objects, method data, bytecode) outside the Java heap. It uses arena-based allocation with hierarchical chunk management:
- MetaspaceArena - Per-ClassLoader allocation arena with pointer-bump allocation
- MetaChunk - Fixed-size chunks organized in a linked list
- ChunkManager - Manages free chunk freelists by size
- VirtualSpaceNode - Manages contiguous virtual memory regions
Allocation attempts: (1) free block list, (2) current chunk, (3) enlarge chunk, (4) new chunk. Metaspace is reclaimed during full GC when classes are unloaded.
GC Phases
Most collectors follow: Mark (identify live objects) → Sweep/Compact (reclaim/move objects) → Update References (fix pointers). Concurrent collectors interleave marking with application threads using barriers. Parallel collectors use worker threads to parallelize marking and compaction phases.
Loading diagram...
Allocation & Promotion
Objects allocate in Eden space. After surviving young GC collections, objects promote to Old generation. TLAB (Thread-Local Allocation Buffer) reduces contention by giving each thread a private allocation buffer in Eden. When TLAB exhausts, threads allocate a new TLAB or allocate directly in shared space.
Java Standard Library Modules
Relevant Files
src/java.base/share/classessrc/java.desktop/share/classessrc/java.net.http/share/classessrc/jdk.compiler/share/classesmake/modules/
Java SE is organized into modular components since Java 9, with each module providing specific functionality and declaring explicit dependencies. The key modules are located in src/java.* and src/jdk.* directories.
Core Module: java.base
The java.base module is the foundation of the Java platform. It exports core packages including:
java.lang- Object, String, Thread, ClassLoader, reflection APIsjava.io- File I/O, streams, serializationjava.nio- Non-blocking I/O, buffers, channels, file systemsjava.util- Collections framework, utilities, service loadersjava.net- Networking (sockets, URLs, addresses)java.security- Cryptography, authentication, permissionsjava.time- Date/time APIs (LocalDate, ZonedDateTime, etc.)java.math- BigInteger, BigDecimal
All other modules depend on java.base. It also provides internal APIs (via jdk.internal.* packages) to other modules through qualified exports.
Desktop Module: java.desktop
The java.desktop module provides graphical user interface and multimedia capabilities:
java.awt- Abstract Window Toolkit (AWT) for basic GUI componentsjavax.swing- Swing framework for rich desktop applicationsjava.beans- JavaBeans component modeljavax.imageio- Image I/O and processingjavax.sound.midiandjavax.sound.sampled- Audio and MIDI supportjavax.print- Printing APIsjavax.accessibility- Accessibility support
This module depends on java.datatransfer and java.xml, and includes extensive native code for platform-specific rendering and windowing.
HTTP Client Module: java.net.http
The java.net.http module provides modern HTTP communication APIs:
java.net.http.HttpClient- High-level HTTP client supporting HTTP/1.1, HTTP/2, and HTTP/3java.net.http.HttpRequest- Immutable HTTP request builderjava.net.http.HttpResponse- HTTP response with various body handlersjava.net.http.WebSocket- WebSocket protocol support
The module includes extensive configuration options via system properties (e.g., jdk.httpclient.bufsize, jdk.httpclient.connectionPoolSize) and supports both synchronous and asynchronous request handling.
Compiler Module: jdk.compiler
The jdk.compiler module provides Java source code compilation capabilities:
com.sun.source.tree- Abstract Syntax Tree (AST) API for representing Java sourcecom.sun.source.util- Utilities for AST traversal and analysiscom.sun.tools.javac- The javac compiler implementationjavax.tools- Standard compiler APIs (JavaCompiler, CompilationTask)
This module enables programmatic compilation, code analysis, and annotation processing. It's used by javadoc, jshell, and other tools that need to parse or compile Java code.
Module Dependencies
Loading diagram...
Build Organization
Each module is compiled separately via the build system (make/modules/). The build process:
- Generates source code - Tools generate character data, buffer classes, and module metadata
- Compiles Java classes - Javac compiles all
.javafiles in the module - Compiles native code - C/C++ code for platform-specific functionality (especially in java.desktop)
- Packages resources - Includes configuration files, data files, and native libraries
- Creates JMOD files - Packages the compiled module for linking into runtime images
Each module declares its public API via module-info.java, which specifies exports (public packages) and requires (dependencies on other modules).
Build System & Configuration
Relevant Files
Makefile- Root makefile wrapper with sanity checksconfigure- Bash wrapper for autoconf configurationmake/PreInit.gmk- Bootstrap phase, locates SPEC filemake/Init.gmk- Initialization phase, launches Main.gmkmake/Main.gmk- Main build orchestration with all targetsmake/autoconf/configure.ac- Autoconf configuration templatemake/common/MakeBase.gmk- Core makefile utilities and functions
Overview
The OpenJDK build system is a sophisticated multi-stage process using GNU Make and Autoconf. It orchestrates compilation of Java modules, native code, documentation, and test infrastructure across multiple platforms. The build is highly modular, with separate phases for different compilation stages and careful dependency management.
Build Bootstrap Process
The build follows a three-stage bootstrap:
- Makefile - Validates GNU Make version (>= 3.81) and includes
make/PreInit.gmk - PreInit.gmk - Determines build configuration by locating or creating a SPEC file
- Init.gmk - Sets up reproducible build environment and delegates to Main.gmk
- Main.gmk - Executes actual build targets with proper dependencies
This design allows the build to work on diverse platforms while maintaining consistency through configuration files.
Configuration via Autoconf
The configure script wraps autoconf-generated configuration:
bash configure [options]
Key configuration steps:
- Detects platform, toolchain, and available libraries
- Validates boot JDK and build tools
- Generates
spec.gmkcontaining platform-specific variables - Supports cross-compilation and custom tool paths
Common options: --with-boot-jdk=PATH, --with-jvm-features=FEATURES, --enable-debug-symbols
Build Phases
Main.gmk orchestrates these sequential phases:
- gensrc - Generate source files from templates
- java - Compile Java modules
- copy - Copy resources and data files
- libs - Compile native libraries
- launchers - Build executable launchers
- gendata - Generate runtime data files
- images - Package final JDK/JRE images
Each phase depends on previous phases. Use make <phase> to build up to that phase.
Common Build Targets
make images # Build complete JDK image (default)
make all # Build all images: product, test, docs
make test-tier1 # Run basic test suite
make docs # Generate all documentation
make clean # Remove build artifacts
make dist-clean # Remove all generated files including config
Append -only to skip dependencies: make java-only builds only Java compilation without gensrc.
Build Output Structure
Build artifacts are organized by configuration:
build/
linux-x64/ # Platform-specific output
spec.gmk # Configuration variables
support/ # Intermediate build files
images/ # Final JDK/JRE images
jdk/
test/
docs/
The OUTPUTDIR variable controls output location; use make OUTPUTDIR=/custom/path to override.
Makefile Infrastructure
The build provides reusable makefile functions in make/common/:
- MakeBase.gmk - Core utilities: named parameters, variable dependencies, tool execution
- JavaCompilation.gmk - Java module compilation templates
- NativeCompilation.gmk - C/C++ compilation with proper flags
- JarArchive.gmk - JAR creation and packaging
- Modules.gmk - Module dependency resolution
These functions use named parameters for clarity: $(call SetupJavaCompilation, NAME=..., SRC=..., ...)
Reproducible Builds
The build supports reproducible builds via:
- Fixed timestamps for all generated files
- Deterministic ordering of inputs
SOURCE_DATE_EPOCHenvironment variable support- Consistent compiler flags across builds
Use make ENABLE_REPRODUCIBLE_BUILD=true for reproducible output.
Testing & Quality Assurance
Relevant Files
test/jdk- JDK regression teststest/hotspot- HotSpot VM tests (jtreg and gtest)test/langtools- Compiler and language tools teststest/lib- Shared test utilities and helpersmake/RunTests.gmk- Test execution frameworktest/failure_handler- Diagnostic data collection on failures
The JDK uses a comprehensive, multi-framework testing infrastructure designed to validate the entire platform across different components and configurations.
Test Frameworks
The repository employs three primary test frameworks, each suited to different testing needs:
JTReg is the primary framework for regression testing, used for the vast majority of JDK tests. It runs Java-based tests with support for both same-VM and separate-VM execution modes. Tests are organized into groups (e.g., jdk_lang, jdk_util, hotspot_gc) and tiered levels (tier1, tier2, tier3) for progressive validation.
Google Test (GTest) provides unit testing for HotSpot VM internals written in C++. Located in test/hotspot/gtest/, these tests cover memory management, garbage collection, compiler infrastructure, and runtime systems. GTest tests are compiled into a native executable (gtestLauncher) and executed separately from Java tests.
Microbenchmarks use the Java Microbenchmark Harness (JMH) framework to measure performance characteristics. Located in test/micro/, these benchmarks help track performance regressions and validate optimization changes.
Test Organization
Tests are organized hierarchically by component and functionality:
- JDK tests (
test/jdk/) cover standard library APIs, security, networking, and tools - HotSpot tests (
test/hotspot/jtreg/) validate VM behavior, garbage collection, and compilation - Language tools tests (
test/langtools/) test javac, javadoc, and related tools - Test libraries (
test/lib/) provide shared utilities likeAsserts,Platform,Utils, andJDKToolLauncher
Running Tests
The make-based test framework simplifies test execution:
make test-tier1 # Run tier1 tests
make test TEST=jdk_lang # Run specific test group
make test-only TEST=gtest:LogTagSet # Run without rebuilding
make test TEST="micro:java.lang.reflect" MICRO="FORK=1"
Problem Lists & Quarantine
ProblemList.txt files in each test directory list tests that should be skipped due to known issues. Entries include platform-specific labels (generic-all, linux-x64, windows-aarch64) and bug references. This allows tests to remain in the codebase while being excluded from CI runs.
Failure Handling
The failure handler (test/failure_handler/) automatically gathers diagnostic information when tests fail or timeout. It collects system state, process information, core dumps, and environment details, generating HTML reports alongside test results for post-mortem analysis.
Test Utilities
The test/lib/jdk/test/lib/ package provides essential testing utilities:
- Asserts - Custom assertion methods for test validation
- Platform - Platform detection (OS, architecture, VM mode)
- Utils - General utilities (random seeds, port finding, condition waiting)
- JDKToolLauncher - Simplified JDK tool invocation
- ProcessTools - Process execution and monitoring
Tiered Testing Strategy
Tests are organized into tiers for efficient validation:
- Tier1 - Fast, essential tests run on every commit
- Tier2 - Broader coverage, run regularly
- Tier3 - Comprehensive tests, run before releases
This tiered approach balances thoroughness with feedback speed during development.
JVM Interfaces & Native Integration
Relevant Files
src/hotspot/share/prims/jni.cppsrc/hotspot/share/prims/jvmti.xmlsrc/hotspot/share/include/jvm.hsrc/hotspot/share/jvmci/jvmciRuntime.cppsrc/hotspot/share/prims/nativeLookup.cppsrc/java.base/share/native/include/jni.h
The JVM exposes three primary native interfaces that allow external code to interact with the runtime: JNI (Java Native Interface), JVMTI (JVM Tool Interface), and JVMCI (JVM Compiler Interface). These interfaces form the bridge between Java code and native C/C++ implementations.
Java Native Interface (JNI)
JNI enables Java code to call native C/C++ functions and vice versa. The interface is defined in jni.h and implemented in jni.cpp. Key components include:
- JNIEnv: A thread-local environment pointer providing access to JNI functions for object manipulation, method invocation, and exception handling.
- JavaVM: A global VM reference with invocation interface functions (
DestroyJavaVM,AttachCurrentThread,DetachCurrentThread,GetEnv,AttachCurrentThreadAsDaemon). - Type Mapping: JNI defines primitive types (
jint,jlong,jboolean) and object references (jobject,jclass,jstring,jarray).
Native method lookup follows the JNI specification: method names are mapped from Java declarations to C symbols using a standardized escaping scheme. For example, Java_java_lang_System_currentTimeMillis maps to the native implementation. The NativeLookup class handles this resolution, supporting both standard library functions and dynamically loaded native libraries.
JVM Tool Interface (JVMTI)
JVMTI provides a comprehensive API for monitoring and controlling JVM execution, primarily used by debuggers, profilers, and monitoring agents. The interface is defined declaratively in jvmti.xml and generated into C headers via XSL transformations.
Key capabilities include:
- Event Callbacks: Agents register callbacks for VM events (class loading, thread creation, method entry/exit, garbage collection).
- Capabilities Model: Agents declare required capabilities at load time; the VM enables only requested functionality to optimize performance.
- Thread Control: Suspend/resume threads, inspect stack traces, and manage thread state.
- Heap Inspection: Iterate over heap objects, tag objects for tracking, and analyze object references.
- Class Instrumentation: Retransform class bytecode at runtime for dynamic instrumentation.
Agents are loaded via the Agent_OnLoad entry point and can attach to running VMs via Agent_OnAttach. The JvmtiEnv structure provides the agent-facing API, while internal JvmtiEnvBase classes manage state and event delivery.
JVM Compiler Interface (JVMCI)
JVMCI enables external compilers (like GraalVM) to integrate with HotSpot's compilation pipeline. It provides bidirectional communication: the VM can request compilation from external compilers, and compilers can query VM metadata and install compiled code.
Core features:
- Compiler Integration: External compilers implement the JVMCI compiler interface and are invoked by the compilation broker.
- Metadata Access: Compilers query class hierarchies, method signatures, and field layouts via
JVMCIEnv. - Code Installation: Compiled code is installed as nmethods with associated metadata (
JVMCINMethodData). - Shared Library Support: JVMCI can run in a separate shared library process for isolation and security.
The JVMCIRuntime class manages the JVMCI environment, handles thread attachment to the JVMCI VM, and coordinates initialization. The JNIJVMCI class provides JNI-based access to Java classes and methods from the compiler.
Native Method Resolution
The VM resolves native methods through a multi-stage lookup process:
- Check for special built-in methods (e.g.,
JVM_GetJVMCIRuntime,JVM_RegisterNatives). - Compute the JNI-compliant C symbol name from the Java method signature.
- Search loaded native libraries for the symbol.
- If not found, throw
UnsatisfiedLinkError.
This mechanism supports both static linking (built-in functions) and dynamic loading (user libraries), enabling flexible native integration.
Loading diagram...