Install Now

opencv/opencv

OpenCV: Open Source Computer Vision Library

Last updated on Dec 18, 2025 (Commit: b157553)

Overview

Relevant Files
  • README.md
  • modules/core/doc/intro.markdown
  • include/opencv2/opencv.hpp
  • doc/root.markdown.in

OpenCV (Open Source Computer Vision Library) is a comprehensive, modular C++ library containing hundreds of computer vision algorithms. It provides a modern C++ API (OpenCV 2.x+) with automatic memory management, efficient data structures, and cross-platform support for image processing, video analysis, machine learning, and deep neural networks.

Core Architecture

OpenCV is organized as a collection of specialized modules, each addressing specific computer vision domains:

Foundation & Data Structures

  • core - Dense multi-dimensional arrays (Mat), basic functions, and automatic memory management
  • imgproc - Image filtering, transformations, color space conversion, and histograms

Input/Output & Visualization

  • imgcodecs - Reading and writing image files in multiple formats
  • videoio - Video capture and codec interfaces
  • highgui - Simple UI capabilities and window management

Analysis & Detection

  • video - Motion estimation, background subtraction, object tracking
  • features2d - Feature detection, descriptors, and matching
  • objdetect - Object and face detection using cascades and deep learning
  • calib3d - Camera calibration, 3D reconstruction, stereo vision

Advanced Processing

  • dnn - Deep Neural Network inference and training
  • ml - Machine learning (classification, regression, clustering)
  • photo - Advanced photo processing (denoising, inpainting)
  • stitching - Image stitching and panorama creation
  • gapi - Graph-based image processing pipeline

Key Design Principles

Automatic Memory Management - All data structures use reference counting. Copying a Mat is O(1); use .clone() for deep copies. Memory is automatically deallocated when reference count reaches zero.

Fixed Pixel Types - Arrays support 8 fundamental types (8-bit unsigned/signed, 16-bit, 32-bit int, 32-bit/64-bit float) and multi-channel variants. This design balances flexibility with performance and language binding support.

Saturation Arithmetic - Operations on 8-bit and 16-bit images automatically clamp results to valid ranges, preventing overflow artifacts.

Flexible Input/Output - Functions accept InputArray and OutputArray proxies, allowing seamless use of Mat, std::vector, Matx, or Scalar without API duplication.

Language Support

OpenCV provides bindings for multiple languages:

  • C++ - Primary implementation with full feature access
  • Python - Popular for research and rapid prototyping
  • Java - Android and desktop applications
  • JavaScript - Web-based computer vision
  • Objective-C - iOS development

Build & Deployment

The library uses CMake for cross-platform builds with optional support for GPU acceleration (CUDA, OpenCL), hardware optimization (IPP, TBB), and specialized backends (OpenVX, Vulkan). Modular design allows selective compilation of only needed components.

Architecture & Module System

Relevant Files
  • CMakeLists.txt
  • cmake/OpenCVModule.cmake
  • modules/core/include/opencv2/core.hpp
  • modules/core/include/opencv2/core/hal/interface.h
  • modules/core/src/hal_replacement.hpp

OpenCV is organized as a modular architecture where functionality is divided into independent, reusable components. Each module encapsulates specific computer vision capabilities and manages its own dependencies, sources, and tests.

Module Organization

The repository contains 20+ core modules located in modules/, each with a consistent structure:

  • include/ – Public API headers (installed with the library)
  • src/ – Implementation files (C++, CUDA, OpenCL kernels)
  • test/ – Accuracy tests
  • perf/ – Performance benchmarks
  • CMakeLists.txt – Module build configuration

Key modules include core (fundamental data structures), imgproc (image processing), dnn (deep learning), video (video analysis), and highgui (GUI/display).

CMake Module System

The build system uses a two-pass approach defined in cmake/OpenCVModule.cmake:

First Pass (Information Collection): CMake scans all module directories, collects metadata (dependencies, descriptions, class type), and builds a dependency graph. Modules are classified as PUBLIC (exported), INTERNAL (private), or BINDINGS (language wrappers).

Second Pass (Target Creation): After dependency resolution, CMake creates actual build targets. The system automatically disables modules with unresolved dependencies and propagates transitive dependencies.

ocv_add_module(modulename [INTERNAL|BINDINGS] [REQUIRED deps] [OPTIONAL deps])
ocv_glob_module_sources()
ocv_module_include_directories()
ocv_create_module()

Dependency Resolution

The system performs sophisticated dependency management:

  • Required dependencies must be available or the module is disabled
  • Optional dependencies are included if available
  • Transitive dependencies are automatically propagated
  • Circular dependencies are detected and reported as errors
  • World build mode combines all modules into a single library for efficiency

Hardware Abstraction Layer (HAL)

The HAL provides a pluggable interface for platform-specific optimizations. It sits between high-level algorithms and low-level implementations:

Algorithm Layer (e.g., cv::add)
        ↓
HAL Interface (cv_hal_add8u)
        ↓
HAL Implementations (IPP, OpenVX, FastCV, KleidiCV, etc.)
        ↓
CPU/GPU Hardware

The HAL interface is defined in modules/core/include/opencv2/core/hal/interface.h with return codes (CV_HAL_ERROR_OK, CV_HAL_ERROR_NOT_IMPLEMENTED). Implementations can be swapped at build time via CMake options like WITH_IPP, WITH_OPENVX, or WITH_FASTCV.

Module Dependencies Graph

Loading diagram...

Build Configuration

The root CMakeLists.txt orchestrates the entire build:

  1. Detects compiler, platform, and optional dependencies
  2. Configures HAL backends (IPP, OpenVX, FastCV, etc.)
  3. Calls ocv_register_modules() to process all modules
  4. Generates configuration headers and CMake config files
  5. Optionally builds documentation, tests, samples, and language bindings

Key options include BUILD_LIST (whitelist specific modules), BUILD_SHARED_LIBS (shared vs. static), and BUILD_opencv_world (monolithic library).

Core Module & Data Structures

Relevant Files
  • modules/core/include/opencv2/core/mat.hpp
  • modules/core/include/opencv2/core/types.hpp
  • modules/core/src/precomp.hpp
  • modules/core/src/matrix.cpp

OpenCV's core module provides the fundamental data structures and algorithms for image processing. The module is built around dense n-dimensional arrays and geometric primitives that enable efficient computation across CPU, GPU, and specialized hardware backends.

Mat: The Central Data Structure

The Mat class is the cornerstone of OpenCV, representing an n-dimensional dense numerical array. It can store single or multi-channel data (grayscale images, color images, tensors, etc.) with automatic memory management through reference counting.

Key characteristics:

  • Flexible dimensionality: Supports 2D matrices, 3D volumes, and higher-dimensional tensors
  • Reference-counted memory: Shallow copies share data; use clone() for deep copies
  • Continuous memory layout: Uses step arrays for efficient element access across dimensions
  • Type system: Supports all primitive types (uint8, int32, float32, float64, etc.) with channel counts

Core member variables:

int flags;           // Magic signature, continuity flag, depth, channels
int dims;            // Number of dimensions (>= 2)
int rows, cols;      // 2D dimensions (-1 if >2D)
uchar* data;         // Pointer to pixel data
MatSize size;        // Dimensional sizes
MatStep step;        // Byte offsets between rows/planes
UMatData* u;         // Unified memory data (GPU/CPU)

Element access patterns:

M.at<double>(i, j)           // Type-safe element access
M.ptr<uchar>(row)            // Raw pointer to row
M(Range(0, 10), Range::all()) // ROI slicing

Geometric Primitives

OpenCV provides template classes for common geometric types, all supporting arithmetic operations and type conversions:

Point_ <T>: 2D coordinates with dot product, cross product, and norm operations. Aliases: Point2i, Point2f, Point2d.

Point3_ <T>: 3D coordinates with 3D-specific operations like cross product.

Size_ <T>: Width and height for image/rectangle dimensions. Aliases: Size2i, Size2f.

Rect_ <T>: Axis-aligned rectangles defined by top-left corner and dimensions. Supports intersection, union, and containment checks. Aliases: Rect2i, Rect2f.

Range: Half-open interval [start, end) for slicing matrices. Range::all() represents the entire dimension.

Scalar_ <T>: 4-element vector for pixel values. Derived from Vec&lt;T, 4&gt;. Default type is Scalar (double-precision).

Vector Types

The Vec&lt;T, n&gt; template represents fixed-size vectors allocated on the stack. Common aliases include Vec2f, Vec3b, Vec4d. Vectors support element-wise operations, norms, and conversions to/from Points and Scalars.

Input/Output Array Abstraction

_InputArray and _OutputArray are proxy classes enabling polymorphic function signatures. They accept Mat, vector&lt;T&gt;, Matx, UMat, GpuMat, and other array types without explicit conversion.

void processImage(InputArray src, OutputArray dst);
// Can be called with Mat, vector, UMat, etc.

Unified Memory (UMat)

UMat provides a unified interface for CPU and GPU memory. The UMatData structure manages reference counting, memory flags, and synchronization between host and device copies. This enables transparent GPU acceleration without changing algorithm code.

Memory Management

OpenCV uses custom allocators (MatAllocator) for flexible memory strategies. The standard allocator handles CPU memory; specialized allocators support GPU, OpenCL, and other backends. Reference counting prevents memory leaks while enabling efficient shallow copies.

Loading diagram...

Type System

OpenCV encodes data types as integers: CV_MAKETYPE(depth, channels). Depths include CV_8U, CV_32F, CV_64F, etc. This enables runtime type checking and automatic format conversions across functions and backends.

Image Processing & Filtering

Relevant Files
  • modules/imgproc/include/opencv2/imgproc.hpp
  • modules/imgproc/CMakeLists.txt
  • modules/imgproc/src/filter.dispatch.cpp
  • modules/imgproc/src/morph.dispatch.cpp
  • modules/imgproc/src/smooth.dispatch.cpp

The imgproc module provides a comprehensive suite of image filtering and processing functions. Each pixel's output is computed from its neighborhood using kernels, with support for multi-channel images processed independently.

Linear Filtering

Linear filters apply weighted sums of pixel values. The module supports several approaches:

  • filter2D() - Applies arbitrary 2D convolution kernels (correlation, not true convolution)
  • sepFilter2D() - Applies separable filters for efficiency (row then column filtering)
  • blur() - Normalized box filter (averaging)
  • boxFilter() - Unnormalized box filter for computing integral characteristics
  • GaussianBlur() - Gaussian smoothing with configurable kernel size and sigma

Smoothing & Noise Reduction

Specialized filters for noise removal while preserving edges:

  • medianBlur() - Median filter (non-linear, effective for salt-and-pepper noise)
  • bilateralFilter() - Edge-preserving filter using color and spatial distance
  • stackBlur() - Fast approximation of Gaussian blur with constant time complexity

Morphological Operations

Morphological operations use structuring elements to process binary or grayscale images:

  • erode() - Applies minimum filter; shrinks white regions
  • dilate() - Applies maximum filter; expands white regions
  • morphologyEx() - Advanced operations combining erosion and dilation:
    • MORPH_OPEN - Erosion followed by dilation (removes small objects)
    • MORPH_CLOSE - Dilation followed by erosion (fills small holes)
    • MORPH_GRADIENT - Difference between dilation and erosion (edge detection)
    • MORPH_TOPHAT - Original minus opening (extracts small objects)
    • MORPH_BLACKHAT - Closing minus original (extracts small holes)

Derivative & Edge Detection

Compute image gradients and detect edges:

  • Sobel() - First/second derivatives with Gaussian smoothing
  • Scharr() - More accurate 3×3 derivative operator
  • Laplacian() - Second derivative (sum of x and y derivatives)
  • Canny() - Multi-stage edge detector with hysteresis thresholding

Key Concepts

Border Handling: Functions support multiple extrapolation methods (BORDER_REPLICATE, BORDER_CONSTANT, BORDER_REFLECT, etc.) for pixels outside image boundaries.

Depth Combinations: Output depth can differ from input. For example, 8-bit input can produce 16-bit or 32-bit output for derivatives to preserve precision.

Structuring Elements: Created via getStructuringElement() with shapes: MORPH_RECT, MORPH_CROSS, MORPH_ELLIPSE, MORPH_DIAMOND.

Performance: The module uses CPU dispatch (SSE2, AVX2, AVX512) and OpenCL acceleration. Separable filters and small kernels are optimized for speed.

// Example: Gaussian blur with edge detection
Mat src = imread("image.jpg", IMREAD_GRAYSCALE);
Mat blurred, edges;
GaussianBlur(src, blurred, Size(5, 5), 1.0);
Canny(blurred, edges, 50, 150);

Deep Neural Networks & Model Inference

Relevant Files
  • modules/dnn/include/opencv2/dnn/dnn.hpp
  • modules/dnn/src/dnn_read.cpp
  • modules/dnn/src/net.cpp
  • modules/dnn/src/layer.cpp
  • samples/dnn/README.md

The DNN module provides a comprehensive framework for loading and executing pre-trained deep neural networks. It supports multiple model formats and backends, enabling efficient inference across different hardware targets.

Core Architecture

The module is built around three main components:

  1. Net Class - Represents a complete neural network graph. It manages layer connections, data flow, and forward pass execution.
  2. Layer Class - Base class for all layer implementations. Each layer encapsulates computation logic and learned parameters (weights/biases).
  3. Blob/Mat - Data containers for network inputs, outputs, and intermediate activations. OpenCV uses cv::Mat as the primary data structure.

Model Loading

OpenCV supports loading models from multiple frameworks through dedicated reader functions:

// Auto-detect framework from file extensions
Net net = cv::dnn::readNet(model, config);

// Framework-specific loaders
Net net = cv::dnn::readNetFromCaffe(prototxt, caffemodel);
Net net = cv::dnn::readNetFromTensorflow(pb, pbtxt);
Net net = cv::dnn::readNetFromONNX(onnxfile);
Net net = cv::dnn::readNetFromDarknet(cfg, weights);
Net net = cv::dnn::readNetFromTFLite(tflitefile);
Net net = cv::dnn::readNetFromTorch(t7file);

All readers support both file paths and in-memory buffers, enabling flexible deployment scenarios.

Inference Pipeline

Loading diagram...

Backend & Target Configuration

The module supports multiple computation backends and target devices:

Backends: DNN_BACKEND_OPENCV (CPU), DNN_BACKEND_CUDA (NVIDIA GPU), DNN_BACKEND_INFERENCE_ENGINE (Intel OpenVINO), DNN_BACKEND_HALIDE, DNN_BACKEND_VULKAN, DNN_BACKEND_WEBNN, DNN_BACKEND_TIMVX, DNN_BACKEND_CANN

Targets: DNN_TARGET_CPU, DNN_TARGET_OPENCL, DNN_TARGET_CUDA, DNN_TARGET_MYRIAD, DNN_TARGET_FPGA, DNN_TARGET_HDDL, DNN_TARGET_NPU

Input Preprocessing

Before inference, inputs must be preprocessed to match model expectations:

Mat blob = cv::dnn::blobFromImage(image, scalefactor, size, mean, swapRB);
net.setInput(blob);

The module handles normalization, resizing, channel reordering, and mean subtraction automatically.

Quantization & Optimization

The module supports model quantization for reduced memory footprint and faster inference:

Net quantizedNet = net.quantize(calibrationData, CV_8S, CV_8S);

Layer fusion and Winograd optimization can be enabled to further accelerate computation on compatible hardware.

Performance Analysis

Profiling tools help identify bottlenecks:

std::vector<double> timings;
int64 totalTime = net.getPerfProfile(timings);

This returns per-layer execution times, useful for optimization and debugging.

Hardware Abstraction Layer & Optimization

Relevant Files
  • modules/core/include/opencv2/core/hal/interface.h
  • modules/core/src/hal_replacement.hpp
  • hal/openvx/hal/openvx_hal.hpp
  • hal/fastcv/CMakeLists.txt
  • hal/ipp/src/warp_ipp.cpp
  • hal/carotene/CMakeLists.txt
  • hal/kleidicv/CMakeLists.txt

Overview

The Hardware Abstraction Layer (HAL) is OpenCV's pluggable optimization framework that decouples high-level algorithms from platform-specific implementations. It enables OpenCV to leverage specialized hardware capabilities (CPUs, GPUs, DSPs) without modifying core algorithm code.

Architecture

The HAL sits between algorithm layers and hardware:

Algorithm Layer (cv::add, cv::filter2D, etc.)
        ↓
HAL Interface (cv_hal_add8u, cv_hal_filter, etc.)
        ↓
HAL Implementations (IPP, OpenVX, FastCV, KleidiCV, Carotene, RISC-V RVV)
        ↓
CPU/GPU/DSP Hardware

Core Design Principles

Pluggable Implementations: Each HAL backend implements a standardized C interface defined in interface.h. Implementations return CV_HAL_ERROR_OK on success, CV_HAL_ERROR_NOT_IMPLEMENTED to fall back to default code, or CV_HAL_ERROR_UNKNOWN on failure.

Macro-Based Dispatch: The hal_replacement.hpp file defines default no-op implementations using macros like cv_hal_add8u, cv_hal_filter, etc. Backend headers override these macros with optimized versions using #undef and #define.

Type-Safe Generics: Many operations use C++ templates to handle multiple data types (8-bit, 16-bit, 32-bit, float, double) with a single implementation.

Available HAL Backends

BackendPurposeTarget Hardware
IPPIntel Performance Primitivesx86/x64 CPUs
OpenVXKhronos standard compute APIGPUs, DSPs, heterogeneous systems
FastCVQualcomm mobile optimizationARM processors, mobile devices
CaroteneARM NEON optimizationARM NEON-capable CPUs
KleidiCVArm Compute Library integrationArm processors
RISC-V RVVRISC-V Vector ExtensionRISC-V processors
NDSRVPAndes DSP extensionAndes processors with DSP

Optimization Strategies

Conditional Fallback: HAL functions check image dimensions and data types. For small images or unsupported formats, they return CV_HAL_ERROR_NOT_IMPLEMENTED to use the default OpenCV implementation, avoiding overhead.

SIMD Intrinsics: The intrin.hpp header provides platform-specific SIMD wrappers (SSE, AVX, NEON, RVV, etc.) for vectorized operations on supported architectures.

Lazy Initialization: Complex operations like DFT and filtering use context structures (cvhalDFT, cvhalFilter2D) to cache precomputed data across multiple calls, reducing setup overhead.

Integration Pattern

// In algorithm code (e.g., cv::add)
int status = cv_hal_add8u(src1, step1, src2, step2, dst, dst_step, width, height);
if (status == CV_HAL_ERROR_OK)
    return;  // HAL handled it
// Fall back to default implementation

Build Configuration

HAL backends are conditionally compiled based on CMake flags:

cmake -DWITH_IPP=ON -DWITH_OPENVX=ON -DWITH_FASTCV=ON ..

Multiple backends can coexist; the first available implementation is used at runtime. This allows OpenCV to gracefully degrade on systems lacking specialized libraries while maximizing performance on well-equipped platforms.

Language Bindings & API Generation

Relevant Files
  • modules/python/common.cmake
  • modules/python/src2/cv2.cpp
  • modules/python/src2/gen2.py
  • modules/java/generator/gen_java.py
  • modules/js/src/core_bindings.cpp
  • modules/js/generator/embindgen.py
  • modules/objc/generator/gen_objc.py

OpenCV provides language bindings for Python, Java, JavaScript, and Objective-C through a unified code generation system. This architecture enables developers to use OpenCV's C++ API from multiple programming languages while maintaining consistency and reducing manual maintenance.

Architecture Overview

The binding generation system follows a three-stage pipeline:

  1. Header Parsing - C++ headers are parsed to extract classes, functions, enums, and constants
  2. Code Generation - Language-specific generators create wrapper code for each target language
  3. Compilation & Linking - Generated code is compiled and linked with OpenCV libraries
Loading diagram...

Python Bindings

Python bindings are generated via gen2.py and compiled into the cv2 module. The system:

  • Parses C++ headers using hdr_parser.CppHeaderParser
  • Generates C++ wrapper code that interfaces with Python's C API
  • Creates type mappings between C++ and Python types (Mat <-> numpy arrays)
  • Generates typing stubs for IDE support and type checking
  • Compiles to a native extension module (.so on Linux, .pyd on Windows)

Key files: cv2.cpp (module initialization), cv2_numpy.cpp (NumPy integration), cv2_convert.cpp (type conversions)

Java Bindings

Java bindings use JNI (Java Native Interface) to bridge Java and C++. The generator:

  • Creates Java wrapper classes that mirror C++ class hierarchies
  • Generates JNI glue code for method calls and data marshaling
  • Handles type conversions (Java primitives <-> C++ types)
  • Supports method overloading through suffix numbering
  • Generates both .java source files and .cpp JNI implementations

The system maintains a type dictionary mapping C++ types to Java equivalents (e.g., cv::Mat <-> long nativeObj reference).

JavaScript Bindings

JavaScript bindings compile to WebAssembly using Emscripten. The embindgen.py generator:

  • Parses headers and extracts public API surface
  • Generates Emscripten binding code using emscripten::bind
  • Creates JavaScript wrapper classes for C++ objects
  • Handles memory management through Emscripten's garbage collection
  • Produces WASM modules loadable in browsers and Node.js

The core_bindings.cpp file demonstrates binding patterns for Mat, Point, Rect, and other core types.

Objective-C Bindings

Objective-C bindings target iOS and macOS platforms. The generator:

  • Creates Objective-C wrapper classes with Swift extensions
  • Generates both .h headers and .mm implementations
  • Handles C++ object lifetime through cv::Ptr&lt;T&gt; smart pointers
  • Supports Swift interoperability with type-safe extensions
  • Generates framework bundles for distribution

Configuration files (gen_dict.json) control which classes and functions are exposed per module.

Configuration & Customization

Each binding generator reads a JSON configuration file specifying:

  • Which modules to bind
  • Header file locations
  • Type mappings and conversions
  • Classes and functions to skip or rename
  • Platform-specific settings

This allows fine-grained control over the API surface exposed to each language while sharing the same C++ implementation.

Build System & Configuration

Relevant Files
  • CMakeLists.txt - Root CMake configuration
  • cmake/OpenCVUtils.cmake - Utility macros and hooks system
  • cmake/OpenCVCompilerOptions.cmake - Compiler flags and optimization settings
  • cmake/templates/OpenCVConfig.cmake.in - Package configuration template
  • cmake/OpenCVModule.cmake - Module definition and registration
  • cmake/OpenCVInstallLayout.cmake - Installation directory structure

OpenCV uses CMake as its primary build system, supporting cross-platform compilation from Windows to embedded ARM systems. The build configuration is highly modular, allowing fine-grained control over features, dependencies, and optimization levels.

Core Build Architecture

The root CMakeLists.txt orchestrates the entire build process through several phases:

  1. Initialization - Enforces out-of-source builds, sets CMake policies for compatibility
  2. Detection - Identifies compiler, platform, and available dependencies
  3. Configuration - Processes 100+ build options (WITH_CUDA, BUILD_SHARED_LIBS, etc.)
  4. Module Registration - Discovers and registers OpenCV modules
  5. Finalization - Generates configuration files and installation rules

Build Options & Features

OpenCV provides extensive customization through CMake options:

Core Options:

  • BUILD_SHARED_LIBS - Build dynamic libraries instead of static
  • CMAKE_BUILD_TYPE - Release (default) or Debug
  • ENABLE_PIC - Position-independent code for shared libraries

Acceleration & Hardware:

  • WITH_CUDA - NVIDIA GPU support (requires CUDA toolkit)
  • WITH_OPENCL - GPU compute via OpenCL
  • WITH_IPP - Intel Performance Primitives
  • WITH_OPENVX - OpenVX acceleration framework
  • WITH_HALIDE - Halide compiler backend

3rd-Party Libraries:

  • BUILD_ZLIB, BUILD_PNG, BUILD_JPEG - Build from source or use system libraries
  • WITH_FFMPEG - Video codec support
  • WITH_PROTOBUF - Protocol buffers for DNN module

Platform-Specific:

  • ANDROID_ABI - ARM, ARM64, x86, x86_64
  • ENABLE_NEON - ARM NEON intrinsics
  • WITH_CAROTENE - ARM acceleration library

Compiler Configuration

The OpenCVCompilerOptions.cmake file manages compiler-specific settings:

# Example: Enable ccache for faster rebuilds
cmake -DENABLE_CCACHE=ON ..

# Example: Custom compiler flags
cmake -DOPENCV_EXTRA_CXX_FLAGS="-march=native" ..

Key features include:

  • ccache Integration - Automatic caching of compilation results
  • Precompiled Headers - Faster builds on MSVC (disabled with Clang)
  • Link-Time Optimization - ENABLE_LTO for smaller, faster binaries
  • Sanitizers - Memory and address sanitizer support via OPENCV_ENABLE_MEMORY_SANITIZER

Module System

Modules are discovered via ocv_register_modules() and can be selectively built:

# Build only specific modules
cmake -DBUILD_LIST="core,imgproc,dnn" ..

Each module has its own CMakeLists.txt defining sources, dependencies, and tests. The module system supports:

  • Conditional compilation based on available dependencies
  • Inter-module dependency resolution
  • Automatic test discovery and registration

Installation & Package Configuration

The build generates OpenCVConfig.cmake for downstream projects:

# In your project
find_package(OpenCV REQUIRED core videoio)
target_link_libraries(my_app ${OpenCV_LIBS})

Installation layout is controlled by OpenCVInstallLayout.cmake, supporting:

  • Standard Unix paths (/usr/local)
  • Windows package layouts
  • Android NDK integration
  • Framework bundles on macOS/iOS

CMake Hooks System

OpenCV provides an extensibility mechanism via CMake hooks. Custom scripts can be registered at specific build phases:

# Register hooks in OPENCV_CMAKE_HOOKS_DIR
# Hooks are called at: CMAKE_INIT, PRE_CMAKE_BOOTSTRAP, POST_COMPILER_OPTIONS, etc.

This allows third-party integrations without modifying core CMake files.

Cross-Compilation

For embedded targets, platform-specific files in cmake/platforms/ configure:

  • System detection (Android, iOS, Linux, Windows)
  • Toolchain settings
  • Architecture-specific optimizations
  • Semihosting support for ARM bare-metal

Example cross-compilation:

cmake -DCMAKE_TOOLCHAIN_FILE=android.toolchain.cmake \
      -DANDROID_ABI=arm64-v8a \
      -DANDROID_NATIVE_API_LEVEL=21 ..

Video I/O & Capture

Relevant Files
  • modules/videoio/include/opencv2/videoio.hpp
  • modules/videoio/src/backend.hpp
  • modules/videoio/src/cap_interface.hpp
  • modules/videoio/doc/videoio_overview.markdown

The Video I/O module provides a unified C++ API for capturing video from cameras and files, and writing video output. It abstracts multiple backend implementations, allowing developers to work with a consistent interface regardless of the underlying platform or hardware.

Core Classes

VideoCapture is the primary class for reading video data. It supports three input modes:

  1. Camera capture by index (e.g., VideoCapture(0) for the default camera)
  2. File/stream reading from video files or image sequences
  3. Stream-based input using custom IStreamReader implementations

VideoWriter handles video output, encoding frames to disk or streaming. Both classes support backend selection via apiPreference parameter and property configuration through VideoCaptureProperties and VideoWriterProperties enums.

Backend Architecture

OpenCV uses a pluggable backend system with two types:

  • Built-in backends: Compiled directly into OpenCV (FFmpeg, GStreamer, MSMF, V4L, etc.)
  • Plugin backends: Dynamically loaded at runtime (GStreamer, FFmpeg on Linux; MediaSDK on Windows/Linux)

The backend selection follows this priority order:

  1. Modern multi-platform libraries (FFmpeg, GStreamer, MediaSDK)
  2. Platform-specific SDKs (WINRT, AVFoundation, MSMF, V4L)
  3. RGB-D sensors (OpenNI2, RealSense, OBSensor)
  4. File-based backends (image sequences, Motion JPEG)
  5. Specialized camera SDKs (DC1394, XIMEA, Aravis, uEye)

Common Usage Patterns

// Capture from default camera
cv::VideoCapture cap(0);
cv::Mat frame;
while (cap.read(frame)) {
    // Process frame
}

// Specify backend explicitly
cv::VideoCapture cap(filename, cv::CAP_FFMPEG);

// Write video with codec
cv::VideoWriter writer("output.mp4", cv::VideoWriter::fourcc('m','p','4','v'),
                       30.0, cv::Size(640, 480));
writer.write(frame);

Property Management

Both capture and writer support querying and setting properties:

  • Capture properties: Frame dimensions, FPS, position, codec, frame count
  • Writer properties: Encoder parameters, quality settings, hardware acceleration

Properties are backend-dependent; not all backends support all properties. Use get() and set() methods to access them.

Backend Configuration

Enable backends at build time using CMake options:

cmake -DWITH_GSTREAMER=ON -DWITH_FFMPEG=ON ..

For plugin backends, add to the plugin list:

cmake -DWITH_GSTREAMER=ON -DVIDEOIO_PLUGIN_LIST=gstreamer ..

Query available backends at runtime using cv::videoio_registry::getBackends(), hasBackend(), and getBackendName().

Hardware Acceleration

Some backends (MSMF, Intel MediaSDK) support hardware-accelerated encoding/decoding. The MSMF backend attempts hardware transforms by default; disable via the OPENCV_VIDEOIO_MSMF_ENABLE_HW_TRANSFORMS=0 environment variable if needed.

Advanced Features

  • Multi-stream capture: Use VideoCapture::waitAny() to efficiently wait for frames from multiple sources
  • Custom streams: Implement IStreamReader for non-file sources
  • Audio support: Some backends (FFmpeg, GStreamer) support audio alongside video
  • Image sequences: Use CAP_IMAGES backend to read/write numbered image files (e.g., img_%02d.jpg)

Graph API & Execution Framework

Relevant Files
  • modules/gapi/doc/00-root.markdown
  • modules/gapi/doc/10-hld-overview.md
  • modules/gapi/include/opencv2/gapi.hpp
  • modules/gapi/src/compiler/gcompiler.hpp
  • modules/gapi/src/executor/gabstractexecutor.hpp

G-API (Graph API) is OpenCV's graph-based execution framework designed for fast, portable image processing pipelines. Unlike traditional function-by-function OpenCV calls, G-API captures entire computation sequences as directed acyclic graphs (DAGs), enabling pipeline-level optimizations and seamless backend portability.

Core Architecture

G-API follows a three-layer architecture:

  1. API Layer — User-facing interface with G-API data types (GMat, GScalar, GArray<T>, GFrame, GOpaque<T>) and operations. Graphs are built implicitly through expressions; no actual computation occurs during graph construction.

  2. Graph Compiler Layer — Built on the ADE Framework, this layer unrolls user expressions into a bipartite graph (Data and Operation nodes), applies optimization passes, and organizes operations into execution "Islands" based on backend affinity.

  3. Backends Layer — Platform-specific implementations (CPU, Fluid, OpenCL, etc.) that execute compiled graphs optimally for their target device.

Graph Compilation Pipeline

Loading diagram...

Compilation happens in two ways:

  • ImplicitGComputation::apply() compiles and executes immediately (useful for variable input formats)
  • ExplicitGComputation::compile() returns a reusable GCompiled object (recommended for production)

Data Types & Kernels

G-API provides five dynamic data types for graph construction:

  • GMat — Image matrices (maps to cv::Mat, cv::UMat, cv::RMat at runtime)
  • GScalar — Scalar values (maps to cv::Scalar)
  • GArray&lt;T&gt; — Dynamic lists (maps to std::vector&lt;T&gt;)
  • GFrame — Media frames in various formats (NV12, I420, BGR)
  • GOpaque&lt;T&gt; — Arbitrary user types

Kernels define operation interfaces using the G_TYPED_KERNEL() macro. Each kernel specifies a signature, metadata function, and unique identifier. Multiple implementations can exist for the same kernel interface across different backends, enabling the same graph to run on CPU, GPU, or specialized hardware without modification.

Execution Model

Compiled graphs execute as stateless functions — identical inputs always produce identical outputs. The executor manages data dependencies, triggers backend-specific executables when inputs are ready, and handles cross-Island data exchange via host buffers. G-API supports both single-threaded and threaded execution modes.

Key Benefits

  • Optimization — Tiling and data locality improvements applied automatically
  • Portability — Write once, deploy anywhere via backend selection
  • Heterogeneous Processing — Mix multiple backends in a single graph
  • Streaming — Native support for continuous frame processing pipelines