Mini-ImagePipe
High-performance DAG-based GPU Image Processing Pipeline
High-performance GPU image processing framework with DAG task scheduling, multi-stream execution, and CUDA-accelerated operators. Designed for real-time video and batch image processing workflows.
Get Started View on GitHub API Reference
Why Mini-ImagePipe?
| Feature | Mini-ImagePipe | OpenCV GPU | Custom CUDA |
|---|---|---|---|
| DAG Scheduling | ✅ | ❌ | Manual |
| Memory Pool | ✅ | ⚠️ Limited | Manual |
| Multi-Stream | ✅ | ⚠️ Limited | Manual |
| Zero-copy Pipeline | ✅ | ❌ | Manual |
| Easy API | ✅ | ✅ | ❌ |
| Error Propagation | ✅ | ❌ | Manual |
Features
🚀 GPU Accelerated
Full CUDA implementation with asynchronous kernel execution. Leverages the full power of NVIDIA GPUs for real-time image processing.
🕸️ DAG Scheduling
Directed acyclic graph-based task dependency management with automatic parallelization. Optimize execution order automatically.
⚡ Multi-Stream Execution
Concurrent CUDA stream execution for independent tasks. Maximum GPU utilization through intelligent stream assignment.
🧠 Memory Efficient
Pinned and device memory pools with best-fit allocation strategy. Minimize allocation overhead across pipeline runs.
🔧 Separable Filtering
Gaussian blur optimized with separable horizontal and vertical passes. Significant performance improvement for large kernels.
🛡️ Error Propagation
Task failures automatically propagate downstream along the DAG. Robust error handling and recovery.
Performance Benchmarks
| Operator | Image Size | Throughput | Latency |
|---|---|---|---|
| GaussianBlur 5×5 | 1920×1080 | 850+ FPS | ~1.2ms |
| Sobel Edge | 1920×1080 | 1200+ FPS | ~0.8ms |
| Resize (2× down) | 1920×1080 | 1500+ FPS | ~0.7ms |
| ColorConvert | 1920×1080 | 2000+ FPS | ~0.5ms |
| Pipeline (4 ops) | 1920×1080 | 400+ FPS | ~2.5ms |
Benchmarked on NVIDIA RTX 3090. Your results may vary based on GPU model and configuration.
Pipeline Architecture
graph LR
Input[📥 Input Image] --> Resize[🔄 Resize]
Resize --> Blur[🌀 Gaussian Blur]
Blur --> Sobel[📐 Sobel Edge]
Sobel --> Output[📤 Output]
style Input fill:#76B900,stroke:#5A8F00,color:#1a1a1a
style Output fill:#76B900,stroke:#5A8F00,color:#1a1a1a
style Resize fill:#2d2d2d,stroke:#76B900
style Blur fill:#2d2d2d,stroke:#76B900
style Sobel fill:#2d2d2d,stroke:#76B900
Use Cases
📹 Real-time Video Processing
Live video filters, effects, and streaming applications
🚗 Autonomous Driving
Perception pipeline preprocessing for sensor fusion
🏥 Medical Imaging
DICOM image processing and analysis workflows
🤖 Embedded AI
Jetson platform deployment for edge computing
Quick Start
# Clone the repository
git clone https://github.com/LessUp/mini-image-pipe.git
cd mini-image-pipe
# Build with CMake presets (Release)
cmake --preset release
cmake --build --preset release
# Run tests
ctest --preset release
# Run demo
./build/demo_pipeline
Usage Example
#include "pipeline.h"
#include "operators/resize.h"
#include "operators/gaussian_blur.h"
#include "operators/sobel.h"
using namespace mini_image_pipe;
int main() {
// Configure pipeline with 4 CUDA streams
PipelineConfig config;
config.numStreams = 4;
Pipeline pipeline(config);
// Create operators
auto resize = std::make_shared<ResizeOperator>(320, 240);
auto blur = std::make_shared<GaussianBlurOperator>(GaussianKernelSize::KERNEL_5x5);
auto sobel = std::make_shared<SobelOperator>();
// Build the DAG
int n1 = pipeline.addOperator("Resize", resize);
int n2 = pipeline.addOperator("Blur", blur);
int n3 = pipeline.addOperator("Sobel", sobel);
pipeline.connect(n1, n2); // Resize → Blur
pipeline.connect(n2, n3); // Blur → Sobel
// Set input and execute
pipeline.setInput(n1, d_input, width, height, channels);
pipeline.execute();
// Get output
void* output = pipeline.getOutput(n3);
return 0;
}
GPU Architecture Support
| Architecture | Compute | Example GPUs |
|---|---|---|
| Volta | sm_70 | V100 |
| Turing | sm_75 | RTX 2080, T4 |
| Ampere | sm_80, sm_86 | A100, RTX 3090 |
| Ada Lovelace | sm_89 | RTX 4090, L40 |
| Hopper | sm_90 | H100 |
Available Operators
| Operator | Function | Features |
|---|---|---|
| GaussianBlur | Gaussian blur | 3×3/5×5/7×7 separable filter, reflection boundary padding |
| Sobel | Edge detection | 3×3 Sobel kernels, gradient magnitude output |
| Resize | Image scaling | Bilinear / nearest-neighbor interpolation |
| ColorConvert | Color conversion | RGB↔Gray, BGR↔RGB, RGBA→RGB |
Documentation
Requirements
- CMake >= 3.18
- CUDA Toolkit >= 11.0
- C++17 compatible compiler
- GTest v1.14.0 (auto-fetched via FetchContent)
License
This project is licensed under the MIT License.
Contributing
We welcome contributions! Please see our Contributing Guide for details.