📋 Changelog

All notable changes to the GPU SpMV project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[1.0.0] - 2025-04-16

🎉 First Stable Release

✨ Added

Core Features

Full CSR (Compressed Sparse Row) sparse matrix format support
Full ELL (ELLPACK) sparse matrix format support
Four CUDA kernels: Scalar CSR, Vector CSR, Merge Path, ELL
Automatic kernel selection based on matrix statistics
Texture cache support with execution context reuse

Performance & Testing

Bandwidth metrics calculation
Benchmarking framework
GPU vs CPU performance comparison

Applications

PageRank graph algorithm implementation

Engineering Quality

RAII resource management (CudaBuffer, CudaTimer, SpMVExecutionContext)
Semantic error code system
Comprehensive Google Test suite
CMake Presets build configuration
GitHub Actions CI/CD

🔒 Security

Integer overflow protection
Memory safety checks

🚀 Performance

ELL Column-major coalesced memory access
Warp-level shuffle reduction
Load balancing optimization (Merge Path)

[0.1.0] - 2025-03-01

🚀 Initial Release

Basic project structure
Basic CSR matrix implementation
Simple SpMV GPU kernel
CMake build configuration

Version History

Version	Date	Status	Highlights
1.0.0	2025-04-16	Stable	First stable release, complete feature set
0.1.0	2025-03-01	Archived	Initial prototype

Migration Guide

Upgrading to 1.0.0

No breaking changes from pre-release versions.

Recommended Updates

Use Named Constants

// Old
config.block_size = 256;
   
// New
config.block_size = spmv::DEFAULT_BLOCK_SIZE;

Use Execution Context for Reuse

// Old
for (int i = 0; i < iterations; i++) {
    spmv_csr(csr, d_x, d_y, &config, cols);
}
   
// New
SpMVExecutionContext context;
for (int i = 0; i < iterations; i++) {
    spmv_csr(csr, d_x, d_y, &config, cols, &context);
}

Future Roadmap

Planned [1.1.0]

COO format support
Hybrid CSR/ELL format
Multi-GPU support
Batched SpMV operations

Under Consideration

BFloat16 precision support
Automatic format selection tuning
Python bindings

**[← Performance](performance.en)** · **[ 🏠 Home →](index.en)**