Release v0.1.0

Release Date: 2025-01-01

🌐 English

Overview

Mini-Inference Engine v0.1.0 is the initial release with basic GEMM functionality and project infrastructure.

🚀 Basic GEMM Kernels

Naive matrix multiplication implementation
Tiled GEMM with shared memory optimization
~10-20% of cuBLAS performance

🔧 Core Infrastructure

CUDA error handling utilities
Basic benchmark framework
Project structure setup

📊 Performance

Kernel	Time (ms)	GFLOPS	vs cuBLAS
cuBLAS	0.31	6920	100%
Naive	3.10	694	10%
Tiled	1.55	1388	20%

Requirements

Dependency	Minimum
CUDA Toolkit	11.0
CMake	3.18
C++ Compiler	GCC 9 / Clang 10
GPU	Compute Capability 7.0+

🌐 简体中文

概述

Mini-Inference Engine v0.1.0 是初始版本，包含基础 GEMM 功能和项目基础设施。

🚀 基础 GEMM 内核

简单矩阵乘法实现
使用共享内存优化的分块 GEMM
~10-20% cuBLAS 性能

🔧 核心基础设施

CUDA 错误处理工具
基础基准测试框架
项目结构搭建

📊 性能

内核	时间 (ms)	GFLOPS	相对 cuBLAS
cuBLAS	0.31	6920	100%
Naive	3.10	694	10%
Tiled	1.55	1388	20%

环境要求

依赖	最低版本
CUDA Toolkit	11.0
CMake	3.18
C++ 编译器	GCC 9 / Clang 10
GPU	计算能力 7.0+

🔗 Links

v1.1.0 v1.0.0 v0.2.0