Thank you for your interest in Mini-Inference Engine! This document describes how to contribute to the project.
Table of Contents
Code of Conduct
Respect all contributors
Maintain professional and constructive discussions
Accept constructive criticism
Act in the best interest of the project
How to Contribute
Reporting Bugs
Check Issues for existing reports
Create a new issue including:
Description : Clear problem description
Reproduction steps : How to reproduce
Expected behavior : What should happen
Actual behavior : What actually happened
Environment : GPU model, CUDA version, OS
Feature Requests
Describe the feature need and use case
Explain why this feature is valuable
Provide possible implementation approach (optional)
Submitting Code
1. Fork the Repository
1
2
3
git clone https://github.com/<your-username>/mini-inference-engine.git
cd mini-inference-engine
git remote add upstream https://github.com/LessUp/mini-inference-engine.git
2. Create Branch
1
2
3
4
5
# Feature branch
git checkout -b feature/my-feature
# Or fix branch
git checkout -b fix/bug-description
3. Write Code
Follow Code Style .
4. Test
1
2
3
4
5
6
7
8
# Debug + tests
cmake --preset default
cmake --build --preset default
ctest --preset default
# Release build
cmake --preset release
cmake --build --preset release
5. Commit
1
2
git add .
git commit -m "feat: add new feature"
Commit message format follows Conventional Commits :
Type
Description
feat:
New feature
fix:
Bug fix
docs:
Documentation update
perf:
Performance improvement
refactor:
Code refactoring
test:
Test-related
chore:
Build/tool changes
6. Push and Create PR
1
git push origin feature/my-feature
Then create a Pull Request on GitHub.
Development Setup
Requirements
Dependency
Minimum
Recommended
CUDA Toolkit
11.0
12.0+
CMake
3.18
3.25+
C++ Compiler
GCC 9 / Clang 10
GCC 11+
GPU
Compute Capability 7.0+
8.0+
Verify Environment
1
2
3
4
5
6
7
8
9
# Check CUDA
nvcc --version
nvidia-smi
# Check CMake
cmake --version
# Check Compiler
gcc --version
Code Style
C++ Style
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// Naming conventions
class ClassName ; // Class: PascalCase
void function_name (); // Function: snake_case
int variable_name ; // Variable: snake_case
const int CONSTANT_NAME ; // Constant: UPPER_SNAKE_CASE
int member_variable_ ; // Member: snake_case + suffix
// Indentation: 4 spaces
void function () {
if ( condition ) {
// code
}
}
// Braces: same line
if ( condition ) {
// code
} else {
// code
}
CUDA Style
1
2
3
4
5
6
7
8
9
10
11
12
// Kernel naming: snake_case
__global__ void my_kernel (...);
// Template parameters: UPPER_SNAKE_CASE
template < int BLOCK_SIZE , bool USE_SHARED >
__global__ void templated_kernel (...);
// Shared memory: s_ prefix
__shared__ float s_data [ 256 ];
// Register variables: r_ prefix
float r_sum = 0.0 f ;
Use clang-format:
1
clang-format --style = file -i <file>
Testing
Unit Tests
Every new feature needs tests:
1
2
3
4
5
6
7
8
9
10
11
12
13
#include <gtest/gtest.h>
#include "feature.h"
class FeatureTest : public :: testing :: Test {
protected:
void SetUp () override {
CUDA_CHECK ( cudaSetDevice ( 0 ));
}
};
TEST_F ( FeatureTest , BasicFunctionality ) {
EXPECT_EQ ( expected , actual );
}
For performance-related changes:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
TEST_F ( FeatureTest , Performance ) {
GpuTimer timer ;
// Warmup
for ( int i = 0 ; i < 5 ; i ++ ) {
function_under_test ();
}
// Benchmark
timer . start ();
for ( int i = 0 ; i < 20 ; i ++ ) {
function_under_test ();
}
timer . stop ();
float avg_time = timer . elapsed_ms () / 20 ;
printf ( "Average time: %.3f ms \n " , avg_time );
}
Documentation
1
2
3
4
5
6
7
8
9
10
/// @brief Execute optimized GEMM operation
/// @param A Input matrix A (M x K)
/// @param B Input matrix B (K x N)
/// @param C Output matrix C (M x N)
/// @param M Rows of A
/// @param N Columns of B
/// @param K Columns of A / Rows of B
/// @param stream CUDA stream (optional)
void launch_optimized_gemm ( const float * A , const float * B , float * C ,
int M , int N , int K , cudaStream_t stream = 0 );
README Updates
When adding new features, update README.md:
Feature list
Usage examples
API documentation
Submission Process
Review Process
Automatic checks
Compilation passes
Tests pass
Code style check
Manual review
Code quality
Design rationale
Documentation completeness
Performance validation (if applicable)
No performance regression
New optimizations are effective
Merge Requirements
All CI checks pass
At least 1 reviewer approval
No unresolved review comments
*Last Updated: 2025-04-16
Document Version: v1.1.0*