Developer Guide

Name: Tiny-LLM
Author: LessUp

Development environment setup and contribution guidelines.

Development Environment
Build System
Testing
Code Style
Contributing

Development Environment

Prerequisites

Tool	Minimum	Recommended
CUDA Toolkit	11.0	12.0+
CMake	3.18	3.25+
GCC	9.4	11+
Clang	10	14+
Python	3.8	3.10+

IDE Setup

VS Code

Recommended extensions:

ms-vscode.cpptools — C/C++ extension
llvm-vs-code-extensions.vscode-clangd — Clangd language server
NVIDIA.nsight-vscode-edition — CUDA support

class="highlight">

1
2
3
4
5
6
7
// .vscode/settings.json
{
    "cmake.configureSettings": {
        "CMAKE_EXPORT_COMPILE_COMMANDS": "ON"
    },
    "C_Cpp.default.configurationProvider": "ms-vscode.cmake-tools"
}
   CLion 
 class="highlight">1
2
Settings → Build, Execution, Deployment → Toolchains
→ Add → CUDA (set CUDA path)
   Build System 
   Debug Build 
 class="highlight">1
2
3
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Debug -DBUILD_TESTS=ON
make -j$(nproc)
   Release Build with Debug Info 
 class="highlight">1
2
cmake .. -DCMAKE_BUILD_TYPE=RelWithDebInfo
make -j$(nproc)
   Sanitizer Build 
 class="highlight">1
2
3
cmake .. -DCMAKE_BUILD_TYPE=Debug \
         -DCMAKE_CXX_FLAGS="-fsanitize=address,undefined"
make -j$(nproc)
   Cross-Compilation 
 class="highlight">1
2
3
4
5
# For specific architecture
cmake .. -DCUDA_ARCH="75;80;86"

# For all common architectures
cmake .. -DCUDA_ARCH="70;75;80;86;89;90"
   Testing 
   Run All Tests 
 class="highlight">1
ctest --output-on-failure
   Run Specific Test 
 class="highlight">1
./tests/tiny_llm_test --gtest_filter="W8A16MatmulTest.*"
   Debug CUDA Kernels 
 class="highlight">1
2
3
4
5
6
7
8
# Enable CUDA launch blocking for debugging
CUDA_LAUNCH_BLOCKING=1 ./tests/tiny_llm_test

# CUDA memcheck
cuda-memcheck ./tests/tiny_llm_test

# Compute sanitizer (CUDA 11.1+)
compute-sanitizer --tool memcheck ./tests/tiny_llm_test
   Profiling 
 class="highlight">1
2
3
4
5
# Nsight Compute
ncu -o profile.ncu-rep ./test_kernel

# Nsight Systems
nsys profile -o profile.qdrep ./test_app
   Code Style 
   C++ Style Guide 
 We follow the Google C++ Style Guide with modifications:
    Rule  Convention  
 
   Naming  CamelCase for classes, snake_case for functions/variables  
  Members  trailing underscore for private members: member_  
  Constants  kCamelCase for constants, SCREAMING_SNAKE for macros  
  Indent  4 spaces (no tabs)  
  Line length  100 characters  
 
 
   Example 
 class="highlight">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
class MyClass {
public:
    explicit MyClass(int size);
    
    void doSomething(const std::string& input);
    
    int getSize() const { return size_; }
    
private:
    int size_;
    std::vector<float> data_;
};

namespace tiny_llm {

constexpr int kDefaultGroupSize = 128;

Result<float> computeValue(int input) {
    if (input < 0) {
        return Result<float>::err("Negative input");
    }
    return Result<float>::ok(std::sqrt(input));
}

}  // namespace tiny_llm
   Formatting 
 Use clang-format with the project’s .clang-format file:
 class="highlight">1
2
3
4
5
6
# Format a file
clang-format -i src/myfile.cpp

# Format all source files
find src tests kernels -name "*.cpp" -o -name "*.h" -o -name "*.cu" -o -name "*.cuh" \
    | xargs clang-format -i
   Contributing 
   Workflow 
  Fork & Clone class="highlight">1
2
git clone https://github.com/your-username/tiny-llm.git
cd tiny-llm
   Create Branch class="highlight">1
git checkout -b feature/my-feature
   Make Changes  Write code
 Add tests
 Update documentation
 
 
 Commit class="highlight">1
git commit -m "feat: add new feature"
   Push & PR class="highlight">1
git push origin feature/my-feature
  Then create a Pull Request on GitHub.
     Commit Message Format 
 Follow Conventional Commits:
 class="highlight">1
2
3
4
5
<type>(<scope>): <description>

[optional body]

[optional footer]
 Types:
  feat: New feature
 fix: Bug fix
 docs: Documentation changes
 style: Code style changes (formatting, semicolons)
 refactor: Code refactoring
 perf: Performance improvements
 test: Adding or correcting tests
 ci: CI/CD changes
 chore: Maintenance tasks
 
 Examples:
 class="highlight">1
2
3
4
5
6
7
8
9
feat(attention): add multi-head attention kernel

Implement optimized attention_decode kernel with online
softmax for improved numerical stability.

fix(kvcache): correct scale dimension calculation

The scale tensor was using incorrect dimension calculation
causing memory corruption in W8A16 matmul.
   PR Checklist 
  Tests pass locally
 Code follows style guide (clang-format)
 Documentation updated
 Commit messages follow convention
 PR description explains changes
 
   Review Process 
  All PRs require at least one review
 CI must pass (format check, build, tests)
 Address review feedback
 Squash commits if requested
 
    Languages: English  中文  
 
 
    ← API Reference  Benchmarks →  
 
 
  
  Back to top
     
   Quick Links
   Documentation
  Quick Start
 Architecture
 API Reference
 
 
  Resources
  Changelog
 Releases
 Troubleshooting
 
 
  Community
  GitHub
 Contributing
 Developer Guide
 
 
 
 
    📦 New version available!

Developer Guide

Table of Contents

Development Environment

Prerequisites

IDE Setup

VS Code

CLion

Build System

Debug Build

Release Build with Debug Info

Sanitizer Build

Cross-Compilation

Testing

Run All Tests

Run Specific Test

Debug CUDA Kernels

Profiling

Code Style

C++ Style Guide

Example

Formatting

Contributing

Workflow

Commit Message Format

PR Checklist

Review Process