CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Philosophy: OpenSpec Spec-Driven Development
This project follows OpenSpec Spec-Driven Development (SDD) framework. All code implementations must use the specification documents in the /openspec/specs/ directory as the Single Source of Truth.
OpenSpec Commands
Use these slash commands for the development workflow:
| Command | Purpose |
|---|---|
/opsx:explore |
Think through ideas, investigate problems |
/opsx:propose |
Create change with proposal, design, tasks |
/opsx:apply |
Implement tasks from a change |
/opsx:verify |
Verify implementation matches specs |
/opsx:archive |
Archive completed change |
/opsx:status |
Show status of changes and specs |
Specification Documents
/openspec/specs/product/: Product feature definitions and acceptance criteriagemm-optimization-requirements.md: Core requirements R1-R9implementation-plan.md: Implementation roadmap
/openspec/specs/architecture/: Technical design documents (RFCs)0001-core-architecture.md: System architecture design0002-memory-pool.md: Memory pool design0003-quantization.md: INT8 quantization0004-stream-manager.md: CUDA stream management0005-auto-tuner.md: Auto-tuning system0006-logger-config-profiler.md: Infrastructure components0007-half-precision-gemm.md: FP16 support0008-batch-gemm.md: Batched operations
/openspec/specs/api/: API interface definitions/openspec/specs/data/: Data schemas and model definitions/openspec/specs/testing/: BDD test specifications
See AGENTS.md for complete AI workflow instructions.
Build Commands
1
2
3
4
5
6
7
8
9
10
11
# Debug build with tests (recommended for development)
cmake --preset default
cmake --build --preset default
# Release build without tests
cmake --preset release
cmake --build --preset release
# CI build (Release + tests)
cmake --preset ci
cmake --build --preset ci
Test Commands
1
2
# Run tests (requires NVIDIA GPU)
ctest --preset default
Tests require an NVIDIA GPU with CUDA support and cannot run in CI.
Code Style
- Follow
.clang-format(Google-based, 4-space indent, 100 column limit) - Use
clang-format --style=file -i <file>to format
Naming Conventions
1
2
3
4
5
6
7
8
9
10
11
class ClassName; // PascalCase
void function_name(); // snake_case
int variable_name; // snake_case
const int CONSTANT_NAME; // UPPER_SNAKE_CASE
int member_variable_; // snake_case with trailing underscore
// CUDA specifics
__global__ void my_kernel(); // snake_case
template<int BLOCK_SIZE> // UPPER_SNAKE_CASE
__shared__ float s_data[256]; // s_ prefix for shared memory
float r_sum = 0.0f; // r_ prefix for registers
Commit Style
Follow Conventional Commits:
feat:new featuresfix:bug fixesdocs:documentationperf:performance improvementsrefactor:code refactoringtest:test changes
Environment Requirements
- CUDA Toolkit 11.0+
- CMake 3.18+
- C++17 compiler
- Target GPU architectures: 75, 80, 86, 89, 90 (Turing through Blackwell)