Major Refactoring - v2.0.0
Date: 2026-03-09
Critical Bug Fixes
KVCache appendKV: layer-order-dependent write position logic
appendKV()had fragile write position logic that depended on layer 0 being called first- Layer 0 updated
current_len, then other layers tried to compensate by subtractingnum_tokens— broke if layers were called in any other order - Negative
write_pospossible if layer 0 hadn’t been called yet - Fix: appendKV now always writes at
current_lenregardless of layer index (stateless per-layer) - Added explicit
advanceSeqLen()method to be called ONCE after all layers have appended - Clean separation: append is stateless per-layer, length update is explicit per-step
Build System
CMakeLists.txt modernization
- Project
VERSION 2.0.0,DESCRIPTION - CUDA arch auto-detect (
nativeon CMake 3.24+, fallback to common archs) - Replaced global
include_directories()with propertarget_include_directories()(PUBLIC/PRIVATE) - Added
tiny_llm::tiny_llmALIAS target - Excluded
main.cppfrom library sources (was being linked into both library and demo executable) - Added compiler warnings (
-Wall -Wextra//W4) gtest_force_shared_crtfor MSVCCMAKE_EXPORT_COMPILE_COMMANDSfor IDE support- Removed unused RapidCheck dependency
- Wrapped tests in
BUILD_TESTSoption
Files Modified
src/kv_cache.cpp— stateless appendKV, new advanceSeqLeninclude/tiny_llm/kv_cache.h— advanceSeqLen declarationCMakeLists.txt— full modernization