Changelog
Tiny-LLM keeps a short, release-oriented changelog. This surface is reserved for meaningful external milestones.
Language Selection
- English (this page)
- 简体中文
Releases
[2.0.2] — 2026-04-27
Added
- Quantization utilities: New
quantization.h/quantization.cppwith F32→F16, Q4_0, Q8_0, W8A16 utilities - CLI enhancements:
--help,--version,--infooptions for better usability
Changed
- KVCacheManager: Factory method
create()for consistentResult<T>error handling - Code quality: Added
noexceptto simple accessors, fixed clang-format-18 violations
Infrastructure
- CI simplification with
Jimver/cuda-toolkitaction - Enhanced
.clangdLSP configuration
[2.0.1] — 2026-04-16
Fixed
- CRITICAL:
QuantizedWeightscale dimension calculation error in test utilities - Removed unused code in attention kernel (
q_regarray loading)
[2.0.0] — 2026-03-09
Changed ⚠️ BREAKING
- API Redesign: KVCache
appendKV()is now stateless with explicitadvanceSeqLen() - CMake modernization with target exports and architecture auto-detection
Added
- CI workflow with automated format checking
tiny_llm::tiny_llmCMake alias target
Migration Guide: Update any direct KVCache usage to call advanceSeqLen() after all layers.
Notes
- Infrastructure-only cleanup is intentionally excluded from this public changelog.
- For the current project story and onboarding flow, start at Home or Documentation.