Database/Data Model Specifications
This directory contains data model and schema definitions for the Mini-Inference Engine.
Purpose
Data model specifications define:
- Data structures and their relationships
- Serialization formats
- File formats for persistence
- Memory layouts for GPU data
Current Status
The project uses custom binary formats for weight storage. Future specifications may include:
- DBML for conceptual data modeling
- Protocol buffer schemas for serialization
- Custom binary format documentation
Data Models
Weight File Format
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Weight File Layout:
═════════════════════════════════════════════════════════════
Offset Size Description
═════════════════════════════════════════════════════════════
0 32 bytes Header
├─ 0 4 bytes Magic number (0x4D494E49 = "MINI")
├─ 4 4 bytes Version (1)
├─ 8 4 bytes Number of layers
└─ 12 20 bytes Reserved
═════════════════════════════════════════════════════════════
32 Variable Layer Data (repeated for each layer)
├─ 0 4 bytes Layer type (0 = Linear)
├─ 4 4 bytes Input dimension (in_features)
├─ 8 4 bytes Output dimension (out_features)
├─ 12 4 bytes Has bias flag (0 or 1)
├─ 16 in×out×4 Weight data (row-major)
└─ ... out×4 Bias data (if has_bias)
═════════════════════════════════════════════════════════════
Core Data Structures
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// Matrix descriptor
struct MatrixDesc {
float* data; // Device pointer
int rows; // Row count M
int cols; // Column count N
int ld; // Leading dimension
bool is_transposed; // Whether transposed
};
// GEMM configuration
struct GemmConfig {
int BLOCK_M; // Tile row size
int BLOCK_N; // Tile column size
int BLOCK_K; // K dimension block size
int WARP_M; // Warp-level M blocking
int WARP_N; // Warp-level N blocking
bool use_double_buffer;
bool use_vectorized_load;
};
// Fusion operation configuration
struct FusionConfig {
bool add_bias;
bool apply_relu;
float* bias;
};
Network Architecture (MNIST)
1
2
3
4
5
6
7
8
9
Input: 784 (28x28)
↓
Linear(784, 256) + ReLU
↓
Linear(256, 128) + ReLU
↓
Linear(128, 10)
↓
Output: 10 (logits)
See AGENTS.md for development workflow.