Test Specifications

This directory contains BDD (Behavior-Driven Development) test specifications for the Mini-Inference Engine.

Purpose

Test specifications define:

  • Test scenarios and acceptance criteria
  • Boundary conditions to test
  • Performance benchmarks
  • Correctness properties

Current Status

Tests are implemented in C++ using Google Test framework. See tests/ directory for implementation.

Test Structure

Unit Tests (tests/)

Test File Coverage
test_gemm.cpp All GEMM kernels
test_tensor.cpp Tensor operations
test_inference.cpp InferenceEngine
test_memory_pool.cpp MemoryPool
test_stream_manager.cpp StreamManager
test_config.cpp Config
test_logger.cpp Logger
test_quantization.cpp INT8 quantization
test_fusion.cpp Fusion kernels

Correctness Properties

P1: Matrix Multiplication Correctness

For any matrices A(M×K) and B(K×N), GPU result C matches CPU reference within max error 1e-5.

Validates: R1.1, R1.2, R1.4

P2: Optimized GEMM Equivalence

For any matrices A and B, all optimized implementations produce equivalent results to Naive.

Validates: R2.4, R8.4

P3: Kernel Fusion Correctness

For any input X, weight W, bias b, fused output equals sequential MatMul + Bias + ReLU.

Validates: R6.1, R6.5

P4: Weight Serialization Round-Trip

For any valid weights, save and load produces bit-identical results.

Validates: R7.1

P5: Multi-Layer Forward Pass Consistency

For any input batch, forward pass produces equivalent results to sequential layer execution.

Validates: R7.3

P6: Dimension Mismatch Detection

For matrices where A.cols != B.rows, engine detects and reports error before computation.

Validates: R9.4

BDD Test Scenarios (Future)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Feature: GEMM Optimization

  Scenario: Tiled GEMM produces correct results
    Given matrices A and B of compatible dimensions
    When I compute C = A × B using tiled GEMM
    Then C should match CPU reference within 1e-5 error

  Scenario: Tiled GEMM handles non-square matrices
    Given matrix A of size 100×200
    And matrix B of size 200×150
    When I compute C = A × B using tiled GEMM
    Then C should have dimensions 100×150
    And C should match CPU reference within 1e-5 error

  Scenario: Fused kernel matches sequential operations
    Given input matrix X
    And weight matrix W
    And bias vector b
    When I compute fused output = ReLU(X × W + b)
    Then the result should equal sequential computation

Performance Benchmarks

See benchmarks/ directory and Performance Tuning Guide.


See AGENTS.md for development workflow.


Back to top

MIT License | A learning project for the CUDA community