API Specifications
This directory contains API interface definitions for the Mini-Inference Engine.
Purpose
API specifications serve as the contract between components. They define:
- Function signatures
- Data types and structures
- Error codes and handling
- Interface contracts
Current Status
The project’s API is defined in header files (include/) with inline documentation.
Future API specifications may include:
- OpenAPI/Swagger definitions (for potential REST API)
- GraphQL schemas (for query interfaces)
- Protocol buffer definitions (for serialization)
API Documentation
For complete API documentation, see:
Key Interfaces
GEMM Operations
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// Basic GEMM kernels
void launch_naive_matmul(const float* A, const float* B, float* C,
int M, int N, int K, cudaStream_t stream = 0);
void launch_tiled_gemm(const float* A, const float* B, float* C,
int M, int N, int K, cudaStream_t stream = 0);
// Optimized GEMM
void launch_optimized_gemm(const float* A, const float* B, float* C,
int M, int N, int K, cudaStream_t stream = 0);
// Fused operations
void launch_fused_gemm(const float* A, const float* B, float* C,
const float* bias, int M, int N, int K,
bool add_bias, bool apply_relu, cudaStream_t stream = 0);
Inference Engine
1
2
3
4
5
6
7
class InferenceEngine {
public:
void init(int device_id = 0);
bool load_weights(const std::string& path);
void forward(const float* input, float* output, int batch_size);
void cleanup();
};
See AGENTS.md for development workflow.