Engine API

High-level inference engine for model execution.

InferenceEngine

Orchestrates model loading and inference execution.

Constructor

typescript

constructor()

Creates a new InferenceEngine instance.

Methods

initialize

typescript

async initialize(): Promise<void>

Initialize the inference engine and GPU context.

Throws:

WebGPUNotSupportedError: If WebGPU is not available
DeviceInitializationError: If GPU initialization fails

Example:

typescript

const engine = new InferenceEngine();
await engine.initialize();

loadModel

typescript

async loadModel(modelDef: ModelDefinition): Promise<void>

Load a model definition.

Parameters:

typescript

interface ModelDefinition {
  name: string;
  layers: LayerDefinition[];
  weights: Record<string, WeightTensor>;
}

interface LayerDefinition {
  name: string;
  type: string;  // 'conv2d', 'maxpool', 'relu', 'softmax', 'dense', 'flatten', 'add', 'batchnorm2d'
  inputs: string[];
  params?: OperatorParams;
}

interface WeightTensor {
  data: Float32Array | number[];
  shape: number[];
}

InferenceEngine can dispatch add for residual connections and batchnorm2d for inference-time normalization when those operators are declared in layers.

Example:

typescript

await engine.loadModel({
  name: 'mnist',
  layers: [
    { name: 'conv1', type: 'conv2d', inputs: ['input'], params: { kernelSize: [3, 3] } },
    { name: 'relu1', type: 'relu', inputs: ['conv1'], params: {} },
    { name: 'output', type: 'softmax', inputs: ['relu1'], params: {} }
  ],
  weights: {
    'conv1.weight': { data: weightData, shape: [32, 1, 3, 3] }
  }
});

tensorFromArray

typescript

tensorFromArray(
  data: number[] | Float32Array, 
  shape: TensorShape, 
  options?: TensorOptions
): Tensor

Create an input tensor from data.

Parameters:

data: Input data
shape: Tensor shape
options: Optional tensor options (layout, label)

Example:

typescript

const imageData = new Float32Array(28 * 28);
const input = engine.tensorFromArray(imageData, [1, 1, 28, 28]);

infer

typescript

async infer(input: Tensor): Promise<Tensor>

Run inference on the loaded model.

Parameters:

input: Input tensor (must be from same context)

Returns: Output tensor

Throws:

Error: If model not loaded
Error: If input is from different GPU context

Example:

typescript

const input = engine.tensorFromArray(data, [1, 1, 28, 28]);
const output = await engine.infer(input);
const predictions = await output.download();

destroy

typescript

destroy(): void

Release all GPU resources.

Important: Always call this when done to prevent memory leaks.

ModelLoader

Utility for loading models from external sources.

Methods

loadFromJSON

typescript

async loadFromJSON(url: string): Promise<ModelDefinition>

Load model definition from JSON URL.

Example:

typescript

const loader = new ModelLoader();
const model = await loader.loadFromJSON('/models/mnist-model.json');
await engine.loadModel(model);

loadWeights

typescript

async loadWeights(url: string): Promise<Record<string, WeightTensor>>

Load weights from binary format.

Complete Example

typescript

import { InferenceEngine, ModelLoader } from 'tiny-dl-inference';

async function runInference() {
  // Initialize
  const engine = new InferenceEngine();
  await engine.initialize();
  
  try {
    // Load model
    const loader = new ModelLoader();
    const model = await loader.loadFromJSON('model.json');
    await engine.loadModel(model);
    
    // Prepare input
    const imageData = await loadImage('input.png');
    const input = engine.tensorFromArray(imageData, [1, 1, 28, 28]);
    
    // Run inference
    const output = await engine.infer(input);
    const predictions = await output.download();
    
    // Get result
    const predictedClass = argmax(predictions);
    console.log('Predicted:', predictedClass);
    
  } finally {
    // Cleanup
    engine.destroy();
  }
}

function argmax(arr: Float32Array): number {
  let max = arr[0];
  let idx = 0;
  for (let i = 1; i < arr.length; i++) {
    if (arr[i] > max) {
      max = arr[i];
      idx = i;
    }
  }
  return idx;
}

Engine API ​

InferenceEngine ​

Constructor ​

Methods ​

initialize ​

loadModel ​

tensorFromArray ​

infer ​

destroy ​

ModelLoader ​

Methods ​

loadFromJSON ​

loadWeights ​

Complete Example ​

Engine API

InferenceEngine

Constructor

Methods

initialize

loadModel

tensorFromArray

infer

destroy

ModelLoader

Methods

loadFromJSON

loadWeights

Complete Example