Skip to the content.

Installation Guide

This guide documents the current recommended build paths for TensorCraft-HPC.

Prerequisites

Required

Optional

1. CUDA development flow

Use this for normal development on a CUDA machine.

cmake --preset dev
cmake --build --preset dev --parallel 2
ctest --preset dev --output-on-failure

2. Python-only / lighter CUDA flow

Use this when you mainly care about the Python extension.

cmake --preset python-dev
cmake --build --preset python-dev --parallel 2
python -m pip install -e .
python -c "import tensorcraft_ops as tc; print(tc.__version__)"

3. Heavier full build

Use this when you want the more complete release-style path, including benchmarks.

cmake --preset release
cmake --build --preset release --parallel
ctest --test-dir build/release --output-on-failure

4. CPU-only smoke validation

Use this on machines without CUDA when you only need to validate configure/install behavior.

cmake --preset cpu-smoke
cmake --install build/cpu-smoke --prefix /tmp/tensorcraft-install

In this mode, tests, benchmarks, and Python bindings are intentionally disabled.

Presets Summary

Preset Purpose
dev Recommended CUDA development preset
python-dev Lighter CUDA build focused on Python bindings
release Heavier release build with benchmarks
debug Debug-oriented CUDA build
cpu-smoke CPU-only configure/install smoke path

Python Bindings

Install from the repository root:

python -m pip install -e .
python -c "import tensorcraft_ops as tc; print(tc.__version__)"

The import name is tensorcraft_ops.

Manual Configuration

If you prefer not to use presets, start from a single-architecture CUDA build:

cmake -B build/manual -G Ninja \
  -DCMAKE_BUILD_TYPE=RelWithDebInfo \
  -DCMAKE_CUDA_ARCHITECTURES=75 \
  -DTC_BUILD_TESTS=ON \
  -DTC_BUILD_BENCHMARKS=OFF \
  -DTC_BUILD_PYTHON=ON

cmake --build build/manual --parallel 2
ctest --test-dir build/manual --output-on-failure

Adjust CMAKE_CUDA_ARCHITECTURES to match your GPU.

Compatibility Notes

Verification

Recommended validation on a CUDA machine:

cmake --preset dev
cmake --build --preset dev --parallel 2
ctest --preset dev --output-on-failure
python -m pip install -e .
python -c "import tensorcraft_ops as tc; print(tc.__version__)"

Troubleshooting

See TROUBLESHOOTING.md for build failures, architecture issues, editable install issues, and CUDA environment problems.