fq-compressor

fq-compressor is a high-performance, next-generation FASTQ compression tool designed for the sequencing era. It combines state-of-the-art Assembly-based Compression (ABC) strategies with robust industrial-grade engineering to deliver extreme compression ratios, fast parallel processing, and native random access.

Key Features

Feature	Description
Extreme Compression	Approaching theoretical limits using Assembly-based reordering and consensus generation
Hybrid Quality Compression	Statistical Context Mixing (SCM) for quality scores, balancing ratio and speed
Parallel Powerhouse	Built on Intel oneTBB with a scalable Producer-Consumer pipeline
Random Access	Native block-based format (like BGZF) enables instant access to any part of the file
Standard Compliant	Written in C++23, using Modern CMake, Conan 2.x, and GitHub Actions CI/CD

Why fq-compressor?

Traditional FASTQ compressors treat reads as independent strings and rely on general-purpose compression algorithms. fq-compressor takes a fundamentally different approach:

Reads are fragments of a genome — we exploit the biological redundancy by reordering and assembling reads before compression.
Each data stream gets a specialized compressor — sequences use ABC, quality scores use SCM, and identifiers use tokenization + delta encoding.
The archive format is designed for real-world use — independent blocks enable random access, parallel decompression, and fault isolation.

Performance at a Glance

Compiler	Compression	Decompression	Compression Ratio
GCC	11.30 MB/s	60.10 MB/s	3.97x
Clang	11.90 MB/s	62.30 MB/s	3.97x

Tested on Intel Core i7-9700 @ 3.00GHz (8 cores), 2.27M Illumina reads (511 MB uncompressed)

Get Started

Installation — build from source
Quick Start — compress your first FASTQ file
CLI Reference — all commands and options
Architecture — how it works under the hood

Introduction

fq-compressor

Key Features

Why fq-compressor?

Performance at a Glance

Get Started

results matching ""

No results matching ""