Skip to content

Performance Benchmarks

This document presents MICOS-2024 performance across different dataset scales, helping users plan resource requirements.

Test Environment

  • Hardware: AMD EPYC 7742 64-Core Processor
  • Memory: 256GB DDR4
  • Storage: NVMe SSD
  • OS: Ubuntu 22.04 LTS
  • Databases: Kraken2 Standard (16GB), KneadData human_genome

Performance Data

Processing Time Comparison (hours)

72543618SmallMediumLargeX-Large
Processing Time (hours)

Memory Usage (GB)

128966432SmallMediumLargeX-Large
Peak Memory (GB)

Detailed Benchmark Data

Dataset ScaleSamplesProcessing TimeMemory UsageThreadsStorage
Small10~2 hours16GB1650GB
Medium50~8 hours32GB32200GB
Large100~15 hours64GB64500GB
X-Large500~72 hours128GB1282TB

Stage-by-Stage Performance

Quality Control Stage

StepTime %Peak MemoryParallelizable
FastQC5%2GB
KneadData35%16GB
Quality Report2%1GB

Taxonomic Classification Stage

StepTime %Peak MemoryParallelizable
Kraken225%DB size
Krona3%4GB
BIOM Conversion2%2GB

Diversity Analysis Stage

StepTime %Peak MemoryParallelizable
QIIME220%8GBPartial
Ordination5%4GBPartial
Visualization3%2GB

Resource Planning Recommendations

Minimum Configuration

  • CPU: 8 cores
  • Memory: 32GB
  • Storage: 100GB SSD
  • Use Case: Teaching demos, small datasets
  • CPU: 32 cores
  • Memory: 64GB
  • Storage: 500GB NVMe SSD
  • Use Case: Research projects, medium datasets

High-Performance Configuration

  • CPU: 64+ cores
  • Memory: 128GB+
  • Storage: 2TB NVMe SSD
  • Use Case: Production, large datasets

Optimization Tips

  1. I/O Optimization: Use NVMe SSD for FASTQ storage
  2. Memory Optimization: Load Kraken2 database into memory
  3. Parallel Optimization: Best efficiency when samples > cores
  4. Storage Optimization: Clean intermediate files to save space

MICOS-2024 whitepaper for reproducible metagenomics engineering.