Skip to content

GCSA2

基于广义压缩后缀数组的图索引方法,将变异图上的所有路径编码为可搜索的 索引结构,支持高效的 k-mer 搜索和精确匹配。该方法是 VG 工具包的核心 索引引擎。

PropertyValue
Purpose变异图的高效 k-mer 索引
Time ComplexityO(n)
Space ComplexityO(n)
Year2017
DifficultyAdvanced
LanguagesC++
CategoryGraph Genomics

Complexity Analysis

  • Time Complexity: O(n)
  • Space Complexity: O(n)

Performance Insight: The time complexity of this algorithm is linear (O(n)), scales linearly to TB-scale data and is suitable for streaming pipelines. Linear space can often be reduced by constant factors via sliding-window techniques.

Note: Complexity analysis is based on theoretical models. Actual runtime is affected by input scale, hardware, and implementation optimizations. Benchmark for your specific workload.

Literature & Implementation

VG · Minigraph · GraphAligner

Tags

indexing k-mer compressed-suffix-array variation-graph

Released under the MIT License.