MANGO
基于上下文建模的参考基因组无关序列压缩方法,通过学习序列局部统计特征实现基因组数据的高效压缩。 该方法无需参考基因组即可达到优秀的压缩比,适用于新物种或参考基因组不可用的场景。
| Property | Value |
|---|---|
| Purpose | 无需参考基因组的基因组序列压缩 |
| Time Complexity | O(n) |
| Space Complexity | O(n) |
| Year | 2018 |
| Difficulty | Advanced |
| Languages | C++ |
| Category | Data Compression |
Complexity Analysis
- Time Complexity:
O(n) - Space Complexity:
O(n)
Performance Insight: The time complexity of this algorithm is linear (O(n)), scales linearly to TB-scale data and is suitable for streaming pipelines. Linear space can often be reduced by constant factors via sliding-window techniques.
Note: Complexity analysis is based on theoretical models. Actual runtime is affected by input scale, hardware, and implementation optimizations. Benchmark for your specific workload.
Related Tools
Genozip · CRAM · gzip