Configuration System
MICOS-2024 ships with a broad template vocabulary. That is useful, but it also means configuration should be read in two layers:
- stable CLI-relevant configuration, which directly feeds the current command surface,
- platform ambition configuration, which captures a wider analytical vision represented across scripts and workflow assets.
Configuration files
| File | Role | Notes |
|---|---|---|
config/analysis.yaml.template | project and analysis parameters | broadest template, includes advanced sections |
config/databases.yaml.template | database locations | important for validation and runtime defaults |
config/samples.tsv.template | sample metadata | used to standardize cohort input |
Precedence model
From the current CLI implementation, the practical precedence is:
- command-line flags,
config/analysis.yaml,- defaults carried by the code.
validate-config also inspects config/databases.yaml when present.
Minimum viable configuration for the stable CLI
The current full-run command primarily needs:
- input directory,
- results directory,
- thread count,
- KneadData database path,
- Kraken2 database path.
Everything else is valuable context, but those fields are the real operational minimum for a working run.
Recommended setup sequence
cp config/analysis.yaml.template config/analysis.yaml
cp config/databases.yaml.template config/databases.yaml
cp config/samples.tsv.template config/samples.tsv
python -m micos.cli validate-config --config config/analysis.yamlExample, minimal operational profile
paths:
input_dir: "data/raw_input"
output_dir: "results"
resources:
max_threads: 16
quality_control:
kneaddata:
threads: 8
taxonomic_profiling:
kraken2:
threads: 16
confidence: 0.1Example, database template expectations
quality_control:
kneaddata:
human_genome: "/db/kneaddata/human_genome"
taxonomy:
kraken2:
standard: "/db/kraken2/standard"Why the templates are broader than the CLI
The repository contains:
- stable CLI modules,
- workflow assets,
- specialist scripts for differential abundance, network analysis, phylogenetics, amplicon workflows, and metatranscriptomics.
The configuration templates reflect that wider platform horizon. The docs therefore explain them as a superset, not as proof that every section is equally stable in the main CLI path.
Validation posture
Use validation early:
python -m micos.cli validate-configThis catches missing files, placeholder database paths, and structural issues before longer jobs begin.
Configuration advice for contributors
If you add a new config field, decide which layer it belongs to:
- stable CLI contract,
- workflow asset support,
- specialist script support.
That decision should shape where it is documented and how prominently it appears.