Code for Genomic Signatures of Pre-Resistance in Mycobacterium tuberculosis. https://www.nature.com/articles/s41467-021-27616-7
The files repeat the analysis specified in Materials and Mehods:
- Assembly, variant calling and pseudosequence:
Main script: pseudoseq_pipeline.sh
VCF filtering and annotation: addFT.py
Pseudosequence creation: vcf2pseudoseq.py
- Phylogenetic inference
Tree inference: raxml.sh
Tree dating: runBactDat.R, run_bactDat.sh
- Phylogenetic analysis
Ancestral sequence reconstruction: anc_seq_recons.R
Survival analysis using phylogenetic tree: survTree_functions.R, tree_functions.R, survival_analysis.R
TB profiler DB file: tb_profiler_db.complete.good.posStrRef.tsv
Genome index file: mtb.snps.dels.wg.withDR.assembly.finalSet.snpSites.masked.idx
- Genome-wide association study
Alignment preparation: gwas_alignment.R GWAS: gwas2_analysis.R
This code has been tested on R version 4.1.1 (2021-08-10). Dependencies required are specified within the R scripts.