-
Notifications
You must be signed in to change notification settings - Fork 2
Genome Assembly
This tool should be run on complete genome sequences or assemblies. The input is fasta format files. If making your own assemblies they can be de novo assembled with SPAdes.
Example SPAdes command:
/pathtospades/spades.py --only-assembler -1 prefix_1.fastq.gz -2 prefix_2.fastq.gz --careful -o outdir -t 8
Note that if your assembly has contigs not covering certain regions deletions will not be detected as such by NucDiff. They may be marked instead as relocations. It might help to use a reference guided method with a complete genome of H37Rv (or another closer related lineage to your genome) to close these gaps in the assembly. One tool for this is RaGOO. It will output a new genome file with some of the gaps closed. This may or may not suit your needs but be aware of the difference between using incomplete assemblies and complete genomes for comparison.
An example of using ragoo:
ragoo.py scaffolds.fasta MTB-H37Rv.fna