- Fix issue with PyPI installation
- Added support for Bcftools and Clair3 VCFs
- Added support for getting depth info from BAM files with pysam
- Ct column no longer added to Stats/QC sheet if no Ct info provided
- Added ruff for linting, replacing flake8
- Fixed issue with not all Nextclade columns being included in Excel report
- Migrated to
pyproject.toml
fromsetup.py
; removed files associated with legacysetup.py
approach - Migrated to Markdown from RST for README and CHANGELOG
- Removed
docs/
directory - Added issue template forms
- Updated GitHub Actions CI workflow
Fixes sample name extraction from nf-core/viralrecon Medaka VCF filename with ".merged" to get sample name that matches SnpSift filename extracted sample name.
- Added more checks for Medaka VCFs from low coverage samples which may produce ValueError and ZeroDivisionError errors
- Add support for reading annotated Medaka VCF files (
medaka_variant
VCF annotated withmedaka tools annotate
) - Changed mutation string format to
{gene}:{AA change} ({NT change}{extra})
if there is an AA change - Added low coverage filtering of variants for Medaka VCF
- "Variants Summary" table now sorted by nucleotide position
- Fixed shorter consensus sequences not being written to report
- Improve nf-virontus VCF compatibility
Fixes and changes from PR #15
- low coverage coordinate output off by one (
xlavir.tools.mosdepth.get_interval_coords_bed
) - error on no Pangolin reports found (e.g. non-SARS-CoV-2 report) (
xlavir.tools.pangolin.get_info
) - user QC thresholds not being used (
xlavir.xlavir.run
) - not showing all QC fail comments (
xlavir.qc.create_qc_stats_dataframe
) - consensus sequences being too long for Excel cell character limit (32,767 characters); longer sequences are chunked
into 80 character segments with one segment per line in consensus sheet (
xlavir.tools.consensus.read_fasta
)
- Ignore and skip unsupported VCFs instead of throwing NotImplementedError (
xlavir.tools.variants.get_info
) - In consensus sheet, only add QC comments on FASTA header rows if necessary (
xlavir.io.xl.add_comments
)
- Fixed issue (#12) where iVar ref allele depth corresponds to depth of base before deletion. For indels, ref allele depth is taken from the total depth minus the alt allele depth.
- Fixed issue (#14) where the total number of reads
from
samtools flagstat
may not be the true number of reads. The unmapped reads may be excluded from the BAM file so thesamtools flagstat
total number of reads may be equal to the number of mapped reads. There is now a search for fastp JSON files to get the true total number of reads.
- Added support for Nanopolish VCF parsing as generated by the ARTIC pipeline
- Added deduplication of VCF and SnpSift entries since the ARTIC pipeline may produce VCF files with duplicate variant calls due to overlap between amplicons.
- Added VCF and SnpSift test data for CLI test to generate Excel report.
- Fix an issue where single base positions are being reported as 0-based when all other ranges are 1-based for reporting of low/no coverage regions from Mosdepth per-base BED files (#10).
- Add support for nf-core/viralrecon version 2.0 (requires Mosdepth
bed.gz
files be output; needs custommodules.config
like this one) - Nextclade CLI per sample results parsed into sheet showing useful info like Nextstrain clade, # of mutations, # of PCR primer changes
- Added check that input directory exists and is a directory
- Added sheet with xlavir info
- Added Gene, Variant Effect, Variant Impact, Amino Acid Change to Variant Summary table
- Add reference sequence length to QC stats table. Get ref seq length from max mosdepth per base BED coverage value.
- Add more conditional formatting
- Fix
execution_report.html
finding - Fix version printing; add to help
- Add epilog with usage info
- Adds "Variants Summary" sheet summarizing variant information across all samples
- Adds comments to AF values in "Variant Matrix" sheet
- Fixes width/height of cell comments to be based on length of comment text
- Adds support for adding Ct values from a Ct values table (tab-delimited, CSV, ODS, XLSX format) into an xlavir report.
- Fixes issue with SnpSift table file parsing and variable naming in variants.py (#4, #5)
- Fixes issue with SnpSift table file parsing. Adds check to see if SnpSift column is dtype object/str before using .str Series methods (#4)
- Fixes issue with SnpEff/SnpSift AA change parsing.
- Fix division by zero error due to variants with DP values of 0
- Added header comments with descriptions of field content
- Added comment to Variant Matrix sheet A1 cell describing what is shown in the matrix
- Added highlighting of samples failing QC in other sheets
- Fixed image scaling by determining image size with imageio
- Added Medaka/Longshot VCF parsing
- Collect sample results from a nf-core/viralrecon or peterk87/nf-virontus into an Excel report
- iVar VCF parsing
- QA/QC of sample analysis results (basic PASS/FAIL based on minimum genome coverage and depth)
- Nextflow workflow execution information
- Prepend worksheets from other Excel documents into the report (e.g. cover page/sheet, sample sheet, lab results)
- Add custom images into worksheets with custom names and descriptions (e.g. phylogenetic tree figure PNG)