Releases: broadinstitute/gatk-protected
gatk-protected-1.0.0.0-alpha1.2.7.2
This release is the same as gatk-protected-1.0.0.0-alpha1.2.6 with b51299f615f06ea5263119f2a94485bd829be0c2
, a315b95b4c1a4a8db4c5835dcdf99e239faff812
, and 581311359c9c5e41b272b8386ae0e2796f9563c2
applied. All of these commits relate to fixes for FilterByOrientationBias.
gatk-protected-1.0.0.0-alpha1.2.6
Mutect 2:
- Using new SplitIntervals tool. WDL updated.
Other:
- New GATK being used (4.alpha.2-234-gc3a44c9-20170427.144324-1).
- New pairHMM interface support.
gatk-protected-1.0.0.0-alpha1.2.5
Mutect 2:
- First cut of WDL for running multiple and single pairs
- Tumor-only supported in WDL
- Base quality and mapping quality filters added and associated annotations
- Fixed STR annotation, so that filter (str_contraction) has been activated
- Additional new annotations added: median fragment length for reads supporting ref and alt ; median distance from end of read
- Automated testing of WDL
- Oncotator and Orientation Bias Filter tasks added and are optional. Note that the oncotator task requires a docker image. Picard jar is required for the Orientation Bias Filter (for now).
- Fix to Orientation Bias Filter to accept empty VCFs
- Orientation Bias Filter now puts filter annotation in the INFO field and in the FORMAT field
CNV/ACNV (v1):
- First cut of improved WDL
- Automated testing of WDL upgraded
- CNLoH caller has been renamed to BalancedSegmentCaller. The loss-of-heterozygosity calls should be considered experimental and are very fragile.
Known issue:
- Docker image is bigger
gatk-protected-1.0.0.0-alpha1.2.4
New features/bugfixes
-CNV plotting tools now determine plotting regions from a sequence dictionary.
-New CLIs for HMM-based segmentation (PerformJointSegmentation, PerformCopyRatioSegmentation, PerformAlleleFractionSegmentation) are still under active development and should not be used.
gatk-protected-1.0.0.0-alpha1.2.3
New features/bugfixes
-Added WDL scripts for running somatic CNV case/PoN workflows
-Added tool for genotyping sex from coverage data (TargetCoverageSexGenotyper)
-Added ACNV tools CreateAllelicPanelOfNormals and CalculatePulldownPhasePosteriors
-Somatic CNV/ACNV tools now take TSV target files as input, rather than BED files (the latter can be converted into the former using the tool ConvertBedToTargetFile)
-Fixed various workflow and plotting issues
-Removed dependency on HDF5-Java JNI Libraries
gatk-protected-1.0.0.0-alpha1.2.2
New features/bugfixes
-Fixed a compiler warning that prevented the previous version from building.
Note: This release was tagged for use by a specific research group at the Broad Institute and so not all of the tools and workflows are in a working state (e.g., GATK ACNV). Other users should use alpha1.2.1 if they are able to build it successfully and alpha1.2 otherwise.
gatk-protected-1.0.0.0-alpha1.2.1
New features/bugfixes
- New option
-numIterSimPerFit 0
to disable refitting between sim-seg iterations. See #503 for more information. When used, this speeds up ACNV, since it will do fewer iterations of MCMC. Results do not seem to be impacted in any appreciable way. - Fixed ACS conversion bug that was raising the segment mean to the power of two, twice.
- Made ACNV Plotting a bit more robust to code changes in other tools.
- Made the x-axes in ACNV Plotting round the genomic position a bit more sensibly.
gatk-protected-1.0.0.0-alpha1.2
New features/bugfixes
- New het pulldown tool introduced:
GetBayesianHetCoverage
This can produce a het pulldown with matched tumor-normal pair or with a tumor sample only - Due to new het pulldown tool mentioned above, ACNV can be run without a matched normal sample.
- Better performance in ACNV
- More documentation
- Improved plotting for ACNV
- Plotting of each segment in ACNV
- Somatic WGS support. This has not been evaluated yet. Users should use the
SparkGenomeReadCounts
tool for coverage collection of WGS samples. See Performance Notes at the end of this post. - Germline CNV toolchain. Performance will be improved soon, so users should be warned that this code will be changed in the near future.
- CLI tool for GC correction in WGS or capture data.
- CNLoH caller implemented using ACNV results
- Balanced segment calling using ACNV results is part of the CNLoH caller. In other words, produces calls as to whether a segment has a MAF of 0.5.
- Added new conversion as part of CNLoH caller that will produce files that can be ingested by TITAN.
- Moved the conversion of ACNV to CGA AllelicCapSeg file format to the CNLoH Caller and integrated the balanced caller with this process.
- Updated WDL for tumor normal case sample workflow.
Incomplete
- Somatic WGS has not been evaluated yet.
- Still using a development version of the GATK4
- GATK ACNV (
GetHetCoverage
andAllelicCNV
) on WGS samples takes ~36 hours. This will be improved in next milestone. RunningGetBayesianHetCoverage
is unlikely to improve this runtime.
Performance Notes (WGS)
All numbers were using a bin size of 3k bases.
SparkGenomeReadCounts
on WGS data:SparkGenomeReadCounts
: ~5.5 hours on Broad NFS and ~5 minutes on hdfs with 120 core spark clusterCalculateTargetCoverage
does not leverage Spark (yet), so no improvement can be gained there. We have seen runtimes well over 12 hours.- PoN creation steps (not including coverage collection) took under 1.25 hrs for 142 samples on Broad NFS
- GATK CNV case workflow on WGS sample takes ~6 hrs
Notes
- The previous
GetHetCoverage
tool has not been removed. By default, we will not be using it in workflow releases. - GC correction has not yet been included in the workflows.
gatk-protected-1.0.0.0-alpha1.1
Many improvements including:
- Better performance in ACNV
- More documentation
- Percentiles in ACNV output distributions
Incomplete:
- Germline CNV tools are not evaluated and should be considered incomplete
- CN LOH caller not implemented
- CN LOH plotting not implemented
- Still using a development version of the GATK4
gatk-protected-1.0.0.0-alpha1-rc1
- Introducing ACNV, which integrates heterozygous SNP (germline) information. See documentation for details.
- Fast WGS coverage collection
- Exposing many more parameters in segmentation
- Improved documentation
- Added Dockerfile
- Generic MCMC library
- Removed sample name parameter
Incomplete:
- Plotting ACNV results still unfinished
- CN LOH caller not implemented
- CN LOH plotting not implemented
- Still using a development version of the GATK4