Skip to content

Commit

Permalink
Small review fixups
Browse files Browse the repository at this point in the history
  • Loading branch information
clintval committed May 10, 2024
1 parent 0090746 commit 7067311
Show file tree
Hide file tree
Showing 3 changed files with 8 additions and 8 deletions.
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,23 +28,23 @@ Documentation can be found in the [docs folder](docs/01_Introduction.md).
The `fgsv` toolkit contains tools for effective structural variant debugging but are not meant to be used as a structural variant calling toolchain in-and-of-itself.
Instead, it is better to think of `fgsv` as an effective breakpoint detection and structural variant exploration toolkit.

When describing structural variation, we use the term breakpoint to mean a junction between two loci and the term breakend to refer to one of the loci in a breakpoint.
When describing structural variation, we use the term *breakpoint* to mean a junction between two loci and the term *breakend* to refer to one of the loci in a breakpoint.
Importantly, all point intervals (1-length) reported by this toolkit are 1-based inclusive from the perspective of the reference sequence.

### `fgsv SvPileup`

Collates a pileup of putative structural variant supporting reads.
Collates pileups of reads over breakpoint events.

```console
fgsv SvPileup \
--input sample.bam \
--output sample.svpileup
```

The tool [`fgsv SvPileup`](https://github.com/fulcrumgenomics/fgsv/blob/main/docs/tools/SvPileup.md) takes a query-grouped BAM file as input and scans through each template one at a time, where a template is the full collection of reads and alignments from a single source molecule.
The tool [`fgsv SvPileup`](https://github.com/fulcrumgenomics/fgsv/blob/main/docs/tools/SvPileup.md) takes a queryname-grouped BAM file as input and scans each group of alignments for structural variant evidence.
For example, a paired-end read may have an alignment per read: one alignment for read 1 and another alignment for read 2.

Primary and supplementary alignments for a template (see the [SAM Format Specification v1](https://samtools.github.io/hts-specs/SAMv1.pdf) for more information) are used to construct a “chain” of aligned sub-segments in a way that honors the logical ordering of sub-segments and their strandeness in relation to the reference sequence.
Primary and supplementary alignments for a template (see the [SAM Format Specification v1](https://samtools.github.io/hts-specs/SAMv1.pdf) for more information) are used to construct a “chain” of aligned sub-segments in a way that honors the logical ordering of sub-segments and their strandedness in relation to the reference sequence.
These aligned sub-segments in a chain relate to each other through typical alignment mechanisms like insertions and deletions but also contain information about the relative orientation of the sub-segment to the reference sequence and importantly, jumps between reference sequences such as translocations between chromosomes or contigs.

For each chain of aligned sub-segments per template, outlier jumps are collected where the minimum inter-segment distance within a read must be 100bp (by default) or greater, and the minimum inter-read distance across reads (e.g. between reads in a paired-end read) must be 1000bp (by default) or greater.
Expand All @@ -56,7 +56,7 @@ The output of this tool is a metrics file tabulating the breakpoints and a BAM f

### `fgsv AggregateSvPileup`

Merges nearby pileups of reads supporting putative breakpoints.
Aggregates and merges pileups that are likely to support the same breakpoint.

```console
fgsv AggregateSvPileup \
Expand All @@ -68,7 +68,7 @@ fgsv AggregateSvPileup \
Because of variability in typical short-read alignments, evidence for a single breakpoint may span a few loci near the true breakend loci. For example, if the breakpoint only has intra-read evidence, then the breakpoint could coincidentally occur within the unobserved bases between read 1 and read 2 in a pair. In other cases and due to sequence similarity or homology between each breakend locus, it is not always possible to locate the exact nucleotide point where the breakends occur, and instead a plausible region may exist that supports either breakend loci.

The tool [`fgsv AggregateSvPileup`](https://github.com/fulcrumgenomics/fgsv/blob/main/docs/tools/AggregateSvPileup.md) is used to coalesce nearby breakpoints into one event if they appear to belong to one true breakpoint.
This polishing step preserves true positive breakpoint events and intends to reduce the number of false positive breakpoint events.
This polishing step preserves true positive breakpoint events and is intended to reduce the number of false positive breakpoint events.

Adjacent breakpoints are only merged if their left breakends map to the same reference sequence, their right breakends map to the same reference sequence, the strandedness of the left and right aligned sub-segments is the same, and their left and right genomic breakend positions are both within a given length threshold.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ import scala.collection.mutable

@clp(group=ClpGroups.BreakpointAndSv, description=
"""
|Merges nearby pileups of reads supporting putative breakpoints.
|Aggregates and merges pileups that are likely to support the same breakpoint.
|
|Takes as input the file of pileups produced by `SvPileup`. That file contains a list of breakpoints, each
|consisting of a chromosome, position and strand for each side of the breakpoint, as well as quantified read support
Expand Down
2 changes: 1 addition & 1 deletion src/main/scala/com/fulcrumgenomics/sv/tools/SvPileup.scala
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ object TargetBedRequirement extends FgBioEnum[TargetBedRequirement] {

@clp(group=ClpGroups.BreakpointAndSv, description=
"""
|Collates a pileup of putative structural variant supporting reads.
|Collates pileups of reads over breakpoint events.
|
|## Outputs
|
Expand Down

0 comments on commit 7067311

Please sign in to comment.