From 921ec559d8eeb9e9d1ce2e0afbb74dad521a9715 Mon Sep 17 00:00:00 2001 From: Gabriella Senior Date: Wed, 7 Aug 2024 13:08:31 -0400 Subject: [PATCH 1/3] Update README.md --- website/docs/Pipelines/snM3C/README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/website/docs/Pipelines/snM3C/README.md b/website/docs/Pipelines/snM3C/README.md index fe5e0fa2f9..5f15b3ee15 100644 --- a/website/docs/Pipelines/snM3C/README.md +++ b/website/docs/Pipelines/snM3C/README.md @@ -89,6 +89,7 @@ To see specific tool parameters, select the [workflow WDL link](https://github.c | Task name | Tool | Software | Description | | --- | --- | --- | --- | | Demultiplexing | Cutadapt | [Cutadapt](https://cutadapt.readthedocs.io/en/stable/) | Performs demultiplexing to cell-level FASTQ files based on random primer indices. | +| Summary_PerCellOutput | --- | --- | Untar files needed at per cell level | | Hisat-paired-end | Cutadapt, HISAT-3N, [hisat3n_general.py](https://github.com/lhqing/cemba_data/blob/788e83cd66f3b556bdfacf3485bed9500d381f23/cemba_data/hisat3n/hisat3n_general.py), [hisat3n_m3c.py](https://github.com/lhqing/cemba_data/blob/bf6248239074d0423d45a67d83da99250a43e50c/cemba_data/hisat3n/hisat3n_m3c.py) | [Cutadapt](https://cutadapt.readthedocs.io/en/stable/), [HISAT-3N](https://daehwankimlab.github.io/hisat2/hisat-3n/), python3 | Sorts, filters, and trims reads using the `r1_adapter`, `r2_adapter`, `r1_left_cut`, `r1_right_cut`, `r2_left_cut`, and `r2_right_cut` input parameters; performs paired-end read alignment; imports 2 custom python3 scripts developed by Hanqing Liu and calls the `separate_unique_and_multi_align_reads()` and `split_hisat3n_unmapped_reads()` functions to separate unmapped, uniquely aligned, multi-aligned reads from HISAT-3N BAM file, then splits the unmapped reads FASTQ file by all possible enzyme cut sites and output new R1 and R2 FASTQ files; unmapped reads are stored in unmapped FASTQ files and uniquely and multi-aligned reads are stored in separate BAM files. | | Hisat_single_end | HISAT-3N, [hisat3n_m3c.py](https://github.com/lhqing/cemba_data/blob/bf6248239074d0423d45a67d83da99250a43e50c/cemba_data/hisat3n/hisat3n_m3c.py) | [HISAT-3N](https://daehwankimlab.github.io/hisat2/hisat-3n/), python3 | Performs single-end alignment of unmapped reads to maximize read mapping, imports a custom python3 script developed by Hanqing Liu, and calls the `remove_overlap_read_parts()` function to remove overlapping reads from the split alignment BAM file produced during single-end alignment. | | Merge_sort_analyze | merge, sort, MarkDuplicates, [hisat3n_m3c.py](https://github.com/lhqing/cemba_data/blob/bf6248239074d0423d45a67d83da99250a43e50c/cemba_data/hisat3n/hisat3n_m3c.py), bam-to-allc, extract-allc | [samtools](https://www.htslib.org/), [Picard](https://broadinstitute.github.io/picard/), python3, [ALLCools](https://lhqing.github.io/ALLCools/intro.html) | Merges and sorts all mapped reads from the paired-end and single-end alignments; creates a position-sorted BAM file and a name-sorted BAM file; removes duplicate reads from the position-sorted, merged BAM file; imports a custom python3 script developed by Hanqing Liu and calls the `call_chromatin_contacts()` function to call chromatin contacts from the name-sorted, merged BAM file; reads are considered chromatin contacts if they are greater than 2,500 base pairs apart; creates a first ALLC file with a list of methylation points and a second ALLC file containing methylation contexts. | From 09ad6003a47a50f242790672a19c212ce731518e Mon Sep 17 00:00:00 2001 From: GitHub Action Date: Wed, 7 Aug 2024 17:14:29 +0000 Subject: [PATCH 2/3] Updated pipeline_versions.txt with all pipeline version information --- pipeline_versions.txt | 66 +++++++++++++++++++++---------------------- 1 file changed, 33 insertions(+), 33 deletions(-) diff --git a/pipeline_versions.txt b/pipeline_versions.txt index f21150b6f2..7e64185c39 100644 --- a/pipeline_versions.txt +++ b/pipeline_versions.txt @@ -1,42 +1,42 @@ Pipeline Name Version Date of Last Commit -MultiSampleSmartSeq2 2.2.21 2023-04-19 -SmartSeq2SingleSample 5.1.20 2023-04-19 -snm3C 4.0.0 2024-03-15 -Optimus 7.1.0 2024-05-20 +Optimus 7.4.0 2024-07-11 +Multiome 5.3.1 2024-07-18 +PairedTag 1.3.1 2024-07-18 +atac 2.2.0 2024-07-11 +SlideSeq 3.2.0 2024-07-11 +snm3C 4.0.2 2024-07-09 +MultiSampleSmartSeq2SingleNucleus 1.4.0 2024-07-11 scATAC 1.3.2 2023-08-03 -MultiSampleSmartSeq2SingleNucleus 1.3.4 2024-04-12 -SlideSeq 3.1.6 2024-05-20 +SmartSeq2SingleSample 5.1.20 2023-04-19 BuildIndices 3.0.0 2023-12-06 -PairedTag 0.7.0 2024-05-20 -atac 2.0.0 2024-05-20 -Multiome 5.0.0 2024-05-20 +MultiSampleSmartSeq2 2.2.21 2023-04-19 CEMBA 1.1.6 2023-12-18 BuildCembaReferences 1.0.0 2020-11-15 -IlluminaGenotypingArray 1.12.17 2024-03-26 -ExomeReprocessing 3.1.19 2024-03-26 -WholeGenomeReprocessing 3.1.20 2024-03-26 -ExternalExomeReprocessing 3.1.21 2024-03-26 -ExternalWholeGenomeReprocessing 2.1.21 2024-03-26 -CramToUnmappedBams 1.1.2 2022-04-14 -AnnotationFiltration 1.2.5 2023-12-18 -BroadInternalUltimaGenomics 1.0.17 2024-03-26 -BroadInternalRNAWithUMIs 1.0.29 2024-03-26 -BroadInternalImputation 1.1.10 2023-12-18 -BroadInternalArrays 1.1.7 2024-03-26 -UltimaGenomicsWholeGenomeCramOnly 1.0.16 2024-03-26 +UltimaGenomicsWholeGenomeCramOnly 1.0.19 2024-06-12 GDCWholeGenomeSomaticSingleSample 1.3.1 2024-01-19 -VariantCalling 2.1.18 2024-03-26 +ExomeGermlineSingleSample 3.1.22 2024-06-12 +UltimaGenomicsWholeGenomeGermline 1.0.19 2024-06-12 +WholeGenomeGermlineSingleSample 3.2.1 2024-06-12 +VariantCalling 2.2.1 2024-06-12 +UltimaGenomicsJointGenotyping 1.1.7 2023-12-18 JointGenotyping 1.6.10 2023-12-18 -JointGenotypingByChromosomePartOne 1.4.12 2023-12-18 +ReblockGVCF 2.2.1 2024-06-12 JointGenotypingByChromosomePartTwo 1.4.11 2023-12-18 -UltimaGenomicsJointGenotyping 1.1.7 2023-12-18 -ReblockGVCF 2.1.12 2024-03-26 -ExomeGermlineSingleSample 3.1.19 2024-03-26 -UltimaGenomicsWholeGenomeGermline 1.0.16 2024-03-26 -WholeGenomeGermlineSingleSample 3.1.20 2024-03-26 -RNAWithUMIsPipeline 1.0.16 2023-12-18 -CheckFingerprint 1.0.16 2024-03-26 -ValidateChip 1.16.4 2023-12-18 -Imputation 1.1.12 2023-12-18 +JointGenotypingByChromosomePartOne 1.4.12 2023-12-18 +ExternalExomeReprocessing 3.2.1 2024-06-12 +ExternalWholeGenomeReprocessing 2.2.1 2024-06-12 +ExomeReprocessing 3.2.1 2024-06-12 +CramToUnmappedBams 1.1.2 2022-04-14 +WholeGenomeReprocessing 3.2.1 2024-06-12 +IlluminaGenotypingArray 1.12.20 2024-06-12 +Arrays 2.6.26 2024-06-12 MultiSampleArrays 1.6.1 2022-04-14 -Arrays 2.6.23 2024-03-26 +ValidateChip 1.16.4 2023-12-18 +Imputation 1.1.13 2024-05-21 +RNAWithUMIsPipeline 1.0.16 2023-12-18 +BroadInternalUltimaGenomics 1.0.20 2024-06-12 +BroadInternalArrays 1.1.10 2024-06-12 +BroadInternalImputation 1.1.11 2024-05-21 +BroadInternalRNAWithUMIs 1.0.32 2024-06-12 +CheckFingerprint 1.0.19 2024-06-12 +AnnotationFiltration 1.2.5 2023-12-18 From 19167a1fae338d07115350a1915903d1f9cc7385 Mon Sep 17 00:00:00 2001 From: akovalsk <45073943+akovalsk@users.noreply.github.com> Date: Thu, 8 Aug 2024 10:24:05 -0400 Subject: [PATCH 3/3] Update website/docs/Pipelines/snM3C/README.md Co-authored-by: ekiernan <55763654+ekiernan@users.noreply.github.com> --- website/docs/Pipelines/snM3C/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/Pipelines/snM3C/README.md b/website/docs/Pipelines/snM3C/README.md index 5f15b3ee15..e8b48e3355 100644 --- a/website/docs/Pipelines/snM3C/README.md +++ b/website/docs/Pipelines/snM3C/README.md @@ -89,7 +89,7 @@ To see specific tool parameters, select the [workflow WDL link](https://github.c | Task name | Tool | Software | Description | | --- | --- | --- | --- | | Demultiplexing | Cutadapt | [Cutadapt](https://cutadapt.readthedocs.io/en/stable/) | Performs demultiplexing to cell-level FASTQ files based on random primer indices. | -| Summary_PerCellOutput | --- | --- | Untar files needed at per cell level | +| Summary_PerCellOutput | Custom function | bash | Untar files needed at per cell level. | | Hisat-paired-end | Cutadapt, HISAT-3N, [hisat3n_general.py](https://github.com/lhqing/cemba_data/blob/788e83cd66f3b556bdfacf3485bed9500d381f23/cemba_data/hisat3n/hisat3n_general.py), [hisat3n_m3c.py](https://github.com/lhqing/cemba_data/blob/bf6248239074d0423d45a67d83da99250a43e50c/cemba_data/hisat3n/hisat3n_m3c.py) | [Cutadapt](https://cutadapt.readthedocs.io/en/stable/), [HISAT-3N](https://daehwankimlab.github.io/hisat2/hisat-3n/), python3 | Sorts, filters, and trims reads using the `r1_adapter`, `r2_adapter`, `r1_left_cut`, `r1_right_cut`, `r2_left_cut`, and `r2_right_cut` input parameters; performs paired-end read alignment; imports 2 custom python3 scripts developed by Hanqing Liu and calls the `separate_unique_and_multi_align_reads()` and `split_hisat3n_unmapped_reads()` functions to separate unmapped, uniquely aligned, multi-aligned reads from HISAT-3N BAM file, then splits the unmapped reads FASTQ file by all possible enzyme cut sites and output new R1 and R2 FASTQ files; unmapped reads are stored in unmapped FASTQ files and uniquely and multi-aligned reads are stored in separate BAM files. | | Hisat_single_end | HISAT-3N, [hisat3n_m3c.py](https://github.com/lhqing/cemba_data/blob/bf6248239074d0423d45a67d83da99250a43e50c/cemba_data/hisat3n/hisat3n_m3c.py) | [HISAT-3N](https://daehwankimlab.github.io/hisat2/hisat-3n/), python3 | Performs single-end alignment of unmapped reads to maximize read mapping, imports a custom python3 script developed by Hanqing Liu, and calls the `remove_overlap_read_parts()` function to remove overlapping reads from the split alignment BAM file produced during single-end alignment. | | Merge_sort_analyze | merge, sort, MarkDuplicates, [hisat3n_m3c.py](https://github.com/lhqing/cemba_data/blob/bf6248239074d0423d45a67d83da99250a43e50c/cemba_data/hisat3n/hisat3n_m3c.py), bam-to-allc, extract-allc | [samtools](https://www.htslib.org/), [Picard](https://broadinstitute.github.io/picard/), python3, [ALLCools](https://lhqing.github.io/ALLCools/intro.html) | Merges and sorts all mapped reads from the paired-end and single-end alignments; creates a position-sorted BAM file and a name-sorted BAM file; removes duplicate reads from the position-sorted, merged BAM file; imports a custom python3 script developed by Hanqing Liu and calls the `call_chromatin_contacts()` function to call chromatin contacts from the name-sorted, merged BAM file; reads are considered chromatin contacts if they are greater than 2,500 base pairs apart; creates a first ALLC file with a list of methylation points and a second ALLC file containing methylation contexts. |