diff --git a/deprecated/pipelines/cemba/cemba_methylcseq/README.md b/deprecated/pipelines/cemba/cemba_methylcseq/README.md index 6631964698..a194e83f5f 100644 --- a/deprecated/pipelines/cemba/cemba_methylcseq/README.md +++ b/deprecated/pipelines/cemba/cemba_methylcseq/README.md @@ -1,6 +1,8 @@ ## Announcement: CEMBA is Deprecated 9/12/2024 -The CEMBA workflow is deprecated and is no longer supported. However, the CEMBA documentation is still available. See [CEMBA Pipeline Overview](https://broadinstitute.github.io/warp/docs/Pipelines/CEMBA_MethylC_Seq_Pipeline/README) on the [WARP documentation site](https://broadinstitute.github.io/warp/)! +The CEMBA workflow is deprecated and no longer supported. File paths in the JSON file are also no longer supported. + +However, the CEMBA documentation is still available. See [CEMBA Pipeline Overview](https://broadinstitute.github.io/warp/docs/Pipelines/CEMBA_MethylC_Seq_Pipeline/README) on the [WARP documentation site](https://broadinstitute.github.io/warp/)! The CEMBA data is also available on the NEMO data portal. The whitelist includes the following barcodes: CTCACG, CAGATC, CGATGT, ACTTGA, TTAGGC, GATCAG, TGACCA, TAGCTT, ACAGTG, GGCTAC, GCCAAT, CTTGTA. ### CEMBA summary diff --git a/pipeline_versions.txt b/pipeline_versions.txt index c6f62d8d65..0b12b2df5a 100644 --- a/pipeline_versions.txt +++ b/pipeline_versions.txt @@ -30,11 +30,11 @@ ExomeReprocessing 3.3.3 2024-11-04 BuildIndices 3.0.0 2023-12-06 scATAC 1.3.2 2023-08-03 snm3C 4.0.4 2024-08-06 -Multiome 5.9.0 2024-10-21 -PairedTag 1.8.1 2024-11-04 +Multiome 5.9.1 2024-11-12 +PairedTag 1.8.2 2024-11-12 MultiSampleSmartSeq2 2.2.22 2024-09-11 -MultiSampleSmartSeq2SingleNucleus 2.0.3 2024-11-04 -Optimus 7.8.1 2024-11-04 -atac 2.5.0 2024-10-23 +MultiSampleSmartSeq2SingleNucleus 2.0.4 2024-11-12 +Optimus 7.8.2 2024-11-12 +atac 2.5.2 2024-11-12 SmartSeq2SingleSample 5.1.21 2024-09-11 -SlideSeq 3.4.4 2024-11-04 +SlideSeq 3.4.5 2024-11-12 diff --git a/pipelines/skylab/atac/atac.changelog.md b/pipelines/skylab/atac/atac.changelog.md index 3a8c05fb03..5c2c0aea4b 100644 --- a/pipelines/skylab/atac/atac.changelog.md +++ b/pipelines/skylab/atac/atac.changelog.md @@ -1,7 +1,18 @@ +# 2.5.2 +2024-11-12 (Date of Last Commit) + +* Added memory and disk updates to Multiome JoinBarcodes; this does not impact the ATAC workflow + +# 2.5.1 +2024-11-12 (Date of Last Commit) + +* Renamed the ATAC workflow library metric percent_target to atac_percent_target for compatibility with downstream tools + # 2.5.0 2024-10-23 (Date of Last Commit) * Updated the tabix flag in CreateFragmentFile task to use CSI instead of TBI indexing, which supports chromosomes larger than 512 Mbp +* Renamed the ATAC workflow library metric percent_target to atac_percent_target for compatibility with downstream tools # 2.4.0 2024-10-23 (Date of Last Commit) diff --git a/pipelines/skylab/atac/atac.wdl b/pipelines/skylab/atac/atac.wdl index 30cb017ab8..521cb09dfd 100644 --- a/pipelines/skylab/atac/atac.wdl +++ b/pipelines/skylab/atac/atac.wdl @@ -49,7 +49,7 @@ workflow ATAC { String adapter_seq_read3 = "TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG" } - String pipeline_version = "2.5.0" + String pipeline_version = "2.5.2" # Determine docker prefix based on cloud provider String gcr_docker_prefix = "us.gcr.io/broad-gotc-prod/" @@ -575,7 +575,7 @@ task CreateFragmentFile { print("Print number of cells", number_of_cells) atac_percent_target = number_of_cells / expected_cells*100 print("Setting percent target in nested dictionary") - data['Cells']['percent_target'] = atac_percent_target + data['Cells']['atac_percent_target'] = atac_percent_target # Flatten the dictionary diff --git a/pipelines/skylab/multiome/Multiome.changelog.md b/pipelines/skylab/multiome/Multiome.changelog.md index 35ebd07702..e5b7ca398e 100644 --- a/pipelines/skylab/multiome/Multiome.changelog.md +++ b/pipelines/skylab/multiome/Multiome.changelog.md @@ -1,7 +1,16 @@ +# 5.9.1 +2024-11-12 (Date of Last Commit) + +* Renamed the ATAC workflow library metric percent_target to atac_percent_target for compatibility with downstream tools +* Added more disk and memory to the JoinBarcodes task + # 5.9.0 2024-10-21 (Date of Last Commit) + * Updated the tabix flag in JoinMultiomeBarcodes task in H5adUtils.wdl to use CSI instead of TBI indexing, which supports chromosomes larger than 512 Mbp; this task changes the format for the ATAC fragment file index * Renamed the fragment file index from atac_fragment_tsv_tbi to atac_fragment_tsv_index + + # 5.8.0 2024-10-23 (Date of Last Commit) diff --git a/pipelines/skylab/multiome/Multiome.wdl b/pipelines/skylab/multiome/Multiome.wdl index 145adbe465..0d291633de 100644 --- a/pipelines/skylab/multiome/Multiome.wdl +++ b/pipelines/skylab/multiome/Multiome.wdl @@ -9,7 +9,7 @@ import "../../../tasks/broad/Utilities.wdl" as utils workflow Multiome { - String pipeline_version = "5.9.0" + String pipeline_version = "5.9.1" input { String cloud_provider diff --git a/pipelines/skylab/optimus/Optimus.changelog.md b/pipelines/skylab/optimus/Optimus.changelog.md index 4918f3776a..916c4ad800 100644 --- a/pipelines/skylab/optimus/Optimus.changelog.md +++ b/pipelines/skylab/optimus/Optimus.changelog.md @@ -1,8 +1,14 @@ +# 7.8.2 +2024-11-12 (Date of Last Commit) + +* Added memory and disk updates to Multiome JoinBarcodes; this does not impact the Optimus workflow + # 7.8.1 2024-11-04 (Date of Last Commit) * Updated the tabix flag in JoinMultiomeBarcodes task in H5adUtils.wdl to use CSI instead of TBI indexing, which supports chromosomes larger than 512 Mbp; this task should not affect the Optimus pipeline + # 7.8.0 2024-10-23 (Date of Last Commit) diff --git a/pipelines/skylab/optimus/Optimus.wdl b/pipelines/skylab/optimus/Optimus.wdl index 8275ff8292..58e65d42fc 100644 --- a/pipelines/skylab/optimus/Optimus.wdl +++ b/pipelines/skylab/optimus/Optimus.wdl @@ -71,7 +71,7 @@ workflow Optimus { # version of this pipeline - String pipeline_version = "7.8.1" + String pipeline_version = "7.8.2" # this is used to scatter matched [r1_fastq, r2_fastq, i1_fastq] arrays diff --git a/pipelines/skylab/paired_tag/PairedTag.changelog.md b/pipelines/skylab/paired_tag/PairedTag.changelog.md index 34e67e5102..f2f9a7fe0a 100644 --- a/pipelines/skylab/paired_tag/PairedTag.changelog.md +++ b/pipelines/skylab/paired_tag/PairedTag.changelog.md @@ -1,3 +1,9 @@ +# 1.8.2 +2024-11-12 (Date of Last Commit) + +* Renamed the ATAC workflow library metric percent_target to atac_percent_target for compatibility with downstream tools +* Added more disk and memory to the ParseBarcodes task + # 1.8.1 2024-11-04 (Date of Last Commit) diff --git a/pipelines/skylab/paired_tag/PairedTag.wdl b/pipelines/skylab/paired_tag/PairedTag.wdl index 5896f5ae60..155254056a 100644 --- a/pipelines/skylab/paired_tag/PairedTag.wdl +++ b/pipelines/skylab/paired_tag/PairedTag.wdl @@ -8,7 +8,7 @@ import "../../../tasks/broad/Utilities.wdl" as utils workflow PairedTag { - String pipeline_version = "1.8.1" + String pipeline_version = "1.8.2" input { diff --git a/pipelines/skylab/slideseq/SlideSeq.changelog.md b/pipelines/skylab/slideseq/SlideSeq.changelog.md index d74190c0c9..c0f4a9f3dc 100644 --- a/pipelines/skylab/slideseq/SlideSeq.changelog.md +++ b/pipelines/skylab/slideseq/SlideSeq.changelog.md @@ -1,8 +1,14 @@ +# 3.4.5 +2024-11-12 (Date of Last Commit) + +* Added memory and disk updates to Multiome JoinBarcodes; this does not impact the SlideSeq workflow + # 3.4.4 2024-11-04 (Date of Last Commit) * Updated the tabix flag in JoinMultiomeBarcodes task in H5adUtils.wdl to use CSI instead of TBI indexing, which supports chromosomes larger than 512 Mbp; this task should not affect the Slide-seq pipeline + # 3.4.3 2024-10-24 (Date of Last Commit) diff --git a/pipelines/skylab/slideseq/SlideSeq.wdl b/pipelines/skylab/slideseq/SlideSeq.wdl index c81d2813c7..5ec74e3e2e 100644 --- a/pipelines/skylab/slideseq/SlideSeq.wdl +++ b/pipelines/skylab/slideseq/SlideSeq.wdl @@ -25,7 +25,7 @@ import "../../../tasks/broad/Utilities.wdl" as utils workflow SlideSeq { - String pipeline_version = "3.4.4" + String pipeline_version = "3.4.5" input { Array[File] r1_fastq diff --git a/pipelines/skylab/smartseq2_single_nucleus_multisample/MultiSampleSmartSeq2SingleNucleus.changelog.md b/pipelines/skylab/smartseq2_single_nucleus_multisample/MultiSampleSmartSeq2SingleNucleus.changelog.md index 2e90a8ef5d..7f75d2c3bb 100644 --- a/pipelines/skylab/smartseq2_single_nucleus_multisample/MultiSampleSmartSeq2SingleNucleus.changelog.md +++ b/pipelines/skylab/smartseq2_single_nucleus_multisample/MultiSampleSmartSeq2SingleNucleus.changelog.md @@ -1,8 +1,14 @@ +# 2.0.4 +2024-11-12 (Date of Last Commit) + +* Added memory and disk updates to Multiome JoinBarcodes; this does not impact the snSS2 workflow + # 2.0.3 2024-11-04 (Date of Last Commit) * Updated the tabix flag in JoinMultiomeBarcodes task in H5adUtils.wdl to use CSI instead of TBI indexing, which supports chromosomes larger than 512 Mbp; this task should not affect the snSS2 pipeline + # 2.0.2 2024-10-23 (Date of Last Commit) diff --git a/pipelines/skylab/smartseq2_single_nucleus_multisample/MultiSampleSmartSeq2SingleNucleus.wdl b/pipelines/skylab/smartseq2_single_nucleus_multisample/MultiSampleSmartSeq2SingleNucleus.wdl index 3488d13d2d..e5702147d1 100644 --- a/pipelines/skylab/smartseq2_single_nucleus_multisample/MultiSampleSmartSeq2SingleNucleus.wdl +++ b/pipelines/skylab/smartseq2_single_nucleus_multisample/MultiSampleSmartSeq2SingleNucleus.wdl @@ -57,7 +57,7 @@ workflow MultiSampleSmartSeq2SingleNucleus { } # Version of this pipeline - String pipeline_version = "2.0.3" + String pipeline_version = "2.0.4" if (false) { String? none = "None" diff --git a/tasks/skylab/H5adUtils.wdl b/tasks/skylab/H5adUtils.wdl index f4dd443877..af83a9e3f8 100644 --- a/tasks/skylab/H5adUtils.wdl +++ b/tasks/skylab/H5adUtils.wdl @@ -235,8 +235,8 @@ task JoinMultiomeBarcodes { Int nthreads = 1 String cpuPlatform = "Intel Cascade Lake" - Int machine_mem_mb = ceil((size(atac_h5ad, "MiB") + size(gex_h5ad, "MiB") + size(atac_fragment, "MiB")) * 6) + 10000 - Int disk = ceil((size(atac_h5ad, "GiB") + size(gex_h5ad, "GiB") + size(atac_fragment, "GiB")) * 8) + 10 + Int machine_mem_mb = ceil((size(atac_h5ad, "MiB") + size(gex_h5ad, "MiB") + size(atac_fragment, "MiB")) * 8) + 10000 + Int disk = ceil((size(atac_h5ad, "GiB") + size(gex_h5ad, "GiB") + size(atac_fragment, "GiB")) * 8) + 100 String docker_path } String gex_base_name = basename(gex_h5ad, ".h5ad") diff --git a/tasks/skylab/PairedTagUtils.wdl b/tasks/skylab/PairedTagUtils.wdl index 0989164d3a..9b2cd6aba2 100644 --- a/tasks/skylab/PairedTagUtils.wdl +++ b/tasks/skylab/PairedTagUtils.wdl @@ -205,8 +205,8 @@ task ParseBarcodes { Int nthreads = 1 String cpuPlatform = "Intel Cascade Lake" String docker_path - Int disk = ceil((size(atac_h5ad, "GiB") + size(atac_fragment, "GiB")) * 8) + 10 - Int machine_mem_mb = ceil((size(atac_h5ad, "MiB") + size(atac_fragment, "MiB")) * 6) + 10000 + Int disk = ceil((size(atac_h5ad, "GiB") + size(atac_fragment, "GiB")) * 8) + 100 + Int machine_mem_mb = ceil((size(atac_h5ad, "MiB") + size(atac_fragment, "MiB")) * 8) + 10000 } String atac_base_name = basename(atac_h5ad, ".h5ad") diff --git a/verification/VerifyMultiome.wdl b/verification/VerifyMultiome.wdl index fa6d3b4676..a43f5e36a8 100644 --- a/verification/VerifyMultiome.wdl +++ b/verification/VerifyMultiome.wdl @@ -80,7 +80,7 @@ workflow VerifyMultiome { test_text_file = test_library_metrics, truth_text_file = truth_library_metrics } - call VerifyTasks.CompareTextFiles as CompareAtacLibraryMetrics { + call VerifyTasks.CompareAtacLibraryMetrics as CompareAtacLibraryMetrics { input: test_text_files = select_all([test_atac_library_metrics]), truth_text_files = select_all([truth_atac_library_metrics]) diff --git a/verification/VerifyPairedTag.wdl b/verification/VerifyPairedTag.wdl index a18e85a44d..564b5264be 100644 --- a/verification/VerifyPairedTag.wdl +++ b/verification/VerifyPairedTag.wdl @@ -29,6 +29,9 @@ workflow VerifyPairedTag { File test_library_metrics File truth_library_metrics + File test_atac_library_metrics + File truth_atac_library_metrics + Boolean? done } @@ -77,4 +80,10 @@ workflow VerifyPairedTag { test_text_file = test_library_metrics, truth_text_file = truth_library_metrics } + + call VerifyTasks.CompareAtacLibraryMetrics as CompareAtacLibraryMetrics { + input: + test_text_files = select_all([test_atac_library_metrics]), + truth_text_files = select_all([truth_atac_library_metrics]) + } } \ No newline at end of file diff --git a/verification/VerifyTasks.wdl b/verification/VerifyTasks.wdl index f60ba6f3a6..b4664611cf 100644 --- a/verification/VerifyTasks.wdl +++ b/verification/VerifyTasks.wdl @@ -190,6 +190,129 @@ task CompareTextFiles { } +task CompareAtacLibraryMetrics { + input { + Array[File] test_text_files + Array[File] truth_text_files + } + + command <<< +python3 < allowable_diff: + print(f"Error: Metric {metric_a} exceeds threshold. Test value: {value_a}, Truth value: {value_b}, Threshold: {threshold*100}%. The allowable difference is {allowable_diff} and the actual difference is {diff}.") + exit_code = 1 + else: + print(f"Metric {metric_a} is within the threshold.") + return exit_code == 0 + + +# Read and compare all files +test_files = ["~{sep=',' test_text_files}"] +truth_files = ["~{sep=',' truth_text_files}"] + +if len(test_files) != len(truth_files): + print(f"Error: Different number of input files ({len(test_files)} vs. {len(truth_files)}). This is really not OK") + exit(1) + +for test_file, truth_file in zip(test_files, truth_files): + if not compare_files(test_file, truth_file): + exit(1) + +print("All files passed the comparison.") +CODE + + >>> + + runtime { + docker: "python:3.9-slim" + disks: "local-disk 100 HDD" + memory: "50 GiB" + preemptible: 3 + } +} + + + task CompareCrams { input { @@ -496,38 +619,64 @@ task CompareH5adFilesGEX { truth = ad.read_h5ad(truth_h5ad) test = ad.read_h5ad(test_h5ad) - truth_obs = pd.DataFrame(truth.obs) - test_obs = pd.DataFrame(test.obs) - - truth_var = pd.DataFrame(truth.var) - test_var = pd.DataFrame(test.var) - - truth_sum = truth.X.sum() - test_sum = test.X.sum() - - print("Now running equivalence check") - - # Check if obs, var, and sum match - if truth_obs.equals(test_obs) and truth_var.equals(test_var) and truth_sum == test_sum: - print("pass") + for x in truth.obs.columns: + z = test.obs[x] + y = truth.obs[x] + if z.equals(y)==False: + print("Cell Metric Column does not match:") + print(x) + print("Sum of test: ") + print(z.sum()) + print("Sum of truth: ") + print(y.sum()) + if x == "doublet_score": + print("Doublet score is allowed to be different") + else: + exit("Cell Metric does not match") + print("Comparing test gene metrics to truth gene metrics using truth as ref") + for x in truth.var.columns: + z = test.var[x] + y = truth.var[x] + if z.equals(y)==False: + print("Gene Metric Column does not match:") + print(x) + print("Making gene_names unique") + test.var_names_make_unique() + truth.var_names_make_unique() + genes_correct=True + for x in truth.var.columns: + z = test.var[x] + y = truth.var[x] + if z.equals(y)==False: + print("Gene metric does not match after making gene names unique") + print(x) + genes_correct=False + print("Done") + print("If no warning above Done, gene metrics match now that they are unique") + + print("Testing for new obs columns in test data set:") + for x in test.obs.columns: + if x not in truth.obs.columns: + print("Column not in truth", x) + print("Done") + print("If no warning above Done, no new obs columns in test matrix") + + print("Testing for new var columns in test data set:") + for x in test.var.columns: + if x not in truth.var.columns: + print("Column not in truth", x) + print("Done") + print("If no warning above Done, no new var columns in test matrix") + print("Testing matrix count sums") + if test.X.sum()==truth.X.sum(): + print("Counts match") else: - # If obs does not match, check if the only difference is in the 'doublet_score' column - if not truth_obs.equals(test_obs): - # Create a boolean DataFrame where True indicates differences - differences = truth_obs.ne(test_obs) # .ne() is the 'not equal' comparison for pandas - - # Identify columns with any differences - differing_columns = differences.any(axis=0) # Check if any value in a column is True - differing_columns = differing_columns[differing_columns].index.tolist() # Get column names with differences - - # Check if the only differing column is 'doublet_score' - if len(differing_columns) == 1 and 'doublet_score' in differing_columns: - print("Files differ in the doublet score") - else: - print(differing_columns) - exit("Multiple columns different") - - print("Done running matrix equivalence check") + print("Counts do not match") + exit("Counts do not match") + if genes_correct==False: + exit("Gene metrics do not match") + + print("Done with equivalence check") CODE >>> diff --git a/verification/test-wdls/TestPairedTag.wdl b/verification/test-wdls/TestPairedTag.wdl index 1838f9d9f3..9fcb2ebbd5 100644 --- a/verification/test-wdls/TestPairedTag.wdl +++ b/verification/test-wdls/TestPairedTag.wdl @@ -120,7 +120,8 @@ workflow TestPairedTag { Array[String] pipeline_metrics = flatten([ [ # File outputs PairedTag.gene_metrics_gex, - PairedTag.cell_metrics_gex + PairedTag.cell_metrics_gex, + PairedTag.atac_library_final ], select_all([PairedTag.library_metrics]), ]) @@ -199,6 +200,13 @@ workflow TestPairedTag { } } + call Utilities.GetValidationInputs as GetAtacLibraryMetrics { + input: + input_file = PairedTag.atac_library_final, + results_path = results_path, + truth_path = truth_path + } + call VerifyPairedTag.VerifyPairedTag as Verify { input: truth_optimus_h5ad = GetOptimusH5ad.truth_file, @@ -217,6 +225,8 @@ workflow TestPairedTag { test_atac_h5ad = GetSnapMetrics.results_file, test_library_metrics = select_first([GetLibraryMetrics.results_file, ""]), truth_library_metrics = select_first([GetLibraryMetrics.truth_file, ""]), + test_atac_library_metrics = GetAtacLibraryMetrics.results_file, + truth_atac_library_metrics = GetAtacLibraryMetrics.truth_file, done = CopyToTestResults.done } } diff --git a/website/docs/Pipelines/ATAC/README.md b/website/docs/Pipelines/ATAC/README.md index ec70252fb1..86d3eaab3e 100644 --- a/website/docs/Pipelines/ATAC/README.md +++ b/website/docs/Pipelines/ATAC/README.md @@ -8,7 +8,7 @@ slug: /Pipelines/ATAC/README | Pipeline Version | Date Updated | Documentation Author | Questions or Feedback | | :----: | :---: | :----: | :--------------: | -| [2.4.0](https://github.com/broadinstitute/warp/releases) | October, 2024 | WARP Pipelines | Please [file an issue in WARP](https://github.com/broadinstitute/warp/issues). | +| [2.5.1](https://github.com/broadinstitute/warp/releases) | November, 2024 | WARP Pipelines | Please [file an issue in WARP](https://github.com/broadinstitute/warp/issues). | ## Introduction to the ATAC workflow diff --git a/website/docs/Pipelines/ATAC/library-metrics.md b/website/docs/Pipelines/ATAC/library-metrics.md index 3e80bc85e4..6b86005bfe 100644 --- a/website/docs/Pipelines/ATAC/library-metrics.md +++ b/website/docs/Pipelines/ATAC/library-metrics.md @@ -31,6 +31,6 @@ The [ATAC pipeline](README.md) uses [SnapATAC2](https://github.com/kaizhang/Snap | Number_of_peaks | The total number of peaks, or regions of accessible chromatin, identified in the dataset, representing potential regulatory elements. | | fraction_of_genome_in_peaks | The fraction of the genome that is covered by identified peaks, indicating the extent of chromatin accessibility across the genome. | | fraction_of_high-quality_fragments_overlapping_peaks | The fraction of high-quality fragments that overlap with identified peaks, providing an indication of the efficiency of the assay in capturing accessible regions. | -| percent_target | Percent of cells recovered; value is calculated as estimated_cells/expected_cells. | +| atac_percent_target | Percent of cells recovered; value is calculated as estimated_cells/expected_cells. | diff --git a/website/docs/Pipelines/Multiome_Pipeline/README.md b/website/docs/Pipelines/Multiome_Pipeline/README.md index 625d3320d7..131a81aca5 100644 --- a/website/docs/Pipelines/Multiome_Pipeline/README.md +++ b/website/docs/Pipelines/Multiome_Pipeline/README.md @@ -7,7 +7,7 @@ slug: /Pipelines/Multiome_Pipeline/README | Pipeline Version | Date Updated | Documentation Author | Questions or Feedback | | :----: | :---: | :----: | :--------------: | -| [Multiome v5.8.0](https://github.com/broadinstitute/warp/releases) | October, 2024 | WARP Pipelines | Please [file an issue in WARP](https://github.com/broadinstitute/warp/issues). | +| [Multiome v5.9.1](https://github.com/broadinstitute/warp/releases) | November, 2024 | WARP Pipelines | Please [file an issue in WARP](https://github.com/broadinstitute/warp/issues). | ![Multiome_diagram](./multiome_diagram.png) @@ -108,9 +108,9 @@ The Multiome workflow calls two WARP subworkflows, one external subworkflow (opt | multiome_pipeline_version_out | N.A. | String describing the version of the Multiome pipeline used. | | bam_aligned_output_atac | `_atac.bam` | BAM file containing aligned reads from ATAC workflow. | | fragment_file_atac | `_atac.fragments.sorted.tsv.gz` | Sorted and bgzipped TSV file containing fragment start and stop coordinates per barcode. The columns are "Chromosome", "Start", "Stop", "ATAC Barcode", "Number of reads", and "GEX Barcode". | -| fragment_file_index | `_atac.fragments.sorted.tsv.gz.tbi` | tabix index file for the fragment file. | +| fragment_file_index | `_atac.fragments.sorted.tsv.gz.csi` | Tabix CSI index file for the fragment file. | | snap_metrics_atac | `_atac.metrics.h5ad` | h5ad (Anndata) file containing per-barcode metrics from SnapATAC2. Also contains the equivalent gene expression barcode for each ATAC barcode in the `gex_barcodes` column of the `h5ad.obs` property. See the [ATAC Count Matrix Overview](../ATAC/count-matrix-overview.md) for more details. | -| atac_library_metrics | `_atac_.metrics.csv` | CSV with library-level metrics produced by SnapATAC2. See the ATAC [Library Level Metrics Overview](../ATAC/library-metrics.md) for more details. | +| atac_library_metrics | `_atac__library_metrics.csv` | CSV with library-level metrics produced by SnapATAC2. See the ATAC [Library Level Metrics Overview](../ATAC/library-metrics.md) for more details. | | genomic_reference_version_gex | `.txt` | File containing the Genome build, source and GTF annotation version. | | bam_gex | `_gex.bam` | BAM file containing aligned reads from Optimus workflow. | | matrix_gex | `_gex_sparse_counts.npz` | NPZ file containing raw gene by cell counts. | diff --git a/website/docs/Pipelines/PairedTag_Pipeline/README.md b/website/docs/Pipelines/PairedTag_Pipeline/README.md index 323b3f33b9..394dd4c6a3 100644 --- a/website/docs/Pipelines/PairedTag_Pipeline/README.md +++ b/website/docs/Pipelines/PairedTag_Pipeline/README.md @@ -7,7 +7,7 @@ slug: /Pipelines/PairedTag_Pipeline/README | Pipeline Version | Date Updated | Documentation Author | Questions or Feedback | |:---:| :---: | :---: | :---: | -| [PairedTag_v1.8.0](https://github.com/broadinstitute/warp/releases) | October, 2024 | WARP Pipelines | Please [file an issue in WARP](https://github.com/broadinstitute/warp/issues). | +| [PairedTag_v1.8.2](https://github.com/broadinstitute/warp/releases) | November, 2024 | WARP Pipelines | Please [file an issue in WARP](https://github.com/broadinstitute/warp/issues). | ## Introduction to the Paired-Tag workflow