Skip to content

Commit

Permalink
Merge branch 'main' of github.com:broadinstitute/ABC-Enhancer-Gene-Pr…
Browse files Browse the repository at this point in the history
…ediction into main

merge message
  • Loading branch information
Maya Sheth committed Jan 12, 2025
2 parents 7c7a52a + 834028c commit 8964e4c
Show file tree
Hide file tree
Showing 14 changed files with 474 additions and 478 deletions.
1 change: 1 addition & 0 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ jobs:
# The executor is the environment in which the steps below will be executed
docker:
- image: condaforge/mambaforge
resource_class: large # 8GB of memory
# CircleCI will report the results back to your VCS provider.
steps:
- checkout
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
CircleCI [![CircleCI](https://dl.circleci.com/status-badge/img/gh/broadinstitute/ABC-Enhancer-Gene-Prediction/tree/dev.svg?style=svg)](https://dl.circleci.com/status-badge/redirect/gh/broadinstitute/ABC-Enhancer-Gene-Prediction/tree/dev)
CircleCI [![CircleCI](https://dl.circleci.com/status-badge/img/gh/broadinstitute/ABC-Enhancer-Gene-Prediction.svg?style=svg)](https://app.circleci.com/pipelines/github/broadinstitute/ABC-Enhancer-Gene-Prediction)

> :memo: **Note:** This is a revamp of the ABC codebase presented in [1]. If you wish to access that version of the ABC repo, please check out https://github.com/broadinstitute/ABC-Enhancer-Gene-Prediction/tree/master.
Expand Down
3 changes: 2 additions & 1 deletion docs/usage/getting_started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ Installation
- Make sure you're not using strict channel priorities: ``conda config --set channel_priority flexible``. Otherwise, you may encounter package conflicts later when installing abc.
- To install mamba: ``conda create -n mamba -c conda-forge mamba -y``
- We recommend mamba as using conda can take 1hr+ for setup
- See troubleshooting page if you run into issues


Setup Conda Environment
Expand Down Expand Up @@ -123,7 +124,7 @@ biosamples config is a tsv separated file with the following columns
- If you dumped hic into a directory via JuicerTools, use ``juicebox``
- If you have a bedpe file for contact, it should be a tab delimited file containing 8 columns (chr1,start1,end1,chr2,start2,end2,name,score)
#. HiC_resolution (int)
- Recommended to use 5KB (kilobases)
- Currently only 5KB (kilobases) is supported
- 5KB means dna regions are bucketed into 5KB bins and we measure contact between those bins
#. alt_TSS (optional; not recommended to fill)
- Alternative TSS reference file
Expand Down
5 changes: 5 additions & 0 deletions docs/usage/troubleshooting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,8 @@ If you're on MacOSX, make sure to remove some of the requirements in abcenv.yml.

If there are incompatibility issues, try building off the 'release.yml' conda environment.


malloc: Heap corruption detected
--------------------------------
We've seen this happen when running on MacOSX during the prediction rule. It's an error thrown by the hicstraw library and happens the first time you use it.
Re-running the pipeline should fix it.
2 changes: 1 addition & 1 deletion tests/config/test_biosamples.tsv
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
biosample DHS ATAC H3K27ac default_accessibility_feature HiC_file HiC_type HiC_resolution alt_TSS alt_genes
K562_chr22 example_chr/chr22/ENCFF860XAE.chr22.sorted.se.bam example_chr/chr22/ENCFF790GFL.chr22.sorted.se.bam DHS https://www.encodeproject.org/files/ENCFF621AIY/@@download/ENCFF621AIY.hic hic 5000 example_chr/chr22/RefSeqCurated.170308.bed.CollapsedGeneBounds.chr22.hg38.TSS500bp.bed example_chr/chr22/RefSeqCurated.170308.bed.CollapsedGeneBounds.chr22.hg38.bed
K562_chr22 example_chr/chr22/ENCFF860XAE.chr22.sorted.se.bam example_chr/chr22/ENCFF790GFL.chr22.sorted.se.bam DHS https://encode-public.s3.amazonaws.com/2022/05/15/0571c671-3645-4f92-beae-51dfd3f42c36/ENCFF621AIY.hic hic 5000 example_chr/chr22/RefSeqCurated.170308.bed.CollapsedGeneBounds.chr22.hg38.TSS500bp.bed example_chr/chr22/RefSeqCurated.170308.bed.CollapsedGeneBounds.chr22.hg38.bed
K562_chr22_tagAlign example_chr/chr22/chr22.sorted.tagAlign.gz ATAC example_chr/chr22/RefSeqCurated.170308.bed.CollapsedGeneBounds.chr22.hg38.TSS500bp.bed example_chr/chr22/RefSeqCurated.170308.bed.CollapsedGeneBounds.chr22.hg38.bed
2 changes: 1 addition & 1 deletion tests/test_predictor.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
from predictor import add_hic_from_hic_file
import pandas as pd

HIC_FILE = "https://www.encodeproject.org/files/ENCFF621AIY/@@download/ENCFF621AIY.hic"
HIC_FILE = "https://encode-public.s3.amazonaws.com/2022/05/15/0571c671-3645-4f92-beae-51dfd3f42c36/ENCFF621AIY.hic"

# this file has 3k rows of E-G pairs with valid contact values
# contact values were generated from the original doubly stochastic method
Expand Down
Loading

0 comments on commit 8964e4c

Please sign in to comment.