TADsplimer - a pipeline for analyzing TAD splitting and merging

Introduction

This pipeline is designed for automated end-to-end analyzing TAD splitting and merging.

If you have any questions or issues, you can ask in "Issues" of this Git or send email to [email protected] or [email protected].

Required packages for executing TADsplimer

When executing TADsplimer, user should install following packages / libraries in the system:

R environment (we have used 3.4.1)
Python version 3.5.5
The R packages: bcp, cluster, mass, essentials, spatstat, hicrep, changepoint, optparse
The python packages: numpy, cv2, PIL, scipy, imagehash

User should include the PATH of above mentioned libraries / packages inside their SYSTEM PATH variable.

Installation

The docker can be directly downloaded from dockerhub (https://hub.docker.com/r/guangywang/tadsplimer) with the following command.

docker pull guangywang/tadsplimer:v1.0.3

Conda setup (Contribute by Jimin Tan, Thanks):

conda create --prefix ./tadsplimer --file tadsplimer_pkgs.txt
conda activate ./tadsplimer

Execution

In general, TADsplimer can be executed by following command line options:

docker run -v /<path>/:/data/ -t guangywang/tadsplimer:v1.0.3 python3 /bin/TADsplimer.py  <command>  <path> [optional arguments]

Please make sure output filder is empty or doesn't exit.

TADsplimer involves following command options:

split_TADs:

split TAD detection using two contact maps as input files

-h, --help            show this help message and exit
-c, --contact_maps CONTACT_MAP
	paths to two contact maps. paths must be separated by
	the comma ','. (default: None)
--contact_maps_aliases
            A set of short aliases for two contact maps. Aliases
	must be separated by the comma ','. (default: None)
-u, --up_cutoff UP_CUTOFF
	paths for up cutoff of two contact maps,paths must be 
	separated by the comma ','. (default: None)
-d, --down_cutoff DOWN_CUTOFF
	paths for down cutoff of two contact maps, paths must 
	be separated by the comma ','. (default: None)
-j, --adjust_quality ADJUST_QUALITY
	set as 0 to auto optimize up_cutoff and down_cutoff, 
	set as 1 not auto optimize up_cutoff and down_cutoff
	(default: 0)
-o, --output OUTPUT
	path to output files (default: None)
-d, --split_direction DIRECTION
	set as 0: output TADs split in both two contact maps, 
	set as 1: output TADs split in contact map1, set as 2: 
	output TADs split in contact map2 (default: 0)

The following commands can be used to detect TAD split and merge events.

docker run -v /<path>/:/data/ -t guangywang/tadsplimer:v1.0.3 python3 /bin/TADsplimer.py split_TADs -c /data/simulation_merge.txt,/data/simulation_split.txt --contact_maps_aliases merge,split -o /data/output

TAD_calculator:

topological domain identification

-h, --help            show this help message and exit
-c, --contact_map CONTACT_MAP
	path to Hi-C contact map (default: None)
-u, --up UP_CUTOFF
	up cutoff for Hi-C contact map detection (default: 0)
-d, --down DOWN_CUTOFF
	down cutoff for Hi-C contact map detection (default: 0)
-o, --TAD_output OUTPUT
	path for the output file of TADs (default: None)
-p, --TAD_plot PLOT
	Set to 1 to plot the contact map and TAD, else set to 
	0 to cancel this analysis. (default: 1)
--sub_map SUB_MAP
	Set to 1 to output sub contact maps and TADs, else set 
	to 0 to cancel this analysis. (default: 1)

The following commands can be used to detect TADs.

docker run -v /<path>/:/data/ -t guangywang/tadsplimer:v1.0.3 python3 /bin/TADsplimer.py TAD_calculator -c /data/simulation_merge.txt -u 1.7 -d 0.2 -o /data/output

TAD_similarity:

calculating four similarity scores for given TADs

-h, --help            show this help message and exit
-c, --contact_maps CONTACT_MAP
	paths to Hi-C contact maps in two conditions. paths must 
	be separated by the comma ','. (default: None)
-t, --TAD TAD
	input files of TADs for two compared Hi-C contact maps. 
	Paths must be separated by the comma ','. (default: None)
-o, --output OUTPUT
	path to output files (default: None)

The following commands can be used to calculate four similarity scores for given TADs.

docker run -v /<path>/:/data/ -t guangywang/tadsplimer:v1.0.3 python3 /bin/TADsplimer.py TAD_similarity -c /data/simulation_merge.txt -t /data/tad.txt -o /data/output

Input

The input are two Hi-C matrices (full TSV matrices) to be compared. The Hi-C matrices should have the dimension N * N. 10Kb is the default resolution for the input matrices.

Output

The main outputs are a split TAD file and a merged TAD file. Columns and explaination are as follwing:

Split TAD file

Files of *.all.split.txt

column	explaination
1th	Row number
2th	Start position of split TAD
3th	End position of split TAD

Merged TAD file

Files of *.all.merge.txt

column	explaination
1th	P value for TAD matching
2th	Start position of merged TAD
3th	End position of merged TAD
4th	Laplacian matrix similarity score
5th	Corner split ratio
6th	Stratum-adjusted correlation coeffient
7th	Image hashing similarity score

In the output files, the "A->B.all.merge.txt" is the coordinate of merged TADs comparing A sample to B sample. The file "A->B.all.split.txt" is the coordinate of split TADs comparing A sample to B sample.

Update

08/17/2020

remove python dependency rpy2
fix bugs for small size input file

Simulation

In general, simulation of TADs can be executed by following command line options:

Rscript simulation.R -i <path of reference TAD> -o <path of output>

Reference TADs can be downloaded from ./reference folder.

This source code is released under an open source licence compliant with MIT license which is approved by Open Source Licenses (OSI) (https://opensource.org/licenses ).

How to Cite?

Please cite the following required publication.

Wang, G., Meng, Q., Xia, B. et al. TADsplimer reveals splits and mergers of topologically associating domains for epigenetic regulation of transcription. Genome Biol 21, 84 (2020). https://doi.org/10.1186/s13059-020-01992-7

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
docker_image		docker_image
example		example
reference		reference
src		src
License		License
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TADsplimer - a pipeline for analyzing TAD splitting and merging

Introduction

Required packages for executing TADsplimer

Installation

Execution

Input

Output

Split TAD file

Merged TAD file

Update

Simulation

How to Cite?

About

Releases 1

Packages

Languages

License

GuangyWang/TADsplimer

Folders and files

Latest commit

History

Repository files navigation

TADsplimer - a pipeline for analyzing TAD splitting and merging

Introduction

Required packages for executing TADsplimer

Installation

Execution

Input

Output

Split TAD file

Merged TAD file

Update

Simulation

How to Cite?

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages