-
Notifications
You must be signed in to change notification settings - Fork 10
RNA seq pipeline: HiSat2 Stringtie ballgown
Ricky Woo edited this page Dec 1, 2017
·
1 revision
ENSEMBL_RELEASE=80
ENSEMBL_GRCh38_BASE=ftp://ftp.ensembl.org/pub/release-${ENSEMBL_RELEASE}/fasta/homo_sapiens/dna
ENSEMBL_GRCh38_GTF_BASE=ftp://ftp.ensembl.org/pub/release-${ENSEMBL_RELEASE}/gtf/homo_sapiens
F=Homo_sapiens.GRCh38.dna.primary_assembly.fa
GTF_FILE=Homo_sapiens.GRCh38.${ENSEMBL_RELEASE}.gtf
## download the genome file
curl -o $F.gz ${ENSEMBL_GRCh38_BASE}/$F.gz
gunzip $F.gz
mv $F genome.fa
## download the transcriptome data file
curl -o ${GTF_FILE}.gz ${ENSEMBL_GRCh38_GTF_BASE}/${GTF_FILE}.gz
gunzip ${GTF_FILE}.gz
hisat2_extract_splice_sites.py ${GTF_FILE}.gz > genome.ss
hisat2_extract_exons.py ${GTF_FILE}.gz > genome.exon
hisat2-build -p 4 genome.fa --ss genome.ss --exon genome.exon hisat2/genome_tran
On the way to the garden of bioinformatics.
A bioinformatics wiki for the course BI462.