Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to read fasta index (error 2) #319

Open
ramnageena11 opened this issue Dec 20, 2024 · 5 comments
Open

Failed to read fasta index (error 2) #319

ramnageena11 opened this issue Dec 20, 2024 · 5 comments
Labels
troubleshooting workflow and data preparation questions

Comments

@ramnageena11
Copy link

Pls look into this error and suggest how to resolve it?
Here:

  1. I have indexed .bam (from dorado) and then converted to fasta (reads).
  2. reference: assembled (genome: NCBI) or assembly from same long reads (using for modification analysis).

modkit pileup -t 4 --cpg --ref /home/Documents/......../genome_assemblies/bc01/ /home/Documents/demux-pod5/EXP-NBD104_barcode01.bam bc01.bed

calculated chunk size: 6, interval size 100000, processing 600000 positions concurrently
filtering to only CpG motifs
Error! Failed to read fasta index from "/home/Documents/genome_assemblies/bc01/.fai"
caused by No such file or directory (os error 2)

@ArtRand
Copy link
Contributor

ArtRand commented Dec 21, 2024

Hello @ramnageena11,

Could you try passing a path to the FASTA-formatted file containing the reference sequence? Something like /home/Documents/genome_assemblies/bc01/ref.fa, there needs to be an index as well such as /home/Documents/genome_assemblies/bc01/ref.fa.fai.

@ArtRand ArtRand added the troubleshooting workflow and data preparation questions label Dec 21, 2024
@ramnageena11
Copy link
Author

Hi ArtRand,
Thanks for the suggestion. you mean i need to create an index file for reference genome (NCBI) or Assembled genome (using same reads of epigenomic analysis)?

Thanks
Ram

@ArtRand
Copy link
Contributor

ArtRand commented Jan 3, 2025

Hello @ramnageena11,

You need to create an index for the reference FASTA you aligned the reads to. So if this is the assembly, you should use the same reference sequence.

@ramnageena11
Copy link
Author

Hi ArtRand,
I did the index of ref.fa to ref.fa.fai but pileup command came another error 101. said no such files.
Pls see the errors:
modkit pileup /home/dnasequencer/Documents/payal_epi/dorado_aling_bam/aligned_bc01.bam /home/dnasequencer/Documents/payal_epi/demux-pod5/pileup/pileup_01.bed --ref /home/dnasequencer/Documents/payal_epi/OA-G20_genome/ref.fa --preset traditional

Error! unable to open SAM/BAM/CRAM index for /home/dnasequencer/Documents/payal_epi/dorado_aling_bam/aligned_bc01.bam; please create an index

modkit pileup /home/dnasequencer/Documents/payal_epi/dorado_aling_bam/aligned_bc01.bam /home/dnasequencer/Documents/payal_epi/demux-pod5/pileup/pileup_01.bed --ref /home/dnasequencer/Documents/payal_epi/OA-G20_genome/ref.fa.fai --preset traditional

Error! unable to open SAM/BAM/CRAM index for /home/dnasequencer/Documents/payal_epi/dorado_aling_bam/aligned_bc01.bam; please create an index

Let me tell what I have done:
Experiment design: 10 barcode files (5 samples with 2 replicates)

  1. QC using "dorado" with SUP (nanopore)
  2. demux the reads as per barcodes (.fastq). and demuxed .bam files.
  3. I have assembled all the sequenced reads (barcodes) in a separate assembly (.fa).
  4. Downloaded the genome reference from NCBI. THIS WILL Be Reference? or Assemblies will be reference?

I was using pileup command got error.

Now, I used dorado aligner to align reads (demuxed .fatsq) with reference (.fa) and created .bam. again it is giving error:
modkit pileup /home/dnasequencer/Documents/payal_epi/dorado_aling_bam/aligned_bc01.bam /home/dnasequencer/Documents/payal_epi/demux-pod5/pileup/pileup_01.bed --ref /home/dnasequencer/Documents/payal_epi/OA-G20_genome/ref.fa.fai --preset traditional

Error! unable to open SAM/BAM/CRAM index for /home/dnasequencer/Documents/payal_epi/dorado_aling_bam/aligned_bc01.bam; please create an index

What "index" creation it is asking?

Can you pls tell me all the steps for Modkit (consider me a beginner)? Starting from Raw sequence data to visualization of results.
I would be highly grateful.

Thanks
rgds
Ram

@ArtRand
Copy link
Contributor

ArtRand commented Jan 3, 2025

Hello @ramnageena11,

A few things to check.

  1. To run pileup you need 2 indices one for the aligned, sorted modBAM file (.bai usually) and one for the FASTA reference. You create the former with samtools index ${bam} and the latter with samtools faidx ${ref}.
  2. (optional) Make sure that you haven't lost the modified base information in your reads, run modkit summary ${modbam} --threads {threads}. If you used dorado aligner to align your sequencing reads you should be fine.
  3. Run pileup using the commands you have posted.

Downloaded the genome reference from NCBI. THIS WILL Be Reference? or Assemblies will be reference?

Use the reference sequence you aligned the reads to, sounds like this is either the assembly or the NCBI reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
troubleshooting workflow and data preparation questions
Projects
None yet
Development

No branches or pull requests

2 participants