bam to fastq one liner #13

LukeBraidwood · 2015-06-15T14:26:08Z

Hey,

Thanks very much for putting these explanations and tools up. I think the one liner you have put for converting bam to fastq is inappropriate (or should be described differently). The problem is that your awk prints fields 1, 10, and 11 in the bam.

Field 10 is called SEQ and represents the query sequence to which the read is aligned. However alignment sequences are always represented on the plus strand of the reference (http://chagall.med.cornell.edu/NGScourse/SAM.pdf, http://genome.sph.umich.edu/wiki/SAM), meaning that for stranded bams this tool is inappropriate.

Thanks,

Luke

stephenturner · 2015-06-15T15:36:33Z

Thanks. Suggestion / pull request welcomed.

Stephen

Sent from mobile.

On Jun 15, 2015, at 10:26 AM, LukeBraidwood [email protected] wrote:

Hey,

Thanks very much for putting these explanations and tools up. I think the one liner you have put for converting bam to fastq is inappropriate (or should be described differently). The problem is that your awk prints fields 1, 10, and 11 in the bam.

Field 10 is called SEQ and represents the query sequence to which the read is aligned. However alignment sequences are always represented on the plus strand of the reference (http://chagall.med.cornell.edu/NGScourse/SAM.pdf, http://genome.sph.umich.edu/wiki/SAM), meaning that for stranded bams this tool is inappropriate.

Thanks,

Luke

—
Reply to this email directly or view it on GitHub.

LukeBraidwood · 2015-07-14T12:33:41Z

Dear Stephen,

Sorry for the slow reply, just remembered this exchange. I'm currently
using the samtofastq tool from picard tools, which has an option to
regenerate the RC of alignments to the negative strand:
http://broadinstitute.github.io/picard/command-line-overview.html#SamToFastq

Cheers,

Luke

On Mon, Jun 15, 2015 at 4:36 PM, Stephen Turner [email protected]
wrote:

Thanks. Suggestion / pull request welcomed.

Stephen

Sent from mobile.

On Jun 15, 2015, at 10:26 AM, LukeBraidwood [email protected]
wrote:

Hey,

Thanks very much for putting these explanations and tools up. I think
the one liner you have put for converting bam to fastq is inappropriate (or
should be described differently). The problem is that your awk prints
fields 1, 10, and 11 in the bam.

Field 10 is called SEQ and represents the query sequence to which the
read is aligned. However alignment sequences are always represented on the
plus strand of the reference (
http://chagall.med.cornell.edu/NGScourse/SAM.pdf,
http://genome.sph.umich.edu/wiki/SAM), meaning that for stranded bams
this tool is inappropriate.

Thanks,

Luke

—
Reply to this email directly or view it on GitHub.

—
Reply to this email directly or view it on GitHub
#13 (comment)
.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bam to fastq one liner #13

bam to fastq one liner #13

LukeBraidwood commented Jun 15, 2015

stephenturner commented Jun 15, 2015

LukeBraidwood commented Jul 14, 2015

bam to fastq one liner #13

bam to fastq one liner #13

Comments

LukeBraidwood commented Jun 15, 2015

stephenturner commented Jun 15, 2015

LukeBraidwood commented Jul 14, 2015