Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

junction score with paired-end data - is it # of reads or fragments? #170

Open
nmanik opened this issue Apr 16, 2013 · 4 comments
Open

junction score with paired-end data - is it # of reads or fragments? #170

nmanik opened this issue Apr 16, 2013 · 4 comments

Comments

@nmanik
Copy link

nmanik commented Apr 16, 2013

Hi Mike,
I hope you're doing well. I'm back again after a hiatus, with a simple question this time. From user guide/wiki, I see that score in junctions_high-quality.bed "is the number of uniquely mapping reads crossing the junction with at least 8 bases on each side."

If I've paired-end data, is this score the number of reads or fragments crossing the junction? I would like to get the count of fragments, as it avoids overcounting when fragments are short (and hence both left and right mate reads cross the junction).

If RUM only reports reads, do you have plans for reporting fragments or any recommendations on how I can get fragment count (possibly from some other output file or which code should I look into)?

Thanks,
Mani

@mdelaurentis
Copy link
Contributor

Mani,

I believe the counts are based on fragments, not reads. By the time we
produce the junctions files, we have already merged overlapping paired
reads. So if the forward and reverse read for the same fragment both span a
junction, they will overlap each other, and so will have been merged
together anyway and will only be counted once.

I'm copying Greg for confirmation, as he is more familiar with this part of
RUM than I am.

Thanks,

Mike

On Tue, Apr 16, 2013 at 3:40 PM, nmanik [email protected] wrote:

Hi Mike,
I hope you're doing well. I'm back again after a hiatus, with a simple
question this time. From user guide/wiki, I see that score in
junctions_high-quality.bed "is the number of uniquely mapping reads
crossing the junction with at least 8 bases on each side."

If I've paired-end data, is this score the number of reads or fragments
crossing the junction? I would like to get the count of fragments, as it
avoids overcounting fragments are short (and hence both left and right mate
reads cross the junction).

If RUM only reports reads, do you have plans for reporting fragments or
any recommendations on how I can get fragment count (possibly from some
other output file or which code should I look into)?

Thanks,
Mani


Reply to this email directly or view it on GitHubhttps://github.com//issues/170
.

@mdelaurentis
Copy link
Contributor

That's correct, they are FPKM's. Mike we should change this on our
documentation. Thanks, Greg

On Wed, 17 Apr 2013, Mike DeLaurentis wrote:

Mani,

I believe the counts are based on fragments, not reads. By the time we
produce the junctions files, we have already merged overlapping paired
reads. So if the forward and reverse read for the same fragment both span a
junction, they will overlap each other, and so will have been merged
together anyway and will only be counted once.

I'm copying Greg for confirmation, as he is more familiar with this part of
RUM than I am.

Thanks,

Mike

On Tue, Apr 16, 2013 at 3:40 PM, nmanik [email protected] wrote:

Hi Mike,
I hope you're doing well. I'm back again after a hiatus, with a simple
question this time. From user guide/wiki, I see that score in
junctions_high-quality.bed "is the number of uniquely mapping reads
crossing the junction with at least 8 bases on each side."

If I've paired-end data, is this score the number of reads or fragments
crossing the junction? I would like to get the count of fragments, as it
avoids overcounting fragments are short (and hence both left and right mate
reads cross the junction).

If RUM only reports reads, do you have plans for reporting fragments or
any recommendations on how I can get fragment count (possibly from some
other output file or which code should I look into)?

Thanks,
Mani


Reply to this email directly or view it on GitHubhttps://github.com//issues/170
.

@nmanik
Copy link
Author

nmanik commented Apr 17, 2013

Thanks Mike & Greg for clarifying this!

One more clarification - I think Greg meant fragment counts and not FPKM (as FPKM would mean fragment count normalized by total number of reads and length of covered region -- I don't think the score in the junctions-high-quality.bed files involve any normalization).

@greggrant
Copy link
Contributor

Right, the raw counts are fragment counts.

On Wed, 17 Apr 2013, nmanik wrote:

Thanks Mike & Greg for clarifying this!

One more clarification - I think Greg meant fragment counts and not FPKM (as FPKM would mean fragment count normalized by total number of reads and length of covered region -- I don't think the score in the junctions-high-quality.bed files involve any normalization).


Reply to this email directly or view it on GitHub:
#170 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants