Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi-mapper in RUM_Unique... revisited #158

Open
safisher opened this issue Dec 19, 2012 · 2 comments
Open

multi-mapper in RUM_Unique... revisited #158

safisher opened this issue Dec 19, 2012 · 2 comments

Comments

@safisher
Copy link

Running 2.0.3_04 and get the following error:

Tue Dec 18 16:39:21 2012 50223 WARN RUM::Script::CountReadsMapped - Looks like there's a multi-mapper in the RUM_Unique file. 20933270 () seq.20933270a chr2 51817713-5 1817757 + CAGCATTGGTAGCCACTAGTTAGACAAGAATGTTGTTCGAATTAA
Tue Dec 18 16:39:21 2012 50223 WARN RUM::Script::CountReadsMapped - Looks like there's a multi-mapper in the RUM_Unique file. 20933270 () seq.20933270a chr2 51817713-5 1817757 + CAGCATTGGTAGCCACTAGTTAGACAAGAATGTTGTTCGAATTAA

I ran this alignment twice and got the same error both times. I'm saving the RUM directory for now, in case you're interested in having a look. It's on the Kim cluster in
kimclust17:/local17/fisher/e.20/RN0015/rum.trim

The source files are in
kimclust17:/local17/fisher/e.20/RN0015/trim.Ad.PolyAT

Thanks!

@mdelaurentis
Copy link
Contributor

I'll take a look at it. I briefly looked at the log file, and it does
appear to be the same issue as last time. I'll run a very small job that
includes that read, with the --no-clean option, and see which step of the
pipeline it first shows up as a duplicate.

On Tue, Dec 18, 2012 at 9:15 PM, safisher [email protected] wrote:

Running 2.0.3_04 and get the following error:

Tue Dec 18 16:39:21 2012 50223 WARN RUM::Script::CountReadsMapped - Looks
like there's a multi-mapper in the RUM_Unique file. 20933270 ()
seq.20933270a chr2 51817713-5 1817757 +
CAGCATTGGTAGCCACTAGTTAGACAAGAATGTTGTTCGAATTAA
Tue Dec 18 16:39:21 2012 50223 WARN RUM::Script::CountReadsMapped - Looks
like there's a multi-mapper in the RUM_Unique file. 20933270 ()
seq.20933270a chr2 51817713-5 1817757 +
CAGCATTGGTAGCCACTAGTTAGACAAGAATGTTGTTCGAATTAA

I ran this alignment twice and got the same error both times. I'm saving
the RUM directory for now, in case you're interested in having a look. It's
on the Kim cluster in
kimclust17:/local17/fisher/e.20/RN0015/rum.trim

The source files are in
kimclust17:/local17/fisher/e.20/RN0015/trim.Ad.PolyAT

Thanks!


Reply to this email directly or view it on GitHubhttps://github.com//issues/158.

@mdelaurentis
Copy link
Contributor

I ran a very small job that includes that read, plus the hundred reads that appear before and after it in the input file. It wasn't duplicated in those results. It just appeared once in the RUM_Unique file. Now I'm running the full job, with the same number of chunks that you used (30). I'm thinking it may have had something to do with the read's position in the input.

mdelaurentis added a commit that referenced this issue Dec 21, 2012
This was a weird edge case where merge_GU_and_TU would (rarely)
add a duplicate copy of a mapping. It looks like it has to do
with how the script handles blocks of records at a time in order
to avoid using too much memory. It was inadvertently retaining
the last record from one block when starting another block.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants