Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can I use ncls to calculate the intersection/union number between ranges? #2

Open
Runsheng opened this issue Aug 18, 2018 · 4 comments

Comments

@Runsheng
Copy link

Is there any method to return the intersection and union between two range in ncls?
For instance, range(1,10) and range(5, 15) would return (5,10) and (1,15).

Or just simply return the length of intersection and union like the bedtools jaccards? [https://bedtools.readthedocs.io/en/latest/content/tools/jaccard.html]

@endrebak
Copy link
Collaborator

I’ll reply more in depth on monday, when I’m back at work :) Pyranges should be able to do this. The repo is on my github :)

@Runsheng
Copy link
Author

Thank you very much! I am now using pybedtools to calculate the intersection between ranges. However, the intersection matrix between 300 mRNA tracks (each contains around 15 ranges) would cost me 400 seconds in a 32 core server. I will try Pyranges first and give you some feedback.

@endrebak
Copy link
Collaborator

endrebak commented Aug 20, 2018

pyranges is still largely unused. I have passing unittests, but it might still have bugs or not work.

I would also look into this potential error in bedtools jaccard: arq5x/bedtools2#645 Whether it is a bug and whether it matters I dunno' :)

@endrebak
Copy link
Collaborator

Also, if you use pybedtools, it is advisable to presort the data first. It is much faster then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants