Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[simd_abpoa_align_sequence_to_subgraph1] Error in cg_backtrack. (4) #9

Open
ekg opened this issue Oct 6, 2020 · 17 comments
Open

[simd_abpoa_align_sequence_to_subgraph1] Error in cg_backtrack. (4) #9

ekg opened this issue Oct 6, 2020 · 17 comments

Comments

@ekg
Copy link
Contributor

ekg commented Oct 6, 2020

I finally found a small reproducible example of an alignment problem.

To reproduce on this input FASTA, fail_smoothxg_block_3055.fa.txt:

abpoa -s -r 3 fail_smoothxg_block_3055.fa
[simd_abpoa_align_sequence_to_subgraph1] Error in cg_backtrack. (4)
@yangao07
Copy link
Owner

yangao07 commented Oct 6, 2020

Thank you @ekg for providing this example.
This is a bug related to the banding.
So disable banded DP (set b as -1) is the easiest way to get rid it.
Also I am trying to figure out how to fix it.

The banded DP is more fragile when the lengths of sequences differ too much, like this data: 264 vs 443.
This happened previously, I thought I fixed it.

@ekg
Copy link
Contributor Author

ekg commented Oct 7, 2020

Any workaround (even dropping into non-banded mode when this happens) would be helpful! What do you suggest?

Running everything non-banded to avoid this issue would be expensive.

@rvolden
Copy link

rvolden commented Oct 13, 2020

I'm also running into this issue when using the python API. Instead of being able to handle the error, the thread that gets the error just hangs since this error kicks you out of python. Is there a way for me to be able to handle this error through the python API? Disabling adaptive banding takes too long.

@ekg
Copy link
Contributor Author

ekg commented Oct 13, 2020 via email

@yangao07
Copy link
Owner

@rvolden As mentioned by Erik, this new flag is not added in pyabpoa right now, I will get it done sone.
However, I didn't implement the ambiguous strand mode in python for now.
So I guess what you met is different from what Erik posted here.
Can you share with me the sequences that cause the error? That would very helpful.
Thanks.

@yangao07
Copy link
Owner

Anyway, the ultimate goal is to fix this bug instead of just break the loop and not provide any alignment result.
I am working on that.

@rvolden
Copy link

rvolden commented Oct 14, 2020

You're right, the traceback error is 2, not 4. It's for a pairwise alignment where one has a long polyA but the other doesn't.
I'm including the initialization as well as the sequences here

poa_aligner = poa.msa_aligner(match=5, extra_b=16) # anything lower for extra_b throws the traceback error
res = poa_aligner.msa(subreads, out_cons=False, out_msa=True)
# errors out here
>0
CTGACATTTCGGTGGAGAATTTTTTTATATTTGTATTCTCAGCGTAAAGTCTCCCCTGGATATATTTGTGTTTATGCTGATATTGGCATCCATGTTTGACGGAGGATTATCAGGTAGGTAAATTACTTCATTTGGAGATGAGGTGGTTGTACATTAACTTCCCTCCTCC
TATATTGACTAGCCTTCAACTGGTTCTAAGCAGTGGTATCAACGCAGAGTACATGGGGATTCCTGAAGCTGACAGCATTCGGGCCGAATGTCTCGCTCCGTGGCCTAGCTGTGCTCGCGCTTCTCTCTCTTTCTGGCCTGGAGGCTATCAGCGTACTCCAAAGATTCAGGT
TTACTCACGTCATCACAGAGAATGGAAAGTCAAATTTCCTGAATTGCTATGTGTCAGGTTTTCATCCATCCGACATTGAAGTTGACTTACTGAAGAATGGAGAAGAATTGAAAAGTGGAGCATTCAGACTTGTCTTTCAGCAAGGACTGGTCTTTCTATCTCTTGTACTAC
ACTGAATTCACCCCCACTGAAAAGATAGGTATACTGCCATGTAGAACCATGTGACTTTGTCACAGCCCAAGATAGTTAAGTGGGATCCGAGACATGTAAGCAGCATCATGGAGGTTTGAAGATGCCGCATTTGGATTGGATGAATTCAAATTCTGCTTGCTTGCTTTTTAA
TATTGATATGCTTATACACTTACACTTTATGCACAAAATGTAGGGTTATAATAATGTTAACATGGACATGATCTTCTTTATAATTCTACTTTGAGTGCTGTCTCCATGTTTGATGTATCTGAGCAGGTTGCTCCACAGGTAGCTCTAGGAGGGCTGGCAACAGAGGTGGGA
GCAGAGATTCTCTTATCCAACATCAACATCTTGGTCAGATTTGAACTCTTCAATCTCTTGCACTCAAAGCTTGTTAAGATAGTTAAGCGTGCATAAGTTAACTTCCAATTTACATACTCTGCTTAGAATTTGGGGGAAAATTTAAATATAGTTGAACCCAGGATTATTGGA
AATTTGTTATAATGAATGAAACATTTTGTCATATAAGATTCATATTTACTTCTTATACATTTGATAAAGTAAGGCATGGTTGTGGTTAATCTGGTTTTATTTTTGTTCCACAAGTTAAATAAATCATAAAACTTGAAAAAAAAAAAAAAAAAAAAAAAAAAAATAAAAAAA
AAAAAAAAAAAAAGTATTCCATAAGACTCTGCGTTGATACCACTGCTT
>1
CTGACATTTCGGTGGAGAATCTTATTATATCGTGCTTCTCAACTGTAAAGTCTCCCTGGATATATTTGTGTTTATGCTGATATTGGCATCCATGTTTGACAGAGGATTATCAGGTAGGT
AAATTACTTCATTTGGAGATGAGGTGGTTGTACATTAACTTCCCTAAATATATATCTTCAAGCCTTCAACTGAAAGTTCTAAGCAGTGGTATCAACGCGAGTCTTTTTATGGAATACTTATTGAACAGGTAATTCACTGTAATATTTATTAAGTGATGACTAGAGGGATAT
TGATAGATGTAAAAATTTTCACTCACAGTGAACATGAAACCTTTACACATGTAAGGTTTAGATTCTTTTTTTTTAATCTGCCCCTTTCAGATTATATCATGGTATATGAAGCACTGGTGAGGTCTATGTCACCAGAAATTCCCCAGTTTGCTGATTTGTTAGGTTTTTTAA
CCCGATGATTGTACTGCAACAAGTGAGCATCATTCACTGCAACCTTGAAGTGGTCAGGTTCAACCAGTACTTGTATTTTGAATGGTTTCCCACTTTCAAATGGGAAAACCGACTGTCTTTCTTCCCTTCCCCAGTTATTATCCAGCTTTGTATTGCCAAACAATGACTCTC
CTGTTGTTCTCATTGAAGCGTGGGTTAAAGTGGAAGGCAACATCATTCCCTCTTTGGAAATCTAAAGCAATTCTGTTTGCATTGGGCTTCACCGTGCCCAGAATTGTTATCAGCATGCGAGGCACCACTCCCCGGTAAAGAGAGCAGGTTATAAGGCACAATCAGTGGCCC
AGCAGGGGCGCCATAGGGGCCAGTGGCGGGAGTAGGCTCCGGTGGCACTTGGCTGTCCAGAAGATGGGTAGGCCCCAGGGCCGCTGGGTGGCCCTGGTGGGCTCCAGGTGCAGGTGCCGGGATAAGCTCCAGGTGCTCCAGGGTAGGCGCCTGGAGGTGCCTGGTCAGGAT
AGCCCCCTGGGGTGCCTGCCCGGGGTAGGCCCAGGATGGGGCCCTGGGTGGCCCCTGCCCCAGCAGGCTGGTTCCCCCATGCGCCAGGCTCGCCAGGGTTTGGGTTTCCAGACCCAGATAACGCATCATGGAGCGCTCGTTGGCTGGCTCCGGACGGCTGCTGGCGAGGAG
GTGCTGCGGGCCCCCCATGTACTCTGCGTTGATACCACTGCTTCT

@yangao07
Copy link
Owner

I modified the codes of the traceback part in the latest commit. Hopefully, this can resolve these bugs.
I also removed the trackback_ok flag, since it is not needed if we can finish the traceback step.

@ekg @rvolden This works on the two sequence sets you guys provided here, please try it out on some other data.

Yan

@ekg
Copy link
Contributor Author

ekg commented Oct 14, 2020

Unfortunately, I still find cases that cause this error.

fail_smoothxg_block_9338.fa.txt

-> % abpoa -s -r 3 fail_smoothxg_block_9338.fa.txt
[simd_abpoa_align_sequence_to_subgraph1] Error in cg_backtrack. (4)

@rvolden
Copy link

rvolden commented Oct 14, 2020

I don't get the error for python2, but I get it in python3. The only modification I made to the makefile for python3 was to change the command to python3 instead of python:

  103 install_py: python/cabpoa.pxd python/pyabpoa.pyx python/README.md
~ 104 |   ${py_SIMD_FLAG} python3 setup.py install
  105 |   
  106 sdist: install_py
~ 107 |   ${py_SIMD_FLAG} python3 setup.py sdist #bdist_wheel

To clarify, this is python 3.6.9, and the error I get with the same sequences I provided is [simd_abpoa_align_sequence_to_subgraph1] Error in cg_backtrack. (2)

@yangao07
Copy link
Owner

yangao07 commented Oct 15, 2020

Unfortunately, I still find cases that cause this error.

@ekg This is a different case and different type of bug. Working on that.
Before I fix it, you probably want to roll back to the version where you added the traceback flag.

@yangao07
Copy link
Owner

I don't get the error for python2, but I get it in python3.

Nothing was changed related to the python side. Did you re-install pyabpoa in python3?

@rvolden
Copy link

rvolden commented Oct 15, 2020

Yeah, I reinstalled using pip3 for python3

@yangao07
Copy link
Owner

yangao07 commented Oct 15, 2020

Yeah, I reinstalled using pip3 for python3

These changes haven't been pushed to the pypi. So pip3 install will give you the old one.
To install locally from source, try make install_py or python3 setup.py install.

@rvolden
Copy link

rvolden commented Oct 15, 2020

I should've been a bit more clear. I did make install_py after modifying the make file. When it didn't work, I tried reinstalling using pip, and I also tried python3 setup.py install, which also throws the traceback error

@yangao07
Copy link
Owner

This really sounds weird to me. Also, it works on my pc when I install with python3.
Maybe you can try to remove everything and reinstall it.

@rvolden
Copy link

rvolden commented Oct 15, 2020

Removed everything previously installed. It's working now. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants