Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consensus coverage in diploid mode #29

Open
ldenti opened this issue Sep 13, 2021 · 8 comments
Open

Consensus coverage in diploid mode #29

ldenti opened this issue Sep 13, 2021 · 8 comments

Comments

@ldenti
Copy link

ldenti commented Sep 13, 2021

Hi all,
first of all: great library!

I was playing around with diploid mode and I ran into strange segmentation faults (when freeing cons_cov). From what I could see, in diploid mode, cons_cov is not used at all:

abPOA/src/abpoa_graph.c

Lines 817 to 818 in 637c79f

if (abpt->is_diploid) {
abpoa_diploid_heaviest_column(abg, ABPOA_SRC_NODE_ID, ABPOA_SINK_NODE_ID, n_seq, abpt->min_freq, out_fp, cons_seq, cons_l, cons_n);

So, I'm assuming that it's not possible to get consensus coverage in diploid mode. Is this true?

Thanks,
Luca

@yangao07
Copy link
Owner

You are right.
The diploid mode is not working very well as I expected, so I did not add it to the release of abPOA.
Right now, it only can provide a two-consensus result and not furhter functions were applied in there.

By the way, what is your senario of using this diploid mode? Does the result look good to you?
I may continue to update this function at some time, if not recently.

Yan

@ldenti
Copy link
Author

ldenti commented Sep 14, 2021

Ok, thanks!

I'm working on genotyping SVs and, at least from the few examples I opened in IGV, it seems to work quite well: when I clearly see two alleles in the reads, I get two consensus.

Since the weights of each consensus are quite important to me, is it possible (using the current available functions) to get the heaviest weighted consensus (in "aploid" mode) and then get a second heaviest consensus? Do you see any easy way to do this?

Luca

@yangao07
Copy link
Owner

I will try to work on that, will get back to you when I have any updates.

Yan

@ldenti
Copy link
Author

ldenti commented Sep 14, 2021

Oh, great! Thanks!

Best,
Luca

@yangao07
Copy link
Owner

Hi Luca,

Sorry that I didn't update for such a long time.
Do you have any diploid example/test data?
I am updating the diploid mode of abPOA recently, any data will be very helpful!

Thanks,
Yan

@ldenti
Copy link
Author

ldenti commented Feb 15, 2022

Hi Yan,
I just extracted these two small fastas from some reads I'm working on (hete-examples.zip): I extracted portions (substrings) of the reads covering potential heterozygous events (a deletion and an insertion).

In the zip you can also find the IGV screenshots of the two regions I considered (the alignments clearly show some differences between the haplotypes).

Let me know if they are a good starting point or if you need anything else.

Best,
Luca

@yangao07
Copy link
Owner

yangao07 commented Mar 15, 2022

Hi,

Just pushed the latest version to github, please try out the multiple consensus sequences mode: set -d/--max-n-cons as the desired value.
Also, all the relevant variables including cons_cov are properly set in this mode. See abpoa.h for more details.

abPOA/include/abpoa.h

Lines 101 to 111 in bfe4ac0

typedef struct {
int n_cons, n_seq, msa_len; // # cons, # of total seq, length of row-column MSA (including gaps)
int *clu_n_seq; // # of reads in each read cluster/group, size: n_cons
int **clu_read_ids; // read ids for each cluster/group, size: n_cons * clu_n_seq[i]
int *cons_len; // length of each consensus sequence, size: n_cons
int **cons_node_ids; // node id of each consensus, size: n_cons * cons_len[i]
uint8_t **cons_base; // sequence base of each consensus, size: n_cons * cons_len[i]
uint8_t **msa_base; // sequence base of RC-MSA, size: (n_seq + n_cons) * msa_len
int **cons_cov; // coverage of each consensus base, size: n_cons * cons_len[i]
int **cons_phred_score; // phred score for each consensus base, size: n_cons * cons_len[i]
} abpoa_cons_t;

Yan

@ldenti
Copy link
Author

ldenti commented Mar 18, 2022

Thanks Yan!! I'll check it out.

Best,
Luca

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants