-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEATURE REQUEST: Extract/Calls reports number of CpG's covered in the sample #332
Comments
Hello @ethan-mcq, Something like that is possible. For a CpG to be "covered" would you mean that the CpG has at least 1 passing base modification call? This feature feels more like something that would be part of |
Hey @ArtRand! I think that at least 1 passing base modification call, canonical or methylated, would be considered "covered". Generally, we are using a CpG motif to speed up the extract/calls function and often we are dealing with large Gb amounts of throughput per sample. We are not often running pileup as it doesn't give us as much information as the extract function does, as we prefer the single-molecule, "wider" format. Something like the samtools coverage function or the likes might be the most versatile option for this. The use case is that we are trying to have an idea of X coverage of CpG's in samples, even down to the single CpG level. Starting with an average % coverage is a great start and I think could be reported easily in the modkit pipeline flow. While this would of course be up to the final decision of the developers, I think it would be most helpful to add in a metadata statistic into functions that use a motif file that report the total number of CpG's or other motifs included that are covered by at least 1 passing base modification call, and/or a percent coverage. The summary function could be argued, but includes additional repeat computation as summary defaults to a subset of reads to produce the summary file. Hope this makes sense! |
Hello @ethan-mcq, I see. Perhaps a very lightweight version of |
FEATURE REQUEST: Extract/Calls reports number of CpG's covered in the sample
It would be awesome if one was using a motif file to have the percent of CpG's covered in the sample reported after creating the extract and calls files and/or the number of unique CpG locations that are present in the extract/calls file after performing analysis.
The text was updated successfully, but these errors were encountered: