You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
single_to_multi_fast5 can be used to reduce the number of files per sequencing run (eg., 100's of 1000's down to just 1000's via selecting the appropriate --batch_size). If one would want to change the number of sequences per fast5 (eg., to further reduce the total number of files), one cannot use single_to_multi_fast5 again on the mullti-fast5 files with a larger --batch-size.
It would be helpful to add a script (e.g., multi_to_multi_fast5) that could alter the number of sequences per fast5 file: either by combining sequences or splitting them, depending on the total number of fast5 files that the user wants.
The text was updated successfully, but these errors were encountered:
Hi @nick-youngblut -- you can do this with fast5_subset, though it will require you to give it a list containing all the read_ids you currently have (which I believe you should be able to easily generate from your call to single_to_multi_fast5). We can certainly look into allowing the read_id list from fast5_subset to be optional, at which point it will do exactly what you're after.
Thanks for pointing out that option. I was looking for a computationally efficient and straight-forward way of changing the number of sequences per fast5 (more or less seqs per file) -- a split/aggregate script. I'm guessing that most just use the now default 4k sequences per fast5 and never want to change it, so maybe 4k-per-file is optimal for most/all situations.
single_to_multi_fast5
can be used to reduce the number of files per sequencing run (eg., 100's of 1000's down to just 1000's via selecting the appropriate--batch_size
). If one would want to change the number of sequences per fast5 (eg., to further reduce the total number of files), one cannot usesingle_to_multi_fast5
again on the mullti-fast5 files with a larger--batch-size
.It would be helpful to add a script (e.g.,
multi_to_multi_fast5
) that could alter the number of sequences per fast5 file: either by combining sequences or splitting them, depending on the total number of fast5 files that the user wants.The text was updated successfully, but these errors were encountered: