Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alternative to BwaMemOptions #3

Merged
merged 5 commits into from
Jan 13, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 14 additions & 12 deletions docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,9 @@ or
from pybwa import BwaMem
mem = BwaMem(prefix="/path/to/genome.fasta")

The :class:`~pybwa.BwaIndex` object is useful when re-using the same index, such that it only needs to be loaded into
memory once. Both constructors for the :class:`~pybwa.BwaAln` and :class:`~pybwa.BwaMem` objects accept an index.

The :meth:`pybwa.BwaAln.align` method accepts a list of reads (as either strings or :class:`pysam.FastxRecord` s) to
align and return a *single* :class:`pysam.AlignedSegment` per input read:

Expand Down Expand Up @@ -66,20 +69,22 @@ It is constructed directly and options set on the object:
recs = aln.align(queries=["GATTACA"], opt=opt)


The :meth:`pybwa.BwaMem.align` method accepts custom options provided as a :class:`~pybwa.BwaMemOptions` object.
It is constructed via the :class:`~pybwa.BwaMemOptionsBuilder` class, to support scaling gap open and extend penalties
when a using custom match score, or the specification of presets (via `mode`).
Similarly, the :meth:`pybwa.BwaMem.align` method accepts custom options provided as a :class:`~pybwa.BwaMemOptions` object.
It is constructed directly and options set on the object:

.. code-block:: python

builder = BwaMemOptionsBuilder()
builder.min_seed_len = 32
opt: BwaMemOptions = builder.build()
opt = BwaMemOptions()
opt.min_seed_len = 32
recs = aln.align(queries=["GATTACA"], opt=opt)

The :class:`~pybwa.BwaIndex` object is useful when re-using the same index, such that it only needs to be loaded into memory
once.
Both constructors for the :class:`~pybwa.BwaAln` and :class:`~pybwa.BwaMem` objects accept an index.
Note: the :meth:`~pybwa.BwaMemOptions.finalize` method will both apply the presets as specified by the
:meth:`~pybwa.BwaMemOptions.mode` option, as well as scale various other options (:code:`-TdBOELU`) based on the
:attr:`~pybwa.BwaMemOptions.match_score`. The presets and scaling will only be applied to other options that have not
been modified from their defaults. After calling the :meth:`~pybwa.BwaMemOptions.finalize` method, the options are
immutable, unless :code:`copy=True` is passed to :meth:`~pybwa.BwaMemOptions.finalize` method, in which case a copy
of the options are returned by the method. Regardless, the :meth:`~pybwa.BwaMemOptions.finalize` method *does not*
need to be called before the :meth:`pybwa.BwaMem.align` is invoked, as the latter will do so (making a local copy).

API versus Command-line Differences
===================================
Expand Down Expand Up @@ -109,9 +114,6 @@ Bwa Aln
Bwa Mem
=======

.. autoclass:: pybwa.BwaMemOptionsBuilder
:members:

.. autoclass:: pybwa.BwaMemOptions
:members:

Expand Down
13 changes: 11 additions & 2 deletions pybwa/libbwaaln.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ cdef class BwaAlnOptions:
free(self._delegate)

cdef gap_opt_t* gap_opt(self):
"""Returns the options struct to use with the bwa C library methods"""
return self._delegate

property max_mismatches:
Expand Down Expand Up @@ -138,6 +139,14 @@ cdef class BwaAln:
cdef BwaIndex _index

def __init__(self, prefix: str | Path | None = None, index: BwaIndex | None = None):
"""Constructs the :code:`bwa aln` aligner.

One of `prefix` or `index` must be specified.

Args:
prefix: the path prefix for the BWA index (typically a FASTA)
index: the index to use
"""
if prefix is not None:
assert Path(prefix).exists()
self._index = BwaIndex(prefix=prefix)
Expand All @@ -148,7 +157,6 @@ cdef class BwaAln:

bwase_initialize()

# TODO: a list of records...
def align(self, queries: List[FastxRecord] | List[str], opt: BwaAlnOptions | None = None) -> List[AlignedSegment]:
"""Align one or more queries with `bwa aln`.

Expand All @@ -157,7 +165,8 @@ cdef class BwaAln:
opt: the alignment options, or None to use the default options

Returns:
one alignment per query
one alignment (:class:`~pysam.AlignedSegment`) per query
:code:`List[List[AlignedSegment]]`.
"""
if len(queries) == 0:
return []
Expand Down
2 changes: 1 addition & 1 deletion pybwa/libbwaindex.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ cdef class BwaIndex:
with :code:`samtools dict <fasta>`).

Args:
prefix (str | Path): the path prefix for teh BWA index
prefix (str | Path): the path prefix for the BWA index (typically a FASTA)
bwt (bool): load the BWT (FM-index)
bns (bool): load the BNS (reference sequence metadata)
pac (bool): load the PAC (the actual 2-bit encoded reference sequences with 'N' converted to a
Expand Down
164 changes: 129 additions & 35 deletions pybwa/libbwamem.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -18,42 +18,136 @@ class BwaMemMode(enum.Enum):

class BwaMemOptions:
def __init__(self, finalize: bool = False) -> None: ...
_finalized: bool
_ignore_alt: bool
min_seed_len: int
mode: BwaMemMode
band_width: int
match_score: int
mismatch_penalty: int
minimum_score: int
unpaired_penalty: int
n_threads: int
skip_pairing: bool
output_all_for_fragments: bool
interleaved_paired_end: bool
short_split_as_secondary: bool
skip_mate_rescue: bool
soft_clip_supplementary: bool
with_xr_tag: bool
query_coord_as_primary: bool
keep_mapq_for_supplementary: bool
with_xb_tag: bool
max_occurrences: int
off_diagonal_x_dropoff: float
ignore_alternate_contigs: bool
internal_seed_split_factor: float
drop_chain_fraction: float
max_mate_rescue_rounds: int
min_seeded_bases_in_chain: int
seed_occurrence_in_3rd_round: int
xa_max_hits: int | tuple[int, int]
xa_drop_ratio: float
gap_open_penalty: int | tuple[int, int]
gap_extension_penalty: int | tuple[int, int]
clipping_penalty: int | tuple[int, int]

class BwaMemOptionsBuilder(BwaMemOptions):
def __init__(self, options: BwaMemOptions | None = None) -> None: ...
def build(self) -> BwaMemOptions: ...
_mode: BwaMemMode | None
@property
def finalized(self) -> bool: ...
@property
def min_seed_len(self) -> int: ...
@min_seed_len.setter
def min_seed_len(self, value: int) -> None: ...
@property
def mode(self) -> BwaMemMode: ...
@mode.setter
def mode(self, value: BwaMemMode) -> None: ...
@property
def band_width(self) -> int: ...
@band_width.setter
def band_width(self, value: int) -> None: ...
@property
def match_score(self) -> int: ...
@match_score.setter
def match_score(self, value: int) -> None: ...
@property
def mismatch_penalty(self) -> int: ...
@mismatch_penalty.setter
def mismatch_penalty(self, value: int) -> None: ...
@property
def minimum_score(self) -> int: ...
@minimum_score.setter
def minimum_score(self, value: int) -> None: ...
@property
def unpaired_penalty(self) -> int: ...
@unpaired_penalty.setter
def unpaired_penalty(self, value: int) -> None: ...
@property
def n_threads(self) -> int: ...
@n_threads.setter
def n_threads(self, value: int) -> None: ...
@property
def skip_pairing(self) -> bool: ...
@skip_pairing.setter
def skip_pairing(self, value: bool) -> None: ...
@property
def output_all_for_fragments(self) -> bool: ...
@output_all_for_fragments.setter
def output_all_for_fragments(self, value: bool) -> None: ...
@property
def interleaved_paired_end(self) -> bool: ...
@interleaved_paired_end.setter
def interleaved_paired_end(self, value: bool) -> None: ...
@property
def short_split_as_secondary(self) -> bool: ...
@short_split_as_secondary.setter
def short_split_as_secondary(self, value: bool) -> None: ...
@property
def skip_mate_rescue(self) -> bool: ...
@skip_mate_rescue.setter
def skip_mate_rescue(self, value: bool) -> None: ...
@property
def soft_clip_supplementary(self) -> bool: ...
@soft_clip_supplementary.setter
def soft_clip_supplementary(self, value: bool) -> None: ...
@property
def with_xr_tag(self) -> bool: ...
@with_xr_tag.setter
def with_xr_tag(self, value: bool) -> None: ...
@property
def query_coord_as_primary(self) -> bool: ...
@query_coord_as_primary.setter
def query_coord_as_primary(self, value: bool) -> None: ...
@property
def keep_mapq_for_supplementary(self) -> bool: ...
@keep_mapq_for_supplementary.setter
def keep_mapq_for_supplementary(self, value: bool) -> None: ...
@property
def with_xb_tag(self) -> bool: ...
@with_xb_tag.setter
def with_xb_tag(self, value: bool) -> None: ...
@property
def max_occurrences(self) -> int: ...
@max_occurrences.setter
def max_occurrences(self, value: int) -> None: ...
@property
def off_diagonal_x_dropoff(self) -> float: ...
@off_diagonal_x_dropoff.setter
def off_diagonal_x_dropoff(self, value: float) -> None: ...
@property
def ignore_alternate_contigs(self) -> bool: ...
@ignore_alternate_contigs.setter
def ignore_alternate_contigs(self, value: bool) -> None: ...
@property
def internal_seed_split_factor(self) -> float: ...
@internal_seed_split_factor.setter
def internal_seed_split_factor(self, value: float) -> None: ...
@property
def drop_chain_fraction(self) -> float: ...
@drop_chain_fraction.setter
def drop_chain_fraction(self, value: float) -> None: ...
@property
def max_mate_rescue_rounds(self) -> int: ...
@max_mate_rescue_rounds.setter
def max_mate_rescue_rounds(self, value: int) -> None: ...
@property
def min_seeded_bases_in_chain(self) -> int: ...
@min_seeded_bases_in_chain.setter
def min_seeded_bases_in_chain(self, value: int) -> None: ...
@property
def seed_occurrence_in_3rd_round(self) -> int: ...
@seed_occurrence_in_3rd_round.setter
def seed_occurrence_in_3rd_round(self, value: int) -> None: ...
@property
def xa_max_hits(self) -> int | tuple[int, int]: ...
@xa_max_hits.setter
def xa_max_hits(self, value: int | tuple[int, int]) -> None: ...
@property
def xa_drop_ratio(self) -> float: ...
@xa_drop_ratio.setter
def xa_drop_ratio(self, value: float) -> None: ...
@property
def gap_open_penalty(self) -> int | tuple[int, int]: ...
@gap_open_penalty.setter
def gap_open_penalty(self, value: int | tuple[int, int]) -> None: ...
@property
def gap_extension_penalty(self) -> int | tuple[int, int]: ...
@gap_extension_penalty.setter
def gap_extension_penalty(self, value: int | tuple[int, int]) -> None: ...
@property
def clipping_penalty(self) -> int | tuple[int, int]: ...
@clipping_penalty.setter
def clipping_penalty(self, value: int | tuple[int, int]) -> None: ...
def finalize(self, copy: bool = False) -> BwaMemOptions: ...

class BwaMem:
_index: BwaIndex
Expand Down
Loading
Loading