From 26d89dd6289dc63c2c6ab96592ae97f5191b4fb0 Mon Sep 17 00:00:00 2001 From: "Documenter.jl" Date: Wed, 13 Sep 2023 17:07:37 +0000 Subject: [PATCH] build based on b97687f --- dev/alt_features/index.html | 2 +- dev/alt_performance/index.html | 4 ++-- dev/api/index.html | 6 +++--- dev/background/index.html | 2 +- dev/debugging/index.html | 2 +- dev/formulas/index.html | 2 +- dev/index.html | 2 +- dev/roadmap/index.html | 2 +- dev/search/index.html | 2 +- dev/search_index.js | 2 +- dev/tuto_builtin/index.html | 2 +- dev/tuto_custom/index.html | 2 +- 12 files changed, 15 insertions(+), 15 deletions(-) diff --git a/dev/alt_features/index.html b/dev/alt_features/index.html index 93d65f7c..832c5863 100644 --- a/dev/alt_features/index.html +++ b/dev/alt_features/index.html @@ -1,2 +1,2 @@ -Features · HiddenMarkovModels.jl

Alternatives - features

We compare features among the following Julia packages:

We discard MarkovModels.jl because its focus is GPU computation. There are also more generic packages for probabilistic programming, which are able to perform MCMC or variational inference (eg. Turing.jl) but we leave those aside.

HMMs.jlHMMBase.jlHMMGradients.jl
AlgorithmsSim, FB, Vit, BWSim, FB, Vit, BWFB
Observation typesanythingNumber / Vectoranything
Observation distributionsDensityInterface.jlDistributions.jlmanual
Number typesanythingFloat64AbstractFloat
Priors / structurespossiblenopossible
Automatic differentiationyesnoyes
Multiple sequencesyesnoyes
Linear algebrayesyesno
Logarithmic probabilitieshalfwayhalfwayyes

Sim = Simulation, FB = Forward-Backward, Vit = Viterbi, BW = Baum-Welch

+Features · HiddenMarkovModels.jl

Alternatives - features

We compare features among the following Julia packages:

We discard MarkovModels.jl because its focus is GPU computation. There are also more generic packages for probabilistic programming, which are able to perform MCMC or variational inference (eg. Turing.jl) but we leave those aside.

HMMs.jlHMMBase.jlHMMGradients.jl
AlgorithmsSim, FB, Vit, BWSim, FB, Vit, BWFB
Observation typesanythingNumber / Vectoranything
Observation distributionsDensityInterface.jlDistributions.jlmanual
Number typesanythingFloat64AbstractFloat
Priors / structurespossiblenopossible
Automatic differentiationyesnoyes
Multiple sequencesyesnoyes
Linear algebrayesyesno
Logarithmic probabilitieshalfwayhalfwayyes

Sim = Simulation, FB = Forward-Backward, Vit = Viterbi, BW = Baum-Welch

diff --git a/dev/alt_performance/index.html b/dev/alt_performance/index.html index f2bbe53b..ff17e0ba 100644 --- a/dev/alt_performance/index.html +++ b/dev/alt_performance/index.html @@ -1,3 +1,3 @@ -Performance · HiddenMarkovModels.jl

Alternatives - performance

We compare performance among the following packages:

Numerical results

The test case is an HMM with diagonal multivariate normal observations.

  • N: number of states
  • D: dimension of the observations
  • T: trajectory length
  • K: number of trajectories
  • I: number of Baum-Welch iterations
Why is this empty?

The benchmark suite is computationally expensive, and we only run it once for each new release. If the following section contains no plots and the links are broken, you're probably reading the development documentation or a local build of the website. Check out the stable documentation instead.

Single sequence

Full benchmark logs: results_single_sequence.csv.

Plot - Logdensity single sequence benchmark

Plot - Viterbi single sequence benchmark

Plot - Forward-backward single sequence benchmark

Plot - Baum-Welch single sequence benchmark

Here, pomegranate is not included because it is much slower on very small inputs.

Multiple sequences

Full benchmark logs: results_multiple_sequences.csv.

Plot - Logdensity single sequence benchmark

Plot - Baum-Welch single sequence benchmark

Here, HMMBase.jl is not included because it does not support multiple sequences.

Reproducibility

These benchmarks were generated in the following environment: setup.txt.

If you want to run them on your machine:

  1. Clone the HiddenMarkovModels.jl repository

  2. Open a Julia REPL at the root

  3. Run the following commands

    include("benchmark/run_benchmarks.jl")
    -include("benchmark/process_benchmarks.jl")

Remarks

Julia-to-Python overhead

Since the Python packages are called from Julia with PythonCall.jl, we pay a small overhead that is hard to quantify. On the plots, we compensate it by subtracting the runtime of the same algorithm for the smallest instance (N=1, D=1, T=2, K=1, I=1) from all Python-generated curves. This estimate for the overhead is put in parentheses in the legend. It is probably over-pessimistic, which is fair because it means that the comparison is now biased against Julia.

Parallelism

The packages we include have different approaches to parallelism, which can bias the evaluation in complex ways:

PackageStates NObservations DSequences K
HMMs.jlLinearAlgebra[2]depends[2]Threads[1]
HMMBase.jl-depends[2]-
hmmlearnNumPy[2]NumPy[2]NumPy[2]
pomegranatePyTorch[3]PyTorch[3]PyTorch[3]

For a fairer comparison, we set JULIA_NUM_THREADS=1, even though it robs HMMs.jl of its parallel speedup on multiple sequences. In addition, OpenBLAS threads have negative interactions with Julia threads. To overcome this obstacle, we run the Julia benchmarks (and only those) with OPENBLAS_NUM_THREADS=1.

  • 1possibly affected by JULIA_NUM_THREADS
  • 2possibly affected by OPENBLAS_NUM_THREADS
  • 3possibly affected by MKL_NUM_THREADS
+Performance · HiddenMarkovModels.jl

Alternatives - performance

We compare performance among the following packages:

The test case is an HMM with diagonal multivariate normal observations.

  • N: number of states
  • D: dimension of the observations
  • T: trajectory length
  • K: number of trajectories
  • I: number of Baum-Welch iterations
Why is this empty?

The benchmark suite is computationally expensive, and we only run it once for each new release. If the following section contains no plots and the links are broken, you're probably reading the development documentation or a local build of the website. Check out the stable documentation instead.

Reproducibility

These benchmarks were generated in the following environment: setup.txt.

If you want to run them on your machine:

  1. Clone the HiddenMarkovModels.jl repository

  2. Open a Julia REPL at the root

  3. Run the following commands

    include("benchmark/run_benchmarks.jl")
    +include("benchmark/process_benchmarks.jl")

Remarks

Julia-to-Python overhead

Since the Python packages are called from Julia with PythonCall.jl, we pay a small overhead that is hard to quantify. On the plots, we compensate it by subtracting the runtime of the same algorithm for the smallest instance (N=1, D=1, T=2, K=1, I=1) from all Python-generated curves.

Allocations

A major bottleneck of performance in Julia is memory allocations. The benchmarks for HMMs.jl thus employ a custom implementation of diagonal multivariate normals, which is entirely allocation-free.

This partly explains the performance gap with HMMBase.jl as the dimension D grows beyond 1. Such a trick is also possible with HMMBase.jl, but more demanding since it requires subtyping Distribution from Distributions.jl, instead of just implementing DensityInterface.jl.

Parallelism

The packages we include have different approaches to parallelism, which can bias the evaluation in complex ways:

PackageStates NObservations DSequences K
HMMs.jlLinearAlgebra[2]depends[2]Threads[1]
HMMBase.jl-depends[2]-
hmmlearnNumPy[2]NumPy[2]NumPy[2]
pomegranatePyTorch[3]PyTorch[3]PyTorch[3]

We report each number of threads in setup.txt. Since OpenBLAS threads have negative interactions with Julia threads, we run the Julia benchmarks with a single OpenBLAS thread.

Numerical results

Low dimension

Full benchmark logs: low_dim.csv.

Here, pomegranate is not included because it is much slower on very small inputs.

High dimension

Full benchmark logs: high_dim.csv.

Here, HMMBase.jl is not included because it does not support multiple sequences.

  • 1affected by number of Julia threads
  • 2affected by number of OpenBLAS threads
  • 3affected by number of Pytorch threads
diff --git a/dev/api/index.html b/dev/api/index.html index 5916ea81..edf6fc1b 100644 --- a/dev/api/index.html +++ b/dev/api/index.html @@ -1,8 +1,8 @@ -API reference · HiddenMarkovModels.jl

API reference

HiddenMarkovModelsModule
HiddenMarkovModels

A Julia package for HMM modeling, simulation, inference and learning.

source

Types

Markov chains

HiddenMarkovModels.AbstractMarkovChainType
AbstractMarkovChain

Abstract supertype for a Markov chain amenable to simulation, inference and learning.

Required interface

  • initial_distribution(mc)
  • transition_matrix(mc)
  • fit!(mc, init_count, trans_count) (optional)

Applicable methods

  • rand([rng,] mc, T)
  • logdensityof(mc, state_seq)
  • fit(mc, state_seq_or_seqs) (if fit! is implemented)
source
HiddenMarkovModels.MarkovChainType
MarkovChain <: AbstractMarkovChain

Basic implementation of a discrete-state Markov chain.

Fields

  • init::AbstractVector: initial state probabilities
  • trans::AbstractMatrix: state transition matrix
source

Hidden Markov Models

HiddenMarkovModels.AbstractHiddenMarkovModelType
AbstractHiddenMarkovModel <: AbstractMarkovChain

Abstract supertype for an HMM amenable to simulation, inference and learning.

Required interface

  • initial_distribution(hmm)
  • transition_matrix(hmm)
  • obs_distribution(hmm, i)
  • fit!(hmm, init_count, trans_count, obs_seq, state_marginals) (optional)

Applicable methods

  • rand([rng,] hmm, T)
  • logdensityof(hmm, obs_seq) / logdensityof(hmm, obs_seqs, nb_seqs)
  • forward(hmm, obs_seq) / forward(hmm, obs_seqs, nb_seqs)
  • viterbi(hmm, obs_seq) / viterbi(hmm, obs_seqs, nb_seqs)
  • forward_backward(hmm, obs_seq) / forward_backward(hmm, obs_seqs, nb_seqs)
  • baum_welch(hmm, obs_seq) / baum_welch(hmm, obs_seqs, nb_seqs) if fit! is implemented
source
HiddenMarkovModels.HiddenMarkovModelType
HiddenMarkovModel{D} <: AbstractHiddenMarkovModel

Basic implementation of an HMM.

Fields

  • init::AbstractVector: initial state probabilities
  • trans::AbstractMatrix: state transition matrix
  • dists::AbstractVector{D}: observation distributions
source
HiddenMarkovModels.PermutedHMMType
PermutedHMM{H<:AbstractHMM}

Wrapper around an AbstractHMM that permutes its states.

This is computationally inefficient and mostly useful for evaluation.

Fields

  • hmm:H: the old HMM
  • perm::Vector{Int}: a permutation such that state i in the new HMM corresponds to state perm[i] in the old.
source

Basics

Base.randFunction
rand([rng=default_rng(),] mc::AbstractMarkovChain, T)

Simulate mc for T time steps with a specified rng.

source
Base.lengthFunction
length(mc::AbstractMarkovChain)

Return the number of states of model.

source
HiddenMarkovModels.obs_distributionFunction
obs_distribution(hmm::AbstractHMM, i)

Return the observation distribution of hmm associated with state i.

The returned object dist must implement

  • rand(rng, dist)
  • DensityInterface.logdensityof(dist, x)
source

Inference

DensityInterface.logdensityofFunction
logdensityof(mc, state_seq)

Compute the loglikelihood of a single state sequence for a Markov chain.

source
logdensityof(hmm, obs_seq)

Apply the forward algorithm to compute the loglikelihood of a single observation sequence for an HMM.

Return a number.

source
logdensityof(hmm, obs_seqs, nb_seqs)

Apply the forward algorithm to compute the total loglikelihood of multiple observation sequences for an HMM.

Return a number.

Multithreading

This function is parallelized across sequences.

source
HiddenMarkovModels.forwardFunction
forward(hmm, obs_seq)

Apply the forward algorithm to an HMM.

Return a tuple (α, logL) where

  • logL is the loglikelihood of the sequence
  • α[i] is the posterior probability of state i at the end of the sequence.
source
forward(hmm, obs_seqs, nb_seqs)

Apply the forward algorithm to an HMM, based on multiple observation sequences.

Return a vector of tuples (αₖ, logLₖ), where

  • logLₖ is the loglikelihood of sequence k
  • αₖ[i] is the posterior probability of state i at the end of sequence k
Multithreading

This function is parallelized across sequences.

source
HiddenMarkovModels.viterbiFunction
viterbi(hmm, obs_seq)

Apply the Viterbi algorithm to compute the most likely state sequence of an HMM.

Return a vector of integers.

source
viterbi(hmm, obs_seqs, nb_seqs)

Apply the Viterbi algorithm to compute the most likely state sequences of an HMM, based on multiple observation sequences.

Return a vector of vectors of integers.

Multithreading

This function is parallelized across sequences.

source
HiddenMarkovModels.forward_backwardFunction
forward_backward(hmm, obs_seq)

Apply the forward-backward algorithm to estimate the posterior state marginals of an HMM.

Return a ForwardBackwardStorage.

source
forward_backward(hmm, obs_seqs, nb_seqs)

Apply the forward-backward algorithm to estimate the posterior state marginals of an HMM, based on multiple observation sequences.

Return a vector of ForwardBackwardStorage objects.

Multithreading

This function is parallelized across sequences.

source

Learning

StatsAPI.fit!Function
fit!(mc::MC, init_count, trans_count)

Update mc in-place based on information generated from a state sequence.

source
fit!(hmm::HMM, init_count, trans_count, obs_seq, state_marginals)

Update hmm in-place based on information generated during forward-backward.

source
StatsAPI.fitFunction
fit(mc, state_seq_or_seqs)

Fit a Markov chain of the same type as mc to one or several state sequence(s).

Beware that mc must be an actual object of type MarkovChain, and not the type itself as is usually done eg. in Distributions.jl.

source
HiddenMarkovModels.baum_welchFunction
baum_welch(
+API reference · HiddenMarkovModels.jl

API reference

HiddenMarkovModelsModule
HiddenMarkovModels

A Julia package for HMM modeling, simulation, inference and learning.

source

Types

Markov chains

HiddenMarkovModels.AbstractMarkovChainType
AbstractMarkovChain

Abstract supertype for a Markov chain amenable to simulation, inference and learning.

Required interface

  • initial_distribution(mc)
  • transition_matrix(mc)
  • fit!(mc, init_count, trans_count) (optional)

Applicable methods

  • rand([rng,] mc, T)
  • logdensityof(mc, state_seq)
  • fit(mc, state_seq_or_seqs) (if fit! is implemented)
source
HiddenMarkovModels.MarkovChainType
MarkovChain <: AbstractMarkovChain

Basic implementation of a discrete-state Markov chain.

Fields

  • init::AbstractVector: initial state probabilities
  • trans::AbstractMatrix: state transition matrix
source

Hidden Markov Models

HiddenMarkovModels.AbstractHiddenMarkovModelType
AbstractHiddenMarkovModel <: AbstractMarkovChain

Abstract supertype for an HMM amenable to simulation, inference and learning.

Required interface

  • initial_distribution(hmm)
  • transition_matrix(hmm)
  • obs_distribution(hmm, i)
  • fit!(hmm, init_count, trans_count, obs_seq, state_marginals) (optional)

Applicable methods

  • rand([rng,] hmm, T)
  • logdensityof(hmm, obs_seq) / logdensityof(hmm, obs_seqs, nb_seqs)
  • forward(hmm, obs_seq) / forward(hmm, obs_seqs, nb_seqs)
  • viterbi(hmm, obs_seq) / viterbi(hmm, obs_seqs, nb_seqs)
  • forward_backward(hmm, obs_seq) / forward_backward(hmm, obs_seqs, nb_seqs)
  • baum_welch(hmm, obs_seq) / baum_welch(hmm, obs_seqs, nb_seqs) if fit! is implemented
source
HiddenMarkovModels.HiddenMarkovModelType
HiddenMarkovModel{D} <: AbstractHiddenMarkovModel

Basic implementation of an HMM.

Fields

  • init::AbstractVector: initial state probabilities
  • trans::AbstractMatrix: state transition matrix
  • dists::AbstractVector{D}: observation distributions
source
HiddenMarkovModels.PermutedHMMType
PermutedHMM{H<:AbstractHMM}

Wrapper around an AbstractHMM that permutes its states.

This is computationally inefficient and mostly useful for evaluation.

Fields

  • hmm:H: the old HMM
  • perm::Vector{Int}: a permutation such that state i in the new HMM corresponds to state perm[i] in the old.
source

Basics

Base.randFunction
rand([rng=default_rng(),] mc::AbstractMarkovChain, T)

Simulate mc for T time steps with a specified rng.

source
Base.lengthFunction
length(mc::AbstractMarkovChain)

Return the number of states of model.

source
HiddenMarkovModels.obs_distributionFunction
obs_distribution(hmm::AbstractHMM, i)

Return the observation distribution of hmm associated with state i.

The returned object dist must implement

  • rand(rng, dist)
  • DensityInterface.logdensityof(dist, x)
source

Inference

DensityInterface.logdensityofFunction
logdensityof(mc, state_seq)

Compute the loglikelihood of a single state sequence for a Markov chain.

source
logdensityof(hmm, obs_seq)

Apply the forward algorithm to compute the loglikelihood of a single observation sequence for an HMM.

Return a number.

source
logdensityof(hmm, obs_seqs, nb_seqs)

Apply the forward algorithm to compute the total loglikelihood of multiple observation sequences for an HMM.

Return a number.

Multithreading

This function is parallelized across sequences.

source
HiddenMarkovModels.forwardFunction
forward(hmm, obs_seq)

Apply the forward algorithm to an HMM.

Return a tuple (α, logL) where

  • logL is the loglikelihood of the sequence
  • α[i] is the posterior probability of state i at the end of the sequence.
source
forward(hmm, obs_seqs, nb_seqs)

Apply the forward algorithm to an HMM, based on multiple observation sequences.

Return a vector of tuples (αₖ, logLₖ), where

  • logLₖ is the loglikelihood of sequence k
  • αₖ[i] is the posterior probability of state i at the end of sequence k
Multithreading

This function is parallelized across sequences.

source
HiddenMarkovModels.viterbiFunction
viterbi(hmm, obs_seq)

Apply the Viterbi algorithm to compute the most likely state sequence of an HMM.

Return a vector of integers.

source
viterbi(hmm, obs_seqs, nb_seqs)

Apply the Viterbi algorithm to compute the most likely state sequences of an HMM, based on multiple observation sequences.

Return a vector of vectors of integers.

Multithreading

This function is parallelized across sequences.

source
HiddenMarkovModels.forward_backwardFunction
forward_backward(hmm, obs_seq)

Apply the forward-backward algorithm to estimate the posterior state marginals of an HMM.

Return a ForwardBackwardStorage.

source
forward_backward(hmm, obs_seqs, nb_seqs)

Apply the forward-backward algorithm to estimate the posterior state marginals of an HMM, based on multiple observation sequences.

Return a vector of ForwardBackwardStorage objects.

Multithreading

This function is parallelized across sequences.

source

Learning

StatsAPI.fit!Function
fit!(mc::MC, init_count, trans_count)

Update mc in-place based on information generated from a state sequence.

source
fit!(hmm::HMM, init_count, trans_count, obs_seq, state_marginals)

Update hmm in-place based on information generated during forward-backward.

source
StatsAPI.fitFunction
fit(mc, state_seq_or_seqs)

Fit a Markov chain of the same type as mc to one or several state sequence(s).

Beware that mc must be an actual object of type MarkovChain, and not the type itself as is usually done eg. in Distributions.jl.

source
HiddenMarkovModels.baum_welchFunction
baum_welch(
     hmm_init, obs_seq;
     atol, max_iterations, check_loglikelihood_increasing
-)

Apply the Baum-Welch algorithm to estimate the parameters of an HMM starting from hmm_init.

Return a tuple (hmm_est, logL_evolution).

Keyword arguments

  • atol: Minimum loglikelihood increase at an iteration of the algorithm (otherwise the algorithm is deemed to have converged)
  • max_iterations: Maximum number of iterations of the algorithm
  • check_loglikelihood_increasing: Whether to throw an error if the loglikelihood decreases
source
baum_welch(
+)

Apply the Baum-Welch algorithm to estimate the parameters of an HMM starting from hmm_init.

Return a tuple (hmm_est, logL_evolution).

Keyword arguments

  • atol: Minimum loglikelihood increase at an iteration of the algorithm (otherwise the algorithm is deemed to have converged)
  • max_iterations: Maximum number of iterations of the algorithm
  • check_loglikelihood_increasing: Whether to throw an error if the loglikelihood decreases
source
baum_welch(
     hmm_init, obs_seqs, nb_seqs;
     atol, max_iterations, check_loglikelihood_increasing
-)

Apply the Baum-Welch algorithm to estimate the parameters of an HMM starting from hmm_init, based on nb_seqs observation sequences.

Return a tuple (hmm_est, logL_evolution).

Multithreading

This function is parallelized across sequences.

Keyword arguments

  • atol: Minimum loglikelihood increase at an iteration of the algorithm (otherwise the algorithm is deemed to have converged)
  • max_iterations: Maximum number of iterations of the algorithm
  • check_loglikelihood_increasing: Whether to throw an error if the loglikelihood decreases
source

Internals

HiddenMarkovModels.ForwardBackwardStorageType
ForwardBackwardStorage{R}

Store forward-backward quantities with element type R.

Fields

Let X denote the vector of hidden states and Y denote the vector of observations. The following fields are part of the API:

  • γ::Matrix{R}: posterior one-state marginals γ[i,t] = ℙ(X[t]=i | Y[1:T])
  • ξ::Array{R,3}: posterior two-state marginals ξ[i,j,t] = ℙ(X[t:t+1]=(i,j) | Y[1:T])

The following fields are internals and subject to change:

  • α::Matrix{R}: scaled forward variables α[i,t] proportional to ℙ(Y[1:t], X[t]=i) (up to a function of t)
  • β::Matrix{R}: scaled backward variables β[i,t] proportional to ℙ(Y[t+1:T] | X[t]=i) (up to a function of t)
  • c::Vector{R}: forward variable inverse normalizations c[t] = 1 / sum(α[:,t])
  • logm::Vector{R}: maximum of the observation loglikelihoods logm[t] = maximum(logB[:, t])
  • Bscaled::Matrix{R}: numerically stabilized observation likelihoods Bscaled[i,t] = exp.(logB[i,t] - logm[t])
  • Bβscaled::Matrix{R}: numerically stabilized product Bβscaled[i,t] = Bscaled[i,t] * β[i,t]
source
HiddenMarkovModels.fit_element_from_sequence!Function
fit_element_from_sequence!(dists, i, x, w)

Modify the i-th element of dists by fitting it to an observation sequence x with associated weight sequence w.

The default behavior is a fallback on StatsAPI.fit!, which users are encouraged to implement if their observation distributions are mutable. If this is not possible, please override fit_element_from_sequence! directly.

source
HiddenMarkovModels.LightDiagNormalType
LightDiagNormal

An HMMs-compatible implementation of a multivariate normal distribution with diagonal covariance, enabling allocation-free estimation.

This is not part of the public API and is expected to change.

source

Notations

Integers

  • N: number of states
  • D: dimension of the observations
  • T: trajectory length
  • K: number of trajectories

Models and simulations

  • p or init: initial_distribution (vector of state probabilities)
  • A or trans: transition_matrix (matrix of transition probabilities)
  • dists: observation distribution (vector of rand-able and logdensityof-able objects)
  • state_seq: a sequence of states (vector of integers)
  • obs_seq: a sequence of observations (vector of individual observations)
  • obs_seqs: several sequences of observations

Forward backward

  • (log)b: vector of observation (log)likelihoods by state for an individual observation
  • (log)B: matrix of observation (log)likelihoods by state for a sequence of observations
  • α: scaled forward variables
  • β: scaled backward variables
  • γ: one-state marginals
  • ξ: two-state marginals
  • logL: loglikelihood of a sequence of observations

Index

+)

Apply the Baum-Welch algorithm to estimate the parameters of an HMM starting from hmm_init, based on nb_seqs observation sequences.

Return a tuple (hmm_est, logL_evolution).

Multithreading

This function is parallelized across sequences.

Keyword arguments

  • atol: Minimum loglikelihood increase at an iteration of the algorithm (otherwise the algorithm is deemed to have converged)
  • max_iterations: Maximum number of iterations of the algorithm
  • check_loglikelihood_increasing: Whether to throw an error if the loglikelihood decreases
source

Internals

HiddenMarkovModels.ForwardBackwardStorageType
ForwardBackwardStorage{R}

Store forward-backward quantities with element type R.

Fields

Let X denote the vector of hidden states and Y denote the vector of observations. The following fields are part of the API:

  • γ::Matrix{R}: posterior one-state marginals γ[i,t] = ℙ(X[t]=i | Y[1:T])
  • ξ::Array{R,3}: posterior two-state marginals ξ[i,j,t] = ℙ(X[t:t+1]=(i,j) | Y[1:T])

The following fields are internals and subject to change:

  • α::Matrix{R}: scaled forward variables α[i,t] proportional to ℙ(Y[1:t], X[t]=i) (up to a function of t)
  • β::Matrix{R}: scaled backward variables β[i,t] proportional to ℙ(Y[t+1:T] | X[t]=i) (up to a function of t)
  • c::Vector{R}: forward variable inverse normalizations c[t] = 1 / sum(α[:,t])
  • logm::Vector{R}: maximum of the observation loglikelihoods logm[t] = maximum(logB[:, t])
  • Bscaled::Matrix{R}: numerically stabilized observation likelihoods Bscaled[i,t] = exp.(logB[i,t] - logm[t])
  • Bβscaled::Matrix{R}: numerically stabilized product Bβscaled[i,t] = Bscaled[i,t] * β[i,t]
source
HiddenMarkovModels.fit_element_from_sequence!Function
fit_element_from_sequence!(dists, i, x, w)

Modify the i-th element of dists by fitting it to an observation sequence x with associated weight sequence w.

The default behavior is a fallback on StatsAPI.fit!, which users are encouraged to implement if their observation distributions are mutable. If this is not possible, please override fit_element_from_sequence! directly.

source
HiddenMarkovModels.LightDiagNormalType
LightDiagNormal

An HMMs-compatible implementation of a multivariate normal distribution with diagonal covariance, enabling allocation-free estimation.

This is not part of the public API and is expected to change.

source

Notations

Integers

  • N: number of states
  • D: dimension of the observations
  • T: trajectory length
  • K: number of trajectories

Models and simulations

  • p or init: initial_distribution (vector of state probabilities)
  • A or trans: transition_matrix (matrix of transition probabilities)
  • dists: observation distribution (vector of rand-able and logdensityof-able objects)
  • state_seq: a sequence of states (vector of integers)
  • obs_seq: a sequence of observations (vector of individual observations)
  • obs_seqs: several sequences of observations

Forward backward

  • (log)b: vector of observation (log)likelihoods by state for an individual observation
  • (log)B: matrix of observation (log)likelihoods by state for a sequence of observations
  • α: scaled forward variables
  • β: scaled backward variables
  • γ: one-state marginals
  • ξ: two-state marginals
  • logL: loglikelihood of a sequence of observations

Index

diff --git a/dev/background/index.html b/dev/background/index.html index 0e944578..8ace1811 100644 --- a/dev/background/index.html +++ b/dev/background/index.html @@ -4,4 +4,4 @@
  • Rabiner, L. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77, 257–286.
  • - + diff --git a/dev/debugging/index.html b/dev/debugging/index.html index 067894b2..7e94a084 100644 --- a/dev/debugging/index.html +++ b/dev/debugging/index.html @@ -1,2 +1,2 @@ -Debugging · HiddenMarkovModels.jl

    Debugging

    Numerical overflow

    The most frequent error you will encounter is an OverflowError during forward-backward, telling you that "some values are infinite / NaN". This can happen for a variety of reasons, so here are a few leads worth investigating:

    • Increase the duration of the sequence / the number of sequences (to get more data)
    • Reduce the number of states (to make every one of them useful)
    • Add a prior to your transition matrix / observation distributions (to avoid degenerate behavior like zero variance in a Gaussian)
    • Pick a better initialization (to start closer to the supposed ground truth)
    • Use LogarithmicNumbers.jl in strategic places (to guarantee numerical stability). Note that these numbers don't play nicely with Distributions.jl, so you may have to roll out your own observation distribution.
    +Debugging · HiddenMarkovModels.jl

    Debugging

    Numerical overflow

    The most frequent error you will encounter is an OverflowError during forward-backward, telling you that "some values are infinite / NaN". This can happen for a variety of reasons, so here are a few leads worth investigating:

    • Increase the duration of the sequence / the number of sequences (to get more data)
    • Reduce the number of states (to make every one of them useful)
    • Add a prior to your transition matrix / observation distributions (to avoid degenerate behavior like zero variance in a Gaussian)
    • Pick a better initialization (to start closer to the supposed ground truth)
    • Use LogarithmicNumbers.jl in strategic places (to guarantee numerical stability). Note that these numbers don't play nicely with Distributions.jl, so you may have to roll out your own observation distribution.
    diff --git a/dev/formulas/index.html b/dev/formulas/index.html index 128d0618..ba4b4309 100644 --- a/dev/formulas/index.html +++ b/dev/formulas/index.html @@ -56,4 +56,4 @@ \frac{\partial \log \mathcal{L}}{\partial a_{i,j}} &= \sum_{t=1}^{T-1} \bar{\alpha}_{i,t} \frac{b_{j,t+1}}{m_{t+1}} \bar{\beta}_{j,t+1} \\ \frac{\partial \log \mathcal{L}}{\partial \log b_{j,1}} &= \pi_j \frac{b_{j,1}}{m_1} \bar{\beta}_{j,1} = \frac{\bar{\alpha}_{j,1} \bar{\beta}_{j,1}}{c_1} = \gamma_{j,1} \\ \frac{\partial \log \mathcal{L}}{\partial \log b_{j,t}} &= \sum_{i=1}^N \bar{\alpha}_{i,t-1} a_{i,j} \frac{b_{j,t}}{m_t} \bar{\beta}_{j,t} = \frac{\bar{\alpha}_{j,t} \bar{\beta}_{j,t}}{c_t} = \gamma_{j,t} -\end{align*}\]

    +\end{align*}\]

    diff --git a/dev/index.html b/dev/index.html index f65a8227..71bfdf70 100644 --- a/dev/index.html +++ b/dev/index.html @@ -3,4 +3,4 @@ init = [0.2, 0.8] trans = [0.1 0.9; 0.7 0.3] dists = [Normal(-1), Normal(1)] -hmm = HMM(init, trans, dists)

    Take a look at the documentation to know what to do next!

    Main features

    This package is generic. Observations can be arbitrary Julia objects, not just scalars or arrays. Their distributions only need to implement rand(rng, dist) and logdensityof(dist, x) from DensityInterface.jl. Number types are not restricted to floating point, and automatic differentiation is supported in forward mode (ForwardDiff.jl) and reverse mode (ChainRules.jl).

    This package is fast. All the inference functions have allocation-free versions, which leverage efficient linear algebra subroutines. Multithreading is used to parallelize computations across sequences, and compatibility with various array types is ensured. We include extensive benchmarks against Julia and Python competitors thanks to BenchmarkTools.jl and PythonCall.jl.

    This package is reliable. It gives the same results as the previous reference package HMMBase.jl up to numerical accuracy. The test suite incorporates quality checks with Aqua.jl, as well as linting and type stability checks with JET.jl. A detailed documentation will help you find the functions you need.

    But this package is limited in scope. It is designed for HMMs with a small number of states, because memory and runtime scale quadratically (even if the transitions are sparse). It is also meant to perform best on a CPU, and not tested at all on GPUs.

    Contributing

    If you spot a bug or want to ask about a new feature, please open an issue on the GitHub repository. Once the issue receives positive feedback, feel free to try and fix it with a pull request that follows the BlueStyle guidelines.

    Acknowledgements

    A big thank you to Maxime Mouchet and Jacob Schreiber, the respective lead devs of HMMBase.jl and pomegranate, for their help and advice. Logo by Clément Mantoux based on a portrait of Andrey Markov.

    +hmm = HMM(init, trans, dists)

    Take a look at the documentation to know what to do next!

    Main features

    This package is generic. Observations can be arbitrary Julia objects, not just scalars or arrays. Their distributions only need to implement rand(rng, dist) and logdensityof(dist, x) from DensityInterface.jl. Number types are not restricted to floating point, and automatic differentiation is supported in forward mode (ForwardDiff.jl) and reverse mode (ChainRules.jl).

    This package is fast. All the inference functions have allocation-free versions, which leverage efficient linear algebra subroutines. Multithreading is used to parallelize computations across sequences, and compatibility with various array types is ensured. We include extensive benchmarks against Julia and Python competitors thanks to BenchmarkTools.jl and PythonCall.jl.

    This package is reliable. It gives the same results as the previous reference package HMMBase.jl up to numerical accuracy. The test suite incorporates quality checks with Aqua.jl, as well as linting and type stability checks with JET.jl. A detailed documentation will help you find the functions you need.

    But this package is limited in scope. It is designed for HMMs with a small number of states, because memory and runtime scale quadratically (even if the transitions are sparse). It is also meant to perform best on a CPU, and not tested at all on GPUs.

    Contributing

    If you spot a bug or want to ask about a new feature, please open an issue on the GitHub repository. Once the issue receives positive feedback, feel free to try and fix it with a pull request that follows the BlueStyle guidelines.

    Acknowledgements

    A big thank you to Maxime Mouchet and Jacob Schreiber, the respective lead devs of HMMBase.jl and pomegranate, for their help and advice. Logo by Clément Mantoux based on a portrait of Andrey Markov.

    diff --git a/dev/roadmap/index.html b/dev/roadmap/index.html index c338dfd9..85d22479 100644 --- a/dev/roadmap/index.html +++ b/dev/roadmap/index.html @@ -1,2 +1,2 @@ -Roadmap · HiddenMarkovModels.jl
    +Roadmap · HiddenMarkovModels.jl
    diff --git a/dev/search/index.html b/dev/search/index.html index 58a70e3f..5ddb1f23 100644 --- a/dev/search/index.html +++ b/dev/search/index.html @@ -1,2 +1,2 @@ -Search · HiddenMarkovModels.jl

    Loading search...

      +Search · HiddenMarkovModels.jl

      Loading search...

        diff --git a/dev/search_index.js b/dev/search_index.js index 2da5b065..a6fdf20c 100644 --- a/dev/search_index.js +++ b/dev/search_index.js @@ -1,3 +1,3 @@ var documenterSearchIndex = {"docs": -[{"location":"debugging/#Debugging","page":"Debugging","title":"Debugging","text":"","category":"section"},{"location":"debugging/#Numerical-overflow","page":"Debugging","title":"Numerical overflow","text":"","category":"section"},{"location":"debugging/","page":"Debugging","title":"Debugging","text":"The most frequent error you will encounter is an OverflowError during forward-backward, telling you that \"some values are infinite / NaN\". This can happen for a variety of reasons, so here are a few leads worth investigating:","category":"page"},{"location":"debugging/","page":"Debugging","title":"Debugging","text":"Increase the duration of the sequence / the number of sequences (to get more data)\nReduce the number of states (to make every one of them useful)\nAdd a prior to your transition matrix / observation distributions (to avoid degenerate behavior like zero variance in a Gaussian)\nPick a better initialization (to start closer to the supposed ground truth)\nUse LogarithmicNumbers.jl in strategic places (to guarantee numerical stability). Note that these numbers don't play nicely with Distributions.jl, so you may have to roll out your own observation distribution.","category":"page"},{"location":"api/#API-reference","page":"API reference","title":"API reference","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"HiddenMarkovModels\nHMMs","category":"page"},{"location":"api/#HiddenMarkovModels","page":"API reference","title":"HiddenMarkovModels","text":"HiddenMarkovModels\n\nA Julia package for HMM modeling, simulation, inference and learning.\n\n\n\n\n\n","category":"module"},{"location":"api/#HiddenMarkovModels.HMMs","page":"API reference","title":"HiddenMarkovModels.HMMs","text":"HMMs\n\nAlias for the module HiddenMarkovModels.\n\n\n\n\n\n","category":"module"},{"location":"api/#Types","page":"API reference","title":"Types","text":"","category":"section"},{"location":"api/#Markov-chains","page":"API reference","title":"Markov chains","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"AbstractMarkovChain\nMarkovChain\nAbstractMC\nMC","category":"page"},{"location":"api/#HiddenMarkovModels.AbstractMarkovChain","page":"API reference","title":"HiddenMarkovModels.AbstractMarkovChain","text":"AbstractMarkovChain\n\nAbstract supertype for a Markov chain amenable to simulation, inference and learning.\n\nRequired interface\n\ninitial_distribution(mc)\ntransition_matrix(mc)\nfit!(mc, init_count, trans_count) (optional)\n\nApplicable methods\n\nrand([rng,] mc, T)\nlogdensityof(mc, state_seq)\nfit(mc, state_seq_or_seqs) (if fit! is implemented)\n\n\n\n\n\n","category":"type"},{"location":"api/#HiddenMarkovModels.MarkovChain","page":"API reference","title":"HiddenMarkovModels.MarkovChain","text":"MarkovChain <: AbstractMarkovChain\n\nBasic implementation of a discrete-state Markov chain.\n\nFields\n\ninit::AbstractVector: initial state probabilities\ntrans::AbstractMatrix: state transition matrix\n\n\n\n\n\n","category":"type"},{"location":"api/#HiddenMarkovModels.AbstractMC","page":"API reference","title":"HiddenMarkovModels.AbstractMC","text":"AbstractMC\n\nAlias for the type AbstractMarkovChain.\n\n\n\n\n\n","category":"type"},{"location":"api/#HiddenMarkovModels.MC","page":"API reference","title":"HiddenMarkovModels.MC","text":"MC\n\nAlias for the type MarkovChain.\n\n\n\n\n\n","category":"type"},{"location":"api/#Hidden-Markov-Models","page":"API reference","title":"Hidden Markov Models","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"AbstractHiddenMarkovModel\nHiddenMarkovModel\nAbstractHMM\nPermutedHMM\nHMM","category":"page"},{"location":"api/#HiddenMarkovModels.AbstractHiddenMarkovModel","page":"API reference","title":"HiddenMarkovModels.AbstractHiddenMarkovModel","text":"AbstractHiddenMarkovModel <: AbstractMarkovChain\n\nAbstract supertype for an HMM amenable to simulation, inference and learning.\n\nRequired interface\n\ninitial_distribution(hmm)\ntransition_matrix(hmm)\nobs_distribution(hmm, i)\nfit!(hmm, init_count, trans_count, obs_seq, state_marginals) (optional)\n\nApplicable methods\n\nrand([rng,] hmm, T)\nlogdensityof(hmm, obs_seq) / logdensityof(hmm, obs_seqs, nb_seqs)\nforward(hmm, obs_seq) / forward(hmm, obs_seqs, nb_seqs)\nviterbi(hmm, obs_seq) / viterbi(hmm, obs_seqs, nb_seqs)\nforward_backward(hmm, obs_seq) / forward_backward(hmm, obs_seqs, nb_seqs)\nbaum_welch(hmm, obs_seq) / baum_welch(hmm, obs_seqs, nb_seqs) if fit! is implemented\n\n\n\n\n\n","category":"type"},{"location":"api/#HiddenMarkovModels.HiddenMarkovModel","page":"API reference","title":"HiddenMarkovModels.HiddenMarkovModel","text":"HiddenMarkovModel{D} <: AbstractHiddenMarkovModel\n\nBasic implementation of an HMM.\n\nFields\n\ninit::AbstractVector: initial state probabilities\ntrans::AbstractMatrix: state transition matrix\ndists::AbstractVector{D}: observation distributions\n\n\n\n\n\n","category":"type"},{"location":"api/#HiddenMarkovModels.AbstractHMM","page":"API reference","title":"HiddenMarkovModels.AbstractHMM","text":"AbstractHMM\n\nAlias for the type AbstractHiddenMarkovModel.\n\n\n\n\n\n","category":"type"},{"location":"api/#HiddenMarkovModels.PermutedHMM","page":"API reference","title":"HiddenMarkovModels.PermutedHMM","text":"PermutedHMM{H<:AbstractHMM}\n\nWrapper around an AbstractHMM that permutes its states.\n\nThis is computationally inefficient and mostly useful for evaluation.\n\nFields\n\nhmm:H: the old HMM\nperm::Vector{Int}: a permutation such that state i in the new HMM corresponds to state perm[i] in the old.\n\n\n\n\n\n","category":"type"},{"location":"api/#HiddenMarkovModels.HMM","page":"API reference","title":"HiddenMarkovModels.HMM","text":"HMM\n\nAlias for the type HiddenMarkovModel.\n\n\n\n\n\n","category":"type"},{"location":"api/#Basics","page":"API reference","title":"Basics","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"rand\nlength\ninitial_distribution\ntransition_matrix\nobs_distribution","category":"page"},{"location":"api/#Base.rand","page":"API reference","title":"Base.rand","text":"rand([rng=default_rng(),] mc::AbstractMarkovChain, T)\n\nSimulate mc for T time steps with a specified rng.\n\n\n\n\n\n","category":"function"},{"location":"api/#Base.length","page":"API reference","title":"Base.length","text":"length(mc::AbstractMarkovChain)\n\nReturn the number of states of model.\n\n\n\n\n\n","category":"function"},{"location":"api/#HiddenMarkovModels.initial_distribution","page":"API reference","title":"HiddenMarkovModels.initial_distribution","text":"initial_distribution(mc::AbstractMarkovChain)\n\nReturn the initial state probabilities of mc.\n\n\n\n\n\n","category":"function"},{"location":"api/#HiddenMarkovModels.transition_matrix","page":"API reference","title":"HiddenMarkovModels.transition_matrix","text":"transition_matrix(mc::AbstractMarkovChain)\n\nReturn the state transition probabilities of mc.\n\n\n\n\n\n","category":"function"},{"location":"api/#HiddenMarkovModels.obs_distribution","page":"API reference","title":"HiddenMarkovModels.obs_distribution","text":"obs_distribution(hmm::AbstractHMM, i)\n\nReturn the observation distribution of hmm associated with state i.\n\nThe returned object dist must implement\n\nrand(rng, dist)\nDensityInterface.logdensityof(dist, x)\n\n\n\n\n\n","category":"function"},{"location":"api/#Inference","page":"API reference","title":"Inference","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"logdensityof\nforward\nviterbi\nforward_backward","category":"page"},{"location":"api/#DensityInterface.logdensityof","page":"API reference","title":"DensityInterface.logdensityof","text":"logdensityof(mc, state_seq)\n\nCompute the loglikelihood of a single state sequence for a Markov chain.\n\n\n\n\n\nlogdensityof(hmm, obs_seq)\n\nApply the forward algorithm to compute the loglikelihood of a single observation sequence for an HMM.\n\nReturn a number.\n\n\n\n\n\nlogdensityof(hmm, obs_seqs, nb_seqs)\n\nApply the forward algorithm to compute the total loglikelihood of multiple observation sequences for an HMM.\n\nReturn a number.\n\nwarning: Multithreading\nThis function is parallelized across sequences.\n\n\n\n\n\n","category":"function"},{"location":"api/#HiddenMarkovModels.forward","page":"API reference","title":"HiddenMarkovModels.forward","text":"forward(hmm, obs_seq)\n\nApply the forward algorithm to an HMM.\n\nReturn a tuple (α, logL) where\n\nlogL is the loglikelihood of the sequence\nα[i] is the posterior probability of state i at the end of the sequence.\n\n\n\n\n\nforward(hmm, obs_seqs, nb_seqs)\n\nApply the forward algorithm to an HMM, based on multiple observation sequences.\n\nReturn a vector of tuples (αₖ, logLₖ), where\n\nlogLₖ is the loglikelihood of sequence k\nαₖ[i] is the posterior probability of state i at the end of sequence k\n\nwarning: Multithreading\nThis function is parallelized across sequences.\n\n\n\n\n\n","category":"function"},{"location":"api/#HiddenMarkovModels.viterbi","page":"API reference","title":"HiddenMarkovModels.viterbi","text":"viterbi(hmm, obs_seq)\n\nApply the Viterbi algorithm to compute the most likely state sequence of an HMM.\n\nReturn a vector of integers.\n\n\n\n\n\nviterbi(hmm, obs_seqs, nb_seqs)\n\nApply the Viterbi algorithm to compute the most likely state sequences of an HMM, based on multiple observation sequences.\n\nReturn a vector of vectors of integers.\n\nwarning: Multithreading\nThis function is parallelized across sequences.\n\n\n\n\n\n","category":"function"},{"location":"api/#HiddenMarkovModels.forward_backward","page":"API reference","title":"HiddenMarkovModels.forward_backward","text":"forward_backward(hmm, obs_seq)\n\nApply the forward-backward algorithm to estimate the posterior state marginals of an HMM.\n\nReturn a ForwardBackwardStorage.\n\n\n\n\n\nforward_backward(hmm, obs_seqs, nb_seqs)\n\nApply the forward-backward algorithm to estimate the posterior state marginals of an HMM, based on multiple observation sequences.\n\nReturn a vector of ForwardBackwardStorage objects.\n\nwarning: Multithreading\nThis function is parallelized across sequences.\n\n\n\n\n\n","category":"function"},{"location":"api/#Learning","page":"API reference","title":"Learning","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"fit!\nfit\nbaum_welch","category":"page"},{"location":"api/#StatsAPI.fit!","page":"API reference","title":"StatsAPI.fit!","text":"fit!(mc::MC, init_count, trans_count)\n\nUpdate mc in-place based on information generated from a state sequence.\n\n\n\n\n\nfit!(hmm::HMM, init_count, trans_count, obs_seq, state_marginals)\n\nUpdate hmm in-place based on information generated during forward-backward.\n\n\n\n\n\n","category":"function"},{"location":"api/#StatsAPI.fit","page":"API reference","title":"StatsAPI.fit","text":"fit(mc, state_seq_or_seqs)\n\nFit a Markov chain of the same type as mc to one or several state sequence(s).\n\nBeware that mc must be an actual object of type MarkovChain, and not the type itself as is usually done eg. in Distributions.jl.\n\n\n\n\n\n","category":"function"},{"location":"api/#HiddenMarkovModels.baum_welch","page":"API reference","title":"HiddenMarkovModels.baum_welch","text":"baum_welch(\n hmm_init, obs_seq;\n atol, max_iterations, check_loglikelihood_increasing\n)\n\nApply the Baum-Welch algorithm to estimate the parameters of an HMM starting from hmm_init.\n\nReturn a tuple (hmm_est, logL_evolution).\n\nKeyword arguments\n\natol: Minimum loglikelihood increase at an iteration of the algorithm (otherwise the algorithm is deemed to have converged)\nmax_iterations: Maximum number of iterations of the algorithm\ncheck_loglikelihood_increasing: Whether to throw an error if the loglikelihood decreases\n\n\n\n\n\nbaum_welch(\n hmm_init, obs_seqs, nb_seqs;\n atol, max_iterations, check_loglikelihood_increasing\n)\n\nApply the Baum-Welch algorithm to estimate the parameters of an HMM starting from hmm_init, based on nb_seqs observation sequences.\n\nReturn a tuple (hmm_est, logL_evolution).\n\nwarning: Multithreading\nThis function is parallelized across sequences.\n\nKeyword arguments\n\natol: Minimum loglikelihood increase at an iteration of the algorithm (otherwise the algorithm is deemed to have converged)\nmax_iterations: Maximum number of iterations of the algorithm\ncheck_loglikelihood_increasing: Whether to throw an error if the loglikelihood decreases\n\n\n\n\n\n","category":"function"},{"location":"api/#Internals","page":"API reference","title":"Internals","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"HMMs.ForwardBackwardStorage\nHMMs.fit_element_from_sequence!\nHMMs.LightDiagNormal","category":"page"},{"location":"api/#HiddenMarkovModels.ForwardBackwardStorage","page":"API reference","title":"HiddenMarkovModels.ForwardBackwardStorage","text":"ForwardBackwardStorage{R}\n\nStore forward-backward quantities with element type R.\n\nFields\n\nLet X denote the vector of hidden states and Y denote the vector of observations. The following fields are part of the API:\n\nγ::Matrix{R}: posterior one-state marginals γ[i,t] = ℙ(X[t]=i | Y[1:T])\nξ::Array{R,3}: posterior two-state marginals ξ[i,j,t] = ℙ(X[t:t+1]=(i,j) | Y[1:T])\n\nThe following fields are internals and subject to change:\n\nα::Matrix{R}: scaled forward variables α[i,t] proportional to ℙ(Y[1:t], X[t]=i) (up to a function of t)\nβ::Matrix{R}: scaled backward variables β[i,t] proportional to ℙ(Y[t+1:T] | X[t]=i) (up to a function of t)\nc::Vector{R}: forward variable inverse normalizations c[t] = 1 / sum(α[:,t])\nlogm::Vector{R}: maximum of the observation loglikelihoods logm[t] = maximum(logB[:, t])\nBscaled::Matrix{R}: numerically stabilized observation likelihoods Bscaled[i,t] = exp.(logB[i,t] - logm[t])\nBβscaled::Matrix{R}: numerically stabilized product Bβscaled[i,t] = Bscaled[i,t] * β[i,t]\n\n\n\n\n\n","category":"type"},{"location":"api/#HiddenMarkovModels.fit_element_from_sequence!","page":"API reference","title":"HiddenMarkovModels.fit_element_from_sequence!","text":"fit_element_from_sequence!(dists, i, x, w)\n\nModify the i-th element of dists by fitting it to an observation sequence x with associated weight sequence w.\n\nThe default behavior is a fallback on StatsAPI.fit!, which users are encouraged to implement if their observation distributions are mutable. If this is not possible, please override fit_element_from_sequence! directly.\n\n\n\n\n\n","category":"function"},{"location":"api/#HiddenMarkovModels.LightDiagNormal","page":"API reference","title":"HiddenMarkovModels.LightDiagNormal","text":"LightDiagNormal\n\nAn HMMs-compatible implementation of a multivariate normal distribution with diagonal covariance, enabling allocation-free estimation.\n\nThis is not part of the public API and is expected to change.\n\n\n\n\n\n","category":"type"},{"location":"api/#Notations","page":"API reference","title":"Notations","text":"","category":"section"},{"location":"api/#Integers","page":"API reference","title":"Integers","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"N: number of states\nD: dimension of the observations\nT: trajectory length\nK: number of trajectories","category":"page"},{"location":"api/#Models-and-simulations","page":"API reference","title":"Models and simulations","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"p or init: initial_distribution (vector of state probabilities)\nA or trans: transition_matrix (matrix of transition probabilities)\ndists: observation distribution (vector of rand-able and logdensityof-able objects)\nstate_seq: a sequence of states (vector of integers)\nobs_seq: a sequence of observations (vector of individual observations)\nobs_seqs: several sequences of observations","category":"page"},{"location":"api/#Forward-backward","page":"API reference","title":"Forward backward","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"(log)b: vector of observation (log)likelihoods by state for an individual observation\n(log)B: matrix of observation (log)likelihoods by state for a sequence of observations\nα: scaled forward variables\nβ: scaled backward variables\nγ: one-state marginals\nξ: two-state marginals\nlogL: loglikelihood of a sequence of observations","category":"page"},{"location":"api/#Index","page":"API reference","title":"Index","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"","category":"page"},{"location":"alt_performance/#Alternatives-performance","page":"Performance","title":"Alternatives - performance","text":"","category":"section"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"We compare performance among the following packages:","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"HiddenMarkovModels.jl (abbreviated to HMMs.jl)\nHMMBase.jl\nhmmlearn\npomegranate","category":"page"},{"location":"alt_performance/#Numerical-results","page":"Performance","title":"Numerical results","text":"","category":"section"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"The test case is an HMM with diagonal multivariate normal observations.","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"N: number of states\nD: dimension of the observations\nT: trajectory length\nK: number of trajectories\nI: number of Baum-Welch iterations","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"danger: Why is this empty?\nThe benchmark suite is computationally expensive, and we only run it once for each new release. If the following section contains no plots and the links are broken, you're probably reading the development documentation or a local build of the website. Check out the stable documentation instead.","category":"page"},{"location":"alt_performance/#Single-sequence","page":"Performance","title":"Single sequence","text":"","category":"section"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"Full benchmark logs: results_single_sequence.csv.","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"(Image: Plot - Logdensity single sequence benchmark)","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"(Image: Plot - Viterbi single sequence benchmark)","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"(Image: Plot - Forward-backward single sequence benchmark)","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"(Image: Plot - Baum-Welch single sequence benchmark)","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"Here, pomegranate is not included because it is much slower on very small inputs.","category":"page"},{"location":"alt_performance/#Multiple-sequences","page":"Performance","title":"Multiple sequences","text":"","category":"section"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"Full benchmark logs: results_multiple_sequences.csv.","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"(Image: Plot - Logdensity single sequence benchmark)","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"(Image: Plot - Baum-Welch single sequence benchmark)","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"Here, HMMBase.jl is not included because it does not support multiple sequences.","category":"page"},{"location":"alt_performance/#Reproducibility","page":"Performance","title":"Reproducibility","text":"","category":"section"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"These benchmarks were generated in the following environment: setup.txt.","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"If you want to run them on your machine:","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"Clone the HiddenMarkovModels.jl repository\nOpen a Julia REPL at the root\nRun the following commands\ninclude(\"benchmark/run_benchmarks.jl\")\ninclude(\"benchmark/process_benchmarks.jl\")","category":"page"},{"location":"alt_performance/#Remarks","page":"Performance","title":"Remarks","text":"","category":"section"},{"location":"alt_performance/#Julia-to-Python-overhead","page":"Performance","title":"Julia-to-Python overhead","text":"","category":"section"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"Since the Python packages are called from Julia with PythonCall.jl, we pay a small overhead that is hard to quantify. On the plots, we compensate it by subtracting the runtime of the same algorithm for the smallest instance (N=1, D=1, T=2, K=1, I=1) from all Python-generated curves. This estimate for the overhead is put in parentheses in the legend. It is probably over-pessimistic, which is fair because it means that the comparison is now biased against Julia.","category":"page"},{"location":"alt_performance/#Parallelism","page":"Performance","title":"Parallelism","text":"","category":"section"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"The packages we include have different approaches to parallelism, which can bias the evaluation in complex ways:","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"Package States N Observations D Sequences K\nHMMs.jl LinearAlgebra[2] depends[2] Threads[1]\nHMMBase.jl - depends[2] -\nhmmlearn NumPy[2] NumPy[2] NumPy[2]\npomegranate PyTorch[3] PyTorch[3] PyTorch[3]","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"[1]: possibly affected by JULIA_NUM_THREADS","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"[2]: possibly affected by OPENBLAS_NUM_THREADS","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"[3]: possibly affected by MKL_NUM_THREADS","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"For a fairer comparison, we set JULIA_NUM_THREADS=1, even though it robs HMMs.jl of its parallel speedup on multiple sequences. In addition, OpenBLAS threads have negative interactions with Julia threads. To overcome this obstacle, we run the Julia benchmarks (and only those) with OPENBLAS_NUM_THREADS=1.","category":"page"},{"location":"tuto_builtin/#Tutorial-built-in-HMM","page":"Built-in HMM","title":"Tutorial - built-in HMM","text":"","category":"section"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"using Distributions\nusing HiddenMarkovModels\n\nusing Random; Random.seed!(63)","category":"page"},{"location":"tuto_builtin/#Construction","page":"Built-in HMM","title":"Construction","text":"","category":"section"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"Creating a model:","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"function gaussian_hmm(N; noise=0)\n p = ones(N) / N # initial distribution\n A = rand_trans_mat(N) # transition matrix\n dists = [Normal(i + noise * randn(), 0.5) for i in 1:N] # observation distributions\n return HMM(p, A, dists)\nend","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"Checking its contents:","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"N = 3\nhmm = gaussian_hmm(N)\ntransition_matrix(hmm)","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"[obs_distribution(hmm, i) for i in 1:N]","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"Simulating a sequence:","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"T = 1000\nstate_seq, obs_seq = rand(hmm, T);\nfirst(state_seq, 10)'","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"first(obs_seq, 10)'","category":"page"},{"location":"tuto_builtin/#Inference","page":"Built-in HMM","title":"Inference","text":"","category":"section"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"Computing the loglikelihood of an observation sequence:","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"logdensityof(hmm, obs_seq)","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"Inferring the most likely state sequence:","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"most_likely_state_seq = viterbi(hmm, obs_seq);\nfirst(most_likely_state_seq, 10)'","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"Learning the parameters based on an observation sequence:","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"hmm_init = gaussian_hmm(N, noise=1)\nhmm_est, logL_evolution = baum_welch(hmm_init, obs_seq);\nfirst(logL_evolution), last(logL_evolution)","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"Correcting state order because we know observation means are increasing in the true model:","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"[obs_distribution(hmm_est, i) for i in 1:N]","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"perm = sortperm(1:3, by=i->obs_distribution(hmm_est, i).μ)","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"hmm_est = PermutedHMM(hmm_est, perm)","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"Evaluating errors:","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"cat(transition_matrix(hmm_est), transition_matrix(hmm), dims=3)","category":"page"},{"location":"roadmap/#Roadmap","page":"Roadmap","title":"Roadmap","text":"","category":"section"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"Here are some of the things that I would like to work on soon-ish:","category":"page"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"numerical stability in large-dimensional settings with sparse transitions\nSIMD optimization with LoopVectorization.jl or Tullio.jl\nspectral estimation methods\ninput-output HMMs in my other package ControlledMarkovModels.jl","category":"page"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"Contributors are welcome!","category":"page"},{"location":"alt_features/#Alternatives-features","page":"Features","title":"Alternatives - features","text":"","category":"section"},{"location":"alt_features/","page":"Features","title":"Features","text":"We compare features among the following Julia packages:","category":"page"},{"location":"alt_features/","page":"Features","title":"Features","text":"HiddenMarkovModels.jl (abbreviated to HMMs.jl)\nHMMBase.jl\nHMMGradients.jl","category":"page"},{"location":"alt_features/","page":"Features","title":"Features","text":"We discard MarkovModels.jl because its focus is GPU computation. There are also more generic packages for probabilistic programming, which are able to perform MCMC or variational inference (eg. Turing.jl) but we leave those aside.","category":"page"},{"location":"alt_features/","page":"Features","title":"Features","text":" HMMs.jl HMMBase.jl HMMGradients.jl\nAlgorithms Sim, FB, Vit, BW Sim, FB, Vit, BW FB\nObservation types anything Number / Vector anything\nObservation distributions DensityInterface.jl Distributions.jl manual\nNumber types anything Float64 AbstractFloat\nPriors / structures possible no possible\nAutomatic differentiation yes no yes\nMultiple sequences yes no yes\nLinear algebra yes yes no\nLogarithmic probabilities halfway halfway yes","category":"page"},{"location":"alt_features/","page":"Features","title":"Features","text":"Sim = Simulation, FB = Forward-Backward, Vit = Viterbi, BW = Baum-Welch","category":"page"},{"location":"background/#Background","page":"Background","title":"Background","text":"","category":"section"},{"location":"background/#What-are-HMMs?","page":"Background","title":"What are HMMs?","text":"","category":"section"},{"location":"background/","page":"Background","title":"Background","text":"Hidden Markov Models (HMMs for short) are a statistical modeling framework that is ubiquitous in signal processing, bioinformatics and plenty of other fields. They capture the distribution of an observation sequence (Y_t) by assuming the existence of a latent state sequence (X_t) such that:","category":"page"},{"location":"background/","page":"Background","title":"Background","text":"the sequence (X_t) follows a (discrete time, discrete space) Markov chain\nfor each t, the distribution of Y_t is entirely determined by the value of X_t","category":"page"},{"location":"background/#What-can-we-do-with-them?","page":"Background","title":"What can we do with them?","text":"","category":"section"},{"location":"background/","page":"Background","title":"Background","text":"Imagine we are given an observation sequence (Y_t) and a parametric family of HMMs mathbbP_theta theta in Theta. We can list several fundamental problems, each of which has a solution that relies on dynamic programming:","category":"page"},{"location":"background/","page":"Background","title":"Background","text":"Problem Goal Algorithm\nEvaluation Likelihood of the observation sequence mathbbP_theta(Y_1T) Forward\nInference State marginals mathbbP_theta(X_t vert Y_1T) Forward-backward\nDecoding Most likely state sequence undersetX_1TmathrmargmaxmathbbP_theta(X_1T vert Y_1T) Viterbi\nLearning Best parameter undersetthetamathrmargmaxmathbbP_theta(Y_1T) Baum-Welch","category":"page"},{"location":"background/","page":"Background","title":"Background","text":"Our whole package is based on the tutorial by (Rabiner, 1989), you can refer to it for more details.","category":"page"},{"location":"background/#Bibliography","page":"Background","title":"Bibliography","text":"","category":"section"},{"location":"background/","page":"Background","title":"Background","text":"","category":"page"},{"location":"tuto_custom/#Tutorial-custom-HMM","page":"Custom HMM","title":"Tutorial - custom HMM","text":"","category":"section"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"The built-in HMM is perfect when the initial state distribution, transition matrix and emission distributions can be updated independently with Maximum Likelihood Estimation. But some of these parameters might be correlated, or fixed. Or they might come with a prior, which forces you to use Maximum A Posteriori instead.","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"In such cases, it is necessary to implement a new subtype of AbstractHMM with all its required methods.","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"using Distributions\nusing HiddenMarkovModels\nusing LinearAlgebra\nusing RequiredInterfaces\nusing StatsAPI\n\nusing Random; Random.seed!(63)","category":"page"},{"location":"tuto_custom/#Interface","page":"Custom HMM","title":"Interface","text":"","category":"section"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"To ascertain that a type indeed satisfies the interface, you can use RequiredInterfaces.jl.","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"RequiredInterfaces.check_interface_implemented(AbstractHMM, HMM)","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"If your implementation is insufficient, the test will list missing methods.","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"struct EmptyHMM end\nRequiredInterfaces.check_interface_implemented(AbstractHMM, EmptyHMM)","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"Note that this test does not check the StatsAPI.fit! method. Since it is only used in the Baum-Welch algorithm, it is an optional part of the AbstractHMM interface.","category":"page"},{"location":"tuto_custom/#Example","page":"Custom HMM","title":"Example","text":"","category":"section"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"We show how to implement an HMM whose initial distribution is always the equilibrium distribution of the underlying Markov chain. The code that follows is not efficient (it leads to a lot of allocations), but it would be fairly easy to optimize if needed.","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"The equilibrium distribution of a Markov chain is the (only) left eigenvector associated with the left eigenvalue 1.","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"function markov_equilibrium(A)\n p = real.(eigvecs(A')[:, end])\n return p ./ sum(p)\nend","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"We now define our custom HMM by taking inspiration from src/types/hmm.jl and making a few modifications:","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"struct EquilibriumHMM{R,D} <: AbstractHMM\n trans::Matrix{R}\n dists::Vector{D}\nend","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"The interface is only different as far as the initialization is concerned.","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"Base.length(hmm::EquilibriumHMM) = length(hmm.dists)\nHMMs.initial_distribution(hmm::EquilibriumHMM) = markov_equilibrium(hmm.trans) # this is new\nHMMs.transition_matrix(hmm::EquilibriumHMM) = hmm.trans\nHMMs.obs_distribution(hmm::EquilibriumHMM, i::Integer) = hmm.dists[i]","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"As for fitting, we simply ignore the initialization count and copy the rest of the original code (with a few simplifications):","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"function StatsAPI.fit!(\n hmm::EquilibriumHMM{R,D}, init_count, trans_count, obs_seq, state_marginals\n) where {R,D}\n hmm.trans .= trans_count ./ sum(trans_count, dims=2)\n for i in 1:N\n hmm.dists[i] = fit(D, obs_seq, state_marginals[i, :])\n end\nend","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"Let's take our new model for a spin:","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"function gaussian_equilibrium_hmm(N; noise=0)\n A = rand_trans_mat(N)\n dists = [Normal(i + noise * randn(), 0.5) for i in 1:N]\n return EquilibriumHMM(A, dists)\nend;","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"N = 3\nhmm = gaussian_equilibrium_hmm(N);\ntransition_matrix(hmm)","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"[obs_distribution(hmm, i) for i in 1:N]","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"We can estimate parameters based on several observation sequences. Note that as soon as we tamper with the re-estimation procedure, the loglikelihood is no longer guaranteed to increase during Baum-Welch, which is why we turn off the corresponding check.","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"T = 1000\nnb_seqs = 10\nobs_seqs = [rand(hmm, T).obs_seq for _ in 1:nb_seqs]\n\nhmm_init = gaussian_equilibrium_hmm(N; noise=1)\nhmm_est, logL_evolution = baum_welch(\n hmm_init, obs_seqs, nb_seqs; check_loglikelihood_increasing=false\n);\nfirst(logL_evolution), last(logL_evolution)","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"Let's correct the state order:","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"[obs_distribution(hmm_est, i) for i in 1:N]","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"perm = sortperm(1:3, by=i->obs_distribution(hmm_est, i).μ)","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"hmm_est = PermutedHMM(hmm_est, perm)","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"And finally evaluate the errors:","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"cat(transition_matrix(hmm_est), transition_matrix(hmm), dims=3)","category":"page"},{"location":"formulas/#Formulas","page":"Formulas","title":"Formulas","text":"","category":"section"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"Suppose we are given observations Y_1 Y_T, with hidden states X_1 X_T. Following (Rabiner, 1989), we use the following notations:","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"let pi in mathbbR^N be the initial state distribution pi_i = mathbbP(X_1 = i)\nlet A in mathbbR^N times N be the transition matrix a_ij = mathbbP(X_t+1=j X_t = i)\nlet B in mathbbR^N times T be the matrix of statewise observation likelihoods b_it = mathbbP(Y_t X_t = i)","category":"page"},{"location":"formulas/#Vanilla-forward-backward","page":"Formulas","title":"Vanilla forward-backward","text":"","category":"section"},{"location":"formulas/#Recursion","page":"Formulas","title":"Recursion","text":"","category":"section"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"The forward and backward variables are defined by","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nalpha_it = mathbbP(Y_1t X_t=i) \nbeta_it = mathbbP(Y_t+1T X_t=i)\nendalign*","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"They are initialized with","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nalpha_i1 = pi_i b_i1 \nbeta_iT = 1\nendalign*","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"and satisfy the dynamic programming equations","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nalpha_jt+1 = left(sum_i=1^N alpha_it a_ijright) b_jt+1 \nbeta_it = sum_j=1^N a_ij b_jt+1 beta_jt+1\nendalign*","category":"page"},{"location":"formulas/#Likelihood","page":"Formulas","title":"Likelihood","text":"","category":"section"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"The likelihood of the whole sequence of observations is given by","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"mathcalL = mathbbP(Y_1T) = sum_i=1^N alpha_iT","category":"page"},{"location":"formulas/#Marginals","page":"Formulas","title":"Marginals","text":"","category":"section"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"We notice that","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nalpha_it beta_it = mathbbP(Y_1T X_t=i) \nalpha_it a_ij b_jt+1 beta_jt+1 = mathbbP(Y_1T X_t=i X_t+1=j)\nendalign*","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"Thus we deduce the one-state and two-state marginals","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\ngamma_it = mathbbP(X_t=i Y_1T) = frac1mathcalL alpha_it beta_it \nxi_ijt = mathbbP(X_t=i X_t+1=j Y_1T) = frac1mathcalL alpha_it a_ij b_jt+1 beta_jt+1\nendalign*","category":"page"},{"location":"formulas/#Derivatives","page":"Formulas","title":"Derivatives","text":"","category":"section"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"According to (Qin et al., 2000), derivatives of the likelihood can be obtained as follows:","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nfracpartial mathcalLpartial pi_i = beta_i1 b_i1 \nfracpartial mathcalLpartial a_ij = sum_t=1^T-1 alpha_it b_jt+1 beta_jt+1 \nfracpartial mathcalLpartial b_j1 = pi_j beta_j1 \nfracpartial mathcalLpartial b_jt = left(sum_i=1^N alpha_it-1 a_ijright) beta_jt \nendalign*","category":"page"},{"location":"formulas/#Scaled-forward-backward","page":"Formulas","title":"Scaled forward-backward","text":"","category":"section"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"In this package, we use a slightly different version of the algorithm, including both the traditional scaling of (Rabiner, 1989) and a normalization of B using m_t = max_i b_it.","category":"page"},{"location":"formulas/#Recursion-2","page":"Formulas","title":"Recursion","text":"","category":"section"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"The variables are initialized with","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nhatalpha_i1 = pi_i fracb_i1m_1 c_1 = frac1sum_i hatalpha_i1 baralpha_i1 = c_1 hatalpha_i1 \nhatbeta_iT = 1 barbeta_1T = c_T hatbeta_1T\nendalign*","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"and satisfy the dynamic programming equations","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nhatalpha_jt+1 = left(sum_i=1^N baralpha_it a_ijright) fracb_jt+1m_t+1 c_t+1 = frac1sum_j hatalpha_jt+1 baralpha_jt+1 = c_t+1 hatalpha_jt+1 \nhatbeta_it = sum_j=1^N a_ij fracb_jt+1m_t+1 barbeta_jt+1 barbeta_jt = c_t hatbeta_jt\nendalign*","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"In terms of the original variables, we find","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nbaralpha_it = alpha_it left(prod_s=1^t fracc_sm_sright) \nbarbeta_it = beta_it left(c_t prod_s=t+1^T fracc_sm_sright)\nendalign*","category":"page"},{"location":"formulas/#Likelihood-2","page":"Formulas","title":"Likelihood","text":"","category":"section"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"Since we normalized baralpha at each time step,","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"1 = sum_i=1^N baralpha_iT = left(sum_i=1^N alpha_iTright) left(prod_s=1^T fracc_sm_sright) ","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"which means","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"mathcalL = sum_i=1^N alpha_iT = prod_s=1^T fracm_sc_s","category":"page"},{"location":"formulas/#Marginals-2","page":"Formulas","title":"Marginals","text":"","category":"section"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"We can now express the marginals using scaled variables:","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\ngamma_it = frac1mathcalL alpha_it beta_it = frac1mathcalL left(baralpha_it prod_s=1^t fracm_sc_sright) left(barbeta_it frac1c_t prod_s=t+1^T fracm_sc_sright) \n= frac1mathcalL fracbaralpha_it barbeta_itc_t left(prod_s=1^T fracm_sc_sright) = fracbaralpha_it barbeta_itc_t\nendalign*","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nxi_ijt = frac1mathcalL alpha_it a_ij b_jt+1 beta_jt+1 \n= frac1mathcalL left(baralpha_it prod_s=1^t fracm_sc_sright) a_ij b_jt+1 left(barbeta_jt+1 frac1c_t+1 prod_s=t+2^T fracm_sc_sright) \n= frac1mathcalL baralpha_it a_ij fracb_jt+1m_t+1 barbeta_jt+1 left(prod_s=1^T fracm_sc_sright) \n= baralpha_it a_ij fracb_jt+1m_t+1 barbeta_jt+1\nendalign*","category":"page"},{"location":"formulas/#Derivatives-2","page":"Formulas","title":"Derivatives","text":"","category":"section"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"And we also need to adapt the derivatives. For the initial distribution,","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nfracpartial mathcalLpartial pi_i = beta_i1 b_i1 = left(barbeta_i1 frac1c_1 prod_s=2^T fracm_sc_s right) b_i1 \n= left(prod_s=1^T fracm_sc_sright) barbeta_i1 fracb_i1m_1 = mathcalL barbeta_i1 fracb_i1m_1 \nendalign*","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"For the transition matrix,","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nfracpartial mathcalLpartial a_ij = sum_t=1^T-1 alpha_it b_jt+1 beta_jt+1 \n= sum_t=1^T-1 left(baralpha_it prod_s=1^t fracm_sc_s right) b_jt+1 left(barbeta_jt+1 frac1c_t+1 prod_s=t+2^T fracm_sc_s right) \n= sum_t=1^T-1 baralpha_it fracb_jt+1m_t+1 barbeta_jt+1 left(prod_s=1^T fracm_sc_s right) \n= mathcalL sum_t=1^T-1 baralpha_it fracb_jt+1m_t+1 barbeta_jt+1 \nendalign*","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"And for the statewise observation likelihoods,","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nfracpartial mathcalLpartial b_j1 = pi_j beta_j1 = pi_j barbeta_j1 frac1c_1 prod_s=2^T fracm_sc_s = mathcalL pi_j barbeta_j1 frac1m_1\nendalign*","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nfracpartial mathcalLpartial b_jt = left(sum_i=1^N alpha_it-1 a_ijright) beta_jt \n= sum_i=1^N left(baralpha_it-1 prod_s=1^t-1 fracm_sc_sright) a_ij left(barbeta_jt frac1c_t prod_s=t+1^T fracm_sc_s right) \n= sum_i=1^N baralpha_it-1 a_ij barbeta_jt frac1m_t left(prod_s=1^T fracm_sc_sright) \n= mathcalL sum_i=1^N baralpha_it-1 a_ij barbeta_jt frac1m_t \nendalign*","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"Finally, we note that","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"fracpartial log mathcalLpartial log b_jt = fracpartial log mathcalLpartial b_jt b_jt","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"To sum up,","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nfracpartial log mathcalLpartial pi_i = fracb_i1m_1 barbeta_i1 \nfracpartial log mathcalLpartial a_ij = sum_t=1^T-1 baralpha_it fracb_jt+1m_t+1 barbeta_jt+1 \nfracpartial log mathcalLpartial log b_j1 = pi_j fracb_j1m_1 barbeta_j1 = fracbaralpha_j1 barbeta_j1c_1 = gamma_j1 \nfracpartial log mathcalLpartial log b_jt = sum_i=1^N baralpha_it-1 a_ij fracb_jtm_t barbeta_jt = fracbaralpha_jt barbeta_jtc_t = gamma_jt\nendalign*","category":"page"},{"location":"","page":"Home","title":"Home","text":"EditURL = \"https://github.com/gdalle/HiddenMarkovModels.jl/blob/main/README.md\"","category":"page"},{"location":"#HiddenMarkovModels.jl","page":"Home","title":"HiddenMarkovModels.jl","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"(Image: Stable) (Image: Dev) (Image: Build Status) (Image: Coverage) (Image: Code Style: Blue)","category":"page"},{"location":"","page":"Home","title":"Home","text":"(Image: DOI) (Image: Aqua QA) (Image: JET)","category":"page"},{"location":"","page":"Home","title":"Home","text":"A Julia package for HMM modeling, simulation, inference and learning.","category":"page"},{"location":"#Getting-started","page":"Home","title":"Getting started","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"This package can be installed using Julia's package manager:","category":"page"},{"location":"","page":"Home","title":"Home","text":"pkg> add HiddenMarkovModels","category":"page"},{"location":"","page":"Home","title":"Home","text":"Then, you can create your first HMM as follows:","category":"page"},{"location":"","page":"Home","title":"Home","text":"using Distributions, HiddenMarkovModels\ninit = [0.2, 0.8]\ntrans = [0.1 0.9; 0.7 0.3]\ndists = [Normal(-1), Normal(1)]\nhmm = HMM(init, trans, dists)","category":"page"},{"location":"","page":"Home","title":"Home","text":"Take a look at the documentation to know what to do next!","category":"page"},{"location":"#Main-features","page":"Home","title":"Main features","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"This package is generic. Observations can be arbitrary Julia objects, not just scalars or arrays. Their distributions only need to implement rand(rng, dist) and logdensityof(dist, x) from DensityInterface.jl. Number types are not restricted to floating point, and automatic differentiation is supported in forward mode (ForwardDiff.jl) and reverse mode (ChainRules.jl).","category":"page"},{"location":"","page":"Home","title":"Home","text":"This package is fast. All the inference functions have allocation-free versions, which leverage efficient linear algebra subroutines. Multithreading is used to parallelize computations across sequences, and compatibility with various array types is ensured. We include extensive benchmarks against Julia and Python competitors thanks to BenchmarkTools.jl and PythonCall.jl.","category":"page"},{"location":"","page":"Home","title":"Home","text":"This package is reliable. It gives the same results as the previous reference package HMMBase.jl up to numerical accuracy. The test suite incorporates quality checks with Aqua.jl, as well as linting and type stability checks with JET.jl. A detailed documentation will help you find the functions you need.","category":"page"},{"location":"","page":"Home","title":"Home","text":"But this package is limited in scope. It is designed for HMMs with a small number of states, because memory and runtime scale quadratically (even if the transitions are sparse). It is also meant to perform best on a CPU, and not tested at all on GPUs.","category":"page"},{"location":"#Contributing","page":"Home","title":"Contributing","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"If you spot a bug or want to ask about a new feature, please open an issue on the GitHub repository. Once the issue receives positive feedback, feel free to try and fix it with a pull request that follows the BlueStyle guidelines.","category":"page"},{"location":"#Acknowledgements","page":"Home","title":"Acknowledgements","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"A big thank you to Maxime Mouchet and Jacob Schreiber, the respective lead devs of HMMBase.jl and pomegranate, for their help and advice. Logo by Clément Mantoux based on a portrait of Andrey Markov.","category":"page"}] +[{"location":"debugging/#Debugging","page":"Debugging","title":"Debugging","text":"","category":"section"},{"location":"debugging/#Numerical-overflow","page":"Debugging","title":"Numerical overflow","text":"","category":"section"},{"location":"debugging/","page":"Debugging","title":"Debugging","text":"The most frequent error you will encounter is an OverflowError during forward-backward, telling you that \"some values are infinite / NaN\". This can happen for a variety of reasons, so here are a few leads worth investigating:","category":"page"},{"location":"debugging/","page":"Debugging","title":"Debugging","text":"Increase the duration of the sequence / the number of sequences (to get more data)\nReduce the number of states (to make every one of them useful)\nAdd a prior to your transition matrix / observation distributions (to avoid degenerate behavior like zero variance in a Gaussian)\nPick a better initialization (to start closer to the supposed ground truth)\nUse LogarithmicNumbers.jl in strategic places (to guarantee numerical stability). Note that these numbers don't play nicely with Distributions.jl, so you may have to roll out your own observation distribution.","category":"page"},{"location":"api/#API-reference","page":"API reference","title":"API reference","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"HiddenMarkovModels\nHMMs","category":"page"},{"location":"api/#HiddenMarkovModels","page":"API reference","title":"HiddenMarkovModels","text":"HiddenMarkovModels\n\nA Julia package for HMM modeling, simulation, inference and learning.\n\n\n\n\n\n","category":"module"},{"location":"api/#HiddenMarkovModels.HMMs","page":"API reference","title":"HiddenMarkovModels.HMMs","text":"HMMs\n\nAlias for the module HiddenMarkovModels.\n\n\n\n\n\n","category":"module"},{"location":"api/#Types","page":"API reference","title":"Types","text":"","category":"section"},{"location":"api/#Markov-chains","page":"API reference","title":"Markov chains","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"AbstractMarkovChain\nMarkovChain\nAbstractMC\nMC","category":"page"},{"location":"api/#HiddenMarkovModels.AbstractMarkovChain","page":"API reference","title":"HiddenMarkovModels.AbstractMarkovChain","text":"AbstractMarkovChain\n\nAbstract supertype for a Markov chain amenable to simulation, inference and learning.\n\nRequired interface\n\ninitial_distribution(mc)\ntransition_matrix(mc)\nfit!(mc, init_count, trans_count) (optional)\n\nApplicable methods\n\nrand([rng,] mc, T)\nlogdensityof(mc, state_seq)\nfit(mc, state_seq_or_seqs) (if fit! is implemented)\n\n\n\n\n\n","category":"type"},{"location":"api/#HiddenMarkovModels.MarkovChain","page":"API reference","title":"HiddenMarkovModels.MarkovChain","text":"MarkovChain <: AbstractMarkovChain\n\nBasic implementation of a discrete-state Markov chain.\n\nFields\n\ninit::AbstractVector: initial state probabilities\ntrans::AbstractMatrix: state transition matrix\n\n\n\n\n\n","category":"type"},{"location":"api/#HiddenMarkovModels.AbstractMC","page":"API reference","title":"HiddenMarkovModels.AbstractMC","text":"AbstractMC\n\nAlias for the type AbstractMarkovChain.\n\n\n\n\n\n","category":"type"},{"location":"api/#HiddenMarkovModels.MC","page":"API reference","title":"HiddenMarkovModels.MC","text":"MC\n\nAlias for the type MarkovChain.\n\n\n\n\n\n","category":"type"},{"location":"api/#Hidden-Markov-Models","page":"API reference","title":"Hidden Markov Models","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"AbstractHiddenMarkovModel\nHiddenMarkovModel\nAbstractHMM\nPermutedHMM\nHMM","category":"page"},{"location":"api/#HiddenMarkovModels.AbstractHiddenMarkovModel","page":"API reference","title":"HiddenMarkovModels.AbstractHiddenMarkovModel","text":"AbstractHiddenMarkovModel <: AbstractMarkovChain\n\nAbstract supertype for an HMM amenable to simulation, inference and learning.\n\nRequired interface\n\ninitial_distribution(hmm)\ntransition_matrix(hmm)\nobs_distribution(hmm, i)\nfit!(hmm, init_count, trans_count, obs_seq, state_marginals) (optional)\n\nApplicable methods\n\nrand([rng,] hmm, T)\nlogdensityof(hmm, obs_seq) / logdensityof(hmm, obs_seqs, nb_seqs)\nforward(hmm, obs_seq) / forward(hmm, obs_seqs, nb_seqs)\nviterbi(hmm, obs_seq) / viterbi(hmm, obs_seqs, nb_seqs)\nforward_backward(hmm, obs_seq) / forward_backward(hmm, obs_seqs, nb_seqs)\nbaum_welch(hmm, obs_seq) / baum_welch(hmm, obs_seqs, nb_seqs) if fit! is implemented\n\n\n\n\n\n","category":"type"},{"location":"api/#HiddenMarkovModels.HiddenMarkovModel","page":"API reference","title":"HiddenMarkovModels.HiddenMarkovModel","text":"HiddenMarkovModel{D} <: AbstractHiddenMarkovModel\n\nBasic implementation of an HMM.\n\nFields\n\ninit::AbstractVector: initial state probabilities\ntrans::AbstractMatrix: state transition matrix\ndists::AbstractVector{D}: observation distributions\n\n\n\n\n\n","category":"type"},{"location":"api/#HiddenMarkovModels.AbstractHMM","page":"API reference","title":"HiddenMarkovModels.AbstractHMM","text":"AbstractHMM\n\nAlias for the type AbstractHiddenMarkovModel.\n\n\n\n\n\n","category":"type"},{"location":"api/#HiddenMarkovModels.PermutedHMM","page":"API reference","title":"HiddenMarkovModels.PermutedHMM","text":"PermutedHMM{H<:AbstractHMM}\n\nWrapper around an AbstractHMM that permutes its states.\n\nThis is computationally inefficient and mostly useful for evaluation.\n\nFields\n\nhmm:H: the old HMM\nperm::Vector{Int}: a permutation such that state i in the new HMM corresponds to state perm[i] in the old.\n\n\n\n\n\n","category":"type"},{"location":"api/#HiddenMarkovModels.HMM","page":"API reference","title":"HiddenMarkovModels.HMM","text":"HMM\n\nAlias for the type HiddenMarkovModel.\n\n\n\n\n\n","category":"type"},{"location":"api/#Basics","page":"API reference","title":"Basics","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"rand\nlength\ninitial_distribution\ntransition_matrix\nobs_distribution","category":"page"},{"location":"api/#Base.rand","page":"API reference","title":"Base.rand","text":"rand([rng=default_rng(),] mc::AbstractMarkovChain, T)\n\nSimulate mc for T time steps with a specified rng.\n\n\n\n\n\n","category":"function"},{"location":"api/#Base.length","page":"API reference","title":"Base.length","text":"length(mc::AbstractMarkovChain)\n\nReturn the number of states of model.\n\n\n\n\n\n","category":"function"},{"location":"api/#HiddenMarkovModels.initial_distribution","page":"API reference","title":"HiddenMarkovModels.initial_distribution","text":"initial_distribution(mc::AbstractMarkovChain)\n\nReturn the initial state probabilities of mc.\n\n\n\n\n\n","category":"function"},{"location":"api/#HiddenMarkovModels.transition_matrix","page":"API reference","title":"HiddenMarkovModels.transition_matrix","text":"transition_matrix(mc::AbstractMarkovChain)\n\nReturn the state transition probabilities of mc.\n\n\n\n\n\n","category":"function"},{"location":"api/#HiddenMarkovModels.obs_distribution","page":"API reference","title":"HiddenMarkovModels.obs_distribution","text":"obs_distribution(hmm::AbstractHMM, i)\n\nReturn the observation distribution of hmm associated with state i.\n\nThe returned object dist must implement\n\nrand(rng, dist)\nDensityInterface.logdensityof(dist, x)\n\n\n\n\n\n","category":"function"},{"location":"api/#Inference","page":"API reference","title":"Inference","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"logdensityof\nforward\nviterbi\nforward_backward","category":"page"},{"location":"api/#DensityInterface.logdensityof","page":"API reference","title":"DensityInterface.logdensityof","text":"logdensityof(mc, state_seq)\n\nCompute the loglikelihood of a single state sequence for a Markov chain.\n\n\n\n\n\nlogdensityof(hmm, obs_seq)\n\nApply the forward algorithm to compute the loglikelihood of a single observation sequence for an HMM.\n\nReturn a number.\n\n\n\n\n\nlogdensityof(hmm, obs_seqs, nb_seqs)\n\nApply the forward algorithm to compute the total loglikelihood of multiple observation sequences for an HMM.\n\nReturn a number.\n\nwarning: Multithreading\nThis function is parallelized across sequences.\n\n\n\n\n\n","category":"function"},{"location":"api/#HiddenMarkovModels.forward","page":"API reference","title":"HiddenMarkovModels.forward","text":"forward(hmm, obs_seq)\n\nApply the forward algorithm to an HMM.\n\nReturn a tuple (α, logL) where\n\nlogL is the loglikelihood of the sequence\nα[i] is the posterior probability of state i at the end of the sequence.\n\n\n\n\n\nforward(hmm, obs_seqs, nb_seqs)\n\nApply the forward algorithm to an HMM, based on multiple observation sequences.\n\nReturn a vector of tuples (αₖ, logLₖ), where\n\nlogLₖ is the loglikelihood of sequence k\nαₖ[i] is the posterior probability of state i at the end of sequence k\n\nwarning: Multithreading\nThis function is parallelized across sequences.\n\n\n\n\n\n","category":"function"},{"location":"api/#HiddenMarkovModels.viterbi","page":"API reference","title":"HiddenMarkovModels.viterbi","text":"viterbi(hmm, obs_seq)\n\nApply the Viterbi algorithm to compute the most likely state sequence of an HMM.\n\nReturn a vector of integers.\n\n\n\n\n\nviterbi(hmm, obs_seqs, nb_seqs)\n\nApply the Viterbi algorithm to compute the most likely state sequences of an HMM, based on multiple observation sequences.\n\nReturn a vector of vectors of integers.\n\nwarning: Multithreading\nThis function is parallelized across sequences.\n\n\n\n\n\n","category":"function"},{"location":"api/#HiddenMarkovModels.forward_backward","page":"API reference","title":"HiddenMarkovModels.forward_backward","text":"forward_backward(hmm, obs_seq)\n\nApply the forward-backward algorithm to estimate the posterior state marginals of an HMM.\n\nReturn a ForwardBackwardStorage.\n\n\n\n\n\nforward_backward(hmm, obs_seqs, nb_seqs)\n\nApply the forward-backward algorithm to estimate the posterior state marginals of an HMM, based on multiple observation sequences.\n\nReturn a vector of ForwardBackwardStorage objects.\n\nwarning: Multithreading\nThis function is parallelized across sequences.\n\n\n\n\n\n","category":"function"},{"location":"api/#Learning","page":"API reference","title":"Learning","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"fit!\nfit\nbaum_welch","category":"page"},{"location":"api/#StatsAPI.fit!","page":"API reference","title":"StatsAPI.fit!","text":"fit!(mc::MC, init_count, trans_count)\n\nUpdate mc in-place based on information generated from a state sequence.\n\n\n\n\n\nfit!(hmm::HMM, init_count, trans_count, obs_seq, state_marginals)\n\nUpdate hmm in-place based on information generated during forward-backward.\n\n\n\n\n\n","category":"function"},{"location":"api/#StatsAPI.fit","page":"API reference","title":"StatsAPI.fit","text":"fit(mc, state_seq_or_seqs)\n\nFit a Markov chain of the same type as mc to one or several state sequence(s).\n\nBeware that mc must be an actual object of type MarkovChain, and not the type itself as is usually done eg. in Distributions.jl.\n\n\n\n\n\n","category":"function"},{"location":"api/#HiddenMarkovModels.baum_welch","page":"API reference","title":"HiddenMarkovModels.baum_welch","text":"baum_welch(\n hmm_init, obs_seq;\n atol, max_iterations, check_loglikelihood_increasing\n)\n\nApply the Baum-Welch algorithm to estimate the parameters of an HMM starting from hmm_init.\n\nReturn a tuple (hmm_est, logL_evolution).\n\nKeyword arguments\n\natol: Minimum loglikelihood increase at an iteration of the algorithm (otherwise the algorithm is deemed to have converged)\nmax_iterations: Maximum number of iterations of the algorithm\ncheck_loglikelihood_increasing: Whether to throw an error if the loglikelihood decreases\n\n\n\n\n\nbaum_welch(\n hmm_init, obs_seqs, nb_seqs;\n atol, max_iterations, check_loglikelihood_increasing\n)\n\nApply the Baum-Welch algorithm to estimate the parameters of an HMM starting from hmm_init, based on nb_seqs observation sequences.\n\nReturn a tuple (hmm_est, logL_evolution).\n\nwarning: Multithreading\nThis function is parallelized across sequences.\n\nKeyword arguments\n\natol: Minimum loglikelihood increase at an iteration of the algorithm (otherwise the algorithm is deemed to have converged)\nmax_iterations: Maximum number of iterations of the algorithm\ncheck_loglikelihood_increasing: Whether to throw an error if the loglikelihood decreases\n\n\n\n\n\n","category":"function"},{"location":"api/#Internals","page":"API reference","title":"Internals","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"HMMs.ForwardBackwardStorage\nHMMs.fit_element_from_sequence!\nHMMs.LightDiagNormal","category":"page"},{"location":"api/#HiddenMarkovModels.ForwardBackwardStorage","page":"API reference","title":"HiddenMarkovModels.ForwardBackwardStorage","text":"ForwardBackwardStorage{R}\n\nStore forward-backward quantities with element type R.\n\nFields\n\nLet X denote the vector of hidden states and Y denote the vector of observations. The following fields are part of the API:\n\nγ::Matrix{R}: posterior one-state marginals γ[i,t] = ℙ(X[t]=i | Y[1:T])\nξ::Array{R,3}: posterior two-state marginals ξ[i,j,t] = ℙ(X[t:t+1]=(i,j) | Y[1:T])\n\nThe following fields are internals and subject to change:\n\nα::Matrix{R}: scaled forward variables α[i,t] proportional to ℙ(Y[1:t], X[t]=i) (up to a function of t)\nβ::Matrix{R}: scaled backward variables β[i,t] proportional to ℙ(Y[t+1:T] | X[t]=i) (up to a function of t)\nc::Vector{R}: forward variable inverse normalizations c[t] = 1 / sum(α[:,t])\nlogm::Vector{R}: maximum of the observation loglikelihoods logm[t] = maximum(logB[:, t])\nBscaled::Matrix{R}: numerically stabilized observation likelihoods Bscaled[i,t] = exp.(logB[i,t] - logm[t])\nBβscaled::Matrix{R}: numerically stabilized product Bβscaled[i,t] = Bscaled[i,t] * β[i,t]\n\n\n\n\n\n","category":"type"},{"location":"api/#HiddenMarkovModels.fit_element_from_sequence!","page":"API reference","title":"HiddenMarkovModels.fit_element_from_sequence!","text":"fit_element_from_sequence!(dists, i, x, w)\n\nModify the i-th element of dists by fitting it to an observation sequence x with associated weight sequence w.\n\nThe default behavior is a fallback on StatsAPI.fit!, which users are encouraged to implement if their observation distributions are mutable. If this is not possible, please override fit_element_from_sequence! directly.\n\n\n\n\n\n","category":"function"},{"location":"api/#HiddenMarkovModels.LightDiagNormal","page":"API reference","title":"HiddenMarkovModels.LightDiagNormal","text":"LightDiagNormal\n\nAn HMMs-compatible implementation of a multivariate normal distribution with diagonal covariance, enabling allocation-free estimation.\n\nThis is not part of the public API and is expected to change.\n\n\n\n\n\n","category":"type"},{"location":"api/#Notations","page":"API reference","title":"Notations","text":"","category":"section"},{"location":"api/#Integers","page":"API reference","title":"Integers","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"N: number of states\nD: dimension of the observations\nT: trajectory length\nK: number of trajectories","category":"page"},{"location":"api/#Models-and-simulations","page":"API reference","title":"Models and simulations","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"p or init: initial_distribution (vector of state probabilities)\nA or trans: transition_matrix (matrix of transition probabilities)\ndists: observation distribution (vector of rand-able and logdensityof-able objects)\nstate_seq: a sequence of states (vector of integers)\nobs_seq: a sequence of observations (vector of individual observations)\nobs_seqs: several sequences of observations","category":"page"},{"location":"api/#Forward-backward","page":"API reference","title":"Forward backward","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"(log)b: vector of observation (log)likelihoods by state for an individual observation\n(log)B: matrix of observation (log)likelihoods by state for a sequence of observations\nα: scaled forward variables\nβ: scaled backward variables\nγ: one-state marginals\nξ: two-state marginals\nlogL: loglikelihood of a sequence of observations","category":"page"},{"location":"api/#Index","page":"API reference","title":"Index","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"","category":"page"},{"location":"alt_performance/#Alternatives-performance","page":"Performance","title":"Alternatives - performance","text":"","category":"section"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"We compare performance among the following packages:","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"HiddenMarkovModels.jl (abbreviated to HMMs.jl)\nHMMBase.jl\nhmmlearn\npomegranate","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"The test case is an HMM with diagonal multivariate normal observations.","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"N: number of states\nD: dimension of the observations\nT: trajectory length\nK: number of trajectories\nI: number of Baum-Welch iterations","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"danger: Why is this empty?\nThe benchmark suite is computationally expensive, and we only run it once for each new release. If the following section contains no plots and the links are broken, you're probably reading the development documentation or a local build of the website. Check out the stable documentation instead.","category":"page"},{"location":"alt_performance/#Reproducibility","page":"Performance","title":"Reproducibility","text":"","category":"section"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"These benchmarks were generated in the following environment: setup.txt.","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"If you want to run them on your machine:","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"Clone the HiddenMarkovModels.jl repository\nOpen a Julia REPL at the root\nRun the following commands\ninclude(\"benchmark/run_benchmarks.jl\")\ninclude(\"benchmark/process_benchmarks.jl\")","category":"page"},{"location":"alt_performance/#Remarks","page":"Performance","title":"Remarks","text":"","category":"section"},{"location":"alt_performance/#Julia-to-Python-overhead","page":"Performance","title":"Julia-to-Python overhead","text":"","category":"section"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"Since the Python packages are called from Julia with PythonCall.jl, we pay a small overhead that is hard to quantify. On the plots, we compensate it by subtracting the runtime of the same algorithm for the smallest instance (N=1, D=1, T=2, K=1, I=1) from all Python-generated curves.","category":"page"},{"location":"alt_performance/#Allocations","page":"Performance","title":"Allocations","text":"","category":"section"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"A major bottleneck of performance in Julia is memory allocations. The benchmarks for HMMs.jl thus employ a custom implementation of diagonal multivariate normals, which is entirely allocation-free.","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"This partly explains the performance gap with HMMBase.jl as the dimension D grows beyond 1. Such a trick is also possible with HMMBase.jl, but more demanding since it requires subtyping Distribution from Distributions.jl, instead of just implementing DensityInterface.jl.","category":"page"},{"location":"alt_performance/#Parallelism","page":"Performance","title":"Parallelism","text":"","category":"section"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"The packages we include have different approaches to parallelism, which can bias the evaluation in complex ways:","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"Package States N Observations D Sequences K\nHMMs.jl LinearAlgebra[2] depends[2] Threads[1]\nHMMBase.jl - depends[2] -\nhmmlearn NumPy[2] NumPy[2] NumPy[2]\npomegranate PyTorch[3] PyTorch[3] PyTorch[3]","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"[1]: affected by number of Julia threads","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"[2]: affected by number of OpenBLAS threads","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"[3]: affected by number of Pytorch threads","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"We report each number of threads in setup.txt. Since OpenBLAS threads have negative interactions with Julia threads, we run the Julia benchmarks with a single OpenBLAS thread.","category":"page"},{"location":"alt_performance/#Numerical-results","page":"Performance","title":"Numerical results","text":"","category":"section"},{"location":"alt_performance/#Low-dimension","page":"Performance","title":"Low dimension","text":"","category":"section"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"Full benchmark logs: low_dim.csv.","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"(Image: ) (Image: ) (Image: ) (Image: )","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"Here, pomegranate is not included because it is much slower on very small inputs.","category":"page"},{"location":"alt_performance/#High-dimension","page":"Performance","title":"High dimension","text":"","category":"section"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"Full benchmark logs: high_dim.csv.","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"(Image: ) (Image: ) (Image: ) (Image: )","category":"page"},{"location":"alt_performance/","page":"Performance","title":"Performance","text":"Here, HMMBase.jl is not included because it does not support multiple sequences.","category":"page"},{"location":"tuto_builtin/#Tutorial-built-in-HMM","page":"Built-in HMM","title":"Tutorial - built-in HMM","text":"","category":"section"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"using Distributions\nusing HiddenMarkovModels\n\nusing Random; Random.seed!(63)","category":"page"},{"location":"tuto_builtin/#Construction","page":"Built-in HMM","title":"Construction","text":"","category":"section"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"Creating a model:","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"function gaussian_hmm(N; noise=0)\n p = ones(N) / N # initial distribution\n A = rand_trans_mat(N) # transition matrix\n dists = [Normal(i + noise * randn(), 0.5) for i in 1:N] # observation distributions\n return HMM(p, A, dists)\nend","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"Checking its contents:","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"N = 3\nhmm = gaussian_hmm(N)\ntransition_matrix(hmm)","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"[obs_distribution(hmm, i) for i in 1:N]","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"Simulating a sequence:","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"T = 1000\nstate_seq, obs_seq = rand(hmm, T);\nfirst(state_seq, 10)'","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"first(obs_seq, 10)'","category":"page"},{"location":"tuto_builtin/#Inference","page":"Built-in HMM","title":"Inference","text":"","category":"section"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"Computing the loglikelihood of an observation sequence:","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"logdensityof(hmm, obs_seq)","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"Inferring the most likely state sequence:","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"most_likely_state_seq = viterbi(hmm, obs_seq);\nfirst(most_likely_state_seq, 10)'","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"Learning the parameters based on an observation sequence:","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"hmm_init = gaussian_hmm(N, noise=1)\nhmm_est, logL_evolution = baum_welch(hmm_init, obs_seq);\nfirst(logL_evolution), last(logL_evolution)","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"Correcting state order because we know observation means are increasing in the true model:","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"[obs_distribution(hmm_est, i) for i in 1:N]","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"perm = sortperm(1:3, by=i->obs_distribution(hmm_est, i).μ)","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"hmm_est = PermutedHMM(hmm_est, perm)","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"Evaluating errors:","category":"page"},{"location":"tuto_builtin/","page":"Built-in HMM","title":"Built-in HMM","text":"cat(transition_matrix(hmm_est), transition_matrix(hmm), dims=3)","category":"page"},{"location":"roadmap/#Roadmap","page":"Roadmap","title":"Roadmap","text":"","category":"section"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"Here are some of the things that I would like to work on soon-ish:","category":"page"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"numerical stability in large-dimensional settings with sparse transitions\nSIMD optimization with LoopVectorization.jl or Tullio.jl\nspectral estimation methods\ninput-output HMMs in my other package ControlledMarkovModels.jl","category":"page"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"Contributors are welcome!","category":"page"},{"location":"alt_features/#Alternatives-features","page":"Features","title":"Alternatives - features","text":"","category":"section"},{"location":"alt_features/","page":"Features","title":"Features","text":"We compare features among the following Julia packages:","category":"page"},{"location":"alt_features/","page":"Features","title":"Features","text":"HiddenMarkovModels.jl (abbreviated to HMMs.jl)\nHMMBase.jl\nHMMGradients.jl","category":"page"},{"location":"alt_features/","page":"Features","title":"Features","text":"We discard MarkovModels.jl because its focus is GPU computation. There are also more generic packages for probabilistic programming, which are able to perform MCMC or variational inference (eg. Turing.jl) but we leave those aside.","category":"page"},{"location":"alt_features/","page":"Features","title":"Features","text":" HMMs.jl HMMBase.jl HMMGradients.jl\nAlgorithms Sim, FB, Vit, BW Sim, FB, Vit, BW FB\nObservation types anything Number / Vector anything\nObservation distributions DensityInterface.jl Distributions.jl manual\nNumber types anything Float64 AbstractFloat\nPriors / structures possible no possible\nAutomatic differentiation yes no yes\nMultiple sequences yes no yes\nLinear algebra yes yes no\nLogarithmic probabilities halfway halfway yes","category":"page"},{"location":"alt_features/","page":"Features","title":"Features","text":"Sim = Simulation, FB = Forward-Backward, Vit = Viterbi, BW = Baum-Welch","category":"page"},{"location":"background/#Background","page":"Background","title":"Background","text":"","category":"section"},{"location":"background/#What-are-HMMs?","page":"Background","title":"What are HMMs?","text":"","category":"section"},{"location":"background/","page":"Background","title":"Background","text":"Hidden Markov Models (HMMs for short) are a statistical modeling framework that is ubiquitous in signal processing, bioinformatics and plenty of other fields. They capture the distribution of an observation sequence (Y_t) by assuming the existence of a latent state sequence (X_t) such that:","category":"page"},{"location":"background/","page":"Background","title":"Background","text":"the sequence (X_t) follows a (discrete time, discrete space) Markov chain\nfor each t, the distribution of Y_t is entirely determined by the value of X_t","category":"page"},{"location":"background/#What-can-we-do-with-them?","page":"Background","title":"What can we do with them?","text":"","category":"section"},{"location":"background/","page":"Background","title":"Background","text":"Imagine we are given an observation sequence (Y_t) and a parametric family of HMMs mathbbP_theta theta in Theta. We can list several fundamental problems, each of which has a solution that relies on dynamic programming:","category":"page"},{"location":"background/","page":"Background","title":"Background","text":"Problem Goal Algorithm\nEvaluation Likelihood of the observation sequence mathbbP_theta(Y_1T) Forward\nInference State marginals mathbbP_theta(X_t vert Y_1T) Forward-backward\nDecoding Most likely state sequence undersetX_1TmathrmargmaxmathbbP_theta(X_1T vert Y_1T) Viterbi\nLearning Best parameter undersetthetamathrmargmaxmathbbP_theta(Y_1T) Baum-Welch","category":"page"},{"location":"background/","page":"Background","title":"Background","text":"Our whole package is based on the tutorial by (Rabiner, 1989), you can refer to it for more details.","category":"page"},{"location":"background/#Bibliography","page":"Background","title":"Bibliography","text":"","category":"section"},{"location":"background/","page":"Background","title":"Background","text":"","category":"page"},{"location":"tuto_custom/#Tutorial-custom-HMM","page":"Custom HMM","title":"Tutorial - custom HMM","text":"","category":"section"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"The built-in HMM is perfect when the initial state distribution, transition matrix and emission distributions can be updated independently with Maximum Likelihood Estimation. But some of these parameters might be correlated, or fixed. Or they might come with a prior, which forces you to use Maximum A Posteriori instead.","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"In such cases, it is necessary to implement a new subtype of AbstractHMM with all its required methods.","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"using Distributions\nusing HiddenMarkovModels\nusing LinearAlgebra\nusing RequiredInterfaces\nusing StatsAPI\n\nusing Random; Random.seed!(63)","category":"page"},{"location":"tuto_custom/#Interface","page":"Custom HMM","title":"Interface","text":"","category":"section"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"To ascertain that a type indeed satisfies the interface, you can use RequiredInterfaces.jl.","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"RequiredInterfaces.check_interface_implemented(AbstractHMM, HMM)","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"If your implementation is insufficient, the test will list missing methods.","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"struct EmptyHMM end\nRequiredInterfaces.check_interface_implemented(AbstractHMM, EmptyHMM)","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"Note that this test does not check the StatsAPI.fit! method. Since it is only used in the Baum-Welch algorithm, it is an optional part of the AbstractHMM interface.","category":"page"},{"location":"tuto_custom/#Example","page":"Custom HMM","title":"Example","text":"","category":"section"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"We show how to implement an HMM whose initial distribution is always the equilibrium distribution of the underlying Markov chain. The code that follows is not efficient (it leads to a lot of allocations), but it would be fairly easy to optimize if needed.","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"The equilibrium distribution of a Markov chain is the (only) left eigenvector associated with the left eigenvalue 1.","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"function markov_equilibrium(A)\n p = real.(eigvecs(A')[:, end])\n return p ./ sum(p)\nend","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"We now define our custom HMM by taking inspiration from src/types/hmm.jl and making a few modifications:","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"struct EquilibriumHMM{R,D} <: AbstractHMM\n trans::Matrix{R}\n dists::Vector{D}\nend","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"The interface is only different as far as the initialization is concerned.","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"Base.length(hmm::EquilibriumHMM) = length(hmm.dists)\nHMMs.initial_distribution(hmm::EquilibriumHMM) = markov_equilibrium(hmm.trans) # this is new\nHMMs.transition_matrix(hmm::EquilibriumHMM) = hmm.trans\nHMMs.obs_distribution(hmm::EquilibriumHMM, i::Integer) = hmm.dists[i]","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"As for fitting, we simply ignore the initialization count and copy the rest of the original code (with a few simplifications):","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"function StatsAPI.fit!(\n hmm::EquilibriumHMM{R,D}, init_count, trans_count, obs_seq, state_marginals\n) where {R,D}\n hmm.trans .= trans_count ./ sum(trans_count, dims=2)\n for i in 1:N\n hmm.dists[i] = fit(D, obs_seq, state_marginals[i, :])\n end\nend","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"Let's take our new model for a spin:","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"function gaussian_equilibrium_hmm(N; noise=0)\n A = rand_trans_mat(N)\n dists = [Normal(i + noise * randn(), 0.5) for i in 1:N]\n return EquilibriumHMM(A, dists)\nend;","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"N = 3\nhmm = gaussian_equilibrium_hmm(N);\ntransition_matrix(hmm)","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"[obs_distribution(hmm, i) for i in 1:N]","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"We can estimate parameters based on several observation sequences. Note that as soon as we tamper with the re-estimation procedure, the loglikelihood is no longer guaranteed to increase during Baum-Welch, which is why we turn off the corresponding check.","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"T = 1000\nnb_seqs = 10\nobs_seqs = [rand(hmm, T).obs_seq for _ in 1:nb_seqs]\n\nhmm_init = gaussian_equilibrium_hmm(N; noise=1)\nhmm_est, logL_evolution = baum_welch(\n hmm_init, obs_seqs, nb_seqs; check_loglikelihood_increasing=false\n);\nfirst(logL_evolution), last(logL_evolution)","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"Let's correct the state order:","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"[obs_distribution(hmm_est, i) for i in 1:N]","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"perm = sortperm(1:3, by=i->obs_distribution(hmm_est, i).μ)","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"hmm_est = PermutedHMM(hmm_est, perm)","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"And finally evaluate the errors:","category":"page"},{"location":"tuto_custom/","page":"Custom HMM","title":"Custom HMM","text":"cat(transition_matrix(hmm_est), transition_matrix(hmm), dims=3)","category":"page"},{"location":"formulas/#Formulas","page":"Formulas","title":"Formulas","text":"","category":"section"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"Suppose we are given observations Y_1 Y_T, with hidden states X_1 X_T. Following (Rabiner, 1989), we use the following notations:","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"let pi in mathbbR^N be the initial state distribution pi_i = mathbbP(X_1 = i)\nlet A in mathbbR^N times N be the transition matrix a_ij = mathbbP(X_t+1=j X_t = i)\nlet B in mathbbR^N times T be the matrix of statewise observation likelihoods b_it = mathbbP(Y_t X_t = i)","category":"page"},{"location":"formulas/#Vanilla-forward-backward","page":"Formulas","title":"Vanilla forward-backward","text":"","category":"section"},{"location":"formulas/#Recursion","page":"Formulas","title":"Recursion","text":"","category":"section"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"The forward and backward variables are defined by","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nalpha_it = mathbbP(Y_1t X_t=i) \nbeta_it = mathbbP(Y_t+1T X_t=i)\nendalign*","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"They are initialized with","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nalpha_i1 = pi_i b_i1 \nbeta_iT = 1\nendalign*","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"and satisfy the dynamic programming equations","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nalpha_jt+1 = left(sum_i=1^N alpha_it a_ijright) b_jt+1 \nbeta_it = sum_j=1^N a_ij b_jt+1 beta_jt+1\nendalign*","category":"page"},{"location":"formulas/#Likelihood","page":"Formulas","title":"Likelihood","text":"","category":"section"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"The likelihood of the whole sequence of observations is given by","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"mathcalL = mathbbP(Y_1T) = sum_i=1^N alpha_iT","category":"page"},{"location":"formulas/#Marginals","page":"Formulas","title":"Marginals","text":"","category":"section"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"We notice that","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nalpha_it beta_it = mathbbP(Y_1T X_t=i) \nalpha_it a_ij b_jt+1 beta_jt+1 = mathbbP(Y_1T X_t=i X_t+1=j)\nendalign*","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"Thus we deduce the one-state and two-state marginals","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\ngamma_it = mathbbP(X_t=i Y_1T) = frac1mathcalL alpha_it beta_it \nxi_ijt = mathbbP(X_t=i X_t+1=j Y_1T) = frac1mathcalL alpha_it a_ij b_jt+1 beta_jt+1\nendalign*","category":"page"},{"location":"formulas/#Derivatives","page":"Formulas","title":"Derivatives","text":"","category":"section"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"According to (Qin et al., 2000), derivatives of the likelihood can be obtained as follows:","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nfracpartial mathcalLpartial pi_i = beta_i1 b_i1 \nfracpartial mathcalLpartial a_ij = sum_t=1^T-1 alpha_it b_jt+1 beta_jt+1 \nfracpartial mathcalLpartial b_j1 = pi_j beta_j1 \nfracpartial mathcalLpartial b_jt = left(sum_i=1^N alpha_it-1 a_ijright) beta_jt \nendalign*","category":"page"},{"location":"formulas/#Scaled-forward-backward","page":"Formulas","title":"Scaled forward-backward","text":"","category":"section"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"In this package, we use a slightly different version of the algorithm, including both the traditional scaling of (Rabiner, 1989) and a normalization of B using m_t = max_i b_it.","category":"page"},{"location":"formulas/#Recursion-2","page":"Formulas","title":"Recursion","text":"","category":"section"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"The variables are initialized with","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nhatalpha_i1 = pi_i fracb_i1m_1 c_1 = frac1sum_i hatalpha_i1 baralpha_i1 = c_1 hatalpha_i1 \nhatbeta_iT = 1 barbeta_1T = c_T hatbeta_1T\nendalign*","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"and satisfy the dynamic programming equations","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nhatalpha_jt+1 = left(sum_i=1^N baralpha_it a_ijright) fracb_jt+1m_t+1 c_t+1 = frac1sum_j hatalpha_jt+1 baralpha_jt+1 = c_t+1 hatalpha_jt+1 \nhatbeta_it = sum_j=1^N a_ij fracb_jt+1m_t+1 barbeta_jt+1 barbeta_jt = c_t hatbeta_jt\nendalign*","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"In terms of the original variables, we find","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nbaralpha_it = alpha_it left(prod_s=1^t fracc_sm_sright) \nbarbeta_it = beta_it left(c_t prod_s=t+1^T fracc_sm_sright)\nendalign*","category":"page"},{"location":"formulas/#Likelihood-2","page":"Formulas","title":"Likelihood","text":"","category":"section"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"Since we normalized baralpha at each time step,","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"1 = sum_i=1^N baralpha_iT = left(sum_i=1^N alpha_iTright) left(prod_s=1^T fracc_sm_sright) ","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"which means","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"mathcalL = sum_i=1^N alpha_iT = prod_s=1^T fracm_sc_s","category":"page"},{"location":"formulas/#Marginals-2","page":"Formulas","title":"Marginals","text":"","category":"section"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"We can now express the marginals using scaled variables:","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\ngamma_it = frac1mathcalL alpha_it beta_it = frac1mathcalL left(baralpha_it prod_s=1^t fracm_sc_sright) left(barbeta_it frac1c_t prod_s=t+1^T fracm_sc_sright) \n= frac1mathcalL fracbaralpha_it barbeta_itc_t left(prod_s=1^T fracm_sc_sright) = fracbaralpha_it barbeta_itc_t\nendalign*","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nxi_ijt = frac1mathcalL alpha_it a_ij b_jt+1 beta_jt+1 \n= frac1mathcalL left(baralpha_it prod_s=1^t fracm_sc_sright) a_ij b_jt+1 left(barbeta_jt+1 frac1c_t+1 prod_s=t+2^T fracm_sc_sright) \n= frac1mathcalL baralpha_it a_ij fracb_jt+1m_t+1 barbeta_jt+1 left(prod_s=1^T fracm_sc_sright) \n= baralpha_it a_ij fracb_jt+1m_t+1 barbeta_jt+1\nendalign*","category":"page"},{"location":"formulas/#Derivatives-2","page":"Formulas","title":"Derivatives","text":"","category":"section"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"And we also need to adapt the derivatives. For the initial distribution,","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nfracpartial mathcalLpartial pi_i = beta_i1 b_i1 = left(barbeta_i1 frac1c_1 prod_s=2^T fracm_sc_s right) b_i1 \n= left(prod_s=1^T fracm_sc_sright) barbeta_i1 fracb_i1m_1 = mathcalL barbeta_i1 fracb_i1m_1 \nendalign*","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"For the transition matrix,","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nfracpartial mathcalLpartial a_ij = sum_t=1^T-1 alpha_it b_jt+1 beta_jt+1 \n= sum_t=1^T-1 left(baralpha_it prod_s=1^t fracm_sc_s right) b_jt+1 left(barbeta_jt+1 frac1c_t+1 prod_s=t+2^T fracm_sc_s right) \n= sum_t=1^T-1 baralpha_it fracb_jt+1m_t+1 barbeta_jt+1 left(prod_s=1^T fracm_sc_s right) \n= mathcalL sum_t=1^T-1 baralpha_it fracb_jt+1m_t+1 barbeta_jt+1 \nendalign*","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"And for the statewise observation likelihoods,","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nfracpartial mathcalLpartial b_j1 = pi_j beta_j1 = pi_j barbeta_j1 frac1c_1 prod_s=2^T fracm_sc_s = mathcalL pi_j barbeta_j1 frac1m_1\nendalign*","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nfracpartial mathcalLpartial b_jt = left(sum_i=1^N alpha_it-1 a_ijright) beta_jt \n= sum_i=1^N left(baralpha_it-1 prod_s=1^t-1 fracm_sc_sright) a_ij left(barbeta_jt frac1c_t prod_s=t+1^T fracm_sc_s right) \n= sum_i=1^N baralpha_it-1 a_ij barbeta_jt frac1m_t left(prod_s=1^T fracm_sc_sright) \n= mathcalL sum_i=1^N baralpha_it-1 a_ij barbeta_jt frac1m_t \nendalign*","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"Finally, we note that","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"fracpartial log mathcalLpartial log b_jt = fracpartial log mathcalLpartial b_jt b_jt","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"To sum up,","category":"page"},{"location":"formulas/","page":"Formulas","title":"Formulas","text":"beginalign*\nfracpartial log mathcalLpartial pi_i = fracb_i1m_1 barbeta_i1 \nfracpartial log mathcalLpartial a_ij = sum_t=1^T-1 baralpha_it fracb_jt+1m_t+1 barbeta_jt+1 \nfracpartial log mathcalLpartial log b_j1 = pi_j fracb_j1m_1 barbeta_j1 = fracbaralpha_j1 barbeta_j1c_1 = gamma_j1 \nfracpartial log mathcalLpartial log b_jt = sum_i=1^N baralpha_it-1 a_ij fracb_jtm_t barbeta_jt = fracbaralpha_jt barbeta_jtc_t = gamma_jt\nendalign*","category":"page"},{"location":"","page":"Home","title":"Home","text":"EditURL = \"https://github.com/gdalle/HiddenMarkovModels.jl/blob/main/README.md\"","category":"page"},{"location":"#HiddenMarkovModels.jl","page":"Home","title":"HiddenMarkovModels.jl","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"(Image: Stable) (Image: Dev) (Image: Build Status) (Image: Coverage) (Image: Code Style: Blue)","category":"page"},{"location":"","page":"Home","title":"Home","text":"(Image: DOI) (Image: Aqua QA) (Image: JET)","category":"page"},{"location":"","page":"Home","title":"Home","text":"A Julia package for HMM modeling, simulation, inference and learning.","category":"page"},{"location":"#Getting-started","page":"Home","title":"Getting started","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"This package can be installed using Julia's package manager:","category":"page"},{"location":"","page":"Home","title":"Home","text":"pkg> add HiddenMarkovModels","category":"page"},{"location":"","page":"Home","title":"Home","text":"Then, you can create your first HMM as follows:","category":"page"},{"location":"","page":"Home","title":"Home","text":"using Distributions, HiddenMarkovModels\ninit = [0.2, 0.8]\ntrans = [0.1 0.9; 0.7 0.3]\ndists = [Normal(-1), Normal(1)]\nhmm = HMM(init, trans, dists)","category":"page"},{"location":"","page":"Home","title":"Home","text":"Take a look at the documentation to know what to do next!","category":"page"},{"location":"#Main-features","page":"Home","title":"Main features","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"This package is generic. Observations can be arbitrary Julia objects, not just scalars or arrays. Their distributions only need to implement rand(rng, dist) and logdensityof(dist, x) from DensityInterface.jl. Number types are not restricted to floating point, and automatic differentiation is supported in forward mode (ForwardDiff.jl) and reverse mode (ChainRules.jl).","category":"page"},{"location":"","page":"Home","title":"Home","text":"This package is fast. All the inference functions have allocation-free versions, which leverage efficient linear algebra subroutines. Multithreading is used to parallelize computations across sequences, and compatibility with various array types is ensured. We include extensive benchmarks against Julia and Python competitors thanks to BenchmarkTools.jl and PythonCall.jl.","category":"page"},{"location":"","page":"Home","title":"Home","text":"This package is reliable. It gives the same results as the previous reference package HMMBase.jl up to numerical accuracy. The test suite incorporates quality checks with Aqua.jl, as well as linting and type stability checks with JET.jl. A detailed documentation will help you find the functions you need.","category":"page"},{"location":"","page":"Home","title":"Home","text":"But this package is limited in scope. It is designed for HMMs with a small number of states, because memory and runtime scale quadratically (even if the transitions are sparse). It is also meant to perform best on a CPU, and not tested at all on GPUs.","category":"page"},{"location":"#Contributing","page":"Home","title":"Contributing","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"If you spot a bug or want to ask about a new feature, please open an issue on the GitHub repository. Once the issue receives positive feedback, feel free to try and fix it with a pull request that follows the BlueStyle guidelines.","category":"page"},{"location":"#Acknowledgements","page":"Home","title":"Acknowledgements","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"A big thank you to Maxime Mouchet and Jacob Schreiber, the respective lead devs of HMMBase.jl and pomegranate, for their help and advice. Logo by Clément Mantoux based on a portrait of Andrey Markov.","category":"page"}] } diff --git a/dev/tuto_builtin/index.html b/dev/tuto_builtin/index.html index 282b6cde..8026d3f4 100644 --- a/dev/tuto_builtin/index.html +++ b/dev/tuto_builtin/index.html @@ -38,4 +38,4 @@ [:, :, 2] = 0.323233 0.512336 0.164431 0.337336 0.0936302 0.569033 - 0.474491 0.211947 0.313561 + 0.474491 0.211947 0.313561 diff --git a/dev/tuto_custom/index.html b/dev/tuto_custom/index.html index 181707b1..dc0c5e0a 100644 --- a/dev/tuto_custom/index.html +++ b/dev/tuto_custom/index.html @@ -61,4 +61,4 @@ [:, :, 2] = 0.323233 0.512336 0.164431 0.337336 0.0936302 0.569033 - 0.474491 0.211947 0.313561 + 0.474491 0.211947 0.313561