class BayesianHMM

class deeptime.markov.hmm.BayesianHMM(initial_hmm: HiddenMarkovModel, n_samples: int = 100, n_transition_matrix_sampling_steps: int = 1000, stride: Union[str, int] = 'effective', initial_distribution_prior: Optional[Union[str, float, ndarray]] = 'mixed', transition_matrix_prior: Optional[Union[str, ndarray]] = 'mixed', store_hidden: bool = False, reversible: bool = True, stationary: bool = False)

Estimator for a Bayesian Hidden Markov state model.

The theory and estimation procedure are described in [1], [2].

Parameters:
  • initial_hmm (HMM) – Single-point estimate of HMM object around which errors will be evaluated. There is a static method available that can be used to generate a default prior, see default().

  • n_samples (int, optional, default=100) – Number of sampled models.

  • stride (str or int, default='effective') –

    stride between two lagged trajectories extracted from the input trajectories. Given trajectory s[t], stride and lag will result in trajectories

    s[0], s[tau], s[2 tau], ...

    s[stride], s[stride + tau], s[stride + 2 tau], ...

    Setting stride = 1 will result in using all data (useful for maximum likelihood estimator), while a Bayesian estimator requires a longer stride in order to have statistically uncorrelated trajectories. Setting stride = ‘effective’ uses the largest neglected timescale as an estimate for the correlation time and sets the stride accordingly.

  • initial_distribution_prior (None, str, float or ndarray(n)) –

    Prior for the initial distribution of the HMM. Will only be active if stationary=False (stationary=True means that p0 is identical to the stationary distribution of the transition matrix). Currently implements different versions of the Dirichlet prior that is conjugate to the Dirichlet distribution of p0. p0 is sampled from:

    \[p0 \sim \prod_i (p0)_i^{a_i + n_i - 1} \]

    where \(n_i\) are the number of times a hidden trajectory was in state \(i\) at time step 0 and \(a_i\) is the prior count. Following options are available:

    • ’mixed’ (default), \(a_i = p_{0,init}\), where \(p_{0,init}\) is the initial distribution of initial_model.

    • ndarray(n) or float, the given array will be used as A.

    • ’uniform’, \(a_i = 1\)

    • None, \(a_i = 0\). This option ensures coincidence between sample mean an MLE. Will sooner or later lead to sampling problems, because as soon as zero trajectories are drawn from a given state, the sampler cannot recover and that state will never serve as a starting state subsequently. Only recommended in the large data regime and when the probability to sample zero trajectories from any state is negligible.

  • transition_matrix_prior (str or ndarray(n, n)) –

    Prior for the HMM transition matrix. Currently implements Dirichlet priors if reversible=False and reversible transition matrix priors as described in [3] if reversible=True. For the nonreversible case the posterior of transition matrix \(P\) is:

    \[P \sim \prod_{i,j} p_{ij}^{b_{ij} + c_{ij} - 1} \]

    where \(c_{ij}\) are the number of transitions found for hidden trajectories and \(b_{ij}\) are prior counts.

    • ’mixed’ (default), \(b_{ij} = p_{ij,\mathrm{init}}\), where \(p_{ij,\mathrm{init}}\) is the transition matrix of initial_model. That means one prior count will be used per row.

    • ndarray(n, n) or broadcastable, the given array will be used as B.

    • ’uniform’, \(b_{ij} = 1\)

    • None, \(b_ij = 0\). This option ensures coincidence between sample mean an MLE. Will sooner or later lead to sampling problems, because as soon as a transition \(ij\) will not occur in a sample, the sampler cannot recover and that transition will never be sampled again. This option is not recommended unless you have a small HMM and a lot of data.

  • store_hidden (bool, optional, default=False) – Store hidden trajectories in sampled HMMs, see BayesianHMMPosterior.hidden_state_trajectories_samples.

  • reversible (bool, optional, default=True) – If True, a prior that enforces reversible transition matrices (detailed balance) is used; otherwise, a standard non-reversible prior is used.

  • stationary (bool, optional, default=False) – If True, the stationary distribution of the transition matrix will be used as initial distribution. Only use True if you are confident that the observation trajectories are started from a global equilibrium. If False, the initial distribution will be estimated as usual from the first step of the hidden trajectories.

References

Attributes

has_model

Property reporting whether this estimator contains an estimated model.

initial_distribution_prior

Prior for the initial distribution.

initial_hmm

The prior HMM.

model

Shortcut to fetch_model().

n_samples

Number of sampled models.

reversible

If True, a prior that enforces reversible transition matrices (detailed balance) is used; otherwise, a standard non-reversible prior is used.

stationary

If True, the stationary distribution of the transition matrix will be used as initial distribution.

store_hidden

Store hidden trajectories in sampled HMMs, see BayesianHMMPosterior.hidden_state_trajectories_samples.

transition_matrix_prior

Prior for the transition matrix.

Methods

default(dtrajs, n_hidden_states, lagtime[, ...])

Computes a default prior for a BHMM and uses that for error estimation.

fetch_model()

Yields the current model or None if fit() was not yet called.

fit(data[, n_burn_in, n_thin, progress])

Sample from the posterior.

fit_fetch(data, **kwargs)

Fits the internal model on data and subsequently fetches it in one call.

get_params([deep])

Get the parameters.

set_params(**params)

Set the parameters of this estimator.

static default(dtrajs, n_hidden_states: int, lagtime: int, n_samples: int = 100, stride: Union[str, int] = 'effective', initial_distribution_prior: Optional[Union[str, float, ndarray]] = 'mixed', transition_matrix_prior: Optional[Union[str, ndarray]] = 'mixed', separate: Optional[Union[int, List[int]]] = None, store_hidden: bool = False, reversible: bool = True, stationary: bool = False, prior_submodel: bool = True)

Computes a default prior for a BHMM and uses that for error estimation. For a more detailed description of the arguments please refer to HMM or __init__().

Returns:

estimator – Estimator that is initialized with a default prior model.

Return type:

BayesianHMM

fetch_model() BayesianHMMPosterior

Yields the current model or None if fit() was not yet called.

Returns:

posterior – The model.

Return type:

BayesianHMMPosterior

fit(data, n_burn_in: int = 0, n_thin: int = 1, progress=None, **kwargs)

Sample from the posterior.

Parameters:
  • data (array_like or list of array_like) – Input time series data.

  • n_burn_in (int, optional, default=0) – The number of samples to discard to burn-in, following which n_samples samples will be generated.

  • n_thin (int, optional, default=1) – The number of Gibbs sampling updates used to generate each returned sample.

  • progress (iterable, optional, default=None) – Optional progressbar. Tested for tqdm.

  • **kwargs – Ignored kwargs for scikit-learn compatibility.

Returns:

self – Reference to self.

Return type:

BayesianHMM

fit_fetch(data, **kwargs)

Fits the internal model on data and subsequently fetches it in one call.

Parameters:
  • data (array_like) – Data that is used to fit the model.

  • **kwargs – Additional arguments to fit().

Returns:

The estimated model.

Return type:

model

get_params(deep=False)

Get the parameters.

Returns:

params – Parameter names mapped to their values.

Return type:

mapping of string to any

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self – Estimator instance.

Return type:

object

property has_model: bool

Property reporting whether this estimator contains an estimated model. This assumes that the model is initialized with None otherwise.

Type:

bool

property initial_distribution_prior: Optional[Union[str, float, ndarray]]

Prior for the initial distribution. For a more detailed description refer to __init__().

property initial_hmm: HiddenMarkovModel

The prior HMM. An estimator with a default prior HMM can be generated using the static default() method.

property model

Shortcut to fetch_model().

property n_samples: int

Number of sampled models.

property reversible

If True, a prior that enforces reversible transition matrices (detailed balance) is used; otherwise, a standard non-reversible prior is used.

property stationary

If True, the stationary distribution of the transition matrix will be used as initial distribution. Only use True if you are confident that the observation trajectories are started from a global equilibrium. If False, the initial distribution will be estimated as usual from the first step of the hidden trajectories.

property store_hidden

Store hidden trajectories in sampled HMMs, see BayesianHMMPosterior.hidden_state_trajectories_samples.

property transition_matrix_prior

Prior for the transition matrix. For a more detailed description refer to __init__().