class TRAM

class deeptime.markov.msm.TRAM(lagtime=1, count_mode='sliding', maxiter=1000, maxerr: float = 1e-08, init_strategy='MBAR', init_maxiter=1000, init_maxerr=1e-08, track_log_likelihoods=False, callback_interval=1, progress=None)

Transition(-based) Reweighting Analysis Method.

TRAM is described in [1]. The parameters in this code correspond to variables in the TRAM paper in the following way:

  • biased_conf_energies[k][i] corresponds to \(f_i^k\), the free energy for Markov state i in thermodynamic state k.

  • lagrangian_mult_log[k][i] corresponds to \(log\;v_i^k\), the logarithm of the lagrangian multiplier of Markov state i in thermodynamic state k.

  • modified_state_counts_log[k][i] corresponds to \(log\;R_i^k\), the logarithm of \(R_i^k\), the modified state counts for Markov state i in thermodynamic state k.

Parameters:
  • lagtime (int, default=1) – Integer lag time at which transitions are counted.

  • count_mode (str, default="sliding") –

    One of “sample”, “sliding”, “sliding-effective”, and “effective”.

    • ”sample” strides the trajectory with lagtime \(\tau\) and uses the strided counts as transitions.

    • ”sliding” uses a sliding window approach, yielding counts that are statistically correlated and too large by a factor of \(\tau\); in uncertainty estimation this yields wrong uncertainties.

    • ”sliding-effective” takes “sliding” and divides it by \(\tau\), which can be shown to provide a likelihood that is the geometrical average over shifted subsamples of the trajectory, \((s_1,\:s_{tau+1},\:...),\:(s_2,\:t_{tau+2},\:...),\) etc. This geometrical average converges to the correct likelihood in the statistical limit [2].

    • ”effective” uses an estimate of the transition counts that are statistically uncorrelated. Recommended when estimating Bayesian MSMs.

  • maxiter (int, optional, default=1000) – The maximum number of self-consistent iterations before the estimator exits unsuccessfully.

  • maxerr (float, optional, default=1E-8) – Convergence criterion based on the maximal free energy change in a self-consistent iteration step.

  • init_strategy (str, optional, default='MBAR') –

    Strategy of initialization of the free energies, lagrangian multipliers and state counts. Possible choices: “MBAR” or None.

    • ”MBAR” : Initialize free energies using MBAR [3]. MBAR iterations are executed before TRAM, to initialize the free energies with the MBAR estimate, which is a good initial estimate for the TRAM free energies and will significantly speed up the convergence of TRAM. Lagrangian multipliers are initialized to \(v_i^k = log( 1/2 * \sum_j (c_ij^k + c_ji^k))\) and modified state counts are zero-initialized.

    • None : Free energies and modified state counts are zero-initialized. Lagrangian multipliers are initialized to \(v_i^k = log( 1/2 * \sum_j (c_ij^k + c_ji^k))\)

  • init_maxiter (int, optional, default=1000) – The maximum number of iterations for parameter initialization, e.g. MBAR iterations. These initialization iterations are executed before TRAM, to initialize the parameters with the chosen init_strategy.

  • init_maxerr (float, optional, default = 1eE-8) – Convergence criterion for the initialization routine, based on the maximum energy change after one iteration step.

  • track_log_likelihoods (bool, optional, default=False) – If True, the log-likelihood is stored every callback_interval steps. For calculation of the log-likelihood the transition matrix needs to be constructed, which will slow down estimation. By default, log-likelihoods are not computed.

  • callback_interval (int, optional, default=0) – Every callback_interval iteration steps, the callback function is called and error increments are stored. If track_log_likelihoods=true, the log-likelihood are also stored. If callback_interval=0, no call to the callback function is done.

  • progress (object) – Progress bar object that TRAM will call to indicate progress to the user. Tested for a tqdm progress bar. The interface is checked via supports_progress_interface.

References

Attributes

has_model

Property reporting whether this estimator contains an estimated model.

init_strategy_options

model

Shortcut to fetch_model().

reversible

If true compute reversible MarkovStateModel, else non-reversible MarkovStateModel

sparse

If true compute count matrix, transition matrix and all derived quantities using sparse matrix algebra.

Methods

fetch_model()

Yields the most recent MarkovStateModelCollection that was estimated.

fit(data[, model])

Fit a MarkovStateModelCollection to the given input data, using TRAM.

fit_fetch(data, **kwargs)

Fits the internal model on data and subsequently fetches it in one call.

get_params([deep])

Get the parameters.

set_params(**params)

Set the parameters of this estimator.

fetch_model() Optional[TRAMModel]

Yields the most recent MarkovStateModelCollection that was estimated. Can be None if fit was not called.

Returns:

model – The most recent markov state model or None.

Return type:

MarkovStateModelCollection or None

fit(data, model=None, *args, **kw)

Fit a MarkovStateModelCollection to the given input data, using TRAM. For each thermodynamic state, there is one Markov Model in the collection at the respective thermodynamic state index.

Parameters:
  • data (TRAMDataset or tuple) –

    If data is supplied in form of a data tuple, a TRAMDataset is constructed from the data tuple inside fit(). The data tuple is of shape (dtrajs, bias_matrices) or (dtrajs, bias_matrices, ttrajs), with the inner values given by:

    • dtrajs: array-like(ndarray(n)), int

      The discrete trajectories in the form of a list or array of numpy arrays. dtrajs[i] contains one trajectory. dtrajs[i][n] equals the Markov state index that the \(n\)-th sample from the \(i\)-th trajectory was binned into. Each of the dtrajs can be of variable length.

    • bias_matrices: ndarray-like(ndarray(n,m)), float

      The bias energy matrices. bias_matrices[i][n, k] equals the bias energy of the \(n\)-th sample from the \(i\)-th trajectory, evaluated at thermodynamic state \(k\), \(b^k(x_{i,n})\). The bias energy matrices should have the same size as dtrajs in both the first and second dimensions. The third dimension is of size n_therm_states, i.e. for each sample, the bias energy in every thermodynamic state is calculated and stored in the bias_matrices.

    • ttrajs: array-like(ndarray(n)), int, optional

      ttrajs[i] indicates for each sample in the \(i\)-th trajectory what thermodynamic state that sample was sampled at. If ttrajs = None, we assume no replica exchange was done. In this case we assume each trajectory corresponds to a unique thermodynamic state, and n_therm_states equals the size of dtrajs.

    If the data is supplied in form of a TRAMDataset, the lagtime of the dataset overwrites the chosen lagtime of the TRAM estimator.

  • model (TRAMModel, optional, default=None) – If a TRAMModel is given, the parameters from the TRAMModel are loaded into the estimator, and estimation continues from the loaded parameters as a starting point. Input data may differ from the input data used to estimate the input model, but the input data should lie within bounds of the number of thermodynamic states and Markov states given by the model, meaning the highest occurring state indices in ttrajs and dtrajs may be model.n_therm_states - 1 and model.n_markov_states - 1 respectively, If no model is given, estimation starts from zero-initialized arrays for the free energies and modified state counts. The lagrangian multipliers are initialized with values \(v_i^{k, 0} = \mathrm{log} (c_{ij}^k + c_{ji}^k)/2\)

See also

TRAMDataset

fit_fetch(data, **kwargs)

Fits the internal model on data and subsequently fetches it in one call.

Parameters:
  • data (array_like) – Data that is used to fit the model.

  • **kwargs – Additional arguments to fit().

Returns:

The estimated model.

Return type:

model

get_params(deep=False)

Get the parameters.

Returns:

params – Parameter names mapped to their values.

Return type:

mapping of string to any

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self – Estimator instance.

Return type:

object

property has_model: bool

Property reporting whether this estimator contains an estimated model. This assumes that the model is initialized with None otherwise.

Type:

bool

property model

Shortcut to fetch_model().

property reversible: bool

If true compute reversible MarkovStateModel, else non-reversible MarkovStateModel

property sparse: bool

If true compute count matrix, transition matrix and all derived quantities using sparse matrix algebra. In this case python sparse matrices will be returned by the corresponding functions instead of numpy arrays. This behavior is suggested for very large numbers of states (e.g. > 4000) because it is likely to be much more efficient.