class VAMP¶

class deeptime.decomposition.VAMP(lagtime: int | None = None, dim: int | None = None, var_cutoff: float | None = None, scaling: str | None = None, epsilon: float = 1e-06, observable_transform: ~typing.Callable[[~numpy.ndarray], ~numpy.ndarray] = <deeptime.basis._monomials.Identity object>)¶

Variational approach for Markov processes (VAMP).

The implementation is based on [1], [2].

Parameters:

lagtime (int or None, optional, default=None) – The lagtime under which covariances are estimated. This is only relevant when estimating from data, in case covariances are provided this should either be None or exactly the value that was used to estimate said covariances.
dim (int, optional, default=None) –
Number of dimensions to keep:
- if dim is not set (None) all available ranks are kept: n_components == min(n_samples, n_uncorrelated_features)
- if dim is an integer >= 1, this number specifies the number of dimensions to keep.
var_cutoff (float, optional, default=None) – Determines the number of output dimensions by including dimensions until their cumulative kinetic variance exceeds the fraction subspace variance. var_cutoff=1.0 means all numerically available dimensions (see epsilon) will be used, unless set by dim. Setting var_cutoff smaller than 1.0 is exclusive with dim.
scaling (str, optional, default=None) –
Scaling to be applied to the VAMP order parameters upon transformation
- None: no scaling will be applied, variance of the order parameters is 1
- ’kinetic_map’ or ‘km’: order parameters are scaled by singular value. Only the left singular functions induce a kinetic map wrt the conventional forward propagator. The right singular functions induce a kinetic map wrt the backward propagator.
epsilon (float, optional, default=1e-6) – Eigenvalue cutoff. Eigenvalues of $C_{00}$ and $C_{11}$ with norms <= epsilon will be cut off. The remaining number of eigenvalues together with the value of dim define the size of the output.
observable_transform (callable, optional, default=Identity) – A feature transformation on the raw data which is used to estimate the model.

See also

CovarianceKoopmanModel: type of model produced by this estimator

Notes

VAMP is a method for dimensionality reduction of Markov processes.

The Koopman operator $\mathcal{K}$ is an integral operator that describes conditional future expectation values. Let $p(\mathbf{x},\,\mathbf{y})$ be the conditional probability density of visiting an infinitesimal phase space volume around point $\mathbf{y}$ at time $t+\tau$ given that the phase space point $\mathbf{x}$ was visited at the earlier time $t$ . Then the action of the Koopman operator on a function $f$ can be written as follows:

\mathcal{K}f=\int p(\mathbf{x},\,\mathbf{y})f(\mathbf{y})\,\mathrm{dy}=\mathbb{E}\left[f(\mathbf{x}_{t+\tau}\mid\mathbf{x}_{t}=\mathbf{x})\right]

The Koopman operator is defined without any reference to an equilibrium distribution. Therefore it is well-defined in situations where the dynamics is irreversible or/and non-stationary such that no equilibrium distribution exists.

If we approximate $f$ by a linear superposition of ansatz functions $\boldsymbol{\chi}$ of the conformational degrees of freedom (features), the operator $\mathcal{K}$ can be approximated by a (finite-dimensional) matrix $\mathbf{K}$ .

The approximation is computed as follows: From the time-dependent input features $\boldsymbol{\chi}(t)$ , we compute the mean $\boldsymbol{\mu}_{0}$ ( $\boldsymbol{\mu}_{1}$ ) from all data excluding the last (first) $\tau$ steps of every trajectory as follows:

\boldsymbol{\mu}_{0} :=\frac{1}{T-\tau}\sum_{t=0}^{T-\tau}\boldsymbol{\chi}(t) \boldsymbol{\mu}_{1} :=\frac{1}{T-\tau}\sum_{t=\tau}^{T}\boldsymbol{\chi}(t)

Next, we compute the instantaneous covariance matrices $\mathbf{C}_{00}$ and $\mathbf{C}_{11}$ and the time-lagged covariance matrix $\mathbf{C}_{01}$ as follows:

\begin{aligned} \mathbf{C}_{00}&:=\frac{1}{T-\tau}\sum_{t=0}^{T-\tau}\left[\boldsymbol{\chi}(t)-\boldsymbol{\mu}_{0}\right]\left[\boldsymbol{\chi}(t)-\boldsymbol{\mu}_{0}\right]\\ \mathbf{C}_{11}&:=\frac{1}{T-\tau}\sum_{t=\tau}^{T}\left[\boldsymbol{\chi}(t)-\boldsymbol{\mu}_{1}\right]\left[\boldsymbol{\chi}(t)-\boldsymbol{\mu}_{1}\right]\\ \mathbf{C}_{01}&:=\frac{1}{T-\tau}\sum_{t=0}^{T-\tau}\left[\boldsymbol{\chi}(t)-\boldsymbol{\mu}_{0}\right]\left[\boldsymbol{\chi}(t+\tau)-\boldsymbol{\mu}_{1}\right] \end{aligned}

The Koopman matrix is then computed as follows:

\mathbf{K}=\mathbf{C}_{00}^{-1}\mathbf{C}_{01}

It can be shown [1] that the leading singular functions of the half-weighted Koopman matrix

\bar{\mathbf{K}}:=\mathbf{C}_{00}^{-\frac{1}{2}}\mathbf{C}_{01}\mathbf{C}_{11}^{-\frac{1}{2}}

encode the best reduced dynamical model for the time series.

The singular functions can be computed by first performing the singular value decomposition

\bar{\mathbf{K}}=\mathbf{U}^{\prime}\mathbf{S}\mathbf{V}^{\prime}

and then mapping the input conformation to the left singular functions $\boldsymbol{\psi}$ and right singular functions $\boldsymbol{\phi}$ as follows:

\begin{aligned} \boldsymbol{\psi}(t)&:=\mathbf{U}^{\prime\top}\mathbf{C}_{00}^{-\frac{1}{2}}\left[\boldsymbol{\chi}(t)-\boldsymbol{\mu}_{0}\right]\\ \boldsymbol{\phi}(t)&:=\mathbf{V}^{\prime\top}\mathbf{C}_{11}^{-\frac{1}{2}}\left[\boldsymbol{\chi}(t)-\boldsymbol{\mu}_{1}\right] \end{aligned}

References

Attributes

`dim`	Dimension attribute.
`epsilon`	Eigenvalue cutoff.
`has_model`	Property reporting whether this estimator contains an estimated model.
`lagtime`	The lagtime under which covariances are estimated.
`model`	Shortcut to `fetch_model()`.
`scaling`	Scaling parameter to be applied to order parameters upon transformation.
`var_cutoff`	Variational cutoff which can be used to further restrict the dimension.

Methods

`covariance_estimator`(lagtime[, ncov])	Yields a properly configured covariance estimator so that its model can be used as input for the vamp estimator.
`fetch_model`()	Finalizes current model and yields new `CovarianceKoopmanModel`.
`fit`(data, args, *kw)	Fits a new `CovarianceKoopmanModel` which can be obtained by a subsequent call to `fetch_model()`.
`fit_fetch`(data, **kwargs)	Fits the internal model on data and subsequently fetches it in one call.
`fit_from_covariances`(covariances)	Fits from existing covariance model (or covariance estimator containing model).
`fit_from_timeseries`(data[, weights])	Estimates a `CovarianceKoopmanModel` directly from time-series data using the `Covariance` estimator.
`fit_transform`(data[, fit_options, ...])	Fits a model which simultaneously functions as transformer and subsequently transforms the input data.
`get_params`([deep])	Get the parameters.
`partial_fit`(data)	Updates the covariance estimates through a new batch of data.
`set_params`(**params)	Set the parameters of this estimator.
`transform`(data[, propagate])	Projects given timeseries onto dominant singular functions.

__call__(*args, **kwargs)¶: Call self as a function.

classmethod covariance_estimator(lagtime: int, ncov: int = inf)¶

Yields a properly configured covariance estimator so that its model can be used as input for the vamp estimator.

Parameters:

lagtime (int) – Positive integer denoting the time shift which is considered for autocorrelations.
ncov (int or float('inf'), optional, default=float('inf')) – Limit the memory usage of the algorithm from [3] to an amount that corresponds to ncov additional copies of each correlation matrix.

Returns:

estimator – Covariance estimator.

Return type:

Covariance

fetch_model() → CovarianceKoopmanModel¶

Finalizes current model and yields new CovarianceKoopmanModel.

Returns:: model – The estimated model.
Return type:: CovarianceKoopmanModel

fit(data, *args, **kw)¶

Fits a new CovarianceKoopmanModel which can be obtained by a subsequent call to fetch_model().

Parameters:

data (CovarianceModel or Covariance or timeseries) – Covariance matrices $C_{00}, C_{0t}, C_{tt}$ in form of a CovarianceModel instance. If the model should be fitted directly from data, please see from_data(). Optionally, this can also be timeseries data directly, in which case a ‘lagtime’ must be provided.
*args – Optional arguments
**kw – Ignored keyword arguments for scikit-learn compatibility.

Returns:

self – Reference to self.

Return type:

VAMP

Notes

If you are running into memory problems for potentially multiple trajectories you can decrease memory load by using partial_fit() on individual trajectories or in conjunction with timeshifted_split:

>>> import numpy as np
>>> from deeptime.decomposition import TICA
>>> from deeptime.util.data import timeshifted_split
>>> estimator = TICA(dim=1)
>>> for X, Y in timeshifted_split(np.ones(shape=(100, 5)), lagtime=1, chunksize=40):  
...     estimator.partial_fit((X, Y))
>>> joint_model = estimator.fetch_model()

fit_fetch(data, **kwargs)¶

Fits the internal model on data and subsequently fetches it in one call.

Parameters:

data (array_like) – Data that is used to fit the model.
**kwargs – Additional arguments to fit().

Returns:

The estimated model.

Return type:

model

fit_from_covariances(covariances: Covariance | CovarianceModel)¶

Fits from existing covariance model (or covariance estimator containing model).

Parameters:: covariances (CovarianceModel or Covariance) – Covariance model containing covariances or Covariance estimator containing a covariance model. The model in particular has matrices $C_{00}, C_{0t}, C_{tt}$ .
Returns:: self – Reference to self.
Return type:: VAMP

fit_from_timeseries(data, weights=None)¶

Estimates a CovarianceKoopmanModel directly from time-series data using the Covariance estimator. For parameters dim, scaling, epsilon.

Parameters:

data – Input data, see to_dataset for options.
weights – See the Covariance estimator.

Returns:

self – Reference to self.

Return type:

VAMP

fit_transform(data, fit_options=None, transform_options=None)¶

Fits a model which simultaneously functions as transformer and subsequently transforms the input data. The estimated model can be accessed by calling fetch_model().

Parameters:

data (array_like) – The input data.
fit_options (dict, optional, default=None) – Optional keyword arguments passed on to the fit method.
transform_options (dict, optional, default=None) – Optional keyword arguments passed on to the transform method.

Returns:

output – Transformed data.

Return type:

array_like

get_params(deep=False)¶

Get the parameters.

Returns:: params – Parameter names mapped to their values.
Return type:: mapping of string to any

partial_fit(data)¶

Updates the covariance estimates through a new batch of data.

Parameters:: data (tuple(ndarray, ndarray)) – A tuple of ndarrays which have to have same shape and are $X_t$ and $X_{t+\tau}$ , respectively. Here, $\tau$ denotes the lagtime.
Returns:: self – Reference to self.
Return type:: VAMP

set_params(**params)¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:: **params (dict) – Estimator parameters.
Returns:: self – Estimator instance.
Return type:: object

transform(data, propagate=False)¶

Projects given timeseries onto dominant singular functions. This method dispatches to CovarianceKoopmanModel.transform().

Parameters:

data ((T, n) ndarray) – Input timeseries data.
propagate (bool, default=False) – Whether to apply the Koopman operator after data was transformed into the whitened feature space.

Returns:

Y – The projected data. If right is True, projection will be on the right singular functions. Otherwise, projection will be on the left singular functions.

Return type:

(T, m) ndarray

property dim: int | None¶

Dimension attribute. Can either be int or float. In case of

int it evaluates it as the actual dimension, must be strictly greater 0,
None all numerically available components are used.

Getter:: yields the dimension
Setter:: sets a new dimension
Type:: int or None

property epsilon¶

Eigenvalue cutoff.

Getter:: Yields current eigenvalue cutoff.
Setter:: Sets new eigenvalue cutoff.
Type:: float

property has_model: bool¶

Property reporting whether this estimator contains an estimated model. This assumes that the model is initialized with None otherwise.

Type:: bool

property lagtime: int | None¶

The lagtime under which covariances are estimated. Can be None in case covariances are provided directly instead of estimating them inside this estimator.

Getter:: Yields the current lagtime.
Setter:: Sets a new lagtime, must be positive.
Type:: int or None

property model¶: Shortcut to fetch_model().

property scaling: str | None¶

Scaling parameter to be applied to order parameters upon transformation. Can be one of None, ‘kinetic_map’, or ‘km’.

Getter:: Yields currently configured scaling parameter.
Setter:: Sets a new scaling parameter (None, ‘kinetic_map’, or ‘km’)
Type:: str or None

property var_cutoff: float | None¶

Variational cutoff which can be used to further restrict the dimension. This takes precedence over the dim() property.

Getter:: yields the currently set variation cutoff
Setter:: sets a new cutoff
Type:: float or None