class VAMP¶
- class deeptime.decomposition.VAMP(lagtime: ~typing.Optional[int] = None, dim: ~typing.Optional[int] = None, var_cutoff: ~typing.Optional[float] = None, scaling: ~typing.Optional[str] = None, epsilon: float = 1e-06, observable_transform: ~typing.Callable[[~numpy.ndarray], ~numpy.ndarray] = <deeptime.basis._monomials.Identity object>)¶
Variational approach for Markov processes (VAMP).
The implementation is based on [1], [2].
- Parameters:
lagtime (int or None, optional, default=None) – The lagtime under which covariances are estimated. This is only relevant when estimating from data, in case covariances are provided this should either be None or exactly the value that was used to estimate said covariances.
dim (int, optional, default=None) –
Number of dimensions to keep:
if dim is not set (None) all available ranks are kept:
n_components == min(n_samples, n_uncorrelated_features)
if dim is an integer >= 1, this number specifies the number of dimensions to keep.
var_cutoff (float, optional, default=None) – Determines the number of output dimensions by including dimensions until their cumulative kinetic variance exceeds the fraction subspace variance. var_cutoff=1.0 means all numerically available dimensions (see epsilon) will be used, unless set by dim. Setting var_cutoff smaller than 1.0 is exclusive with dim.
scaling (str, optional, default=None) –
Scaling to be applied to the VAMP order parameters upon transformation
None: no scaling will be applied, variance of the order parameters is 1
’kinetic_map’ or ‘km’: order parameters are scaled by singular value. Only the left singular functions induce a kinetic map wrt the conventional forward propagator. The right singular functions induce a kinetic map wrt the backward propagator.
epsilon (float, optional, default=1e-6) – Eigenvalue cutoff. Eigenvalues of \(C_{00}\) and \(C_{11}\) with norms <= epsilon will be cut off. The remaining number of eigenvalues together with the value of dim define the size of the output.
observable_transform (callable, optional, default=Identity) – A feature transformation on the raw data which is used to estimate the model.
See also
CovarianceKoopmanModel
type of model produced by this estimator
Notes
VAMP is a method for dimensionality reduction of Markov processes.
The Koopman operator \(\mathcal{K}\) is an integral operator that describes conditional future expectation values. Let \(p(\mathbf{x},\,\mathbf{y})\) be the conditional probability density of visiting an infinitesimal phase space volume around point \(\mathbf{y}\) at time \(t+\tau\) given that the phase space point \(\mathbf{x}\) was visited at the earlier time \(t\). Then the action of the Koopman operator on a function \(f\) can be written as follows:
\[\mathcal{K}f=\int p(\mathbf{x},\,\mathbf{y})f(\mathbf{y})\,\mathrm{dy}=\mathbb{E}\left[f(\mathbf{x}_{t+\tau}\mid\mathbf{x}_{t}=\mathbf{x})\right]\]The Koopman operator is defined without any reference to an equilibrium distribution. Therefore it is well-defined in situations where the dynamics is irreversible or/and non-stationary such that no equilibrium distribution exists.
If we approximate \(f\) by a linear superposition of ansatz functions \(\boldsymbol{\chi}\) of the conformational degrees of freedom (features), the operator \(\mathcal{K}\) can be approximated by a (finite-dimensional) matrix \(\mathbf{K}\).
The approximation is computed as follows: From the time-dependent input features \(\boldsymbol{\chi}(t)\), we compute the mean \(\boldsymbol{\mu}_{0}\) (\(\boldsymbol{\mu}_{1}\)) from all data excluding the last (first) \(\tau\) steps of every trajectory as follows:
\[\boldsymbol{\mu}_{0} :=\frac{1}{T-\tau}\sum_{t=0}^{T-\tau}\boldsymbol{\chi}(t) \boldsymbol{\mu}_{1} :=\frac{1}{T-\tau}\sum_{t=\tau}^{T}\boldsymbol{\chi}(t)\]Next, we compute the instantaneous covariance matrices \(\mathbf{C}_{00}\) and \(\mathbf{C}_{11}\) and the time-lagged covariance matrix \(\mathbf{C}_{01}\) as follows:
\[ \begin{aligned} \mathbf{C}_{00}&:=\frac{1}{T-\tau}\sum_{t=0}^{T-\tau}\left[\boldsymbol{\chi}(t)-\boldsymbol{\mu}_{0}\right]\left[\boldsymbol{\chi}(t)-\boldsymbol{\mu}_{0}\right]\\ \mathbf{C}_{11}&:=\frac{1}{T-\tau}\sum_{t=\tau}^{T}\left[\boldsymbol{\chi}(t)-\boldsymbol{\mu}_{1}\right]\left[\boldsymbol{\chi}(t)-\boldsymbol{\mu}_{1}\right]\\ \mathbf{C}_{01}&:=\frac{1}{T-\tau}\sum_{t=0}^{T-\tau}\left[\boldsymbol{\chi}(t)-\boldsymbol{\mu}_{0}\right]\left[\boldsymbol{\chi}(t+\tau)-\boldsymbol{\mu}_{1}\right] \end{aligned}\]The Koopman matrix is then computed as follows:
\[\mathbf{K}=\mathbf{C}_{00}^{-1}\mathbf{C}_{01}\]It can be shown [1] that the leading singular functions of the half-weighted Koopman matrix
\[\bar{\mathbf{K}}:=\mathbf{C}_{00}^{-\frac{1}{2}}\mathbf{C}_{01}\mathbf{C}_{11}^{-\frac{1}{2}}\]encode the best reduced dynamical model for the time series.
The singular functions can be computed by first performing the singular value decomposition
\[\bar{\mathbf{K}}=\mathbf{U}^{\prime}\mathbf{S}\mathbf{V}^{\prime}\]and then mapping the input conformation to the left singular functions \(\boldsymbol{\psi}\) and right singular functions \(\boldsymbol{\phi}\) as follows:
\[\begin{aligned} \boldsymbol{\psi}(t)&:=\mathbf{U}^{\prime\top}\mathbf{C}_{00}^{-\frac{1}{2}}\left[\boldsymbol{\chi}(t)-\boldsymbol{\mu}_{0}\right]\\ \boldsymbol{\phi}(t)&:=\mathbf{V}^{\prime\top}\mathbf{C}_{11}^{-\frac{1}{2}}\left[\boldsymbol{\chi}(t)-\boldsymbol{\mu}_{1}\right] \end{aligned}\]References
Attributes
Dimension attribute.
Eigenvalue cutoff.
Property reporting whether this estimator contains an estimated model.
The lagtime under which covariances are estimated.
Shortcut to
fetch_model()
.Scaling parameter to be applied to order parameters upon transformation.
Variational cutoff which can be used to further restrict the dimension.
Methods
covariance_estimator
(lagtime[, ncov])Yields a properly configured covariance estimator so that its model can be used as input for the vamp estimator.
Finalizes current model and yields new
CovarianceKoopmanModel
.fit
(data, *args, **kw)Fits a new
CovarianceKoopmanModel
which can be obtained by a subsequent call tofetch_model()
.fit_fetch
(data, **kwargs)Fits the internal model on data and subsequently fetches it in one call.
fit_from_covariances
(covariances)Fits from existing covariance model (or covariance estimator containing model).
fit_from_timeseries
(data[, weights])Estimates a
CovarianceKoopmanModel
directly from time-series data using theCovariance
estimator.fit_transform
(data[, fit_options, ...])Fits a model which simultaneously functions as transformer and subsequently transforms the input data.
get_params
([deep])Get the parameters.
partial_fit
(data)Updates the covariance estimates through a new batch of data.
set_params
(**params)Set the parameters of this estimator.
transform
(data[, propagate])Projects given timeseries onto dominant singular functions.
- __call__(*args, **kwargs)¶
Call self as a function.
- classmethod covariance_estimator(lagtime: int, ncov: int = inf)¶
Yields a properly configured covariance estimator so that its model can be used as input for the vamp estimator.
- Parameters:
lagtime (int) – Positive integer denoting the time shift which is considered for autocorrelations.
ncov (int or float('inf'), optional, default=float('inf')) – Limit the memory usage of the algorithm from [3] to an amount that corresponds to ncov additional copies of each correlation matrix.
- Returns:
estimator – Covariance estimator.
- Return type:
- fetch_model() CovarianceKoopmanModel ¶
Finalizes current model and yields new
CovarianceKoopmanModel
.- Returns:
model – The estimated model.
- Return type:
- fit(data, *args, **kw)¶
Fits a new
CovarianceKoopmanModel
which can be obtained by a subsequent call tofetch_model()
.- Parameters:
data (CovarianceModel or Covariance or timeseries) – Covariance matrices \(C_{00}, C_{0t}, C_{tt}\) in form of a CovarianceModel instance. If the model should be fitted directly from data, please see
from_data()
. Optionally, this can also be timeseries data directly, in which case a ‘lagtime’ must be provided.*args – Optional arguments
**kw – Ignored keyword arguments for scikit-learn compatibility.
- Returns:
self – Reference to self.
- Return type:
Notes
If you are running into memory problems for potentially multiple trajectories you can decrease memory load by using
partial_fit()
on individual trajectories or in conjunction withtimeshifted_split
:>>> import numpy as np >>> from deeptime.decomposition import TICA >>> from deeptime.util.data import timeshifted_split >>> estimator = TICA(dim=1) >>> for X, Y in timeshifted_split(np.ones(shape=(100, 5)), lagtime=1, chunksize=40): ... estimator.partial_fit((X, Y)) >>> joint_model = estimator.fetch_model()
- fit_fetch(data, **kwargs)¶
Fits the internal model on data and subsequently fetches it in one call.
- Parameters:
data (array_like) – Data that is used to fit the model.
**kwargs – Additional arguments to
fit()
.
- Returns:
The estimated model.
- Return type:
model
- fit_from_covariances(covariances: Union[Covariance, CovarianceModel])¶
Fits from existing covariance model (or covariance estimator containing model).
- Parameters:
covariances (CovarianceModel or Covariance) – Covariance model containing covariances or Covariance estimator containing a covariance model. The model in particular has matrices \(C_{00}, C_{0t}, C_{tt}\).
- Returns:
self – Reference to self.
- Return type:
- fit_from_timeseries(data, weights=None)¶
Estimates a
CovarianceKoopmanModel
directly from time-series data using theCovariance
estimator. For parameters dim, scaling, epsilon.- Parameters:
data – Input data, see
to_dataset
for options.weights – See the
Covariance
estimator.
- Returns:
self – Reference to self.
- Return type:
- fit_transform(data, fit_options=None, transform_options=None)¶
Fits a model which simultaneously functions as transformer and subsequently transforms the input data. The estimated model can be accessed by calling
fetch_model()
.- Parameters:
data (array_like) – The input data.
fit_options (dict, optional, default=None) – Optional keyword arguments passed on to the fit method.
transform_options (dict, optional, default=None) – Optional keyword arguments passed on to the transform method.
- Returns:
output – Transformed data.
- Return type:
array_like
- get_params(deep=False)¶
Get the parameters.
- Returns:
params – Parameter names mapped to their values.
- Return type:
mapping of string to any
- partial_fit(data)¶
Updates the covariance estimates through a new batch of data.
- Parameters:
data (tuple(ndarray, ndarray)) – A tuple of ndarrays which have to have same shape and are \(X_t\) and \(X_{t+\tau}\), respectively. Here, \(\tau\) denotes the lagtime.
- Returns:
self – Reference to self.
- Return type:
- set_params(**params)¶
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
object
- transform(data, propagate=False)¶
Projects given timeseries onto dominant singular functions. This method dispatches to
CovarianceKoopmanModel.transform()
.- Parameters:
data ((T, n) ndarray) – Input timeseries data.
propagate (bool, default=False) – Whether to apply the Koopman operator after data was transformed into the whitened feature space.
- Returns:
Y – The projected data. If right is True, projection will be on the right singular functions. Otherwise, projection will be on the left singular functions.
- Return type:
(T, m) ndarray
- property dim: Optional[int]¶
Dimension attribute. Can either be int or float. In case of
int
it evaluates it as the actual dimension, must be strictly greater 0,None
all numerically available components are used.
- Getter:
yields the dimension
- Setter:
sets a new dimension
- Type:
int or None
- property epsilon¶
Eigenvalue cutoff.
- Getter:
Yields current eigenvalue cutoff.
- Setter:
Sets new eigenvalue cutoff.
- Type:
float
- property has_model: bool¶
Property reporting whether this estimator contains an estimated model. This assumes that the model is initialized with None otherwise.
- Type:
bool
- property lagtime: Optional[int]¶
The lagtime under which covariances are estimated. Can be None in case covariances are provided directly instead of estimating them inside this estimator.
- Getter:
Yields the current lagtime.
- Setter:
Sets a new lagtime, must be positive.
- Type:
int or None
- property model¶
Shortcut to
fetch_model()
.
- property scaling: Optional[str]¶
Scaling parameter to be applied to order parameters upon transformation. Can be one of None, ‘kinetic_map’, or ‘km’.
- Getter:
Yields currently configured scaling parameter.
- Setter:
Sets a new scaling parameter (None, ‘kinetic_map’, or ‘km’)
- Type:
str or None