class CovarianceKoopmanModel¶
- class deeptime.decomposition.CovarianceKoopmanModel(instantaneous_coefficients, singular_values, timelagged_coefficients, cov, rank_0: int, rank_t: int, dim=None, var_cutoff=None, scaling=None, epsilon=1e-10, instantaneous_obs: ~typing.Callable[[~numpy.ndarray], ~numpy.ndarray] = <deeptime.basis._monomials.Identity object>, timelagged_obs: ~typing.Callable[[~numpy.ndarray], ~numpy.ndarray] = <deeptime.basis._monomials.Identity object>)¶
A type of Koopman model \(\mathbb{E}[g(x_{t+\tau})] = K^\top \mathbb{E}[f(x_{t})]\) which was obtained through diagonalization of covariance matrices. This leads to a Koopman operator which is a diagonal matrix and can be used to project onto specific processes of the system.
In particular, this model expects matrices \(U\) and \(V\) as well as singular values \(\sigma_i\), such that
\[\mathbb{E}[V^\top\chi_1 (x_{t+\tau})]=\mathbb{E}[g(x_{t+\tau})] \approx K^\top \mathbb{E}[f(x_{t})] = \mathrm{diag}(\sigma_i) \mathbb{E}[U^\top\chi_0(x_{t})], \]where \(\chi_0,\chi_1\) are basis transformations of the full state \(x_t\).
The estimators which produce this kind of model are
VAMP
andTICA
.For a description of parameters operator, basis_transform_forward, basis_transform_backward, and output_dimension: please see
TransferOperatorModel
.- Parameters:
instantaneous_coefficients ((n, k) ndarray) – The coefficient matrix \(U\).
singular_values ((k,) ndarray) – Singular values \(\sigma_i\).
instantaneous_coefficients – The coefficient matrix \(V\).
cov (CovarianceModel) – Covariances \(C_{00}\), \(C_{0t}\), and \(C_{tt}\).
rank_0 (int) – Rank of the instantaneous whitening transformation \(C_{00}^{-1/2}\).
rank_t (int) – Rank of the time-lagged whitening transformation \(C_{tt}^{-1/2}\).
scaling (str or None, default=None) – Scaling parameter which was applied to singular values for additional structure in the projected space. See the respective estimator for details.
epsilon (float, default=1e-6) – Eigenvalue / singular value cutoff. Eigenvalues (or singular values) of \(C_{00}\) and \(C_{11}\) with norms <= epsilon were cut off. The remaining number of eigenvalues together with the value of dim define the effective output dimension.
instantaneous_obs (Callable, optional, default=identity) – Transforms the current state \(x_t\) to \(\chi_0(x_t)\). Defaults to \(\chi_0(x) = x\).
timelagged_obs (Callable, optional, default=identity) – Transforms the future state \(x_{t+\tau}\) to \(\chi_1(x_{t+\tau})\). Defaults to \(\chi_1(x) = x\).
Attributes
Estimated covariances.
Shortcut to
cov_00
.Shortcut to
cov_0t
.Shortcut to
cov_tt
.Yields the cumulative kinetic variance.
Dimension attribute.
Singular value cutoff.
Instantaneous correlation matrix between mean-free input features and projection components.
Coefficient matrix \(U\).
Transforms the current state \(x_t\) to \(f(x_t)\).
Same as
operator
.The lagtime corresponding to this model.
Shortcut to
mean_0
.Shortcut to
mean_t
.The operator \(K\) so that \(\mathbb{E}[g(x_{t+\tau})] = K^\top \mathbb{E}[f(x_t)]\) in transformed bases.
Inverse of the operator \(K\), i.e., \(K^{-1}\).
The dimension of data after propagation by \(K\).
Scaling of projection.
The singular values of the half-weighted Koopman matrix.
Transformation matrix that represents the linear map from mean-free feature space to the space of left singular functions.
Transformation matrix that represents the linear map from mean-free feature space to the space of right singular functions.
Coefficient matrix \(V\).
Transforms the future state \(x_{t+\tau}\) to \(g(x_{t+\tau})\).
Variance cutoff parameter.
Rank of the instantaneous whitening transformation \(C_{00}^{-1/2}\).
Rank of the time-lagged whitening transformation \(C_{tt}^{-1/2}\).
Methods
backward
(data[, propagate])Maps data backward in time.
ck_test
(models[, test_model, include_lag0, ...])Returns a Chapman-Kolmogorov validator based on this estimator and a test model.
copy
()Makes a deep copy of this model.
effective_output_dimension
(rank0, rankt, ...)Computes effective output dimension.
expectation
(observables, statistics[, ...])Compute future expectation of observable or covariance using the approximated Koopman operator.
forward
(data[, propagate])Maps data forward in time.
get_params
([deep])Get the parameters.
propagate
(trajectory[, components])Applies the forward transform to the trajectory in non-transformed space.
score
(r[, test_model, epsilon, dim])Compute the VAMP score between a this model and potentially a test model for cross-validation.
set_params
(**params)Set the parameters of this estimator.
timescales
([k, lagtime])Implied timescales of the TICA transformation
transform
(data, **kw)Projects data onto the Koopman modes \(f(x) = U^\top \chi_0 (x)\), where \(U\) are the coefficients of the basis \(\chi_0\).
- __call__(*args, **kwargs)¶
Call self as a function.
- backward(data: ndarray, propagate=True)¶
Maps data backward in time.
- Parameters:
data ((T, n) ndarray) – Input data
propagate (bool, default=True) – Whether to apply the Koopman operator to the featurized data.
- Returns:
mapped_data – Mapped data.
- Return type:
(T, m) ndarray
- ck_test(models, test_model=None, include_lag0=True, n_observables=None, observables='phi', statistics='psi', progress=None)¶
Returns a Chapman-Kolmogorov validator based on this estimator and a test model.
- Parameters:
models (list of models) – Multiple models with different lagtimes to test against.
test_model (CovarianceKoopmanModel, optional, default=None) – The model that is tested. If not provided, uses this estimator’s encapsulated model.
include_lag0 (bool, optional, default=True) – Whether to include lagtime 0.
n_observables (int, optional, default=None) – Limit the number of default observables (and of default statistics) to this number. Only used if observables are None or statistics are None.
observables ((input_dimension, n_observables) ndarray) – Coefficients that express one or multiple observables in the basis of the input features.
statistics ((input_dimension, n_statistics) ndarray) – Coefficients that express one or multiple statistics in the basis of the input features.
progress (ProgressBar, optional, default=None) – Optional progress bar, tested for tqdm.
- Returns:
test – The test results
- Return type:
See also
Notes
This method computes two sets of time-lagged covariance matrices
estimates at higher lag times :
\[\left\langle \mathbf{K}(n\tau)g_{i},f_{j}\right\rangle_{\rho_{0}}\]where \(\rho_{0}\) is the empirical distribution implicitly defined by all data points from time steps 0 to T-tau in all trajectories, \(\mathbf{K}(n\tau)\) is a rank-reduced Koopman matrix estimated at the lag-time n*tau and g and f are some functions of the data.
predictions at higher lag times :
\[\left\langle \mathbf{K}^{n}(\tau)g_{i},f_{j}\right\rangle_{\rho_{0}}\]where \(\mathbf{K}^{n}\) is the n’th power of the rank-reduced Koopman matrix contained in self.
The Champan-Kolmogorov test is to compare the predictions to the estimates.
- copy() Model ¶
Makes a deep copy of this model.
- Returns:
A new copy of this model.
- Return type:
copy
- static effective_output_dimension(rank0, rankt, dim, var_cutoff, singular_values) int ¶
Computes effective output dimension.
- expectation(observables, statistics, lag_multiple=1, observables_mean_free=False, statistics_mean_free=False)¶
Compute future expectation of observable or covariance using the approximated Koopman operator.
- Parameters:
observables (np.ndarray((input_dimension, n_observables))) – Coefficients that express one or multiple observables in the basis of the input features.
statistics (np.ndarray((input_dimension, n_statistics)), optional) – Coefficients that express one or multiple statistics in the basis of the input features. This parameter can be None. In that case, this method returns the future expectation value of the observable(s).
lag_multiple (int) – If > 1, extrapolate to a multiple of the estimator’s lag time by assuming Markovianity of the approximated Koopman operator.
observables_mean_free (bool, default=False) – If true, coefficients in observables refer to the input features with feature means removed. If false, coefficients in observables refer to the unmodified input features.
statistics_mean_free (bool, default=False) – If true, coefficients in statistics refer to the input features with feature means removed. If false, coefficients in statistics refer to the unmodified input features.
- Returns:
expectation – The equilibrium expectation of observables or covariance if statistics is not None.
- Return type:
ndarray
Notes
A “future expectation” of an observable \(g\) is the average of \(g\) computed over a time window that has the same total length as the input data from which the Koopman operator was estimated but is shifted by
lag_multiple*tau
time steps into the future (where tau is the lag time).It is computed with the equation:
\[\mathbb{E}[g]_{\rho_{n}}=\mathbf{q}^{T}\mathbf{P}^{n-1}\mathbf{e}_{1}\]where
\[P_{ij}=\sigma_{i}\langle\psi_{i},\phi_{j}\rangle_{\rho_{1}}\]and
\[q_{i}=\langle g,\phi_{i}\rangle_{\rho_{1}}\]and \(\mathbf{e}_{1}\) is the first canonical unit vector.
A model prediction of time-lagged covariances between the observable \(f\) and the statistic \(g\) at a lag-time of
lag_multiple*tau
is computed with the equation:\[\mathrm{cov}[g,\,f;n\tau]=\mathbf{q}^{T}\mathbf{P}^{n-1}\boldsymbol{\Sigma}\mathbf{r}\]where \(r_{i}=\langle\psi_{i},f\rangle_{\rho_{0}}\) and \(\boldsymbol{\Sigma}=\mathrm{diag(\boldsymbol{\sigma})}\) .
- forward(data: ndarray, propagate=True)¶
Maps data forward in time.
- Parameters:
data ((T, n) ndarray) – Input data
propagate (bool, default=True) – Whether to apply the Koopman operator to the featurized data.
- Returns:
mapped_data – Mapped data.
- Return type:
(T, m) ndarray
- get_params(deep=False)¶
Get the parameters.
- Returns:
params – Parameter names mapped to their values.
- Return type:
mapping of string to any
- propagate(trajectory: ndarray, components: Optional[Union[int, List[int]]] = None) ndarray ¶
Applies the forward transform to the trajectory in non-transformed space. Given the Koopman operator \(\Sigma\), transformations \(V^\top - \mu_t\) and \(U^\top -\mu_0\) for bases \(f\) and \(g\), respectively, this is achieved by transforming each frame \(X_t\) with
\[\hat{X}_{t+\tau} = (V^\top)^{-1} \Sigma U^\top (X_t - \mu_0) + \mu_t. \]If the model stems from a
VAMP
estimator, \(V\) are the left singular vectors, \(\Sigma\) the singular values, and \(U\) the right singular vectors.- Parameters:
trajectory ((T, n) ndarray) – The input trajectory
components (int or list of int or None, default=None) – Optional arguments for the Koopman operator if appropriate. If the model stems from a
VAMP
estimator, these are the component(s) to project onto. If None, all processes are taken into account, if list of integer, this sets all singular values to zero but the “components”th ones.
- Returns:
predictions – The predicted trajectory.
- Return type:
(T, n) ndarray
- score(r: Union[float, str], test_model=None, epsilon=1e-06, dim=None)¶
Compute the VAMP score between a this model and potentially a test model for cross-validation.
- Parameters:
r (float or str) –
The type of score to evaluate. Can by an floating point value greater or equal to 1 or ‘E’, yielding the VAMP-r score or the VAMP-E score, respectively. [1] Typical choices are:
- ’VAMP1’ Sum of singular values of the half-weighted Koopman matrix.
If the model is reversible, this is equal to the sum of Koopman matrix eigenvalues, also called Rayleigh quotient [1].
- ’VAMPE’ Approximation error of the estimated Koopman operator with respect to
the true Koopman operator up to an additive constant [1] .
test_model (CovarianceKoopmanModel, optional, default=None) –
If test_model is not None, this method computes the cross-validation score between self and covariances_test. It is assumed that self was estimated from the “training” data and test_model was estimated from the “test” data. The score is computed for one realization of self and test_model. Estimation of the average cross-validation score and partitioning of data into test and training part is not performed by this method.
If covariances_test is None, this method computes the VAMP score for the model contained in self.
epsilon (float, default=1e-6) – Regularization parameter for computing sqrt-inverses of spd matrices.
dim (int, optional, default=None) – How many components to use for scoring.
- Returns:
score – If test_model is not None, returns the cross-validation VAMP score between self and test_model. Otherwise return the selected VAMP-score of self.
- Return type:
float
Notes
The VAMP-\(r\) and VAMP-E scores are computed according to [1], Equation (33) and Equation (30), respectively.
References
- set_params(**params)¶
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
object
- timescales(k=None, lagtime: Optional[int] = None) ndarray ¶
Implied timescales of the TICA transformation
For each \(i\)-th eigenvalue, this returns
\[t_i = -\frac{\tau}{\log(|\lambda_i|)}\]where \(\tau\) is the
lagtime
of the TICA object and \(\lambda_i\) is the i-theigenvalue
of the TICA object.- Parameters:
k (int, optional, default=None) – Number of timescales to be returned. By default with respect to all available singular values.
lagtime (int, optional, default=None) – The lagtime with respect to which to compute the timescale. If
None
, this defaults to the lagtime under which the covariances were estimated.
- Returns:
timescales – numpy array with the implied timescales. In principle, one should expect as many timescales as input coordinates were available. However, less eigenvalues will be returned if the TICA matrices were not full rank or
dim
contained a floating point percentage, i.e., was interpreted as variance cutoff.- Return type:
(n,) np.array
- Raises:
ValueError – If any of the singular values not real, i.e., has a non-zero imaginary component.
- transform(data: ndarray, **kw)¶
Projects data onto the Koopman modes \(f(x) = U^\top \chi_0 (x)\), where \(U\) are the coefficients of the basis \(\chi_0\).
- Parameters:
data ((T, n) ndarray) – Input data.
- Returns:
transformed_data – Data projected onto the Koopman modes.
- Return type:
(T, k) ndarray
- property cov: CovarianceModel¶
Estimated covariances.
- property cumulative_kinetic_variance: ndarray¶
Yields the cumulative kinetic variance.
- property dim: Optional[int]¶
Dimension attribute. Can either be int or None. In case of
int
it evaluates it as the actual dimension, must be strictly greater 0,None
all numerically available components are used.
- Getter:
yields the dimension
- Setter:
sets a new dimension
- Type:
int or None
- property epsilon: float¶
Singular value cutoff.
- property feature_component_correlation¶
Instantaneous correlation matrix between mean-free input features and projection components.
Denoting the input features as \(X_i\) and the projection components as \(\theta_j\), the instantaneous, linear correlation between them can be written as
\[\mathbf{Corr}(X_i - \mu_i, \mathbf{\theta}_j) = \frac{1}{\sigma_{X_i - \mu_i}}\sum_l \sigma_{(X_i - \mu_i)(X_l - \mu_l)} \mathbf{U}_{li} \]The matrix \(\mathbf{U}\) is the matrix containing the eigenvectors of the generalized eigenvalue problem as column vectors.
- Returns:
corr – Correlation matrix between input features and projection components. There is a row for each feature and a column for each component.
- Return type:
ndarray(n,m)
- property instantaneous_coefficients: ndarray¶
Coefficient matrix \(U\).
- property instantaneous_obs: Callable[[ndarray], ndarray]¶
Transforms the current state \(x_t\) to \(f(x_t)\). Defaults to f(x) = x.
- property lagtime¶
The lagtime corresponding to this model. See also
CovarianceModel.lagtime
.
- property operator: ndarray¶
The operator \(K\) so that \(\mathbb{E}[g(x_{t+\tau})] = K^\top \mathbb{E}[f(x_t)]\) in transformed bases.
- property operator_inverse: ndarray¶
Inverse of the operator \(K\), i.e., \(K^{-1}\). Potentially pseudo-inverse instead of true inverse.
- property output_dimension¶
The dimension of data after propagation by \(K\).
- property scaling: Optional[str]¶
Scaling of projection. Can be
None
, ‘kinetic map’, or ‘km’
- property singular_values: ndarray¶
The singular values of the half-weighted Koopman matrix.
- property singular_vectors_left: ndarray¶
Transformation matrix that represents the linear map from mean-free feature space to the space of left singular functions.
- property singular_vectors_right: ndarray¶
Transformation matrix that represents the linear map from mean-free feature space to the space of right singular functions.
- property timelagged_coefficients: ndarray¶
Coefficient matrix \(V\).
- property timelagged_obs: Callable[[ndarray], ndarray]¶
Transforms the future state \(x_{t+\tau}\) to \(g(x_{t+\tau})\). Defaults to f(x) = x.
- property var_cutoff: Optional[float]¶
Variance cutoff parameter. Can be set to include dimensions up to a certain threshold. Takes precedence over the
dim()
parameter.- Getter:
Yields the current variance cutoff.
- Setter:
Sets a new variance cutoff or disables variance cutoff by setting the value to None.
- Type:
float or None
- property whitening_rank_0: int¶
Rank of the instantaneous whitening transformation \(C_{00}^{-1/2}\).
- property whitening_rank_t: int¶
Rank of the time-lagged whitening transformation \(C_{tt}^{-1/2}\).