class CovarianceKoopmanModel

class deeptime.decomposition.CovarianceKoopmanModel(instantaneous_coefficients, singular_values, timelagged_coefficients, cov, rank_0: int, rank_t: int, dim=None, var_cutoff=None, scaling=None, epsilon=1e-10, instantaneous_obs: ~typing.Callable[[~numpy.ndarray], ~numpy.ndarray] = <deeptime.basis._monomials.Identity object>, timelagged_obs: ~typing.Callable[[~numpy.ndarray], ~numpy.ndarray] = <deeptime.basis._monomials.Identity object>)

A type of Koopman model \(\mathbb{E}[g(x_{t+\tau})] = K^\top \mathbb{E}[f(x_{t})]\) which was obtained through diagonalization of covariance matrices. This leads to a Koopman operator which is a diagonal matrix and can be used to project onto specific processes of the system.

In particular, this model expects matrices \(U\) and \(V\) as well as singular values \(\sigma_i\), such that

\[\mathbb{E}[V^\top\chi_1 (x_{t+\tau})]=\mathbb{E}[g(x_{t+\tau})] \approx K^\top \mathbb{E}[f(x_{t})] = \mathrm{diag}(\sigma_i) \mathbb{E}[U^\top\chi_0(x_{t})], \]

where \(\chi_0,\chi_1\) are basis transformations of the full state \(x_t\).

The estimators which produce this kind of model are VAMP and TICA.

For a description of parameters operator, basis_transform_forward, basis_transform_backward, and output_dimension: please see TransferOperatorModel.

Parameters:
  • instantaneous_coefficients ((n, k) ndarray) – The coefficient matrix \(U\).

  • singular_values ((k,) ndarray) – Singular values \(\sigma_i\).

  • instantaneous_coefficients – The coefficient matrix \(V\).

  • cov (CovarianceModel) – Covariances \(C_{00}\), \(C_{0t}\), and \(C_{tt}\).

  • rank_0 (int) – Rank of the instantaneous whitening transformation \(C_{00}^{-1/2}\).

  • rank_t (int) – Rank of the time-lagged whitening transformation \(C_{tt}^{-1/2}\).

  • scaling (str or None, default=None) – Scaling parameter which was applied to singular values for additional structure in the projected space. See the respective estimator for details.

  • epsilon (float, default=1e-6) – Eigenvalue / singular value cutoff. Eigenvalues (or singular values) of \(C_{00}\) and \(C_{11}\) with norms <= epsilon were cut off. The remaining number of eigenvalues together with the value of dim define the effective output dimension.

  • instantaneous_obs (Callable, optional, default=identity) – Transforms the current state \(x_t\) to \(\chi_0(x_t)\). Defaults to \(\chi_0(x) = x\).

  • timelagged_obs (Callable, optional, default=identity) – Transforms the future state \(x_{t+\tau}\) to \(\chi_1(x_{t+\tau})\). Defaults to \(\chi_1(x) = x\).

Attributes

cov

Estimated covariances.

cov_00

Shortcut to cov_00.

cov_0t

Shortcut to cov_0t.

cov_tt

Shortcut to cov_tt.

cumulative_kinetic_variance

Yields the cumulative kinetic variance.

dim

Dimension attribute.

epsilon

Singular value cutoff.

feature_component_correlation

Instantaneous correlation matrix between mean-free input features and projection components.

instantaneous_coefficients

Coefficient matrix \(U\).

instantaneous_obs

Transforms the current state \(x_t\) to \(f(x_t)\).

koopman_matrix

Same as operator.

lagtime

The lagtime corresponding to this model.

mean_0

Shortcut to mean_0.

mean_t

Shortcut to mean_t.

operator

The operator \(K\) so that \(\mathbb{E}[g(x_{t+\tau})] = K^\top \mathbb{E}[f(x_t)]\) in transformed bases.

operator_inverse

Inverse of the operator \(K\), i.e., \(K^{-1}\).

output_dimension

The dimension of data after propagation by \(K\).

scaling

Scaling of projection.

singular_values

The singular values of the half-weighted Koopman matrix.

singular_vectors_left

Transformation matrix that represents the linear map from mean-free feature space to the space of left singular functions.

singular_vectors_right

Transformation matrix that represents the linear map from mean-free feature space to the space of right singular functions.

timelagged_coefficients

Coefficient matrix \(V\).

timelagged_obs

Transforms the future state \(x_{t+\tau}\) to \(g(x_{t+\tau})\).

var_cutoff

Variance cutoff parameter.

whitening_rank_0

Rank of the instantaneous whitening transformation \(C_{00}^{-1/2}\).

whitening_rank_t

Rank of the time-lagged whitening transformation \(C_{tt}^{-1/2}\).

Methods

backward(data[, propagate])

Maps data backward in time.

ck_test(models[, test_model, include_lag0, ...])

Returns a Chapman-Kolmogorov validator based on this estimator and a test model.

copy()

Makes a deep copy of this model.

effective_output_dimension(rank0, rankt, ...)

Computes effective output dimension.

expectation(observables, statistics[, ...])

Compute future expectation of observable or covariance using the approximated Koopman operator.

forward(data[, propagate])

Maps data forward in time.

get_params([deep])

Get the parameters.

propagate(trajectory[, components])

Applies the forward transform to the trajectory in non-transformed space.

score(r[, test_model, epsilon, dim])

Compute the VAMP score between a this model and potentially a test model for cross-validation.

set_params(**params)

Set the parameters of this estimator.

timescales([k, lagtime])

Implied timescales of the TICA transformation

transform(data, **kw)

Projects data onto the Koopman modes \(f(x) = U^\top \chi_0 (x)\), where \(U\) are the coefficients of the basis \(\chi_0\).

__call__(*args, **kwargs)

Call self as a function.

backward(data: ndarray, propagate=True)

Maps data backward in time.

Parameters:
  • data ((T, n) ndarray) – Input data

  • propagate (bool, default=True) – Whether to apply the Koopman operator to the featurized data.

Returns:

mapped_data – Mapped data.

Return type:

(T, m) ndarray

ck_test(models, test_model=None, include_lag0=True, n_observables=None, observables='phi', statistics='psi', progress=None)

Returns a Chapman-Kolmogorov validator based on this estimator and a test model.

Parameters:
  • models (list of models) – Multiple models with different lagtimes to test against.

  • test_model (CovarianceKoopmanModel, optional, default=None) – The model that is tested. If not provided, uses this estimator’s encapsulated model.

  • include_lag0 (bool, optional, default=True) – Whether to include lagtime 0.

  • n_observables (int, optional, default=None) – Limit the number of default observables (and of default statistics) to this number. Only used if observables are None or statistics are None.

  • observables ((input_dimension, n_observables) ndarray) – Coefficients that express one or multiple observables in the basis of the input features.

  • statistics ((input_dimension, n_statistics) ndarray) – Coefficients that express one or multiple statistics in the basis of the input features.

  • progress (ProgressBar, optional, default=None) – Optional progress bar, tested for tqdm.

Returns:

test – The test results

Return type:

deeptime.util.validation.ChapmanKolmogorovTest

See also

ck_test

Notes

This method computes two sets of time-lagged covariance matrices

  • estimates at higher lag times :

    \[\left\langle \mathbf{K}(n\tau)g_{i},f_{j}\right\rangle_{\rho_{0}}\]

    where \(\rho_{0}\) is the empirical distribution implicitly defined by all data points from time steps 0 to T-tau in all trajectories, \(\mathbf{K}(n\tau)\) is a rank-reduced Koopman matrix estimated at the lag-time n*tau and g and f are some functions of the data.

  • predictions at higher lag times :

    \[\left\langle \mathbf{K}^{n}(\tau)g_{i},f_{j}\right\rangle_{\rho_{0}}\]

    where \(\mathbf{K}^{n}\) is the n’th power of the rank-reduced Koopman matrix contained in self.

The Champan-Kolmogorov test is to compare the predictions to the estimates.

copy() Model

Makes a deep copy of this model.

Returns:

A new copy of this model.

Return type:

copy

static effective_output_dimension(rank0, rankt, dim, var_cutoff, singular_values) int

Computes effective output dimension.

expectation(observables, statistics, lag_multiple=1, observables_mean_free=False, statistics_mean_free=False)

Compute future expectation of observable or covariance using the approximated Koopman operator.

Parameters:
  • observables (np.ndarray((input_dimension, n_observables))) – Coefficients that express one or multiple observables in the basis of the input features.

  • statistics (np.ndarray((input_dimension, n_statistics)), optional) – Coefficients that express one or multiple statistics in the basis of the input features. This parameter can be None. In that case, this method returns the future expectation value of the observable(s).

  • lag_multiple (int) – If > 1, extrapolate to a multiple of the estimator’s lag time by assuming Markovianity of the approximated Koopman operator.

  • observables_mean_free (bool, default=False) – If true, coefficients in observables refer to the input features with feature means removed. If false, coefficients in observables refer to the unmodified input features.

  • statistics_mean_free (bool, default=False) – If true, coefficients in statistics refer to the input features with feature means removed. If false, coefficients in statistics refer to the unmodified input features.

Returns:

expectation – The equilibrium expectation of observables or covariance if statistics is not None.

Return type:

ndarray

Notes

A “future expectation” of an observable \(g\) is the average of \(g\) computed over a time window that has the same total length as the input data from which the Koopman operator was estimated but is shifted by lag_multiple*tau time steps into the future (where tau is the lag time).

It is computed with the equation:

\[\mathbb{E}[g]_{\rho_{n}}=\mathbf{q}^{T}\mathbf{P}^{n-1}\mathbf{e}_{1}\]

where

\[P_{ij}=\sigma_{i}\langle\psi_{i},\phi_{j}\rangle_{\rho_{1}}\]

and

\[q_{i}=\langle g,\phi_{i}\rangle_{\rho_{1}}\]

and \(\mathbf{e}_{1}\) is the first canonical unit vector.

A model prediction of time-lagged covariances between the observable \(f\) and the statistic \(g\) at a lag-time of lag_multiple*tau is computed with the equation:

\[\mathrm{cov}[g,\,f;n\tau]=\mathbf{q}^{T}\mathbf{P}^{n-1}\boldsymbol{\Sigma}\mathbf{r}\]

where \(r_{i}=\langle\psi_{i},f\rangle_{\rho_{0}}\) and \(\boldsymbol{\Sigma}=\mathrm{diag(\boldsymbol{\sigma})}\) .

forward(data: ndarray, propagate=True)

Maps data forward in time.

Parameters:
  • data ((T, n) ndarray) – Input data

  • propagate (bool, default=True) – Whether to apply the Koopman operator to the featurized data.

Returns:

mapped_data – Mapped data.

Return type:

(T, m) ndarray

get_params(deep=False)

Get the parameters.

Returns:

params – Parameter names mapped to their values.

Return type:

mapping of string to any

propagate(trajectory: ndarray, components: Optional[Union[int, List[int]]] = None) ndarray

Applies the forward transform to the trajectory in non-transformed space. Given the Koopman operator \(\Sigma\), transformations \(V^\top - \mu_t\) and \(U^\top -\mu_0\) for bases \(f\) and \(g\), respectively, this is achieved by transforming each frame \(X_t\) with

\[\hat{X}_{t+\tau} = (V^\top)^{-1} \Sigma U^\top (X_t - \mu_0) + \mu_t. \]

If the model stems from a VAMP estimator, \(V\) are the left singular vectors, \(\Sigma\) the singular values, and \(U\) the right singular vectors.

Parameters:
  • trajectory ((T, n) ndarray) – The input trajectory

  • components (int or list of int or None, default=None) – Optional arguments for the Koopman operator if appropriate. If the model stems from a VAMP estimator, these are the component(s) to project onto. If None, all processes are taken into account, if list of integer, this sets all singular values to zero but the “components”th ones.

Returns:

predictions – The predicted trajectory.

Return type:

(T, n) ndarray

score(r: Union[float, str], test_model=None, epsilon=1e-06, dim=None)

Compute the VAMP score between a this model and potentially a test model for cross-validation.

Parameters:
  • r (float or str) –

    The type of score to evaluate. Can by an floating point value greater or equal to 1 or ‘E’, yielding the VAMP-r score or the VAMP-E score, respectively. [1] Typical choices are:

    • ’VAMP1’ Sum of singular values of the half-weighted Koopman matrix.

      If the model is reversible, this is equal to the sum of Koopman matrix eigenvalues, also called Rayleigh quotient [1].

    • ’VAMP2’ Sum of squared singular values of the half-weighted Koopman

      matrix [1]. If the model is reversible, this is equal to the kinetic variance [2].

    • ’VAMPE’ Approximation error of the estimated Koopman operator with respect to

      the true Koopman operator up to an additive constant [1] .

  • test_model (CovarianceKoopmanModel, optional, default=None) –

    If test_model is not None, this method computes the cross-validation score between self and covariances_test. It is assumed that self was estimated from the “training” data and test_model was estimated from the “test” data. The score is computed for one realization of self and test_model. Estimation of the average cross-validation score and partitioning of data into test and training part is not performed by this method.

    If covariances_test is None, this method computes the VAMP score for the model contained in self.

  • epsilon (float, default=1e-6) – Regularization parameter for computing sqrt-inverses of spd matrices.

  • dim (int, optional, default=None) – How many components to use for scoring.

Returns:

score – If test_model is not None, returns the cross-validation VAMP score between self and test_model. Otherwise return the selected VAMP-score of self.

Return type:

float

Notes

The VAMP-\(r\) and VAMP-E scores are computed according to [1], Equation (33) and Equation (30), respectively.

References

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self – Estimator instance.

Return type:

object

timescales(k=None, lagtime: Optional[int] = None) ndarray

Implied timescales of the TICA transformation

For each \(i\)-th eigenvalue, this returns

\[t_i = -\frac{\tau}{\log(|\lambda_i|)}\]

where \(\tau\) is the lagtime of the TICA object and \(\lambda_i\) is the i-th eigenvalue of the TICA object.

Parameters:
  • k (int, optional, default=None) – Number of timescales to be returned. By default with respect to all available singular values.

  • lagtime (int, optional, default=None) – The lagtime with respect to which to compute the timescale. If None, this defaults to the lagtime under which the covariances were estimated.

Returns:

timescales – numpy array with the implied timescales. In principle, one should expect as many timescales as input coordinates were available. However, less eigenvalues will be returned if the TICA matrices were not full rank or dim contained a floating point percentage, i.e., was interpreted as variance cutoff.

Return type:

(n,) np.array

Raises:

ValueError – If any of the singular values not real, i.e., has a non-zero imaginary component.

transform(data: ndarray, **kw)

Projects data onto the Koopman modes \(f(x) = U^\top \chi_0 (x)\), where \(U\) are the coefficients of the basis \(\chi_0\).

Parameters:

data ((T, n) ndarray) – Input data.

Returns:

transformed_data – Data projected onto the Koopman modes.

Return type:

(T, k) ndarray

property cov: CovarianceModel

Estimated covariances.

property cov_00: ndarray

Shortcut to cov_00.

property cov_0t: ndarray

Shortcut to cov_0t.

property cov_tt: ndarray

Shortcut to cov_tt.

property cumulative_kinetic_variance: ndarray

Yields the cumulative kinetic variance.

property dim: Optional[int]

Dimension attribute. Can either be int or None. In case of

  • int it evaluates it as the actual dimension, must be strictly greater 0,

  • None all numerically available components are used.

Getter:

yields the dimension

Setter:

sets a new dimension

Type:

int or None

property epsilon: float

Singular value cutoff.

property feature_component_correlation

Instantaneous correlation matrix between mean-free input features and projection components.

Denoting the input features as \(X_i\) and the projection components as \(\theta_j\), the instantaneous, linear correlation between them can be written as

\[\mathbf{Corr}(X_i - \mu_i, \mathbf{\theta}_j) = \frac{1}{\sigma_{X_i - \mu_i}}\sum_l \sigma_{(X_i - \mu_i)(X_l - \mu_l)} \mathbf{U}_{li} \]

The matrix \(\mathbf{U}\) is the matrix containing the eigenvectors of the generalized eigenvalue problem as column vectors.

Returns:

corr – Correlation matrix between input features and projection components. There is a row for each feature and a column for each component.

Return type:

ndarray(n,m)

property instantaneous_coefficients: ndarray

Coefficient matrix \(U\).

property instantaneous_obs: Callable[[ndarray], ndarray]

Transforms the current state \(x_t\) to \(f(x_t)\). Defaults to f(x) = x.

property koopman_matrix: ndarray

Same as operator.

property lagtime

The lagtime corresponding to this model. See also CovarianceModel.lagtime.

property mean_0: ndarray

Shortcut to mean_0.

property mean_t: ndarray

Shortcut to mean_t.

property operator: ndarray

The operator \(K\) so that \(\mathbb{E}[g(x_{t+\tau})] = K^\top \mathbb{E}[f(x_t)]\) in transformed bases.

property operator_inverse: ndarray

Inverse of the operator \(K\), i.e., \(K^{-1}\). Potentially pseudo-inverse instead of true inverse.

property output_dimension

The dimension of data after propagation by \(K\).

property scaling: Optional[str]

Scaling of projection. Can be None, ‘kinetic map’, or ‘km’

property singular_values: ndarray

The singular values of the half-weighted Koopman matrix.

property singular_vectors_left: ndarray

Transformation matrix that represents the linear map from mean-free feature space to the space of left singular functions.

property singular_vectors_right: ndarray

Transformation matrix that represents the linear map from mean-free feature space to the space of right singular functions.

property timelagged_coefficients: ndarray

Coefficient matrix \(V\).

property timelagged_obs: Callable[[ndarray], ndarray]

Transforms the future state \(x_{t+\tau}\) to \(g(x_{t+\tau})\). Defaults to f(x) = x.

property var_cutoff: Optional[float]

Variance cutoff parameter. Can be set to include dimensions up to a certain threshold. Takes precedence over the dim() parameter.

Getter:

Yields the current variance cutoff.

Setter:

Sets a new variance cutoff or disables variance cutoff by setting the value to None.

Type:

float or None

property whitening_rank_0: int

Rank of the instantaneous whitening transformation \(C_{00}^{-1/2}\).

property whitening_rank_t: int

Rank of the time-lagged whitening transformation \(C_{tt}^{-1/2}\).