Dimension reduction

Here we introduce the dimension reduction / decomposition techniques implemented in the package.

Koopman operator methods

All methods contained in this sub-package relate to the Koopman operator \(\mathcal{K}_\tau\) defined as

\[[\mathcal{K}_\tau g](x) = \mathbb{E}[g(x_{t+\tau}) \mid x_t = x] \]

for a process \(\{x_t\}_{t\geq 0}\) with transition density \(p_\tau(x, y)\). When projecting \(\mathcal{K}_\tau\) into a finite basis, one seeks

\[\mathbb{E}[g(x_{t+\tau})] = K^{\top}\mathbb{E}[f(x_t)], \]

where \(K\in\mathbb{R}^{n\times m}\) is a finite-dimensional Koopman matrix which propagates the observable \(f\) of the system’s state \(x_t\) to the observable \(g\) at state \(x_{t+\tau}\).

When to use which method

All methods assume (approximate) Markovianity of the time series under lag-time \(\tau\).




TICA [1] [2]

Time series should be stationary with symmetric covariances (equivalently: reversible with detailed balance) and compact Koopman operator (guaranteed to be compact in stochastic systems).

  • Based on variational principle

  • TICA uses the assumptions as prior and thus can yield more interpretable results than its generalization VAMP.

  • Might yield biased results if the observed process contains rare events which are not sufficiently reflected in the time-series.

  • Dual to DMD: The estimated matrix is the transpose of the one estimated by DMD; TICA estimates eigenfunctions and DMD estimates ‘modes’, the coefficients which lead to the approximate Koopman operator using the eigenfunctions.

  • Eigenvalues of the decomposition relate to relaxation timescales.

  • Can identify metastabile sets.

  • Is VAMP if system is reversible and ansatz library contains only the full state observable \(\Psi(x) = x\)

  • Can deal with a large amount of frames due to online estimation of covariances

VAMP [3]

Compact Koopman operator (guaranteed to be compact in stochastic systems)

  • Based on variational principle

  • Is canonical correlation analysis (CCA) in time, i.e., time-lagged CCA (TCCA)

  • Is VAC under reversible dynamics and VAMP-1 score

  • Uses a singular value decomposition of covariances instead of eigenvalue decomposition.

  • Deals with off-equilibrium data consistently.

  • The singular functions can be clustered to find metastable and coherent sets.

  • Equivalent to EDMD in case of the VAMP-1 score

  • Uses ansatz library \(\Psi\) of functions as basis

  • Can deal with a large amount of frames due to online estimation of covariances

kernel VAMP / kernel CCA [4]

Compact Koopman operator (guaranteed to be compact in stochastic systems)

  • kernelized version of VAMP with \(k(x, x') = \langle\Psi(x), \Psi(x')\rangle\)

  • memory requirements grow quadratically with the number of frames

  • generalization of kernel EDMD to nonreversible dynamics

DMD [5] [6]

  • dual to TICA (DMD yields coefficients of eigenfunctions, TICA yields eigenfunctions)

  • not an online algorithm but better suited for very high-dimensional data with a lower number of frames

EDMD [7]

  • Equivalent to VAMP under the VAMP-1 score

  • Uses an ansatz library \(\Psi\)

  • not an online algorithm but better suited for very high-dimensional data with a lower number of frames

kernel EDMD [8]

  • kernelized version of EDMD with \(k(x, x') = \langle\Psi(x), \Psi(x')\rangle\)

  • memory requirements grow quadratically with the number of frames

  • considering generator kernel EDMD, SINDy arises as special case [9]

It should be noted that, if available, scores evaluated under different lagtimes are not comparable because they relate to different operators.

What’s next?

While a dimensionality reduction is always of great use because it makes it easier to look at the data, one can take further steps.

A commonly performed pipeline would be to cluster the projected data and then building a markov state model on the resulting discretized state space.

Estimating covariances and how to deal with large amounts of data

While the implementations of TICA and its generalization VAMP can be fit directly by a time series that is kept in the computer’s memory, this might not always be possible.

The implementations are based on estimating covariance matrices, by default using the covariance estimator.

This estimator makes use of an online algorithm, so that it can be fit in a streaming fashion:

estimator = deeptime.decomposition.TICA(lagtime=tau)  # creating an estimator
estimator = deeptime.decomposition.VAMP(lagtime=tau)  # either TICA or VAMP

Since toy data usually easily fits into memory, loading data from, e.g., a database or network is simulated with the timeshifted_split() utility function. It splits the data into timeshifted blocks \(X_t\) and \(X_{t+\tau}\).

These blocks are not trajectory-overlapping, i.e., if two or more trajectories are provided then the blocks are always completely contained in exactly one of these.

Note how here we provide both blocks, the block \(X_t\) and the block \(X_{t+\tau}\) as a tuple. This is different to fit() where the splitting and shifting is performed internally; in which case it suffices to provide the whole dataset as argument.

for X, Y in deeptime.data.timeshifted_split(feature_trajectory, lagtime=tau, chunksize=100):
    estimator.partial_fit((X, Y))

Furthermore, the online algorithm uses a tree-like moment storage with copies of intermediate covariance and mean estimates. During the learning procedure, these moment storages are combined so that the tree never exceeds a certain depth. This depth can be set by the ncov estimator parameter:

estimator = deeptime.decomposition.TICA(lagtime=1, ncov=50)
for X, Y in deeptime.data.timeshifted_split(feature_trajectory, lagtime=1, chunksize=10):
    tica.partial_fit((X, Y))

Another factor to consider is numerical stability. While memory consumption can increase with larger ncov, the stability generally improves.
