deeptime.markov.tools.estimation.count_matrix¶

deeptime.markov.tools.estimation.count_matrix(dtraj, lag, sliding=True, sparse_return=True, nstates=None)¶

Generate a count matrix from given microstate trajectory. [1]

Parameters:

dtraj (array_like or list of array_like) – Discretized trajectory or list of discretized trajectories
lag (int) – Lagtime in trajectory steps
sliding (bool, optional) – If true the sliding window approach is used for transition counting.
sparse_return (bool (optional)) – Whether to return a dense or a sparse matrix.
nstates (int, optional) – Enforce a count-matrix with shape=(nstates, nstates)

Returns:

C – The count matrix at given lag in coordinate list format.

Return type:

scipy.sparse.coo_matrix

Notes

Transition counts can be obtained from microstate trajectory using two methods. Couning at lag and slidingwindow counting.

Lag

This approach will skip all points in the trajectory that are seperated form the last point by less than the given lagtime \(\tau\).

Transition counts \(c_{ij}(\tau)\) are generated according to

\[c_{ij}(\tau) = \sum_{k=0}^{\left \lfloor \frac{N}{\tau} \right \rfloor -2} \chi_{i}(X_{k\tau})\chi_{j}(X_{(k+1)\tau}). \]

\(\chi_{i}(x)\) is the indicator function of \(i\), i.e \(\chi_{i}(x)=1\) for \(x=i\) and \(\chi_{i}(x)=0\) for \(x \neq i\).

Sliding

The sliding approach slides along the trajectory and counts all transitions sperated by the lagtime \(\tau\).

Transition counts \(c_{ij}(\tau)\) are generated according to

\[c_{ij}(\tau)=\sum_{k=0}^{N-\tau-1} \chi_{i}(X_{k}) \chi_{j}(X_{k+\tau}). \]

References

Examples

>>> import numpy as np
>>> from deeptime.markov.tools.estimation import count_matrix

>>> dtraj = np.array([0, 0, 1, 0, 1, 1, 0])
>>> tau = 2

Use the sliding approach first

>>> C_sliding = count_matrix(dtraj, tau)

The generated matrix is a sparse matrix in CSR-format. For convenient printing we convert it to a dense ndarray.

>>> C_sliding.toarray()
array([[1., 2.],
       [1., 1.]])

Let us compare to the count-matrix we obtain using the lag approach

>>> C_lag = count_matrix(dtraj, tau, sliding=False)
>>> C_lag.toarray()
array([[0., 1.],
       [1., 1.]])