function timeshifted_split¶

deeptime.util.data.timeshifted_split(inputs, lagtime: int, chunksize: int = 1000, stride: int = 1, n_splits: Optional[int] = None, shuffle: bool = False, random_state: Optional[RandomState] = None)¶

Utility function which splits input trajectories into pairs of timeshifted data \((X_t, X_{t+\tau})\). In case multiple trajectories are provided, the timeshifted pairs are always within the same trajectory.

Parameters:

inputs ((T, n) ndarray or list of (T_i, n) ndarrays) – Input trajectory or trajectories. In case multiple trajectories are provided, they must have the same dimension in the second axis but may be of variable length.
lagtime (int) – The lag time \(\tau\) used to produce timeshifted blocks.
chunksize (int, default=1000) – The chunk size, i.e., the maximal length of the blocks.
stride (int, default=1) – Optional stride which is applied after creating a tau-shifted version of the dataset.
n_splits (int, optional, default=None) – Alternative to chunksize - this determines the number of timeshifted blocks that is drawn from each provided trajectory. Supersedes whatever was provided as chunksize.
shuffle (bool, default=False) – Whether to shuffle the data prior to splitting it.
random_state (np.random.RandomState, default=None) – When shuffling this can be used to set a specific random state.

Returns:

iterable – A Python generator which can be iterated.

Return type:

Generator

Examples

Using chunksize:

>>> data = np.array([0, 1, 2, 3, 4, 5, 6])
>>> for X, Y in timeshifted_split(data, lagtime=1, chunksize=4):
...     print(X, Y)
[0 1 2 3] [1 2 3 4]
[4 5] [5 6]

Using n_splits:

>>> data = np.array([0, 1, 2, 3, 4, 5, 6])
>>> for X, Y in timeshifted_split(data, lagtime=1, n_splits=2):
...     print(X, Y)
[0 1 2] [1 2 3]
[3 4 5] [4 5 6]