function timeshifted_split

deeptime.util.data.timeshifted_split(inputs, lagtime: int, chunksize: int = 1000, stride: int = 1, n_splits: Optional[int] = None, shuffle: bool = False, random_state: Optional[RandomState] = None)

Utility function which splits input trajectories into pairs of timeshifted data \((X_t, X_{t+\tau})\). In case multiple trajectories are provided, the timeshifted pairs are always within the same trajectory.

Parameters:
  • inputs ((T, n) ndarray or list of (T_i, n) ndarrays) – Input trajectory or trajectories. In case multiple trajectories are provided, they must have the same dimension in the second axis but may be of variable length.

  • lagtime (int) – The lag time \(\tau\) used to produce timeshifted blocks.

  • chunksize (int, default=1000) – The chunk size, i.e., the maximal length of the blocks.

  • stride (int, default=1) – Optional stride which is applied after creating a tau-shifted version of the dataset.

  • n_splits (int, optional, default=None) – Alternative to chunksize - this determines the number of timeshifted blocks that is drawn from each provided trajectory. Supersedes whatever was provided as chunksize.

  • shuffle (bool, default=False) – Whether to shuffle the data prior to splitting it.

  • random_state (np.random.RandomState, default=None) – When shuffling this can be used to set a specific random state.

Returns:

iterable – A Python generator which can be iterated.

Return type:

Generator

Examples

Using chunksize:

>>> data = np.array([0, 1, 2, 3, 4, 5, 6])
>>> for X, Y in timeshifted_split(data, lagtime=1, chunksize=4):
...     print(X, Y)
[0 1 2 3] [1 2 3 4]
[4 5] [5 6]

Using n_splits:

>>> data = np.array([0, 1, 2, 3, 4, 5, 6])
>>> for X, Y in timeshifted_split(data, lagtime=1, n_splits=2):
...     print(X, Y)
[0 1 2] [1 2 3]
[3 4 5] [4 5 6]