function timeshifted_split¶
- deeptime.util.data.timeshifted_split(inputs, lagtime: int, chunksize: int = 1000, stride: int = 1, n_splits: Optional[int] = None, shuffle: bool = False, random_state: Optional[RandomState] = None)¶
Utility function which splits input trajectories into pairs of timeshifted data \((X_t, X_{t+\tau})\). In case multiple trajectories are provided, the timeshifted pairs are always within the same trajectory.
- Parameters:
inputs ((T, n) ndarray or list of (T_i, n) ndarrays) – Input trajectory or trajectories. In case multiple trajectories are provided, they must have the same dimension in the second axis but may be of variable length.
lagtime (int) – The lag time \(\tau\) used to produce timeshifted blocks.
chunksize (int, default=1000) – The chunk size, i.e., the maximal length of the blocks.
stride (int, default=1) – Optional stride which is applied after creating a tau-shifted version of the dataset.
n_splits (int, optional, default=None) – Alternative to chunksize - this determines the number of timeshifted blocks that is drawn from each provided trajectory. Supersedes whatever was provided as chunksize.
shuffle (bool, default=False) – Whether to shuffle the data prior to splitting it.
random_state (np.random.RandomState, default=None) – When shuffling this can be used to set a specific random state.
- Returns:
iterable – A Python generator which can be iterated.
- Return type:
Generator
Examples
Using chunksize:
>>> data = np.array([0, 1, 2, 3, 4, 5, 6]) >>> for X, Y in timeshifted_split(data, lagtime=1, chunksize=4): ... print(X, Y) [0 1 2 3] [1 2 3 4] [4 5] [5 6]
Using n_splits:
>>> data = np.array([0, 1, 2, 3, 4, 5, 6]) >>> for X, Y in timeshifted_split(data, lagtime=1, n_splits=2): ... print(X, Y) [0 1 2] [1 2 3] [3 4 5] [4 5 6]