function to_dataset

deeptime.util.types.to_dataset(data: Union[TimeLaggedDataset, Tuple[ndarray, ndarray], ndarray], lagtime: Optional[int] = None)

Converts input data to a TimeLaggedDataset if possible, otherwise assumes that data implements __len__ as well as __getitem__, where __getitem__ yields a tuple of data.

The possible cases are:

  • input data is already a time-lagged dataset, then return immediately (see is_timelagged_dataset()).

  • input data is a tuple of (X, Y), where X and Y are ndarrays - in this case they are interpreted as time-lagged versions of another, i.e., \(Y_i = \mathcal{g}(X_i)\), where \(g(\cdot )\) describes the temporal evolution. In this case lagtime is ignored.

  • input data is a list of trajectories, in this case a concatenated TrajectoriesDataset is created

  • input is a ndarray, in this case Y[i] = X[i+lagtime] and the result is a dataset of length len(data) - lagtime

Parameters:
  • data (TimeLaggedDataset or tuple of arrays or array) – Input data.

  • lagtime (int, optional, default=None) – Lagtime, only is considered if input is array.

Returns:

A dataset based on input arguments.

Return type:

dataset

Raises:

ValueError – If data is single array but no lagtime is provided or input is list or tuple of length not equal to 2

Examples

Create dataset via trajectory + lagtime

>>> data = np.arange(0, 6)
>>> dataset = to_dataset(data, lagtime=1)
>>> print(dataset[:])
(array([0, 1, 2, 3, 4]), array([1, 2, 3, 4, 5]))

Create dataset via corresponding data matrices

>>> data_instantaneous = np.array([0, 1, 2, 3])
>>> data_timelagged = np.zeros((4, 2))
>>> dataset = to_dataset((data_instantaneous, data_timelagged))

Printing instantaneous data

>>> print(dataset[:][0])
[0 1 2 3]

Printing timelagged data

>>> print(dataset[:][1])
[[0. 0.]
 [0. 0.]
 [0. 0.]
 [0. 0.]]