For developers

This document is mainly for aspiring contributors to familiarize themselves with the code structure and underlying API. When it is planned to add a new estimator / model / transformer, the module deeptime.base defines the interfaces, which are described in some more detail subsequently.

Writing a custom estimator

When writing a custom estimator, first one should decide on what is supposed to be estimated. To this end, a Model can be implemented. For example, implementing a model which can hold mean values:

from deeptime.base import Model
class MeanModel(Model):

    def __init__(self, mean):
        self._mean = mean

    @property
    def mean(self):
        return self._mean

is nothing more than a dictionary which inherits from Model.

Subsequently, an Estimator for MeanModel s can be implemented:

from deeptime.base import Estimator
class MeanEstimator(Estimator):

    def __init__(self, axis=-1):
        super(MeanEstimator, self).__init__()
        self.axis = axis

    def fetch_model(self, data) -> typing.Optional[MeanModel]:
        return self._model

    def fit(self, data):
        self._model = MeanModel(np.mean(data, axis=self.axis))
        return self

Some estimators also offer a partial_fit, in which case an existing model is updated.

Now estimator and model can be used:

data = np.random.normal(size=(100000, 4))
mean_model = MeanEstimator(axis=-1).fit(data).fetch_model()
print(mean_model.mean)

Adding transformer capabilities

Some models have the capability to transform / project data. For example, k-means can be used to transform time series to discrete series of states by assigning each frame to its respective cluster center.

To add this kind of functionality, one can use the Transformer interface and implement the abstract Transformer.transform() method:

from deeptime.base import Model, Transformer
class Projector(Model, Transformer):

    def __init__(self, dim):
        self.dim = dim

    def transform(self, data: np.ndarray):
        # projects time series data to "dim"-th dimension
        return data[:, self.dim]

It usually also makes sense to implement the transformer interface for estimators whose models are transformers by simply calling self.fetch_model().transform(data), i.e., dispatching the transform call to the current model.

Depending on PyTorch

If your code depends on pytorch it is no problem to import it at module level (at the top of your implementation file). To make it accessible to the parent package via __init__ however, the import should be wrapped into a call to module_available like so

# ... the init
from ..util.platform import module_available
if module_available("torch"):
    from .your_module import MeanEstimator, MeanModel
del module_available

because there is no hard dependency to PyTorch and functionality should be exposed as available.

Testing your code

Tests are designed to be run with py.test which can be obtained via, e.g., pypi or conda. All tests (except for doctests) are placed inside the toplevel tests directory. The tests directory is organized in the same way as the deeptime package itself. For example, if you developed a new estimator MeanEstimator in the package deeptime.some.package, then tests should go into tests/some/package/test_mean_estimator.py.

To execute the tests a call to pytest tests/ suffices. To execute doctests, pytest --doctest-modules deeptime can be called.

Documenting the code

When documenting your code, numpydoc style should be used. Going back to the example of the MeanEstimator, this style of documentation would look like the following:

class MeanEstimator(deeptime.base.Estimator):
    r""" The mean estimator. It estimates the mean using a complicated algorithm
    :footcite:`authorofthecomplicatedalgo1988`.

    Parameters
    ----------
    axis : int, optional, default=-1
        The axis over which to compute the mean. Defaults to -1, which refers to the last axis.

    References
    ----------
    .. footbibliography::

    See Also
    --------
    MeanModel
    """

    def __init__(self, axis=-1):
        super(MeanEstimator, self).__init__()
        self.axis = axis

    def fetch_model(self, data) -> typing.Optional[MeanModel]:
        r"""Fetches the current model. Can be `None` in case :meth:`fit` was not called yet.

        Returns
        -------
        model : MeanModel or None
            the latest estimated model
        """
        return self._model

    def fit(self, data):
        r""" Performs the estimation.

        Parameters
        ----------
        data : ndarray
            Array over which the mean should be estimated.

        Returns
        -------
        self : MeanEstimator
            Reference to self.
        """
        self._model = MeanModel(np.mean(data, axis=self.axis))
        return self

Note the specific style of using citations. For citations there is a package-global BibTeX file under docs/source/references.bib. These references can then be included into the documentation website using the citation key as defined in the references file.

The documentation website is hosted via GitHub pages, its sources can be found here. Please see the README on GitHub for instructions on how to build it.