class KMeansModel

class deeptime.clustering.KMeansModel(cluster_centers, metric: str, tolerance: Optional[float] = None, inertias: Optional[ndarray] = None, converged: bool = False)

The K-means clustering model. Stores all important information which are result of the estimation procedure. It can also be used to transform data by assigning each frame to its closest cluster center. For an example please see the documentation of the superclass ClusterModel.

Parameters:
  • cluster_centers ((k, d) ndarray) – The d-dimensional cluster centers, length of the array should coincide with n_clusters.

  • metric (str) – The metric that was used

  • tolerance (float, optional, default=None) – Tolerance which was used as convergence criterium. Defaults to None so that clustering models can be constructed purely from cluster centers and metric.

  • inertias ((t,) ndarray or None, optional, default=None) – Value of the cost function over t iterations. Defaults to None so that clustering models can be constructed purely from cluster centers and metric.

  • converged (bool, optional, default=False) – Whether the convergence criterium was met.

Attributes

cluster_centers

Gets the cluster centers that were estimated for this model.

converged

Whether the estimation process converged.

dim

inertia

Sum of squared distances to assigned centers of training data

inertias

Series of inertias over the the iterations of k-means.

metric

The metric that was used.

n_clusters

The number of cluster centers.

tolerance

The tolerance used as stopping criterion in the kmeans clustering loop.

Methods

copy()

Makes a deep copy of this model.

get_params([deep])

Get the parameters.

score(data[, n_jobs])

Computes how well the model fits to given data by computing the inertia.

set_params(**params)

Set the parameters of this estimator.

transform(data[, n_jobs])

For each frame in data, yields the index of the closest point in cluster_centers.

__call__(*args, **kwargs)

Call self as a function.

copy() Model

Makes a deep copy of this model.

Returns:

A new copy of this model.

Return type:

copy

get_params(deep=False)

Get the parameters.

Returns:

params – Parameter names mapped to their values.

Return type:

mapping of string to any

score(data: ndarray, n_jobs: Optional[int] = None) float

Computes how well the model fits to given data by computing the inertia.

Parameters:
  • data ((T, d) ndarray, dtype=float or double) – dataset with T entries and d dimensions

  • n_jobs (int, optional, default=None) – number of jobs to use

Returns:

score – the inertia

Return type:

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self – Estimator instance.

Return type:

object

transform(data, n_jobs=None) ndarray

For each frame in data, yields the index of the closest point in cluster_centers.

Parameters:
  • data ((T, d) ndarray) – frames

  • n_jobs (int, optional, default=None) – number of jobs to use for assignment

Returns:

discrete_trajectory – A discrete trajectory where each frame denotes the closest cluster center.

Return type:

(T, 1) ndarray

property cluster_centers: ndarray

Gets the cluster centers that were estimated for this model.

Returns:

Array containing estimated cluster centers.

Return type:

np.ndarray

property converged: bool

Whether the estimation process converged. Per default this is set to False, which can also indicate that the model was created manually and does not stem from an Estimator directly.

Returns:

converged – Whether the clustering converged

Return type:

bool

property inertia: Optional[int]

Sum of squared distances to assigned centers of training data

\[\sum_{i=1}^k \sum_{x\in S_i} d(x, \mu_i)^2, \]

where \(x\) are the frames assigned to their respective cluster center \(S_i\).

Type:

float or None

property inertias: Optional[ndarray]

Series of inertias over the the iterations of k-means.

Type:

(t, dtype=float) ndarray or None

property metric: str

The metric that was used.

Returns:

metric – Name of the metric that was used. The name is related to the implementation via the metric registry.

Return type:

str

property n_clusters: int

The number of cluster centers.

Returns:

The number of cluster centers.

Return type:

int

property tolerance

The tolerance used as stopping criterion in the kmeans clustering loop. In particular, when the relative change in the inertia is smaller than the given tolerance value.

Returns:

tolerance – the tolerance

Return type:

float