class RegularSpace¶
- class deeptime.clustering.RegularSpace(dmin: float, max_centers: int = 1000, metric: str = 'euclidean', n_jobs=None)¶
Clusters data objects in such a way, that cluster centers are at least in distance of dmin to each other according to the given metric. The assignment of data objects to cluster centers is performed by Voronoi partioning.
Regular space clustering [1] is very similar to Hartigan’s leader algorithm [2]. It consists of two passes through the data. Initially, the first data point is added to the list of centers. For every subsequent data point, if it has a greater distance than dmin from every center, it also becomes a center. In the second pass, a Voronoi discretization with the computed centers is used to partition the data.
- Parameters:
dmin (float) – Minimum distance between all clusters, must be non-negative.
max_centers (int) – If this threshold is met during finding the centers, the algorithm will terminate. Must be positive.
metric (str, default='euclidean') – The metric to use during clustering. For a list of available metrics, see the
metric registry
.n_jobs (int, optional, default=None) – Number of threads to use during estimation.
References
Attributes
Minimum distance between cluster centers.
Property reporting whether this estimator contains an estimated model.
Cutoff during clustering.
The metric that is used for clustering.
Shortcut to
fetch_model()
.Alias to
max_centers
.The number of threads to use during estimation.
Methods
Fetches the current model.
fit
(data[, n_jobs])Fits this estimator onto data.
fit_fetch
(data, **kwargs)Fits the internal model on data and subsequently fetches it in one call.
get_params
([deep])Get the parameters.
partial_fit
(data[, n_jobs])Fits data to an existing model.
set_params
(**params)Set the parameters of this estimator.
- fetch_model() ClusterModel ¶
Fetches the current model. Can be None in case
fit()
was not called yet.- Returns:
model – The latest estimated model or None.
- Return type:
ClusterModel or None
- fit(data, n_jobs=None)¶
Fits this estimator onto data. The estimation is carried out by
Choosing first data frame as centroid
for all frames \(x\in X\): Calculate distance to all cluster centers
Add a new centroid if minimal distance to all other cluster centers is larger or equal
dmin
.
- Parameters:
data ((T, n) ndarray or list of ndarray) – the data to fit
n_jobs (int, optional, default=None) – Number of jobs, superseeds
n_jobs
if set to an integer value
- Returns:
self – reference to self
- Return type:
- fit_fetch(data, **kwargs)¶
Fits the internal model on data and subsequently fetches it in one call.
- Parameters:
data (array_like) – Data that is used to fit the model.
**kwargs – Additional arguments to
fit()
.
- Returns:
The estimated model.
- Return type:
model
- get_params(deep=False)¶
Get the parameters.
- Returns:
params – Parameter names mapped to their values.
- Return type:
mapping of string to any
- set_params(**params)¶
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
object
- property dmin: float¶
Minimum distance between cluster centers.
- Getter:
Yields the currently set minimum distance.
- Setter:
Sets a new minimum distance, must be non-negative.
- Type:
float
- property has_model: bool¶
Property reporting whether this estimator contains an estimated model. This assumes that the model is initialized with None otherwise.
- Type:
bool
- property max_centers: int¶
Cutoff during clustering. If reached no more data is taken into account. You might then consider a larger value or a larger dmin value.
- Getter:
Current maximum number of cluster centers.
- Setter:
Sets a new maximum number of cluster centers, must be non-negative.
- Type:
int
- property metric: str¶
The metric that is used for clustering.
- Type:
str.
- property model¶
Shortcut to
fetch_model()
.
- property n_clusters: int¶
Alias to
max_centers
.
- property n_jobs: int¶
The number of threads to use during estimation.
- Getter:
Yields the number of threads to use, -1 is an allowed value for all available threads.
- Setter:
Sets the number of threads to use, can be None in which case it defaults to 1 thread.
- Type:
int