class MiniBatchKMeans¶
- class deeptime.clustering.MiniBatchKMeans(n_clusters, batch_size=100, max_iter=5, metric='euclidean', tolerance=1e-05, init_strategy='kmeans++', n_jobs=None, initial_centers=None)¶
K-means clustering in a mini-batched fashion.
- Parameters:
batch_size (int, optional, default=100) – The maximum sample size if calling
fit()
.
Attributes
seed for random choice of initial cluster centers.
Property reporting whether this estimator contains an estimated model.
Strategy to get an initial guess for the centers.
Yields initial centers which override the
init_strategy()
.Maximum number of clustering iterations before stop.
The metric that is used for clustering.
Shortcut to
fetch_model()
.The number of cluster centers to use.
Number of threads to use during clustering and assignment of data.
Stopping criterion for the k-means iteration.
Methods
Fetches the current model.
fit
(data[, initial_centers, ...])Perform clustering on whole data.
fit_fetch
(data, **kwargs)Fits the internal model on data and subsequently fetches it in one call.
fit_transform
(data[, fit_options, ...])Fits a model which simultaneously functions as transformer and subsequently transforms the input data.
get_params
([deep])Get the parameters.
partial_fit
(data[, n_jobs])Updates the current model (or creates a new one) with data.
set_params
(**params)Set the parameters of this estimator.
transform
(data, **kw)Transforms a trajectory to a discrete trajectory by assigning each frame to its respective cluster center.
- __call__(*args, **kwargs)¶
Call self as a function.
- fetch_model() Optional[KMeansModel] ¶
Fetches the current model. Can be None in case
fit()
was not called yet.- Returns:
model – the latest estimated model
- Return type:
KMeansModel or None
- fit(data, initial_centers=None, callback_init_centers=None, callback_loop=None, n_jobs=None)¶
Perform clustering on whole data.
- fit_fetch(data, **kwargs)¶
Fits the internal model on data and subsequently fetches it in one call.
- Parameters:
data (array_like) – Data that is used to fit the model.
**kwargs – Additional arguments to
fit()
.
- Returns:
The estimated model.
- Return type:
model
- fit_transform(data, fit_options=None, transform_options=None)¶
Fits a model which simultaneously functions as transformer and subsequently transforms the input data. The estimated model can be accessed by calling
fetch_model()
.- Parameters:
data (array_like) – The input data.
fit_options (dict, optional, default=None) – Optional keyword arguments passed on to the fit method.
transform_options (dict, optional, default=None) – Optional keyword arguments passed on to the transform method.
- Returns:
output – Transformed data.
- Return type:
array_like
- get_params(deep=False)¶
Get the parameters.
- Returns:
params – Parameter names mapped to their values.
- Return type:
mapping of string to any
- partial_fit(data, n_jobs=None)¶
Updates the current model (or creates a new one) with data. This method can be called repeatedly and thus be used to train a model in an on-line fashion. Note that usually multiple passes over the data is used. Also this method should not be mixed with calls to
fit()
, as then the model is overwritten with a new instance based on the data passed tofit()
.- Parameters:
data ((T, n) ndarray) – Data with which the model is updated and/or initialized.
n_jobs (int, optional, default=None) – number of jobs to use when updating the model, supersedes the n_jobs attribute of the estimator.
- Returns:
self – reference to self
- Return type:
- set_params(**params)¶
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
object
- transform(data, **kw) ndarray ¶
Transforms a trajectory to a discrete trajectory by assigning each frame to its respective cluster center.
- Parameters:
data ((T, n) ndarray) – trajectory with T frames and data points in n dimensions.
**kw – ignored kwargs for scikit-learn compatibility
- Returns:
discrete_trajectory – discrete trajectory
- Return type:
(T, 1) ndarray
See also
ClusterModel.transform
transform method of cluster model, implicitly called.
- property fixed_seed¶
seed for random choice of initial cluster centers.
Fix this to get reproducible results in conjunction with n_jobs=0. The latter is needed, because parallel execution causes non-deterministic behaviour again.
- property has_model: bool¶
Property reporting whether this estimator contains an estimated model. This assumes that the model is initialized with None otherwise.
- Type:
bool
- property init_strategy¶
Strategy to get an initial guess for the centers.
- Getter:
Yields the strategy, can be one of “kmeans++” or “uniform”.
- Setter:
Setter for the initialization strategy that is used when no initial centers are provided.
- Type:
string
- property initial_centers: Optional[ndarray]¶
Yields initial centers which override the
init_strategy()
. Can be used to resume k-means iterations.- Getter:
The initial centers or None.
- Setter:
Sets the initial centers. If not None, the array is expected to have length
n_clusters
.- Type:
(k, n) ndarray or None
- property max_iter: int¶
Maximum number of clustering iterations before stop.
- Getter:
Yields the maximum number of clustering iterations
- Setter:
Sets the max. number of clustering iterations
- Type:
int
- property metric: str¶
The metric that is used for clustering.
See also
_clustering_bindings.Metric
The metric class, can be subclassed
metrics
Metrics registry which maps from metric label to actual implementation
- property model¶
Shortcut to
fetch_model()
.
- property n_clusters: int¶
The number of cluster centers to use.
- Getter:
Yields the number of cluster centers.
- Setter:
Sets the number of cluster centers.
- Type:
int
- property n_jobs: int¶
Number of threads to use during clustering and assignment of data.
- Getter:
Yields the number of threads. If -1, all available threads are used.
- Setter:
Sets the number of threads to use. If -1, use all, if None, use 1.
- Type:
int
- property tolerance: float¶
Stopping criterion for the k-means iteration. When the relative change of the cost function between two iterations is less than the tolerance, the algorithm is considered to be converged.
- Getter:
Yields the currently set tolerance.
- Setter:
Sets a new tolerance.
- Type:
float