Combining Statistical Information and Distance Computation for K-Means Initialization

Wei Du; Lin Hu; Jianwei Sun; Bo Yu; Haibo Yang

doi:10.1109/skg.2016.022

Abstract

1 min read

As the symbol of the partition clustering method, K-Means is well known and widely used in many fields for the easily implemented and high efficiency. However, the initial center problem may affect the final cluster result, sometimes the final cluster result might contain some empty clusters. In this paper, a new K-Mean initialization method is proposed which combines the statistical information and the distance computation. The statistical information contains the mean, median, and Gaussian kernel density estimation. At first, the high density points are selected for each dimension. Then the distance and the density are used to measure every possible initial centers. After this process works from high variance dimension to low variance ones, the final initial cluster centers are constructed with the K nearest neighbors. Experiments on public datasets show that this method can achieve comparable results compared with other conventional methods.

Combining Statistical Information and Distance Computation for K-Means Initialization

Abstract

Discussion(0)

Related publications

A new projection-based K-Means initialization algorithm

One-shot Federated K-means Clustering based on Density Cores

One-shot Federated K-means Clustering based on Density Cores

Viewpoint-Based Kernel Fuzzy Clustering With Weight Information Granules

One-Shot Secure Federated <i>K</i>-Means Clustering Based on Density Cores

Related publications

Article2016
A new projection-based K-Means initialization algorithm
Article2016

Preprint2024
One-shot Federated K-means Clustering based on Density Cores
Preprint2024

Preprint2023
One-shot Federated K-means Clustering based on Density Cores
Preprint2023

Article2022
Viewpoint-Based Kernel Fuzzy Clustering With Weight Information Granules
Article2022

Article2025
One-Shot Secure Federated <i>K</i>-Means Clustering Based on Density Cores
Article2025