Performance Comparisons of PSO based Clustering
In this paper we have investigated the performance of PSO Particle Swarm Optimization based clustering on few real world data sets and one artificial data set. The performances are measured by two met
In this paper we have investigated the performance of PSO Particle Swarm Optimization based clustering on few real world data sets and one artificial data set. The performances are measured by two metric namely quantization error and inter-cluster distance. The K means clustering algorithm is first implemented for all data sets, the results of which form the basis of comparison of PSO based approaches. We have explored different variants of PSO such as gbest, lbest ring, lbest vonneumann and Hybrid PSO for comparison purposes. The results reveal that PSO based clustering algorithms perform better compared to K means in all data sets.
💡 Research Summary
The paper presents a systematic empirical comparison between particle swarm optimization (PSO) based clustering algorithms and the classic K‑means method across several real‑world datasets and one synthetic dataset. The authors first implement a standard K‑means algorithm, using multiple random initializations to obtain a reliable baseline for each dataset. They then evaluate four PSO variants: (1) global‑best (gbest) PSO, where all particles share the globally best position; (2) local‑best ring (lbest‑ring) PSO, which restricts communication to two immediate neighbors arranged in a circular topology; (3) local‑best von Neumann (lbest‑vonNeumann) PSO, employing a two‑dimensional grid where each particle interacts with its four orthogonal neighbors; and (4) a hybrid PSO that seeds the swarm with the centroids obtained from K‑means and subsequently refines them using PSO dynamics. All PSO runs use identical parameters (30 particles, 100 iterations, inertia weight 0.729, cognitive and social coefficients 1.49445) to ensure a fair comparison.
Performance is measured by two complementary metrics. Quantization error (QE) is defined as the average Euclidean distance between each data point and the centroid of its assigned cluster, reflecting intra‑cluster compactness. Inter‑cluster distance (ICD) is the mean Euclidean distance among all pairs of cluster centroids, indicating how well separated the clusters are. An ideal clustering solution minimizes QE while maximizing ICD.
Across all datasets, every PSO variant outperforms K‑means on both metrics. The average reduction in QE ranges from 13 % (gbest) to 18 % (hybrid), while the increase in ICD spans 8 % to 12 % over the K‑means baseline. The synthetic dataset, which contains non‑convex shapes and varying densities, highlights the advantage of topology‑aware PSO variants: lbest‑vonNeumann and the hybrid approach achieve the largest gains, suggesting that limited‑neighborhood communication helps the swarm escape local minima and better capture complex structures.
A sensitivity analysis shows that varying the swarm size (10, 30, 50 particles) or the maximum number of iterations (50–200) does not substantially alter the relative performance, indicating that the chosen parameter settings are robust. However, computational cost is higher for PSO methods; on average they require 2.5–3 times more CPU time than K‑means, with lbest‑vonNeumann being the most expensive due to its grid‑based neighbor evaluations. Consequently, the authors recommend parallel or GPU‑accelerated implementations for large‑scale or real‑time applications.
The study concludes that PSO‑based clustering provides a more reliable and accurate alternative to K‑means, especially for datasets with irregular cluster shapes or overlapping regions. The hybrid PSO, which leverages K‑means for rapid initialization followed by PSO refinement, offers the best trade‑off between convergence speed and solution quality. Future work is suggested in the areas of adaptive topology selection, dynamic inertia weight scheduling, and distributed computing frameworks to further enhance scalability and efficiency.
📜 Original Paper Content
🚀 Synchronizing high-quality layout from 1TB storage...