基于改进蜣螂优化算法的K-means聚类

doi:10.12005/orms.2025.0278

运筹与管理 ›› 2025, Vol. 34 ›› Issue (9): 77-83.DOI: 10.12005/orms.2025.0278

基于改进蜣螂优化算法的K-means聚类

马志海, 刘升

上海工程技术大学管理学院,上海 201620

收稿日期:2024-01-12 出版日期:2025-09-25 发布日期:2026-01-19
通讯作者: 刘升(1966-),男,湖北黄石人,教授,博士,研究方向:智能计算,群智能系统,进化算法。Email: ls6601@163.com。
作者简介:马志海(1998-),男,安徽阜阳人,硕士研究生,研究方向:群智能算法,用户画像。
基金资助:
国家自然科学基金资助项目(61673258,61075115);上海市自然科学基金资助项目(19ZR1421600)

K-means Clustering Based on Improved Dung Beetle Optimizer

MA Zhihai, LIU Sheng

School of Management, Shanghai University of Engineering Sciences, Shanghai 201620, China

Received:2024-01-12 Online:2025-09-25 Published:2026-01-19

摘要/Abstract

摘要： 针对K-means聚类算法易受到初始聚类中心的影响且易陷入局部最优值的不足,提出一种基于改进蜣螂优化算法的K-means聚类算法。首先,引入分段线性混沌映射(Piecewise Linear Chaotic Map, PWLCM)改善种群多样性,提高算法的求解精度和收敛速度;其次,受鱼鹰算法位置识别和捕鱼策略的启发,使用其全局勘探策略替换蜣螂优化算法滚球阶段策略,可以弥补算法在滚球阶段中只依赖最差值,无法与其它蜣螂进行交流的缺点,从而增强算法的全局探索能力;然后,加入动态选择的自适应t分布扰动,增加全局开发以及局部搜索能力,通过CEC2017测试函数验证改进蜣螂优化算法的有效性和优越;最后,将改进后的蜣螂优化算法与K-means聚类算法相结合,从UCI数据集中选取6个真实的数据集与其他学者提出的群智能算法优化的K-means进行对比仿真实验,结果表明本文改进后的聚类算法具有更好的求解精度和鲁棒性。

关键词: 蜣螂优化算法, PWLCM映射, K-means聚类算法, 自适应t分布

Abstract: Cluster analysis is a data analysis method used to group similar data points into different categories or clusters. It is an unsupervised learning approach that does not require pre-defined class labels but rather automatically classifies data points based on their similarity. Through cluster analysis, similar samples are assigned to the same group, revealing similarities and differences among samples, and providing a preliminary classification of the data. Cluster analysis has been widely applied in various fields such as data mining, image processing, natural language processing, and market segmentation.
K-means clustering algorithm is the most commonly used algorithm in cluster analysis due to its simplicity, scalability, suitability for high-dimensional data, and robustness. However, K-means algorithm is highly sensitive to the initial selection of cluster centers, and improper initialization can lead to inaccurate or unstable clustering results. Swarm intelligence algorithms, which are stochastic search algorithms capable of escaping local optima, have been adopted by researchers to optimize clustering algorithms and have shown promising results. Dung beetle optimization algorithm (DBO) is a swarm intelligence optimization algorithm proposed in 2022, inspired by the rolling, dancing, foraging, stealing, and reproduction behaviors of dung beetles. Compared to classical algorithms like particle swarm optimization and whale optimization algorithm, DBO exhibits better optimization performance. However, like other swarm intelligence algorithms, the dung beetle optimization algorithm may suffer from uneven distribution and lack of population diversity during the initialization of the population. Additionally, during the rolling phase where positions are updated, the algorithm relies solely on the worst value for updating, resulting in a weaker global exploration capability.
To overcome the limitations of K-means clustering’s heavy reliance on initial cluster centers, a novel K-means clustering algorithm based on an improved beetle optimization algorithm, called POTDBO-K-means, is proposed in this study. Firstly, the beetle optimization algorithm is enhanced by incorporating a Piecewise Linear Chaotic Map (PWLCM) to improve population diversity, enhance solution accuracy, and accelerate convergence. Secondly, inspired by the osprey optimization algorithm for position recognition and fishing strategy, replacing the dung beetle optimization algorithm’s rolling stage strategy with its global exploration strategy can compensate for the algorithm’s reliance on only the worst value and its inability to communicate with other dung beetles during the rolling stage, thereby enhancing the algorithm’s global exploration capability. Then, a dynamically selected adaptive t-distribution perturbation is introduced to increase both global exploitation and local search capabilities. The effectiveness and superiority of the improved dung beetle optimizer are verified through experiments on CEC2017 test functions. Finally, the improved dung beetle optimizer is combined with the K-means clustering algorithm and compared with other K-means clustering algorithms enhanced by swarm intelligence algorithms proposed by other researchers. The comparison is conducted on six UCI datasets with different characteristics. The simulation results demonstrate that the POTDBO-K-means algorithm exhibits faster convergence, stronger optimization ability, and higher clustering accuracy.
In future work, the proposed POTDBO-K-means clustering algorithm can be applied to address challenging problems such as credit risk assessment, potential customer segmentation for the automotive industry, and user profiling for insurance products. Furthermore, further research will be conducted to combine swarm intelligence algorithms with K-means clustering in order to improve the convergence speed and clustering accuracy of the K-means algorithm.

Key words: dung beetle optimization algorithm, PWLCM mapping, K-means clustering algorithm, adaptive t-distribution

中图分类号:

TP301.6

马志海, 刘升. 基于改进蜣螂优化算法的K-means聚类[J]. 运筹与管理, 2025, 34(9): 77-83.

MA Zhihai, LIU Sheng. K-means Clustering Based on Improved Dung Beetle Optimizer[J]. Operations Research and Management Science, 2025, 34(9): 77-83.

参考文献

[1] 韩家炜,裴健,范明,等.数据挖掘:概念与技术[M].北京:机械工业出版社,2012.
[2] 高文欣,刘升,肖子雅.闪电分叉过程算法优化的K-means聚类[J].运筹与管理,2021,30(12):35-41.
[3] 胡啸,王玲燕,张浩宇,等.基于狮群优化的改进K-means聚类算法研究[J].控制工程,2022,29(11):1996-2002.
[4] 贺思云,高建瓴,陈岚.基于改进人工蜂群算法的K-means聚类算法[J].贵州大学学报:自然科学版,2017,34(5):83-87.
[5] XUE J, SHEN B. Dung beetle optimizer: A new meta-heuristic algorithm for global optimization[J]. The Journal of Supercomputing, 2023, 79(7): 7305-7336.
[6] 董奕含,喻志超,胡天跃,等.基于改进蜣螂优化算法的瑞雷波频散曲线反演方法[J].油气地质与采收率,2023,30(4):86-97.
[7] DEHGHANI M, TROJOVSKY P. Osprey optimization algorithm: A new bio-inspired metaheuristic algorithm for solving engineering optimization problems[J]. Frontiers in Mechanical Engineering, 2023, 8: 1126450.
[8] 马志海,刘升.增强型野马优化算法及其工程应用[J].计算机应用研究,2024,41(7):2061-2068.
[9] LAN K T, LAN C H. Notes on the distinction of gaussian and cauchy mutations[C]//2008 Eighth International Conference on Intelligent Systems Design and Applications, November 26-28, 2008, Kaohsiung, Taiwan, China. Piscataway: IEEE Press, 2008: 272-277.
[10] HEIDARI A A, MIRJALILI S, FARIS H, et al. Harris hawks optimization: Algorithm and applications[J]. Future Generation Computer Systems, 2019, 97: 849-872.
[11] MIRJALILI S, LEWIS A. The whale optimization algorithm[J]. Advances in Engineering Software, 2016, 95: 51-67.
[12] ABUALIGAH L, DIABAT A, MIRJALILI S, et al. The arithmetic optimization algorithm[J]. Computer Methods in Applied Mechanics and Engineering, 2021, 376: 113609.
[13] MIRJALILI S. Moth-flame optimization algorithm: A novel nature-inspired heuristic paradigm[J]. Knowledge-Based Systems, 2015, 89: 228-249.
[14] 杨菊蜻,张达敏.基于改进BA算法的K-means聚类[J].计算机应用研究,2018,35(5):1454-1457.

[1]	冯晓春, 姚娜娜, 阮俊虎, 胡祥培, 魏洪杰, 刘天军. 植保托管模式下多作业区域无人机与无人车协同作业路径优化研究[J]. 运筹与管理, 2025, 34(7): 9-15.
[2]	刘景发, 李宛桦. 基于回溯搜索算法的多行动态设施布局方法[J]. 运筹与管理, 2025, 34(6): 93-100.
[3]	朱学敏, 刘升, 朱学林, 游晓明. 基于改进平衡优化算法的K-means聚类及其应用[J]. 运筹与管理, 2025, 34(3): 37-44.
[4]	朱磊, 苏强. 改进的松鼠搜索算法求解手术时间不确定的手术病例分配问题[J]. 运筹与管理, 2025, 34(2): 31-37.
[5]	甘沛露, 宋一豪, 朱晓雄, 周支立. 融入概率矩阵分解模型的改进二部图推荐算法[J]. 运筹与管理, 2025, 34(1): 1-7.
[6]	罗梦文, 王恺. 考虑车辆路径的多工厂生产与配送联合调度[J]. 运筹与管理, 2024, 33(11): 51-57.
[7]	刘雅文, 潘大志, 池莹. 基于有效限制邻域结构的禁忌搜索求解预算最大覆盖问题[J]. 运筹与管理, 2024, 33(7): 72-78.
[8]	于翘楚, 赵明清, 罗雨婷. 基于最优权的协同过滤混合推荐算法及应用[J]. 运筹与管理, 2024, 33(7): 79-84.
[9]	张文宇, 袁永斌, 高雪, 张炳晨. 求解大规模优化问题的改进海洋捕食者算法[J]. 运筹与管理, 2024, 33(6): 14-21.
[10]	李煜, 林笑笑, 刘景森. 多策略集成的哈里斯鹰算法求解全局优化问题[J]. 运筹与管理, 2024, 33(6): 28-34.
[11]	熊福力, 储梦伶. 预制构件流水车间订单接受与调度的集成优化[J]. 运筹与管理, 2022, 31(8): 70-76.
[12]	彭大江, 叶春明, 赵灵玮. 改进的蝗虫优化算法在双目标应急物资中心选址问题中的应用[J]. 运筹与管理, 2022, 31(4): 14-20.
[13]	胡卉, 刘富鑫, 王愚勤, 冯芷郁, 王瑞. 基于改进模拟退火算法的推动式生产-配送协调优化[J]. 运筹与管理, 2022, 31(2): 15-22.
[14]	董海, 徐晓鹏. 离散回溯搜索算法求解多柔性作业车间调度[J]. 运筹与管理, 2022, 31(1): 87-91.
[15]	范厚明, 徐振林, 李阳, 杨翔. 开放式多中心需求可拆分VRP及混沌遗传模拟退火算法[J]. 运筹与管理, 2022, 31(1): 92-98.

基于改进蜣螂优化算法的K-means聚类

K-means Clustering Based on Improved Dung Beetle Optimizer

PDF

补充材料

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics