基于SCN函数共轭梯度方向的稀疏支持向量机特征分块分解算法

doi:10.12005/orms.2025.0148

摘要/Abstract

摘要： 随着机器学习分类算法在多模态大数据中的广泛应用,对高维数据进行准确分类变得迫切而重要。处理高维数据时,传统支持向量机模型常受冗余特征的影响,导致分类精度降低。因此,实现特征稀疏化的方法变得至关重要。虽然许多学者提出了使用添加正则化项的方法进行稀疏化,但其本质上都是构建一个近似于L₀范数的函数,与L₀范数在稀疏性方面仍存在差距。为了获得更好的稀疏分类结果,本文利用L₀范数构建稀疏支持向量机模型,并运用强可转化非凸函数将L₀范数转化为可微凸凹连续函数,进一步解决L₀范数导致的直接计算困难问题,从而可以使用梯度下降算法求解。本文在五个高维数据集上进行了CGDL-SVM算法与其他经典算法的对比实验,结果表明,在保持相近分类精度的前提下,CGDL-SVM算法在稀疏性方面显著优于其他算法。

关键词: 稀疏性, L₀范数, 支持向量机, 强可转化非凸函数

Abstract: With the widespread application of machine learning classification algorithms in multimodal big data, the accurate classification of high-dimensional data becomes urgent and essential. As the feature dimensions of the classification objects continue to increase, the number of features involved in the final classification result also increases, leading to a decrease in classification accuracy. In practical applications, such high-dimensional classification results need to be more effective. Therefore, in multimodal large models, how to sparsify high-dimensional features has become an urgent issue in many practical classification applications.
Traditional support vector machine models lack feature selection capabilities and are often affected by redundant features, decreasing classification accuracy. Thus, methods of achieving feature sparsification have become crucial. Many researchers have proposed to use methods that involve adding regularization terms for sparsification. Since the L₀ norm is non-convex and non-smooth, belonging to an NP-hard problem, solving it directly is computationally challenging. As a result, some researchers have suggested using the L₁ norm and the L_p norm as penalty terms to simplify the calculations while achieving similar sparsity effects. However, the core idea of these methods is to construct a function that approximates the L₀ norm. Although they address the computational difficulties, there is still room for improvement regarding sparsity.
This paper presents a sparse support vector machine approach based on the L₀ norm. Considering the non-convex and discontinuous nature of the L₀ norm, the paper employs a strongly transformable non-convex function to convert the L₀ norm into a differentiable convex-concave continuous function. For the transformed convex-concave minimax problem, it is equivalent to a bilevel programming problem. The upper-level problem is solved by using both the conjugate gradient descent and the steepest descent algorithm, while the lower-level problem’s optimal solution can be directly obtained and substituted for the upper-level problem. This transformation turns the original problem into a convex optimization problem, thus addressing the difficulty of directly computing the L₀ norm. Due to its equivalence, this model effectively retains the sparsity of the L₀ norm. Finally, a feature block decomposition algorithm named CGDL-SVM is constructed. The basic idea is to divide samples into multiple small blocks based on features and solve them sparsely, and then merge the samples after block sparsification optimization and perform further sparsification optimization to obtain the final decision classification surface. This process simultaneously ensures classification accuracy while reducing the complexity of high-dimensional feature computation.
In the numerical experiments, we first compare the CGDL-SVM algorithm with three sparse support vector machine algorithms using other regularization terms on five UCI datasets, demonstrating that L₀ norm regularization is superior to other regularization terms in terms of sparsity. Then, the CGDL-SVM algorithm is compared with four classical sparse algorithms in terms of sparsity and accuracy on five high-dimensional real-world datasets, and the results show that the CGDL-SVM algorithm not only performs well in terms of classification accuracy, especially excelling in sparsity but also exhibits excellent performance in high-dimensional datasets, indicating good practicality.
In summary, the proposed algorithm in this paper has better sparsity while ensuring high classification accuracy, effectively balancing the contradiction between classification accuracy and sparsity, and providing new ideas for sparse support vector machine research.

Key words: sparse, L₀ norm, support vector machine, strongly convertible nonconvex function

中图分类号:

TP181

潘阳, 孟志青, 温国栋, 蒋敏. 基于SCN函数共轭梯度方向的稀疏支持向量机特征分块分解算法[J]. 运筹与管理, 2025, 34(5): 89-96.

PAN Yang, MENG Zhiqing, WEN Guodong, JIANG Min. Feature Block Decomposition Algorithm of Sparse SupportVector Machine Based on SCN Function[J]. Operations Research and Management Science, 2025, 34(5): 89-96.

参考文献

[1] TAIEB A, BERKOVIC G, HAIFLER M, et al. Classification of tissue biopsies by Raman spectroscopy guided by quantitative phase imaging and its application to bladder cancer[J]. Journal of Biophotonics, 2022, 15(8): e202200009.
[2] ZHANG Y P, WANG S H, XIA K J, et al. Alzheimer’s disease multiclass diagnosis via multimodal neuroimaging embedding feature selection and fusion[J]. Information Fusion, 2021, 66: 170-183.
[3] 张海,王尧,常象宇,等.L_1/2正则化[J].中国科学:信息科学,2010,40(3):412-422.
[4] TIBSHIRANI R. Regression shrinkage and selection via the lasso[J]. Journal of the Royal Statistical Society: Series B (Methodological), 1996, 58(1): 267-288.
[5] MANGASARIAN O L. Exact 1-norm support vector machines via unconstrained convex differentiable minimization[J]. Journal of Machine Learning Research, 2006, 7: 1517-1530.
[6] SHAO Y H, LI C N, LIU M Z, et al. Sparse L_q-norm least squares support vector machine with feature selection[J]. Pattern Recognition, 2018, 78: 167-181.
[7] YAO L, ZENG F, LI D H, et al. Sparse support vector machine with L_p penalty for feature selection[J]. Journal of Computer Science and Technology, 2017, 32(1): 68-77.
[8] SUN J, QU W T. DCA for sparse quadratic kernel-Free least squares semi-supervised support vector machine[J]. Mathematics, 2022, 10(15): 2714-2730.
[9] LPEZ J, MALDONADO S, CARRASCO M. Double regularization methods for robust feature selection and SVM classification via DC programming[J]. Information Sciences, 2018, 429: 377-389.
[10] JIANG M, SHEN R, MENG Z Q, et al. Exact penalty algorithm of strong convertible nonconvex optimization[J/OL]. arXiv, 2022: 2202. 07317v5[2023-03-05]. http://arxiv.org/pdf/2202.07317.
[11] 袁亚湘,孙文瑜.最优化理论与方法[M].北京:科学出版社,1997.
[12] BRADLEY P S, MANGASARIAN O L. Feature selection via concave minimization and support vector machines[C]//SHAVLIK J W. ICML’98: Proceedings of the Fifteenth International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers Inc., 1998: 82-90.
[13] 鲁伊莎,王凯明,肖玉柱,等.基于高斯近似l₀范数典型相关分析的特征选择[J].计算机仿真,2020,37(4):234-238.
[14] DUDA R O, HART P E. Pattern Classification and Scene Analysis[M]. New York: John Wiley & Sons, Inc., 1973.
[15] GUYON I, WESTON J, BARNHILL S, et al. Gene selection for cancer classification using support vector machines[J]. Machine Learning, 2002, 46: 389-422.
[16] MALDONADO S, LPEZ J. Synchronized feature selection for support vector machines with twin hyperplanes[J]. Knowledge-Based Systems, 2017, 132: 119-128.