运筹与管理 ›› 2025, Vol. 34 ›› Issue (2): 111-117.DOI: 10.12005/orms.2025.0050

• 理论分析与方法探讨 • 上一篇    下一篇

基于水平Choquet积分的鲁棒多元回归模型及其应用

高晓辉, 巩在武   

  1. 1.南京信息工程大学管理工程学院,江苏南京 210044;
    2.南京信息工程大学风险治理与应急决策研究院,江苏南京 210044
  • 收稿日期:2023-02-25 出版日期:2025-02-25 发布日期:2025-06-04
  • 通讯作者: 巩在武(1975-),男,山东临沂人,教授,博士生导师,研究方向:群决策,灾害管理。Email: zwgong26@163.com。
  • 作者简介:高晓辉(1988-),男,陕西延安人,博士研究生,研究方向:预测理论研究
  • 基金资助:
    国家自然科学基金资助项目(71971121);江苏省研究生科研与实践创新计划资助项目(KYCX21-1034)

Robust Multiple Regression Prediction Model Based on Level Dependent Choquet Integral and its Application

GAO Xiaohui, GONG Zaiwu   

  1. 1. School of Management Science and Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China;
    2. Research Institute for Risk Governance and Emergency Decision-making, Nanjing University of Information Science and Technology, Nanjing 210044, China
  • Received:2023-02-25 Online:2025-02-25 Published:2025-06-04

摘要: 传统多元回归在样本受限的情况下会存在信息挖掘不足的问题, 为此本文提出了基于水平Choquet积分的鲁棒多元回归模型。首先,该模型将自变量和因变量之间的关系通过0-1规划来进行检验,对异常点进行剔除,保证所使用的数据不受到干扰。其次,对数据进行水平Choquet积分处理,其目的是在考虑指标间交互作用的基础上得到更丰富的数据样本,更加深入地挖掘原始数据中的信息。最后,将得到的精细化的样本数据进行分数阶累加,进而建立多元回归模型同时运用最小二乘原理获得模型参数,并通过灰狼优化算法进行分数阶累加系数的优选,旨在提高模型的性能。在此基础上, 将新模型应用于中国船队的二氧化碳排放量预测,研究结果表明:基于水平Choquet积分的鲁棒多元回归模型预测效果优于其它经典的模型,同时本文所设计数据挖掘体系(0-1异常值检验以及水平Choquet积分丰富数据)也可以在其他众多的预测模型中应用。

关键词: 水平Choquet积分, 鲁棒多元回归, 分数阶累加, 灰狼优化算法

Abstract: In the face of complex data, the existing data outlier method has been difficult to meet the demand. Especially the modeling data suffers from various interference, leading to the deviation of modeling results from the model. The consistency plays an important role in the complex fluctuating data modeling, where the hidden inconsistencies pose a significant threat to the model performance. Therefore, it is urgent to find suitable methods for correcting data. In the field of decision-making, robust ordinal regression is achieved by repeatedly communicating with decision-makers to obtain more robust parameter results. In this paper, this idea is introduced into the prediction model to identify inconsistencies in the data to improve the robustness of the model. Besides, limited data limits the performance of models in forecasting process. However, fully mining the hidden information contained in data under existing conditions and utilizing the information of existing data is a concerned issue in current prediction models. Level dependent Choquet integral is an interval division based on traditional Choquet integral, which obtains more data information through more precise division. It can effectively solve the problem of insufficient data information mining in existing data. Multiple regression is widely used in many fields, but there are still two shortcomings in dealing with multivariate prediction problems. The relationship between dependent and independent variables should be considered simultaneously for outlier detection, that is, whether there is an exception in the whole rather than just a single sequence. Traditional decomposition techniques do not consider the interaction between variables when enriching data, and it is necessary to enrich sample data and fully mine information based on the consideration of interaction. Therefore, effectively solving the above two problems is of great significance for improving the performance of multiple regression models, and providing new ideas for the development of predictive model.
This article proposes a robust multiple regression model based on horizontal Choquet integration. Firstly, the model first checks the relationship between dependent and independent variables through 0-1 planning. If the results are all 0, there will be no outliers, that is, the data is consistent. If not all 0, there will be outliers that make the data inconsistent, and then the outliers will be eliminated to ensure that the data used are not disturbed. Secondly, level dependent Choquet integration processing is carried out on the data to obtain more precise data through interval division. The purpose is to obtain more abundant data samples on the basis of considering the interaction between indicators to deeply mine the information in the original data. Finally, the refined sample data obtained is subject to fractional order accumulation. The multiple regression model is established using the least squares principle to obtain model parameter estimates. The grey wolf optimization algorithm is used to optimize the fractional order accumulation coefficients to improve the performance of the model. The significance of fractional order accumulation aims to improve the predictive performance of multiple regression models. The multiple regression model established based on the sequence obtained by the r-order accumulation operator has more selectivity than the traditional multiple regression. When r equals 0, it is the traditional multiple regression. Therefore, adding the r-order accumulation operator is an extension of the traditional multiple regression. On this basis, the new model is applied to the prediction of carbon dioxide emissions of the Chinese fleet. The research results show that the prediction effect of the robust multiple regression model based on level dependent Choquet integration is better than other classic models. At the same time, the data mining system designed in this paper can also be applied to many other prediction models.

Key words: level dependent Choquet integral, robust multiple regression, fractional order accumulation, grey wolf optimization

中图分类号: