运筹与管理 ›› 2021, Vol. 30 ›› Issue (11): 127-134.DOI: 10.12005/orms.2021.0360

• 应用研究 • 上一篇    下一篇

基于集成学习策略的化工园区大气污染影响预测

王旭坪1, 于秀丽1,2, 王天腾1   

  1. 1.大连理工大学 经济管理学院,辽宁 大连 116023;
    2.大连海洋大学 海洋法律与人文学院,辽宁 大连 116023
  • 收稿日期:2020-02-05 出版日期:2021-11-25
  • 通讯作者: 于秀丽(1980-),女,辽宁大连人,副教授,博士,研究方向:信息系统集成与应急管理。
  • 作者简介:王旭坪 (1962-),男,辽宁锦州人,博士,教授,博士生导师,研究方向:电子商务与物流管理、应急管理;王天腾(1997-),男,黑龙江牡丹江人,硕士,研究方向:数据挖掘,应急管理。
  • 基金资助:
    国家自然科学基金资助项目(72071028);辽宁省经济社会发展研究课题(2021lslybkt-058);辽宁省社会科学规划基金项目(L20BGL051)

Air Pollution Impact Prediction of Chemical Industry Park Based on Ensemble Learning Strategy

WANG Xu-ping1, YU Xiu-li1,2, WANG Tian-teng1   

  1. 1. School of Economics and Management, Dalian University of Technology, Dalian 116023, China;
    2. School of Marine Law and Humanities, Dalian Ocean University, Dalian 116623, China
  • Received:2020-02-05 Online:2021-11-25

摘要: 建立科学、有效、准确的空气质量预测系统,对于保护人们的身体健康和促进社会的和谐稳定具有重要的科学价值和实际意义。研究聚焦化工园区,基于物联网背景下企业排放实时数据,融合气象信息,采用多种有监督式机器学习(决策树、多元线性回归、Lasso回归、支持向量机、Xgboost、梯度提升机、Light GBM、MLP(多层感知觉神经网络))及改进的集成学习Stacking策略实现化工园区空气质量的预测,并识别影响大气污染的关键因素。结果表明:(1)Stacking策略下的预测框架与单模型预测结果相比有统计学意义上的显著提升。(2)在Stacking策略中,初级、次级学习器的选择策略影响预测的精度和泛化性,最佳模式为初级采用强学习器,次级使用线性模型。(3)在同一园区、不同企业污染物不同排放口对空气质量影响不同,研究结论可为政府监管部门对化工园区的治理和管控提供决策支持。

关键词: Stacking策略, 机器学习, 大气污染影响, 化工园区

Abstract: Establishing a scientific, effective and accurate air quality prediction system has important scientific value and practical significance for protecting people's health and promoting social harmony and stability. In this paper, we focus on chemical industry parks, with the data of enterprise emissions and meteorological information, utilizing supervised machine learning (decision tree, multiple linear regression, Lasso regression, support vector machine, Xgboost, gradient boosting machine, Light GBM, MLP) and ensemble learning Stack strategy to realize the prediction and control of atmospheric environmental pollution in chemical industry park. The results show that: (1)The prediction results under stacking strategy have improved significantly compared with the prediction result of single model. (2)In stacking strategy, the choice of primary and secondary learners affects the accuracy and generalization of prediction. The best mode is to use strong learners at the primary level and linear models at the secondary level. (3)Different outlets in the same park and different enterprises have different impacts on air quality. In this paper, the trend of pollution events in chemical industrial parks is predicted reasonably, which can provide decision support for the government in the management and control of enterprises in chemical industry parks.

Key words: stacking strategy, machine learning, air pollution impact, chemical industrial park

中图分类号: