Operations Research and Management Science ›› 2024, Vol. 33 ›› Issue (8): 213-218.DOI: 10.12005/orms.2024.0273

• Application Research • Previous Articles     Next Articles

Three-way Clustering Model Integrated Decomposition Ensemble Learning for Forecasting Stock Price

BAI Juncheng1, SUN Bingzhen1, GUO Yuqi1, CHEN Youwei1, GUO Jianfeng2   

  1. 1. School of Economics and Management, Xidian University, Xi’an 710071, China;
    2. Research Center for Computational Finance and Risk Management, Xi’an University of Posts and Telecommunications, Xi’an 710061, China
  • Received:2021-11-18 Online:2024-08-25 Published:2024-10-29

融合三支聚类与分解集成学习的股票价格预测模型

白军成1, 孙秉珍1, 郭誉齐1, 陈有为1, 郭建峰2   

  1. 1.西安电子科技大学 经济与管理学院,陕西 西安 710071;
    2.西安邮电大学 计算金融与风险管理研究中心,陕西 西安 710061
  • 作者简介:白军成(1993-),男,甘肃镇原人,博士研究生,研究方向:机器学习;孙秉珍(1979-),通讯作者,男,甘肃宁县人,博士,教授,博士生导师,研究方向:数据科学与智能决策,应急管理决策,数据驱动的医疗决策。
  • 基金资助:
    国家自然科学基金资助项目(72071152);陕西省杰出青年科学基金项目(2023-JC-JQ-11);西安市科技项目(2022RKYJ0030);教育部人文社会科学研究项目(22YJA630008);中央高校基础研究项目(20101236618,20101236262)

Abstract: Accurate trend analysis and real-time price prediction are effective ways to achieve optimal investment returns. However, traditional forecasting methods face challenges in the financial markets, which are influenced by changes in the objective economic environment, investors’ expected returns, and other underlying factors. How to discover a reliable forecasting tool in uncertain environments and improve prediction accuracy is a scientific issue worthy of in-depth exploration.
   This paper introduces the idea of decomposition ensemble along with the theory of three-way decisions, and proposes a composite forecasting model based on three-way clustering. First, the Complementary Ensemble Empirical Mode Decomposition (CEEMD) method is used to decompose the original time series into several relatively stable sub-series, thereby reducing the complexity of the original time series while uncovering hidden information. Next, to address the different properties of the sub-series, sample entropy is used to measure the complexity of each sub-series, and a probabilistic rough set based on Bayesian risk decision is constructed to classify the sub-series into core, marginal, and trivial domains. Then, to avoid the lack of input information or interference from redundant information, a phase space reconstruction method is employed to determine the optimal input structures for Elman neural networks, extreme learning machines, and BP neural networks to predict the core, marginal, and trivial domains, respectively. Finally, the proposed model is applied to the prediction of ANY stock prices in the U.S. market, as well as to the prediction of important international and domestic stock indices and their constituent stocks.
   The method proposed in this paper demonstrates good predictive performance for stock prices, and its outstanding results can be attributed to the following factors: First, the CEEMD effectively uncovers hidden information in the time series. Second, three-way clustering enhances the adaptability of the forecasting method. Third, phase space reconstruction adaptively constructs the input structures of the neural networks. Theoretically, the integration of granular computing with decomposition and integration methods represents a beneficial attempt and exploration in constructing complex dynamic data forecasting decision models and methods. From the perspective of time series complexity, the construction of a three-way clustering model based on Bayesian risk decision and probabilistic rough set offers a new perspective to enrich the theory of three-way decisions. In practice, achieving accurate stock price predictions can enable investors to more effectively avoid future risks and provide scientific support and reference for practical investment decisions.

Key words: three-way clustering, complementary ensemble empirical mode decomposition, stock price forecasting

摘要: 准确的趋势判断与实时价格预测是获得理想投资收益的有效途径。现实的金融市场受客观经济环境变化,投资者预期回报以及其他潜在因素影响,使得传统预测方法面临较多的挑战和压力。如何在不确定的环境中发现一种可靠的预测工具,提高预测的准确性,将是值得深入探讨的科学问题。为了获得准确的预测,帮助投资者赢得最大利润,本文引入分解集成思想和三支决策理论,提出了一种基于三支聚类和分解集成的复合预测方法。首先,使用互补集成经验模态分解方法将原始时间序列分解成若干个相对平稳的子序列,实现降低原始时间序列复杂性的同时挖掘了隐藏的信息。其次,为了针对性地处理不同属性的子序列,构建了基于贝叶斯风险决策的概率粗糙集进行三支聚类。接着,为了避免输入信息的欠缺或者冗余信息的干扰,采用基于相空间重构的特征选择方法确定不同神经网络的输入结构。最后,将提出的方法应用于美股ANY价格预测和国际、国内的重要股票指数以及其成分股预测验证其有效性和实用性。同时为把粒计算思想方法与分解集成融合,构建复杂动态数据预测决策模型与方法进行了有益的尝试和探讨。此外,研究结果将为投资者的实际投资决策提供科学的支持与参考。

关键词: 三支聚类, 互补集成经验模态分解, 股票价格预测

CLC Number: