运筹与管理 ›› 2025, Vol. 34 ›› Issue (3): 183-189.DOI: 10.12005/orms.2025.0094

• 应用研究 • 上一篇    下一篇

集成多元网络信息的期货价格波动预测:农产品玉米期货实证

张大斌1,2,3, 曾芷媚1, 凌立文1,2, 余泽汇1   

  1. 1.华南农业大学 数学与信息学院,广东 广州 510642;
    2.华南农业大学 乡村振兴研究院,广东 广州 510642;
    3.广东白云学院 大数据与计算机学院,广东 广州 510450
  • 收稿日期:2022-10-14 出版日期:2025-03-25 发布日期:2025-07-04
  • 作者简介:张大斌(1969-),男,湖北潜江人,博士生导师,教授,研究方向:预测理论与方法。
  • 基金资助:
    国家自然科学基金资助项目(71971089,72001083);广东省自然科学基金项目(2022A1515011612);广东省普通高校重点领域专项项目(2020ZDZX3009)

Prediction of Futures Price by Integrating Multivariate Network Information: An Empirical Study of Agricultural Corn Futures

ZHANG Dabin1,2,3, ZENG Zhimei1, LING Liwen1,2, YU Zehui1   

  1. 1. School of Mathematics and Information, South China Agricultural University, Guangzhou 510642, China;
    2. Rural Development Institute, South China Agricultural University, Guangzhou 510642, China;
    3. School of Big Data and Computer Science, Guangdong Baiyun University, Guangzhou 510450, China
  • Received:2022-10-14 Online:2025-03-25 Published:2025-07-04

摘要: 以互联网为载体的媒体信息作为公众的主要信息来源,对期货市场参与者的投资决策产生影响,同时也会影响市场的具体表现。本文聚焦于挖掘网络信息对期货市场的赋能作用,提出一种集成多元网络信息的期货价格波动预测方法,以农产品玉米期货为实证对象,验证了预测方法的有效性。首先采用KL-LDA模型和SnowNLP方法,基于相关的新闻信息分别构建主题指数和情绪指数,并引入累积衰减因子对情绪指数进行优化;其次,利用百度需求图谱构建核心关键词库,使用相应的百度指数建立网络关注度指数;最后,通过递归特征消除方法RFE构建预测变量组合,基于深度学习模型LSTM进行期价预测。玉米期货实证结果:与基于单变量预测的LSTM模型相比,该方法在MAE,RMSE和MAPE指标上分别降低45%,41%和43%,能够有效测度网络信息对玉米期价预测的价值,提升模型预测精度。

关键词: 多元网络信息, 新闻信息, 网络关注度, 深度学习模型, 玉米期价预测

Abstract: In China’s financial market, the futures market has the important functions of risk aversion and price discovery. The futures price is the embodiment of the market expectations of agricultural products, and can reflect the actual supply and demand of agricultural products. Corn futures is a representative variety of bulk agricultural products, and it is also one of the most active futures in China’s futures market. It is always at the forefront of commodity futures in terms of trading volume and investor participation. With the improvement of the marketization level of corn, the uncertainty of the market increases sharply, the price of corn futures fluctuates continuously, the risk of investing in grain storage intensifies, and the demand for industrial hedging increases. Effective and accurate price prediction can provide farmers with guidance for planting and trading, give a reference basis for the production and trade of spot enterprises, and transmit efficient information to regulators to enhance the predictability and pertinence of national policy regulation.
Benefiting from the rapid development of the Internet, a large number of unstructured data related to the market has brought an unimaginable amount of information, which not only affects the investment decisions of market participants, but also affects the specific performance of the market. In the performance, news is passively received by the public, especially reports on major policies, emergencies and weather in the place of origin, which can have a great impact on market sentiment, thus promoting the fluctuation of futures prices. On the other hand, the search engine is a way for people to access information actively. Each trader sends a request signal to the network in the form of unstructured keywords, and the engine returns the corresponding content to assist traders in making judgments on market conditions and developing trading strategies. At the same time, the engine also records the search frequency with structured data. In the current futures market, these two forms are also the easiest channels for traders and supervisors to obtain multiple information. Among them, news topics and emotional tendencies are the main expressions of the influence of news information on market fluctuations, and search data is the main reflection of the trend of public concern. The key issues of the research are the measurement of the number of topics, the quantification of sentiment index and the selection of search keywords.
In addition, many studies in the field of price forecasting show that neural network model can efficiently fit and model the observed data due to its structural characteristics and high intelligence, which greatly improves the accuracy of financial time series forecasting.
Based on the above ideas, in addition to the closing price of corn futures, the basic data set carried out in this paper also introduces two source data of relevant news and keyword search volume, and proposes a corn futures price prediction method integrating multiple network information. Since all kinds of major events are mostly published in the form of news text, and there is a lack of effective prior knowledge about the influence of corn futures prices, this method first uses Kullback-Leibler divergence (KL) to determine the key parameters of the topic model Latent Dirichlet Allocation (LDA), and then analyzes the news texts and extracts the topic index. Secondly, the SnowNLP method is used to judge the emotional tendency of news, and the text sentiment index is further optimized from the perspective of time cumulative effect. In addition, the related keyword map of corn futures is constructed with the help of Baidu. According to the keyword map, the Baidu search volume corresponding to each keyword is captured, and the network attention index is synthesized after it is filtered by the Spearman correlation test. In order to integrate the most effective predictor variables and reduce the redundancy of information, the Recursive Feature Elimination (RFE) method is used to construct a combination of predictor variables. Considering that the extracted multivariate indexes are all time series data, the Long Short-Term Memory (LSTM) model is used to complete the final multi-step prediction of corn futures price. The empirical results show that the proposed method is better than the benchmark models SVR, RF and BPNN, and has better prediction performance in the medium and long term. In addition, compared with the LSTM model based on univariate, the proposed method reduces the MAE, RMSE and MAPE indexes by 45%, 41% and 43% respectively, and shows significant performance advantages in Diebold-Mariano (DM) statistical tests. It shows that the proposed method can effectively measure the value of multivariate network information for the prediction of corn futures price and improve the prediction accuracy of the model.
Since this paper focuses on the enabling effect of multiple network information on the prediction of futures price, it only discusses the effectiveness of predictor variables, but does not make a further analysis of the importance of different complex variables. Meanwhile, we only use corn futures price as an financial variable in the research. In the subsequent research work, the relevant content will continue to be improved and expanded.

Key words: multivariate network information, news information, Internet attention, deep learning model, corn futures price prediction

中图分类号: