运筹与管理 ›› 2025, Vol. 34 ›› Issue (3): 163-169.DOI: 10.12005/orms.2025.0091

• 应用研究 • 上一篇    下一篇

一种融合多源数据信息的沪铜期货价格预测新方法

孙景云1,2, 邴贵英1   

  1. 1.兰州财经大学 统计与数据科学学院,甘肃 兰州 730020;
    2.甘肃经济发展数量分析研究中心,甘肃 兰州 730020
  • 收稿日期:2023-03-02 出版日期:2025-03-25 发布日期:2025-07-04
  • 基金资助:
    国家自然科学基金资助项目(72061020);甘肃省科技计划项目(21JR1RA280);2022年陇原青年创新创业人才项目(2022LYQN77)

A New Prediction Method for Shanghai Copper Futures Price Integrating Multi-source Data Information

SUN Jingyun1,2, BING Guiying1   

  1. 1. School of Statistics and Data Science, Lanzhou University of Finance and Economics, Lanzhou 730020, China;
    2. Center for Quantitative Analysis of Gansu Economic Development, Lanzhou 730020, China
  • Received:2023-03-02 Online:2025-03-25 Published:2025-07-04

摘要: 本文以上海期货市场的沪铜期货为研究对象,分别将宏观经济数据和百度搜索关键词信息作为影响沪铜期货价格变化的宏观经济因素和投资者微观关注度特征,提出了一种预测沪铜期货价格的新模型:SC-KPCA-KELM。首先对多源数据信息集进行系统聚类,然后对聚类结果利用核主成分分析法进行特征提取,最后将提取出的主要特征作为预测因子,分别采用4种机器学习方法对沪铜期货的月度价格进行预测。实证结果表明,在本文预测框架下综合利用宏观经济数据和百度搜索信息的预测模型在水平和方向预测精度上均获得了更好的预测性能。本文的方法可为铜相关企业和期铜投机者提供一定的决策依据。

关键词: 沪铜期货, 百度指数, 系统聚类, 核主成分分析, 核极限学习机

Abstract: Of the commodities, copper, as the most important industrial raw material, is widely used in various fields of China’s national economy. In the process of futures trading, futures prices are the focus of market participants. However, the price of Shanghai copper futures is affected by various factors, which makes its price change have great uncertainty, which not only brings great risks to speculative traders in the copper futures market, but also has important impacts on the production and operation of enterprises and the stability of the market. Therefore, this paper takes Shanghai copper futures as the research object and analyzes the main factors affecting its price, so as to reveal the effectiveness of the copper futures market and the law of price changes. This is of great significance to our successful use of futures as an investment tool to ensure stable economic development.
Although a large number of scholars have carried out research on the prediction of non-ferrous metal futures prices and related financial time series for recent years, the current research on Shanghai copper prices combined with multiple factors is still in its infancy. In terms of index selection, historical price data and macroeconomic data have been mostly used in the forecast research on Shanghai copper prices, and the impact of investor attention information on Shanghai copper futures prices has been rarely considered. Therefore, measuring the impact of investor attention on the price of Shanghai copper futures is a challenging task. In terms of feature extraction, data reduction is inevitable due to a large number of exogenous variables. If a large amount of exogenous information is directly dimensionally reduced, the extraction of information may not be sufficient when the amount of information is relatively complex. Therefore, clustering information before dimensionality reduction, and then categorically extracting it should increase the effectiveness of information extraction. This paper first integrates macro variables and Baidu search keyword information, and then uses the idea of classification first and then dimensionality reduction to extract effective auxiliary prediction information from multi-source data information, and then uses a variety of machine learning methods to construct a Shanghai copper futures price prediction model, and uses improvement rate indicators and statistical test methods to make an evaluation.
This paper integrates Baidu search information and macroeconomic data to propose a new model for Shanghai copper futures price forecasting. Firstly, the systematic clustering method is used to classify and integrate the multi-source dataset, and the KPCA method is used to reduce the dimensionality and feature extraction, and finally the machine learning method is used to obtain the final monthly price prediction value of Shanghai copper futures. Our research mainly has four conclusions: (1)Using mixed data sets as exogenous auxiliary prediction information has better prediction accuracy than using single data sets. (2)The method of first clustering and then nucleating principal component extraction of multi-source and multi-dimensional data is effective. The similar information is integrated through the clustering process, and then the KPCA method is used to extract and reduce the dimensionality of the data sets with high similarity, which can more fully extract the exogenous auxiliary information related to the Shanghai copper futures price, so as to improve the prediction accuracy. (3)The four machine learning prediction methods of SVR, RF, ELM and KELM are compared, and the prediction model using the KEM method in this paper is significantly better than other benchmark models in horizontal and directional prediction accuracy. (4)Based on the prediction results of this paper, the prediction method of first clustering and then feature extracting for different research objects shows good prediction performance, indicating that the prediction framework has certain robustness in multi-source information processing.
This paper considers the impact of investor attention on the price fluctuation of Shanghai copper futures, and integrates multi-source information to make combined prediction and obtain a good prediction effect. But there is still room for further improvement in the model, and we can incorporate more exogenous information as a secondary predictor. For example, unstructured text information such as financial news headlines and Weibo stock bar comments related to Shanghai copper futures can be used to construct investor sentiment indexes, and further improve the accuracy of forecasts by adding more exogenous forecast information.

Key words: Shanghai copper futures, Baidu index, system clustering, kernel principal component analysis, kernel extreme learning machine

中图分类号: