基于机器学习超参数优化的均值-方差投资组合决策研究

doi:10.12005/orms.2024.0373

摘要/Abstract

摘要： 本文提出一种基于机器学习超参数优化的均值-方差投资组合模型。该模型包括股票预测和投资组合优化两个阶段。具体上,首先,采用基于特定概率改进的萤火虫算法(pFA)优化XGBoost的超参数,构建pFAXGBoost混合模型预测股票的收益,并将其预测能力与XGBoost, LSTM和SVR模型进行比较。其次,选择具有较高预测收益率的股票构建股票池,且考虑交易成本与上下界约束条件下,采用MV和1/N模型构建预选股票的最优投资组合。基于沪深300指数成分股的实证分析显示,相较于其他模型和基准指数,pFAXGBoost+MV模型具有更优的收益与风险指标表现。

关键词: 均值-方差投资组合, 机器学习, 超参数优化, 股票预测, 萤火虫算法

Abstract: The returns and risks of assets in the securities market are uncertain. The core problem faced with by investors is how to optimize the allocation of wealth in an uncertain environment so that the risks are minimized or the returns are maximized. To achieve this goal, the artificial intelligence and optimization methods are used by the investors studying portfolio selection problems, which provide investors with new theoretical frameworks and investment strategies. And numerous widely accepted empirical researches suggest that stock prices and returns are predictable to some extent. In this case, it is necessary to observe the change and volatility of financial data over a long time in the past so as to make a good preparation for future trend forecast and investment decisions. Up to now, existing studies have mainly focused on: (1)Statistical methods. They aim at prediction by analyzing the past price characteristics, such as linear regression, autoregressive conditional heteroscedasticity, autoregressive integrated moving average, and generalized autoregressive conditional heteroscedasticity model. And (2)machine learning methods. They are K-Nearest Neighbor, support vector regression, eXtreme Gradient Boosting, deep neural network, long-short term memory, convolutional neural networks, etc. Some comparative studies emphasize that machine learning has stronger ability to deal with non-linear and non-stationary problems than statistical models.
In this paper, a portfolio selection model is proposed using machine learning with hyperparameter optimization for the stock prediction and Mean-variance model for portfolio selection. To be specific, two stages are involved in this model: the stock prediction and portfolio optimization. In the first stage, a hybrid model combining XGBoost with a modified firefly algorithm based on specific probability (pFA) is proposed to predict stock prices for the next period, and compares its predictive ability with XGBoost, LSTM and SVR algorithms. The pFA is developed to optimize the hyperparameters of the XGBoost. In the second stage, the stock selection (stocks with higher predicted returns are selected) and asset allocation (spreading funds across selected stocks). Considering the constraints such as transaction costs and threshold constraints, the mean-variance model and equally weighted model are employed for the portfolio selection. The MV model aims to make a trade-off between maximizing returns and minimizing risks, which is expressed by a typical multi-objective optimization formula, and introduce the risk aversion coefficient to change the multi-objective formula into the mono-objective formulation.
At the same time, using China Securities 300 Index component stocks as study sample, we give a numerical example to demonstrate the designed algorithm’s performance and the proposed model’s application. We specifically and randomly select 48 stocks in the CSI300 as candidate assets, large enough for individual investors to choose stocks before forming portfolios. The sample interval is from January 2013 to December 2021. The history data is divided into six periods, every period containing four-year data divided into a training set and a testing set as a ratio of 8∶2. The training set is used to train the model and adjust the hyperparameters to get a good generalization, and the test set to evaluate the performance of the final model.
In this paper we select 19 indicators as the input of the stock prediction, and the MV model is used to conduct the portfolio’s asset allocation based on the selected high-quality asset. We use the pivoting algorithm to solve the MV model without short sales and to obtain the optimal portfolio strategy. In addition to the MV model, an equal-weight portfolio is also studied. To investigate the accuracy of stock prediction methods, four indexes, namely mean square error, root mean square error, mean absolute error, and hit ratio are used. From the experimental results, we have several important findings: (1)to improve the FA’s optimization ability, the pFA is developed, and after comparing the pFA with FA, PSO, AFSA, GA, and DE, the advantage of pFA is verified by a set of unimodal and multimodal test functions; (2)the empirical results demonstrate that the pFAXGBoost+MV model achieves better results than its counterparts and the market index in terms of return and return-risk metrics.
Considering realistic constraints, a portfolio selection model is proposed using machine learning with hyperparameter optimization for the stock prediction and Mean-variance model for the portfolio selection. And it is a convex quadratic programming problem with equality and linear inequality constraints, which is solved by a novel improved pivoting algorithm. On the one hand, it enriches the research into the modern financial decision-making theory and provides a new idea based on machine learning for the stock prediction. On the other hand, it helps investors adjust their investment strategies in the quantitative investment, and enhances the ability of individual and institutional investors to adapt to the investment environment. However, there are some limitations to this study. This paper can further be improved and extended from the following aspects: (1)Semi-variance, VaR, and skewness can be used for portfolio selection. (2)Cardinal constraint and minimum trading volume can be considered.

Key words: mean-variance portfolio selection, machine learning, hyperparameter optimization, stock prediction, firefly algorithm

中图分类号:

F727

张鹏, 党世力, 黄梅雨. 基于机器学习超参数优化的均值-方差投资组合决策研究[J]. 运筹与管理, 2024, 33(11): 197-203.

ZHANG Peng, DANG Shili, HUANG Meiyu. Mean-variance Portfolio Selection Using Machine Learning Hyperparameter Optimization[J]. Operations Research and Management Science, 2024, 33(11): 197-203.

参考文献

[1] WANG W Y, LI W Z, ZHANG N, et al. Portfolio formation with preselection using deep learning from long-term financial data[J]. Expert Systems with Applications, 2020, 143: 113042.
[2] 李斌,邵新月,李玥阳.机器学习驱动的基本面量化投资研究[J].中国工业经济,2019(8):61-79.
[3] 戴德宝,兰玉森,范体军,等.基于文本挖掘和机器学习的股指预测与决策研究[J].中国软科学,2019(4):166-175.
[4] 苟小菊,王芊.基于数据挖掘技术的股票收益率方向研究[J].运筹与管理,2021,30(1):163-169.
[5] 马甜,姜富伟,唐国豪.深度学习与中国股票市场因子投资—基于生成式对抗网络方法[J].经济学(季刊),2022,22(3):819-842.
[6] 许雪晨,田侃.一种基于金融文本情感分析的股票指数预测新方法[J].数量经济技术经济研究,2021,38(12):124-145.
[7] SERMPINIS G, STASINAKIS C, ROSILLO R, et al. European exchange trading funds trading with locally weighted support vector regression[J]. European Journal of Operational Research, 2017, 258(1): 372-384.
[8] KRAUS M, FEUERRIEGEL S. Decision support form financial disclosures with deep neural networks and transfer learning[J]. Decision Support Systems, 2017, 104: 38-48.
[9] SONG K, YAN T, DING T, et al. A steel property optimization model based on the XGBoost algorithm and improved PSO[J]. Computational Materials Science,2020, 174: 109472.
[10] ZHAO S Q, ZENG D G, WANG W H. et al. Mutation grey wolf elite PSO balanced XGBoost for radar emitter individual identification based on measured signals[J]. Measurement, 2020, 159: 107777.
[11] YANG X S. Firefly algorithms for multimodal optimization[C]//OSAMU W, THOMAS Z. Proceedings of the 5th International Conference on Stochastic Algorithms: Foundations and Applications. Berlin, Heidelberg: Springer-Verlag, 2009: 169-178.
[12] ZHANG J, TENG Y F, CHEN W. Support vector regression with modified firefly algorithm for stock price forecasting[J]. Applied Intelligence, 2019, 49: 1658-1674.
[13] CHEN W, ZHANG H Y, MEHLAWAT M K, et al. Mean-variance portfolio optimization using machine learning-based stock price prediction[J]. Applied Soft Computing, 2021, 100: 106943.
[14] WANG C F, CHU X Y. An improved firefly algorithm with specific probability and its engineering application[J]. IEEE Access, 2019, 7: 57424-57439.
[15] MARKOWITZ H M. Portfolio selection[J]. The Journal of Finance, 1952, 7(1): 77-91.
[16] YANG L, SHAMI A. On hyperparameter optimization of machine learning algorithms: Theory and practice[J]. Neurocomputing, 2020, 415: 295-316.
[17] CHEN T Q, GUESTRIN C. XGBoost: A scalable tree boosting system[C]//The 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD’16), August 13-17, 2016, San Francisco, CA, USA. New York: Association for Computing Machinery, 2016: 785-794.
[18] 张鹏.不允许卖空情况下均值-方差和均值-VaR投资组合比较研究[J].中国管理科学,2008(4):30-35.