运筹与管理 ›› 2025, Vol. 34 ›› Issue (1): 1-7.DOI: 10.12005/orms.2025.0001

• 理论分析与方法探讨 •    下一篇

融入概率矩阵分解模型的改进二部图推荐算法

甘沛露, 宋一豪, 朱晓雄, 周支立   

  1. 西安交通大学 管理学院,陕西 西安 710049
  • 收稿日期:2022-05-29 出版日期:2025-01-25 发布日期:2025-05-16
  • 通讯作者: 宋一豪(1998-),男,山东临沂人,硕士研究生,研究方向:信息管理。Email: 13184136737@163.com。
  • 作者简介:甘沛露(1979-),男,陕西西安人,博士研究生,研究方向:信息管理。
  • 基金资助:
    国家自然科学基金资助项目(71971168)

Improved Bipartite Graph Recommendation Algorithm Integrating Probability Matrix Factorization Model

GAN Peilu, SONG Yihao, ZHU Xiaoxiong, ZHOU Zhili   

  1. School of Management. Xi’an Jiaotong University, Xi’an 710049, China
  • Received:2022-05-29 Online:2025-01-25 Published:2025-05-16

摘要: 针对历史数据稀疏和分布不均衡影响二部图算法推荐效果的问题,一方面通过带约束的概率矩阵分解模型预测项目评分,设置权重对初始评分数据矩阵进行填充以扩充数据;另一方面,在传统二部图推荐算法的研究基础上,通过修正用户评分标准、融入时间效应因素、扩充用户评分信息,从而改进资源初始配置和分配方式以充分利用历史数据,实现对二部图推荐算法进行改进。最后,使用推荐算法领域常用的MovieLens数据集采用五折交叉验证的方式进行实验,并与传统二部图推荐算法进行比较。实验结果表明,每一步改进都提高了二部图算法的推荐效果,并且二部图算法与概率矩阵分解模型结合后,算法的推荐效果有显著提升。

关键词: 二部图推荐算法, 数据稀疏性, 概率矩阵分解, 矩阵填充

Abstract: With the development of the Internet technology and the diversity of personal demand, personalized recommendation attracts wide attention. For some items, such as movie, goods, news and so on, interaction data is generated as users and items interact mutually, which is mainly achieved by users’ selection of items. Based on such interaction data, user preference behavior information can be mined by the recommendation algorithm. Accordingly, more items that may be favored by users will be recommended to them, which implies that personal recommendation is achieved. Such data mining requires adequate historical data to achieve a great recommendation performance. However, interaction data is often sparse and unbalanced as each user has only selected a few items, and each item has not been selected by all users. Therefore, the recommendation effect of the algorithm will be greatly affected. A bipartite graph based on the characteristics of interaction between users and items is widely used as an effective recommendation algorithm. Although it has a good recommendation effect, it is similar to other recommendation algorithms and is also affected by sparse and unbalanced interaction data of users and items. In the light of this challenge, this study is committed to mitigating impacts of sparse and unbalanced data on the bipartite graph recommendation algorithm and improving its recommendation performance.
The core idea of the traditional bipartite graph recommendation algorithm is to allocate resources to items by two steps that include initial resources acquisition and reallocating resources to obtain final items resources so that the item can be ranked and recommended to each user. Accordingly, on the one hand, this study improves the acquisition criteria of initial resources and reallocation method to obtain final resources to make full use of historical interaction data. Firstly, modify the user scoring criteria. Then, incorporate time effect factors. Lastly, expand the item attribute information of user scoring. On the other hand, the probability matrix decomposition model transforms the initial matrix with higher dimensions into two lower matrices, and estimates the initial matrix through the product of these two matrices, thus mitigating the problem caused by the lack of data. Therefore, in the light of the characteristics of the problem, this study predicts score of some items which are not selected by users through the constrained probability matrix decomposition model. Then predicted data will be filled into the initial interaction data in set weights to make full use of real data and predicted data. Afterwards, we use a benchmark dataset which is commonly used in the research field of recommendation algorithms, and the MovieLens dataset to test performance of algorithms improved to different degrees. The data has been proved to be sparse. The experiment is conducted in the 50 fold cross validation, and compares each improved steps with the results of the traditional bipartite graph recommendation algorithm. The experiment uses commonly used recommended indicators, including accuracy rate, hit rate, ranking rate, and comprehensive indicators F1, which integrate accuracy rate and hit rate, as well as precision indicators, including root mean square error, average absolute error, to evaluate the performance of the algorithms.
The experiment results show that the hit rate of all algorithms exceeds 50%, but the improved algorithm improves the hit rate to more than 52% while other factors are the same. In addition, the accuracy rate of the algorithm also rises slowly. With the continuous improvement of the algorithm, the F1 value considering the hit rate and accuracy rate of the algorithm gradually increases, which shows that the improvement of the bipartite graph recommendation algorithm in this study is effective. Therefore, there may be some latent user preference information in time factor or item category attribute factor. The algorithm has to integrate other information related to user interest more comprehensively, which is of great significance to improving the performance. What is more, the recommendation performance of the bipartite graph algorithm will significantly improve when it is merged with the probability matrix decomposition model. As a result, the improvement in this study is very effective and makes great contributions to further development of recommendation algorithms. In the time of information overload, it is very important and meaningful to reduce people’s information filtering costs and improve the efficiency of information searching. As one of the most important research directions of information filtering, there are still many aspects of the recommendation algorithm worth being further explored, which has great significance for both theoretical development and practical problems solving.
Although the item attribute information is integrated into this study, only the category attribute information of the item is considered. For the item, there is more attribute information. Take movie as an example, there is also attribute information such as the character and director of the movie. The more item attribute information is complemented, the better recommendation performance of the bipartite graph algorithm will be achieved. Besides, the research on the probability matrix decomposition algorithm in this study is limited to the potential eigenvectors of users and projects. There are still some details to improve, such as incorporating time factors, adding trust relationships, etc. Thus, the accuracy rate of predicted data by probability matrix decomposition model may be higher, improving the recommendation performance of the bipartite graph recommendation algorithm.

Key words: graph recommendation algorithm, data sparsity, probabilistic matrix factorization, matrix filling

中图分类号: