运筹与管理 ›› 2023, Vol. 32 ›› Issue (9): 215-221.DOI: 10.12005/orms.2023.0307

• 应用研究 • 上一篇    下一篇

基于超网络的社会化标签相似性研究

潘旭伟, 曾雪梅, 李涛   

  1. 浙江理工大学 经济管理学院,浙江 杭州 310018
  • 出版日期:2023-09-25 发布日期:2023-11-02
  • 通讯作者: 潘旭伟 (1977-),男,浙江丽水人,博士,教授,研究方向:个性化服务,知识管理,知识网络。
  • 作者简介:曾雪梅(1997-),女,湖南衡阳人,硕士研究生,研究方向:信息与知识网络,社会化标注;李涛(1991-),男,河南开封人,硕士研究生,研究方向:个性化推荐,社会化标注。
  • 基金资助:
    浙江省自然科学基金重点项目(LZ18G010001)

Hypernetwork-based Tags Similarity Measure for Social Tagging Systems

PAN Xuwei, ZENG Xuemei, LI Tao   

  1. School of Economics and Management, Zhejiang Sci-Tech University, Hangzhou 310018, China
  • Online:2023-09-25 Published:2023-11-02

摘要: 社会化标签的相似性评估是基于标签的链路预测和个性化推荐的基础。针对以向量空间矩阵和基于图或网络的标签共现关系来度量标签之间相似性的现有方法存在的割裂社会化标签系统“用户-资源-标签”三元内在关系及语义联系丢失问题,本文引入能系统刻画“用户-资源-标签”三元内在关系的超网络模型,提出基于超网络的社会化标签相似性评估方法。该方法从用户的社会化标注行为入手,将标签表示为节点,把用户对资源标注表示为超边,构建社会化标签超网络。在此基础上,建立基于超网络的社会化标签相似性度量的两个基本原则:共有超边原则和超边包含节点数原则,并据此构建基于超网络的系列社会化标签相似性度量指标。选取代表性社会化标签应用数据集,利用链路预测的AUC和Precision评价方法对构建的相似性指标进行实验评估,实验结果表明,基于单纯共超边原则以及综合共超边与超边包含节点数原则构建的标签相似性指标优于基于标签共现网络构建的标签相似性度量指标,特别是在Precision评价方面提升明显。

关键词: 社会化标签, 超网络, 标签相似性, 链路预测, 相似性度量

Abstract: Social tags express users' preferences by a user-defined way to describe online resources and build the connections between users and resources. As a valuable resource, social tags have been exploited in link prediction and personalized recommendation to solve information overload in the era of big data. Social tags similarity evaluation is the foundational issue of tag-based link prediction and personalized recommendation. The current methods of tags similarity evaluation based on such as vector space matrix, bipartite graph, tripartite graph and tag co-occurrence network split the internal relationship of user-resource-tag in social tagging systems during their transforming processes, resulting in the loss of tags semantic association to some extent. To overcome this problem, this paper innovatively introduces the hyper-network model which can systematically describe the internal ternary relationship of user-resource-tag and proposes an approach to measuring social tags similarity based on hyper-network.
The proposed approach focuses on behaviors of users' social tagging to build social tags hyper-network in which a tagging action is expressed as a hyper-edge, and tags are expressed as nodes. The constructed hypernetwork links users, resources, and tags in tagging activities by hyper-edges in that it can more accurately depicts the user's tagging behavior and maintains the intrinsic semantic association information of the user-resource-tag ternary relationship. Combining the topological structure of the social tags hyper-network and the two fundamentals of the proximity relation rules and ternal closure for describing the degree of association and similarity of objects based on object relation, two basic principles are established for measuring social tags similarity based on the constructed hyper-network. One is the principle of common hyper-edges, that is, the more common hyper-edges of two tag nodes, the more similar the two tag nodes are. Another is the principle of the number of nodes in one hyper-edge, that is, the fewer tag nodes a hyper-edge contains, the more similar these tag nodes are. Based on these two basic principles, a series of social tags similarity measures are established by referring to the logics of constructing the similarity index between nodes in general complex networks. The experimental study is conducted to verify the constructed similarity measures on the data sets from two representative social tagging applications of Delicious and Last.fm by using the AUC and Precision evaluation methods of link prediction.
In term of the AUC and Precision criterions in the link prediction, the experimental results show that the tags similarity measures constructed on the principle of pure common hyper-edge and the combined principles of the number of nodes in one hyper-edge and common hyper-edge have better performances, which are obviously better than the tag similarity index constructed on the tags co-occurrence network. Especially, the distinct improvement in the Top-N Precision evaluation of link prediction has positive significance for improving the accuracy of personalized recommendation. At the same time, the experimental results also show that adding different normalization ways of node hyper-degree into common hyper-edges have a certain negative effect on the accuracy of tags similarity measurement.
The social tags similarity measures in our proposed hyper-network based approach are built by mainly combining two basic structural features of networks: Common hyper-edges of nodes and number of nodes in one common hyper-edge. However, the situations and elements of affecting tags semantic similarity are complicated. For example, the “weak connection effect” existing in networks may affect the prediction effect of the method reflecting the strong connection relationship by a common hyper-edge, which is worth further exploration. In addition, social tags hyper-networks also have many other topological features, such as distance and path between nodes. Further work can explore the relationship between such topological features of social tags hyper-networks and the similarity of tag nodes, so as to build more effective social tags similarity measures.

Key words: social tag, hyper-network, tag similarity, link prediction, similarity measure

中图分类号: