Operations Research and Management Science ›› 2021, Vol. 30 ›› Issue (11): 71-75.DOI: 10.12005/orms.2021.0352

• Theory Analysis and Methodology Study • Previous Articles     Next Articles

A Knowledge Points Labeling Method for Test Questions Based on Bipartite Graph

GUO Chong-hui1, XING Xiao-yu1, WEI Wei1,2   

  1. 1. Institute of Systems Engineering, Dalian University of Technology, Dalian 116024, China;
    2. Center for Energy, Environment & Economy Research, Zhengzhou University, Zhengzhou 450001, China
  • Received:2019-03-25 Online:2021-11-25

一种基于二部图的试题知识点标注方法

郭崇慧1, 邢小宇1, 魏伟1,2   

  1. 1.大连理工大学 系统工程研究所,辽宁 大连 116024;
    2.郑州大学 能源-环境-经济研究中心,河南 郑州 450001
  • 作者简介:郭崇慧(1973-),男,博士,教授,博士生导师,主要研究方向:数据挖掘与知识发现、决策理论与方法等;邢小宇(1994-),男,博士研究生,研究方向:数据挖掘、自然语言处理;魏伟(1988-),男,博士,讲师,研究方向:文本挖掘与知识发现。
  • 基金资助:
    国家自然科学基金资助项目(71771034,72001191);中央高校基本科研业务费资助项目(DUT21YG108)

Abstract: Aiming at the problem of knowledge points labeling for test questions in online education, a bipartite graph-based method is proposed for knowledge points labeling in this paper. Firstly, considering the problem that the granularity of knowledge points is fuzzy, the knowledge graph of knowledge points is constructed to integrate knowledge points. Secondly, based on the corpus of textbooks and test questions, we extract the knowledge points bipartite graph as well as the test questions bipartite graph, calculate the weights of edges by the TF-IDF to construct a bipartite graph model between knowledge points and test questions. Besides, similarity measurement method based on term frequency weighting is used to calculate the similarity between questions and knowledge points, marking the knowledge points with the highest similarity to the test questions. Finally, the high school history test questions on the online education platform are used as the experimental data sets.The experiments and analysis show that the proposed method is obviously superior to classical machine learning methods such as Naive Bayes, K-Nearest Neighbor, Random Forest and Support Vector Machines.

Key words: educational data mining, knowledge points labeling, bipartite graph

摘要: 针对在线教育中试题知识点自动标注问题,本文提出了一种基于二部图的试题知识点标注方法。首先,为了合理划分知识点粒度,本文构建了知识点知识图谱来融合知识点;其次,基于教材等语料抽取知识点与特征词二部图和试题与特征词二部图,并利用TF-IDF公式计算知识点与特征词、试题与特征词之间的边权,以构建试题-知识点二部图模型;再次,提出词频加权的相似性度量方法计算试题和知识点之间的相似度,将相似度最高的知识点作为试题知识点标签。最后,以某在线教育平台提供的高中历史试题为实验数据集进行数值实验,实验结果表明该方法的效果显著优于朴素贝叶斯、K最近邻、随机森林以及支持向量机。

关键词: 教育数据挖掘, 知识点标注, 二部图

CLC Number: