基于评分系统误差的群决策专家赋权法及在残缺评价中的应用

doi:10.12005/orms.2025.0184

摘要/Abstract

摘要： 专家权重在群决策中具有重要的作用,影响着评价与决策的质量。针对残缺主观评分型群决策问题,为了提高评判的准确性和公平性,减小决策偏差,提出了一种充分考虑评分系统误差的专家权重计算方法。首先,定义了专家评价的特征信息和关联信息;其次,根据专家评价的关联信息构造成对比较矩阵,并应用误差平方和最小法确定专家的后验权重;最后,以全国大学生数学建模竞赛为例,进行了两组不同规模的100次仿真模拟实验,并通过一致率、差异度、误差度和争议度等检验指标进行比较分析。结果表明,本文方法较传统法和标准分法(T分数法)提高了一致率,减小了差异度、误差度和争议度,验证了该方法的有效性和科学性。

关键词: 群决策, 专家权重, 系统误差, 主观评分, 关联信息

Abstract: The marking problems of large-scale subjective-type competitions or examinations belong to typical group decision-making problems, due to the large number of participants and limited number of experts and marking time, and each answer sheet can only be randomly assigned to a few experts for marking, so the scoring matrices of large-scale competitions are incomplete. Subjective questions generally do not have standard answers, and are susceptible to the subjective factors of experts during the marking process, which can produce scoring system errors. Scoring system errors can be divided into two categories. The first is unequal mean scores, that is, some experts score generally higher, while others score generally lower. The second is unequal variance, which means that some experts’ ratings are more varied (large variance), while others are less varied (small variance). Due to the scoring system error, the raw scores of different judges are not additive in the incomplete scoring, so the traditional scoring method, that is, taking the average of the raw scores directly, is unfair to the contestants. At present, the T-score method (standardized score method) is often used, i.e., the mean score and variance of the answer sheets reviewed by each expert are leveled to the same level. However, in incomplete scoring, each expert reviews different answer sheets and the mean level of each expert’s answer sheet is different, so in incomplete scoring, the direct use of T-score method is not scientific enough.
In order to reduce the systematic error in scoring between experts, firstly we define the concept of “feature information” and its calculation formula, which reflects, to a certain extent, the degree of leniency or preference of an expert’s evaluation for the participant and represents the expert’s evaluation characteristics. Secondly, inspired by the formula of information entropy, we define the concept of “correlation information” and its calculation formula, and establish the least sum of square error model for determining expert weights based on the pairwise comparison matrix of correlation information. In order to test the reliability of the method of determining expert weights proposed in this paper, we conduct 100 simulation experiments with the example of Undergraduate Mathematical Contest in Modeling. The simulation experiments are divided into two groups: the number of modeling papers in the first group of experiments is 40 and the number of judges is 5; the number of modeling papers in the second group of experiments is 100 and the number of judges is 8. Each paper is reviewed by three experts, and each group of experiments is simulated for 50 times. When assigning papers to experts, we follow the principles of even and cross-assignment in order to increase the comparability of ratings among experts and make the pairwise comparison matrix of correlation information more reliable. In order to illustrate the scientificity of the method in this paper, in the simulation experiments, we use the traditional method (directly taking the mean value of the original score), the T-score method (standardized score method) and the method of this paper for comparative analysis respectively, and use the consistency rate, difference degree, error degree and controversy degree as the indexes for evaluating the advantages and disadvantages of the methods. The results of simulation experiments show that our method has higher consistency rate and lower difference, error and controversy degrees than the traditional method and T-score method, which indicates that our method is more scientific and reasonable than the other two methods.
The method of determining experts’ weights in this paper fully considers systematic errors, but less consideration is given to factors such as random errors and scoring drift. Therefore, further research can add random error, scoring drift and other factors to design a dynamic expert weight. The simulation experiments in this paper are based on the comprehensive scoring method, and further research can consider the itemized scoring method, that is, in the case of the itemized scoring method, how the posteriori weights are designed to improve the quality of the expert marking and reduce the system error.

Key words: group decision-making, expert’s weight, systematic error, subjective scoring, correlation information

中图分类号:

郭东威, 朱英明, 陈玉磊, 张耀. 基于评分系统误差的群决策专家赋权法及在残缺评价中的应用[J]. 运筹与管理, 2025, 34(6): 123-130.

GUO Dongwei, ZHU Yingming, CHEN Yulei, ZHANG Yao. Method for Determining Experts’ Weights of Group Decision-making Based on Scoring System Error and its Application in Evaluation with Incomplete Judgment Information[J]. Operations Research and Management Science, 2025, 34(6): 123-130.

参考文献

[1] LIU X, ZHANG Y Y, XU Y J, et al. A consensus model for group decision-making with personalized individual self-confidence and trust semantics: A perspective on dynamic social network interactions[J]. Information Sciences, 2023, 627: 147-168.
[2] AKRAM M, ALI G, ALCANTUD J C R. A novel group decision-making framework under pythagorean fuzzy n-soft expert knowledge[J]. Engineering Applications of Artificial Intelligence, 2023, 120: 105879.
[3] 丁文,裴赟.评分趋中性现象的初步分析[J].中国考试,2008(8):14-18.
[4] 赵海燕,辛涛,田伟.大规模教育考试作文评分的趋中漂移和不准确性漂移研究[J].中国考试,2020(3):13-20.
[5] ADEYEMI T O. The effective use of standard scores for research in educational management[J]. Research Journal of Mathematics & Statistics, 2011, 3(3): 91-96.
[6] HASSANI H, RAZAVI-FAR R, SAIF M, et al. Consensus-based decision support model and fusion architecture for dynamic decision making[J]. Information Sciences, 2022, 597: 86-104.
[7] 蒋文能.群组决策中专家权重确定的思路和方法[J].统计与决策,2013(2):24-28.
[8] 易平涛,王士烨,李伟伟,等.基于先验信息和一维数据聚类的专家赋权方法[J].运筹与管理,2022,31(3):31-37.
[9] 何立华,王栎绮,张连营.基于聚类的多属性群决策专家权重确定方法[J].运筹与管理,2014,23(6):65-72.
[10] WU J, CHICLANA F. A social network analysis trust-consensus based approach to group decision making problems with interval-valued fuzzy reciprocal preference relations[J]. Knowledge-Based Systems, 2014, 59: 97-107.
[11] 钱丽丽,刘思峰,邓桂丰.考虑后悔规避的灰色群体偏离靶心度决策方法[J].中国管理科学,2020,28(6):193-200.
[12] 金飞飞,刘金培,陈华友,等.基于信任关系和信息测度的概率语义社会网路群决策模型[J].中国管理科学,2021,29(10):178-190.
[13] 林原,战仁军,吴虎胜.基于犹豫度和相似度的专家权重确定方法及其应用[J].控制与决策,2021,36(6):1482-1488.
[14] 向南,豆亚杰,姜江,等.基于专家信任网络的不完全信息武器选择决策[J].系统工程理论与实践,2021,41(3):759-770.
[15] 王泽洲,陈云翔,项华春.一种改进型专家模糊核聚类赋权方法研究[J].中国管理科学,2021,29(2):177-183.
[16] CHEN Z, ZHONG P S, LIU M, et al. An integrated expert weight determination method for design concept evaluation[J]. Scientific Reports, 2022, 12: (Article) 6358.
[17] 何杜博,黄栋.基于改进EAHP的装备采购供应链质量绩效评价[J].运筹与管理,2022,31(2):148-154.
[18] LIU Y T, LI Y, LIANG H M, et al. Strategic experts’ weight manipulation in 2-rank consensus reaching in group decision making[J]. Expert Systems with Applications, 2023, 216: 119432.
[19] DAVOUDABADI R, MOUSAVI S M, ZAVADSKAS E K, et al. Introducing MOWSCER method for multiple criteria group decision-making: A new method of weighting in the structure of cause and effect relationships[J]. International Journal of Information Technology & Decision Making, 2023, 22(2): 641-647.
[20] 刘久兵,彭莉莎,李华雄,等.考虑权重信息未知的区间直觉模糊三支群决策方法[J].运筹与管理,2022,31(7):50-57.
[21] 缑迅杰,邓富民,徐泽水.基于自信双层语言偏好关系的大规模群体共识决策方法及其应用研究[J].中国管理科学,2023,31(9):222-232.
[22] 叶义成,柯丽华,黄德育.系统综合评价技术及其应用[M].北京:冶金工业出版社,2006.
[23] 刘斯佳,张建新.分步增值评分—提高主观题评分质量的有效方法[J].心理学探新,2015,35(3):266-271.
[24] 郭东威,丁根宏.群组决策主观评分型竞赛名次的优化模型[J].工程数学学报,2022,39(3):379-388.
[25] 郭东威,丁根宏,毛俊诚,等.群决策论文名次的优化模型[J].统计与决策,2016(18):80-83.
[26] 谢圆梦.高考网络化评卷模式下作文评分标准的研究[D].成都: 四川师范大学,2022.
[27] 郭东威,丁根宏,刘伟.论文型竞赛阅卷及排名的优化模型[J].数学的实践与认识,2019,49(8):258-263.
[28] 秦权.普通高中学业水平考试成绩的分布特征[D].南宁:广西师范大学,2022.