基于类案生成的法条知识推荐算法研究

doi:10.12005/orms.2025.0345

运筹与管理 ›› 2025, Vol. 34 ›› Issue (11): 74-80.DOI: 10.12005/orms.2025.0345

基于类案生成的法条知识推荐算法研究

司林胜¹, 闫妍霏¹, 崔春生¹, 刘俊²

1.河南财经政法大学数据科学与电子商务学院,河南郑州 450046;
2.郑州财经学院智能会计学院,河南郑州 450044

收稿日期:2025-02-04 出版日期:2025-11-25 发布日期:2026-03-30
通讯作者: 崔春生(1974-),男,河南南阳人,博士,教授,研究方向:运筹优化,推荐系统,决策理论与方法等。Email:traition@126.com。
作者简介:司林胜(1967-),男,内蒙古丰镇人,博士,教授,研究方向:系统工程,系统决策。
基金资助:
教育部人文社会科学研究规划基金项目(23YJA860004,24YJA860023);河南省高等学校哲学社会科学基础研究重大项目(2024-JCZD-27);河南省科技研发计划联合基金(产业类)项目(225101610054)

Research on Legal Provisions Knowledge RecommendationAlgorithm Based on Similar Case Generation

SI Linsheng¹, YAN Yanfei¹, CUI Chunsheng¹, LIU Jun²

1. School of Data Science and E-commerce, Henan University of Economics and Law, Zhengzhou 450046, China;
2. School of Intelligent Accounting, Zhengzhou College of Finance and Economics, Zhengzhou 450044, China

Received:2025-02-04 Online:2025-11-25 Published:2026-03-30

摘要/Abstract

摘要： 快速准确地找到相似的过往案例并对新案提供断案依据,对于提高办案效率,提升断案精准性有重要的意义。本文从类案文本的多元数据特征出发,提出一种基于类案生成的法条知识推荐算法。该算法使用KeyBERT抽取案件事实的关键词序列,得到法条知识库,并使用余弦相似度计算关键词与法条知识的语义表征向量,得到法条及相似度的多维向量。之后,从罪名、法条知识和刑期三个维度计算案例之间的综合相似度,并生成案例推荐结果。实验结果表明,本文提出的方法在DCG@p指标上优于传统的文本匹配方法,验证了其在实际应用中的可行性和有效性。本文的研究成果可以辅助司法人员进行案情分析和案情诊断,对司法人工智能及其应用有一定的参考价值。

关键词: 推荐算法, 类案推荐, KeyBERT算法, 法条知识

Abstract: With the surge in judicial cases, traditional text matching methods struggle to meet the precise retrieval demands of massive legal documents due to inefficiency, weak generalization capability and insufficient interpretability. The professionalism and complexity of legal texts require models to not only capture semantic information but also incorporate structured domain knowledge. This study proposes a case recommendation algorithm integrating legal knowledge and case characteristics through constructing a legal article knowledge base and multi-dimensional similarity computation framework, aiming to enhance recommendation accuracy and interpretability. This assists judicial professionals in efficiently locating similar cases, improves transparency and consistency of legal reasoning, and provides a technically valuable pathway for judicial intelligence with both theoretical significance and practical value. Thus, effectively integrating case facts into legal provisions to build an interpretable case recommendation model becomes crucial for enhancing judicial efficiency and decision consistency.
This paper presents a multi-dimensional feature fusion-based precedent recommendation model with the following core framework: (1) Data acquisition and preprocessing: crawling 215 criminal judgment documents on intentional injury from China Judgments Online and extracting factual descriptions as raw data. Text segmentation using THULAC with dual filtering through general and legal-domain-specific stop word lists (e.g., “public security bureau”, “review”) optimizes text representation. (2) Keyword extraction and legal knowledge base construction: KeyBERT algorithm extracts top-10 case keywords, filtered through BERT’s semantic understanding. Transforming criminal law provisions into element-based structures (e.g., decomposing “fraud crime” into elements like “defrauding public/private property” and “large amount”), stored in Elasticsearch as structured knowledge. (3) Semantic matching and similarity computation: XS-BERT (legally optimized pre-trained model) generates semantic vectors for keywords and legal elements. A weighted similarity function integrates three dimensions: charge overlap (Jaccard index), legal knowledge similarity (vector inner product) and sentence difference (normalized distance). (4) Recommendation and validation: using DCG as core metric, comparative experiments with TF-IDF and Word2Vec baseline models demonstrate superior retrieval accuracy and interpretability.
The experimental results show significant advantages in DCG@5, DCG@10, and DCG@20 metrics over traditional methods. By integrating legal knowledge bases into deep learning, this model effectively addresses semantic gaps and logical inconsistencies in conventional legal text processing. The algorithm not only improves recommendation precision but also enhances credibility through structured legal provision matching, offering an efficient and reliable solution for judicial AI systems. Future work will extend to multi-offense joint recommendation, courtroom debate perspective integration and cross-jurisdictional adaptability optimization. The case recommendations demonstrate high consistency in charges, legal provisions and sentencing patterns with real cases, validating practical feasibility. This approach assists judicial professionals in rapidly locating similar precedents while enhancing decision interpretability, providing a technically referential framework balancing efficiency and precision. Subsequent research could incorporate external knowledge like trial arguments to further optimize multi-dimensional recommendation mechanisms.

Key words: recommendation algorithm, similar cases recommendation, KeyBERT algorithm, legal provisions knowledge

中图分类号:

D926
TP391.3

司林胜, 闫妍霏, 崔春生, 刘俊. 基于类案生成的法条知识推荐算法研究[J]. 运筹与管理, 2025, 34(11): 74-80.

SI Linsheng, YAN Yanfei, CUI Chunsheng, LIU Jun. Research on Legal Provisions Knowledge RecommendationAlgorithm Based on Similar Case Generation[J]. Operations Research and Management Science, 2025, 34(11): 74-80.

参考文献

[1] 孙海波.类案检索在何种意义上有助于同案同判?[J].清华法学,2021,15(1):79-97.
[2] 张明红,佘廉,耿波.基于情景的结构化突发事件相似度研究[J].中国管理科学,2017,25(1):151-159.
[3] AIZAWA A. An information-theoretic perspective of tf-idf measures[J]. Information Processing and Management, 2003, 39(1): 45-65.
[4] RPBERTSON S, ZARAGOZA H. The probabilistic relevance framework: BM25 and beyond[J]. Foundations & Trends in Information Retrieval, 2009, 3(4): 333-389.
[5] BLEI M D, NG Y A, JORDAN I M. Latent dirichlet allocation[J]. Journal of Machine Learning Research, 2003, 3(4-5): 993-1022.
[6] 赵京胜,宋梦雪,高祥,等.自然语言处理中的文本表示研究[J].软件学报,2022,33(1):102-128.
[7] BHATTACHARYA P, GHOSH K, PAL A, et al. Methods for computing legal document similarity: A comparative study[J/OL]. arXiv, 2020: 2004.12307v1[2025-01-10]. https://arxiv.org/pdf/2004.12307v1.
[8] WAGH RS, ANAND D. Legal document similarity: A multi-criteria decision-making perspective[J]. PeerJ Computer Science, 2020, 6: (Article) 262.
[9] TRAN V, NGUYEN M L, SATOH K. Building legal case retrieval systems with lexical matching and summarization using a pre-trained phrase scoring model[C]//The 17th International Conference on Artificial Intelligence and Law (ICAIL-2019), June 17-21, 2019, Montreal, Canada. New York: ACM, 2019: 275-282.
[10] JIANG J Y, ZHANG M Y, LI C, et al. Semantic text matching for long-form documents[C]//The World Wide Web Conference (WWW’19), May 13-17, 2019, San Francisco, CA, USA. New York: ACM, 2019: 795-806.
[11] SHAO Y, MAO J X, LIU Y Q, et al. BERT-PLI: Modeling paragraph-level interactions for legal case retrieval[C]//The 29th International Joint Conference on Artificial Intelligence (IJCAI-20), January 7-15, 2021, Yokohama, Japan. California: International Joint Conferences on Artificial Intelligence Organization, 2021: 3501-3507.
[12] HU W, ZHAO S, ZHAO Q, et al. BERT_LF: A similar case retrieval method based on legal facts[J]. Wireless Communications & Mobile Computing, 2022, 2022: 2511147.
[13] LI J, LIU X, NIE X, et al. Weighted-attribute triplet hashing for large-scale similar judicial case matching[J]. Computational Intelligence and Neuroscience, 2021, 2021: 6650962.
[14] 惠欣恒,白雄文,王红艳,等.基于知识表示增强的类案推荐模型[J].计算机工程与设计,2023,44(8):2399-2407.
[15] 刘权,余正涛,高盛祥,等.融合案件要素的相似案例匹配[J].中文信息学报,2022,36(11):140-147.
[16] 郑洁,黄辉,秦永彬.一种融合法律知识的相似案例匹配模型[J].数据分析与知识发现,2022,6(7):99-106.
[17] 刘博阳,李尚,叶麟,等.基于法律要素引导的相似案例推荐算法[J].智能计算机与应用,2021,11(6):1-4+13.
[18] PENG D, YANG J, LU J. Similar case matching with explicit knowledge enhanced text representation[J]. Applied Soft Computing, 2020, 95: 106514.
[19] GROOTENDORST M. KeyBERT: Minimal Keyword Extraction with BERT[EB/OL]. (2020-02-09)[2024-11-27]. https://github.com/MaartenGr/KeyBERT.
[20] 丁娜,刘鹏,邵惠鹏,等.双向注意力文本关键词匹配法条推荐[J].北京大学学报:自然科学版,2024,60(1):79-88.
[21] ZHONG H, ZHANG Z, LIU Z, et al. Open Chinese Language Pre-trained Model Zoo[EB/OL]. (2019-07-01)[2024-11-27]. https://github.com/thunlp/OpenCLaP.
[22] 曾金,贺国秀.基于多模数据的微博用户好友推荐研究[J].情报科学,2019,37(3):136-140+176.

基于类案生成的法条知识推荐算法研究

Research on Legal Provisions Knowledge RecommendationAlgorithm Based on Similar Case Generation

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 5

编辑推荐

Metrics

[1]	甘沛露, 宋一豪, 朱晓雄, 周支立. 融入概率矩阵分解模型的改进二部图推荐算法[J]. 运筹与管理, 2025, 34(1): 1-7.
[2]	王永, 刘岽, 杜锡为, 肖玲. 融合注意力机制的自编码器推荐算法[J]. 运筹与管理, 2024, 33(2): 57-63.
[3]	王永, 李行健, 邓江洲. 融合注意力机制的残差神经协同过滤推荐模型[J]. 运筹与管理, 2024, 33(10): 201-208.
[4]	臧振春, 李洁璐, 王美琦, 王娜娜. 基于犹豫模糊相似的网络正能量推荐算法研究[J]. 运筹与管理, 2022, 31(3): 44-49.
[5]	崔春生, 王梦冉, 王国成. 一种基于可拓学的电子商务内容推荐算法研究[J]. 运筹与管理, 2018, 27(6): 75-81.