Operations Research and Management Science ›› 2024, Vol. 33 ›› Issue (10): 65-72.DOI: 10.12005/orms.2024.0321

• Theory Analysis and Methodology Study • Previous Articles     Next Articles

Research into Mining New Types of Cybercriminal Tricks and Management Countermeasures

NI Peifeng1, SHI Jiangfeng2   

  1. 1. School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100049, China;
    2. Beijing RICHAI Information Technology Co., Ltd., Beijing 100007, China
  • Received:2022-07-22 Online:2024-10-25 Published:2025-02-26

新型网络犯罪手法挖掘研究与管理对策

倪培峰1, 石江枫2   

  1. 1.中国科学院大学 经济与管理学院, 北京 100049;
    2.北京睿企信息科技有限公司, 北京 100007
  • 通讯作者: 倪培峰(1979-),男,山西长治人,博士研究生,研究方向:管理学。
  • 作者简介:石江枫(1988-),男,湖北黄冈人,硕士,研究方向:推荐系统与自然语言处理。
  • 基金资助:
    山东省重大科技创新工程项目(2019TSLH0213)

Abstract: New types of cybercrimes are more novel, sophisticated and professional, and have been growing in recent years. Research into mining and analyzing new types of cybercrimes, and countermeasures will help public security organizations actively prevent cybercrimes and implement precise strikes. However, this type of research is rare both at home and abroad.
To tackle the challenges of analyzing new cybercriminal tricks, we need to address five main tasks: 1)establish a research framework, as one is lack of mining these tricks in open-source intelligence data; 2)accurately classify cybercriminal tricks, as they evolve and complicate classification; 3)accurately extract representative cybercrime keywords, as traditional unsupervised keyword recognition model has low accuracy and is difficult to meet the business requirements for keyword extraction of cybercriminal tricks, meanwhile supervised learning models have the problem with sample imbalance; 4)accurately identify hot words, as we need to pay more attention to the hot words of cybercrime tactics with prominent changes, while the traditional hot word identification method based on word frequency statistics has poor results; 5)summarize new cybercriminal tricks, as new types of cybercriminal tricks are changing rapidly, which are difficult to directly define with classification categories, and need to be accurately expressed for management countermeasures.
In this paper, we propose a framework to mine new types of cybercriminal tricks by adopting an interdisciplinary method. We refer to the cybercriminal tricks published on the website of the Ministry of Public Security and defines the common cybercriminal tricks as a two-level classification system. For new cybercriminal tricks which are not covered by existing categories, we use keyword recognition technology to detect representative keywords and manually confirm whether these keywords are sufficient to represent the cybercriminal tricks. Based on the existing research process of hot word recognition, in this paper we propose a new type of cybercriminal trick extraction method based on the classification of cybercrime related content and the recognition of cybercrime keywords.
To provide a high accuracy of the cybercrime classification model and keyword extraction model, we innovatively design the BERT-JTFL joint training model. This model enables the cybercriminal trick classification model and cybercrime keyword extraction model to share knowledge and promote each other. To deal with the sample imbalance issue, we propose multiclass Focal Loss to balance weight of samples in keyword extraction loss.
To mine cybercrime hot words, we propose a hot word recognition model as follows: the model first relies on the classification of textual cybercrime techniques to filter the texts in the field of cybercrime; next, it identifies keywords for each text, ensuring the text is representative of cybercrime activity; finally, it calculates keyword popularity over time using a historical weighted average and applies Bayesian smoothing to determine the results for cybercrime hot words.
The new cybercriminal tricks usually did not appear in the past, and in this paper we propose a new cybercriminal trick mining model based on the hot words of cybercrime, which screens the new words in the hot words, and identifies and mines representative combinations of related keywords in sliding window as a new type of cybercriminal trick.
These models are trained based on preprocessed Internet public police notices and Weibo data in 2019 and 2020. The research results show that: 1)The BERT-JTFL joint training model designed in this paper outperforms the BERT and RoBERTa models in both text classification tasks and keyword recognition tasks. 2)The novel hot word model is able to pay attention to the recent changes in keywords with smooth processing, effectively captures hot cybercriminals with P@10 up to 83.3%. 3)The extraction results of new keywords and related keywords can effectively capture and identify new cybercriminal tricks and summarize their characteristics.
To achieve the goal of proactive prevention of cybercrimes, precisely predict and fight new types of cybercrimes, and fully utilize the open-source intelligence information on the existing Internet, we also propose how to prevent and fight cybercrimes from the perspective of management.

Key words: new cybercriminal tricks, BERT-JTFL, joint training, cybercrime management countermeasures, content classification, keyword, hot word

摘要: 近年来,新型网络犯罪数量持续增长,手法新颖精细,形势严峻复杂,而新型网络犯罪手法挖掘及管理方法目前在国内外还少有研究。在此形势下,针对新型网络犯罪手法难以挖掘分析的挑战,本文采用交叉学科方法,搭建新型网络犯罪手法挖掘框架,创新设计BERT-JTFL联合训练模型,使得网络犯罪手法分类与关键词识别任务分享知识并互相促进。本文进而创新设计网络犯罪热词识别模型,从开源情报数据中挖掘新型网络犯罪手法,并就挖掘结果进行管理对策分析和建议。研究结果表明:①本文所设计BERT-JTFL联合训练模型在文本分类任务和关键词识别任务均优于BERT和RoBERTa等已知模型;②本文设计的热词识别模型支持关注关键词近期异动并进行平滑处理,能有效捕捉热点网络犯罪手法,P@10达83.3%;③新词和关联关键词提取结果能概括表达新型网络犯罪手法,指出其发展迅速、手法多变的特点。基于研究结果,本文针对性的提出了管理对策建议工作。

关键词: 新型网络犯罪, BERT-JTFL, 联合训练, 网络犯罪管理, 文本分类, 关键词, 热词

CLC Number: