运筹与管理 ›› 2025, Vol. 34 ›› Issue (5): 177-184.DOI: 10.12005/orms.2025.0160

• 应用研究 • 上一篇    下一篇

基于TL-TimeGAN的多维时间序列数据增强及其应用分析

智路平, 汪万敏   

  1. 上海理工大学 管理学院,上海 200093
  • 收稿日期:2023-05-04 发布日期:2025-08-26
  • 通讯作者: 汪万敏(1996-),女,安徽六安人,硕士研究生研究方向:异常检测供应链管理。
  • 作者简介:智路平(1982-),男,山西太原人,博士,高级实验师,硕士生导师,研究方向:供应链管理。
  • 基金资助:
    上海市哲学社会科学规划课题(2024BGL001)

Analysis of Multidimensional Time-series Data EnhancementBased on TL-TimeGAN and its Application

ZHI Luping, WANG Wanmin   

  1. Business School, University of Shanghai for Science & Technology, Shanghai 200093, China
  • Received:2023-05-04 Published:2025-08-26

摘要: 针对部分场景下标签较少、样本不均衡的时序数据,为了更好的捕捉序列之间的逐步依赖关系,本文一方面使用具有因果关系属性的时域卷积网络构建生成对抗网络,另一方面使用长短期记忆网络构建嵌入网络和复现网络,以实现模型同时处理短期依存项和长期依存项,从而提出一种基于时域卷积网络和长短期记忆网络的时间序列生成对抗网络(A Time-series Generative Adversarial Network based on Temporal convolutional network and Long-short term memory network, TL-TimeGAN)。采用覆盖性、有用性和相似度检验的综合分析方法作为合成数据质量的评价指标,进一步全面地评价合成数据的覆盖性、预测程度和相似性。最终,基于以太坊欺诈检测数据集,使用Tabnet网络对扩增数据进行异常检测并获得局部特征重要性以及全局特征重要性,以增强扩增数据应用于实际工作的实践指导价值。

关键词: 时域卷积网络, 长短期记忆网络, 时间序列生成对抗网络, 时序数据增强, 多维时间序列

Abstract: Aiming at the problems of data scarcity and data imbalance in time-series anomaly detection, this paper proposes a multidimensional time-series anomaly detection model based on TL-TimeGAN (A Time-series Generative Adversarial Network based on Temporal convolutional network and Long-short term memory network, TL-TimeGAN), which mainly consists of data preprocessing, creation of sliding time window, TL-TimeGAN, synthetic data quality evaluation, time-series data augmentation, Tabnet network, and evaluation and interpretation of the model.
In order to better capture the stepwise dependencies between sequences, on the one hand, this paper uses a temporal convolutional network with causality attribute to construct a generative adversarial network, and on the other hand, uses a long short-term memory network to construct an embedding network and a recurrent network to realize the model to handle both short-term dependencies and long-term dependencies simultaneously, so as to propose a model based on temporal convolutional networks and long short-term memory networks for time-series data. This network framework combines supervised and unsupervised learning to learn not only the distribution of features on each time-series, but also the potential complex relationships between variables at different time points to explain the correlation of the series, and still maintains the characteristics of co-training of TimeGAN (Time-series Generative Adversarial Networks, TimeGAN), which relies on different loss functions for the training of autoencoder networks and generative adversarial networks.
In this paper, we propose a comprehensive evaluation method combining qualitative and quantitative analyses as an evaluation index of synthetic data quality, which further comprehensively evaluates the coverage, degree of prediction and similarity of synthetic data, mainly from the perspective of the combined analysis method of coverage, usefulness and similarity test. The empirical results show that TL-TimeGAN outperforms TimeGAN in coverage, usefulness and similarity of the synthesized time-series data, and is able to capture the “time-series dynamics” in historical data well, synthesize high-quality time-series data, and solve the problem of data scarcity.
Due to the anonymity of blockchain and the automatic execution of smart contracts, failure to detect fraud may lead to irreversible economic losses or even loss of personal interests, so accurate and timely anomaly detection can warn to users, avoid unnecessary economic losses, and promote the healthy development and application of blockchain technology. Therefore, in this paper, based on the Ethereum fraud detection dataset, we use Tabnet network to detect anomalies in augmented data and obtain the local feature importance as well as the global feature importance, in order to enhance the practical guidance value of the augmented data applied to practical work. In the training process of Tabnet network, AMEX evaluation index is innovatively introduced as a customized evaluation index to achieve early stopping of the model and prevent overfitting.
The Tabnet network sparsely selects the most salient features through a masking layer so that the learning power of the decision step is not wasted on irrelevant features, thus improving the parametric efficiency of the model. In order to achieve global interpretability, we visualize the importance of the features, and based on the ranking results, it can be seen that the top ten most important features are: the number of ERC20 token transactions sent to the unique account address, the maximum value of Ether received, the average value of Ether sent, the total number of normal transactions received, the total number of ERC20 token transactions sent by Ether, and the total number of contract transactions created, total number of Ether transactions received for ERC20 tokens, total amount of ERC20 tokens transferred to other contracts in Ether, the time difference (in minutes) between the first and last transaction, and the total Ether balance after enacted transactions.
In future work, the theoretical foundation part of the autoencoder as well as the generative adversarial network needs to be studied in depth to further optimize the network structure, reduce the memory usage of the model, and improve the performance of the model.

Key words: temporal convolutional networks, long short-term memory networks, time-series generative adversarial networks, time-series data augmentation, multidimensional time-series

中图分类号: