-
Analysis of Multidimensional Time-series Data EnhancementBased on TL-TimeGAN and its Application
- ZHI Luping, WANG Wanmin
-
2025, 34(5):
177-184.
DOI: 10.12005/orms.2025.0160
-
Asbtract
(
)
PDF (1589KB)
(
)
-
References |
Related Articles |
Metrics
Aiming at the problems of data scarcity and data imbalance in time-series anomaly detection, this paper proposes a multidimensional time-series anomaly detection model based on TL-TimeGAN (A Time-series Generative Adversarial Network based on Temporal convolutional network and Long-short term memory network, TL-TimeGAN), which mainly consists of data preprocessing, creation of sliding time window, TL-TimeGAN, synthetic data quality evaluation, time-series data augmentation, Tabnet network, and evaluation and interpretation of the model.
In order to better capture the stepwise dependencies between sequences, on the one hand, this paper uses a temporal convolutional network with causality attribute to construct a generative adversarial network, and on the other hand, uses a long short-term memory network to construct an embedding network and a recurrent network to realize the model to handle both short-term dependencies and long-term dependencies simultaneously, so as to propose a model based on temporal convolutional networks and long short-term memory networks for time-series data. This network framework combines supervised and unsupervised learning to learn not only the distribution of features on each time-series, but also the potential complex relationships between variables at different time points to explain the correlation of the series, and still maintains the characteristics of co-training of TimeGAN (Time-series Generative Adversarial Networks, TimeGAN), which relies on different loss functions for the training of autoencoder networks and generative adversarial networks.
In this paper, we propose a comprehensive evaluation method combining qualitative and quantitative analyses as an evaluation index of synthetic data quality, which further comprehensively evaluates the coverage, degree of prediction and similarity of synthetic data, mainly from the perspective of the combined analysis method of coverage, usefulness and similarity test. The empirical results show that TL-TimeGAN outperforms TimeGAN in coverage, usefulness and similarity of the synthesized time-series data, and is able to capture the “time-series dynamics” in historical data well, synthesize high-quality time-series data, and solve the problem of data scarcity.
Due to the anonymity of blockchain and the automatic execution of smart contracts, failure to detect fraud may lead to irreversible economic losses or even loss of personal interests, so accurate and timely anomaly detection can warn to users, avoid unnecessary economic losses, and promote the healthy development and application of blockchain technology. Therefore, in this paper, based on the Ethereum fraud detection dataset, we use Tabnet network to detect anomalies in augmented data and obtain the local feature importance as well as the global feature importance, in order to enhance the practical guidance value of the augmented data applied to practical work. In the training process of Tabnet network, AMEX evaluation index is innovatively introduced as a customized evaluation index to achieve early stopping of the model and prevent overfitting.
The Tabnet network sparsely selects the most salient features through a masking layer so that the learning power of the decision step is not wasted on irrelevant features, thus improving the parametric efficiency of the model. In order to achieve global interpretability, we visualize the importance of the features, and based on the ranking results, it can be seen that the top ten most important features are: the number of ERC20 token transactions sent to the unique account address, the maximum value of Ether received, the average value of Ether sent, the total number of normal transactions received, the total number of ERC20 token transactions sent by Ether, and the total number of contract transactions created, total number of Ether transactions received for ERC20 tokens, total amount of ERC20 tokens transferred to other contracts in Ether, the time difference (in minutes) between the first and last transaction, and the total Ether balance after enacted transactions.
In future work, the theoretical foundation part of the autoencoder as well as the generative adversarial network needs to be studied in depth to further optimize the network structure, reduce the memory usage of the model, and improve the performance of the model.