PHANTOM: Progressive High-fidelity Adversarial Network for Threat Object Modeling

February 20, 2026

Reading time: 5 minute

...

📝 Original Info

Title: PHANTOM: Progressive High-fidelity Adversarial Network for Threat Object Modeling
ArXiv ID: 2512.15768
Date: 2025-12-12
Authors: ** - Jamal Al‑Karaki¹² - Muhammad Al‑Zafar Khan¹* (Corresponding author) - Rand Derar Mohammad Al Athamneh¹ ¹ College of Interdisciplinary Studies, Zayed University, Abu Dhabi, UAE. ² College of Engineering, The Hashemite University, Zarqa, Jordan. E‑mail: Muhammad.Khan@zu.ac.ae (corresponding), Jamal.Al-Karaki@zu.ac.ae — **

📝 Abstract

The scarcity of high-quality cyberattack datasets poses a fundamental challenge to developing robust machine learning-based intrusion detection systems. Real-world attack data is difficult to obtain due to privacy regulations, organizational reluctance to share breach information, and the rapidly evolving threat landscape. This paper introduces PHANTOM (Progressive High-fidelity Adversarial Network for Threat Object Modeling), a novel multi-task adversarial variational framework specifically designed for generating synthetic cyberattack datasets. PHANTOM addresses the unique challenges of cybersecurity data through three key innovations: Progressive training that captures attack patterns at multiple resolutions, dual-path learning that combines VAE stability with GAN fidelity, and domain-specific feature matching that preserves temporal causality and behavioral semantics. We implement a Multi-Task Adversarial VAE with Progressive Feature Matching (MAV-PFM) architecture that incorporates specialized loss functions for reconstruction, adversarial training, feature preservation, classification accuracy, and cyber-specific constraints. Experimental validation on a realistic synthetic dataset of 100 000 network traffic samples across five attack categories demonstrates that PHANTOM achieves 98% weighted accuracy when used to train intrusion detection models tested on real attack samples. Statistical analyses, including kernel density estimation, nearest neighbor distance distributions, and t-SNE visualizations, confirm that generated attacks preserve the distributional properties, diversity, and class separability of authentic cyberattack patterns. However, results also reveal limitations in generating rare attack types, highlighting the need for specialized handling of severely imbalanced classes. This work advances the state-of-the-art in synthetic cybersecurity data generation, providing a foundation for training more robust threat detection systems while maintaining privacy and security.

💡 Deep Analysis

📄 Full Content

PHANTOM: Progressive High-fidelity Adversarial Network for Threat Object Modeling Jamal Al-Karaki1,2, Muhammad Al-Zafar Khan1*, Rand Derar Mohammad Al Athamneh1 1*College of Interdisciplinary Studies, Zayed University, Abu Dhabi, UAE. 2College of Engineering, The Hashemite University Zarqa, Jordan. *Corresponding author(s). E-mail(s): Muhammad.Khan@zu.ac.ae; Contributing authors: Jamal.Al-Karaki@zu.ac.ae; Abstract The scarcity of high-quality cyberattack datasets poses a fundamental challenge to developing robust machine learning-based intrusion detection systems. Real-world attack data is difficult to obtain due to privacy regulations, organizational reluctance to share breach information, and the rapidly evolving threat landscape. This paper introduces PHANTOM (Progressive High-fidelity Adversarial Network for Threat Object Modeling), a novel multi-task adversar- ial variational framework specifically designed for generating synthetic cyberattack datasets. PHANTOM addresses the unique challenges of cybersecurity data through three key innovations: Progressive training that captures attack patterns at multiple resolutions, dual-path learning that combines VAE stability with GAN fidelity, and domain-specific feature matching that preserves temporal causality and behavioral semantics. We implement a Multi-Task Adversarial VAE with Progressive Feature Matching (MAV-PFM) architecture that incorporates specialized loss func- tions for reconstruction, adversarial training, feature preservation, classification accuracy, and cyber-specific constraints. Experimental validation on a realistic synthetic dataset of 100 000 network traffic samples across five attack categories demonstrates that PHANTOM achieves 98% weighted accuracy when used to train intrusion detection models tested on real attack samples. Statistical analyses, including kernel density estimation, nearest neighbor distance distributions, and t-SNE visualizations, confirm that generated attacks preserve the distributional properties, diversity, and class separability of authentic cyberattack patterns. However, results also reveal limitations in generating rare attack types, highlighting the need for specialized handling of severely imbalanced classes. This work advances the state-of-the-art in synthetic cybersecurity data generation, providing a foundation for training more robust threat detection systems while maintaining privacy and security. Keywords: Synthetic Cyberattack Generation, Adversarial Generative Modeling, Cybersecurity Data Scarcity, Intrusion Detection Augmentation 1 Introduction The exponential growth of cyber threats in recent years has created an urgent demand for robust cybersecurity systems capable of detecting and mitigating sophisticated attacks [1–3]. Machine Learn- ing (ML) and Deep Learning (DL) models have emerged as powerful tools for threat detection [4, 5], enabling automated analysis of network traffic [6], system logs [7], and user behavior patterns [8]. However, the effectiveness of these models hinges critically on the availability of diverse, represen- tative training data that captures the full spectrum of attack vectors and techniques employed by adversaries. 1 arXiv:2512.15768v1 [cs.CR] 12 Dec 2025 Despite this need, obtaining high-quality cyberattack datasets remains one of the most significant challenges in cybersecurity research and practice. Real-world attack data is inherently scarce due to several factors: 1. Organizations are often reluctant to share sensitive breach information due to legal and reputational concerns [9]. 2. Privacy regulations restrict the dissemination of network traffic containing potentially identifiable information [10]. 3. The rapidly evolving threat landscape means that historical datasets quickly become obsolete [11]. Additionally, even when attack data is available, it often suffers from severe class imbalance, with benign traffic vastly outnumbering malicious samples, leading to biased models that struggle to detect novel or rare attack patterns. Synthetic data generation has emerged as a promising solution to address these limitations [12, 13]. By artificially creating realistic cyberattack samples, researchers can augment existing datasets, balance class distributions, and generate examples of rare or emerging threats that may not yet exist in operational environments. However, traditional synthetic data generation techniques, such as rule-based simulation and simple statistical sampling, often produce oversimplified attack patterns that lack the complexity and variability of real-world threats. Models trained on such synthetic data frequently exhibit poor generalization when deployed in production environments, as they fail to capture the nuanced behavioral characteristics of actual attackers. Recent advances in generative modeling, particularly Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs), offer a paradigm shift in synthetic data generation. The

📄 Read Full PDF on ArXiv

Reference

This content is AI-processed based on open access ArXiv data.

PHANTOM: Progressive High-fidelity Adversarial Network for Threat Object Modeling

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Related Posts

A Model of Causal Explanation on Neural Networks for Tabular Data

Multiscale approach for bone remodeling simulation based on finite element and neural network computation

Agent-based model of information spread in social networks

Start searching

No results found