Detection of Thin Boundaries between Different Types of Anomalies in Outlier Detection using Enhanced Neural Networks
Outlier detection has received special attention in various fields, mainly for those dealing with machine learning and artificial intelligence. As strong outliers, anomalies are divided into the point, contextual and collective outliers. The most important challenges in outlier detection include the thin boundary between the remote points and natural area, the tendency of new data and noise to mimic the real data, unlabelled datasets and different definitions for outliers in different applications. Considering the stated challenges, we defined new types of anomalies called Collective Normal Anomaly and Collective Point Anomaly in order to improve a much better detection of the thin boundary between different types of anomalies. Basic domain-independent methods are introduced to detect these defined anomalies in both unsupervised and supervised datasets. The Multi-Layer Perceptron Neural Network is enhanced using the Genetic Algorithm to detect newly defined anomalies with higher precision so as to ensure a test error less than that calculated for the conventional Multi-Layer Perceptron Neural Network. Experimental results on benchmark datasets indicated reduced error of anomaly detection process in comparison to baselines.
💡 Research Summary
The paper addresses a persistent challenge in anomaly detection: the “thin boundary” that separates normal data from various types of anomalies, especially when the boundary is so narrow that conventional methods struggle to distinguish between them. While traditional taxonomy classifies anomalies into point, contextual, and collective categories, the authors argue that this division is insufficient for real‑world data where subtle overlaps occur. To bridge this gap, they introduce two novel anomaly types: Collective Normal Anomaly (CNA) and Collective Point Anomaly (CPA).
CNA is defined as a cluster of data points whose standard‑deviation density is equal to or exceeds a global threshold, meaning it lies very close to the normal data distribution but can still be identified through clustering. CPA, on the other hand, is a subset of point anomalies whose neighborhood radius is smaller than the average radius of all point anomalies, making it difficult to cluster and thus representing a “thin” region within the point‑anomaly space. Formal definitions are provided through equations (1)–(8), establishing precise mathematical criteria for distinguishing these new classes.
The authors propose a five‑step framework for handling both supervised and unsupervised datasets:
- Feature Selection & Weighting – From the original k features, l discriminative features are selected based on separation capability. Normalization and a weighting scheme (Equations 9‑10) generate new composite features, ensuring balanced scales across dimensions.
- Dataset Partitioning & Clustering – Supervised data are split into class‑specific sub‑datasets; unsupervised data undergo clustering (e.g., K‑means, X‑means) to form provisional groups.
- Labeling of Anomalies – Using the mathematical definitions, each instance is labeled as ND (normal), CNA, PA (point anomaly), or CPA.
- Normalization of Sub‑datasets – After weighting, a second normalization step removes residual scale differences before integration.
- Integration – All sub‑datasets are merged into a single training set for the downstream model.
The core detection engine is a Multi‑Layer Perceptron Neural Network (MLP‑NN) whose performance is notoriously sensitive to initial weight values and can become trapped in local minima. To overcome this, the authors embed a Genetic Algorithm (GA) into the training pipeline. Two identical MLPs are instantiated: one follows standard back‑propagation, while the other serves as the fitness evaluator within the GA. The GA searches the weight‑bias space (bounded between 0 and 1) using crossover and mutation, evaluating each candidate by the Mean Squared Error (MSE) of the corresponding MLP. Key architectural choices include:
- Input layer with 2 neurons (assuming a 2‑dimensional feature space)
- One hidden layer with 10 neurons (determined empirically to balance under‑fitting and over‑fitting)
- Output layer with 4 neurons, each representing one of the four classes (ND, CNA, PA, CPA) using a one‑hot encoding scheme
- Tanh‑type sigmoid (Tansig) activation for both hidden and output layers, providing non‑linear capability while keeping outputs within a manageable range
- Scaled Conjugate Gradient (Trainscg) as the back‑propagation optimizer for rapid convergence
Through extensive experiments on benchmark datasets such as KDD‑Cup, NSL‑KDD, and several public anomaly repositories, the GA‑enhanced MLP consistently outperforms baseline MLP, Support Vector Machines, Isolation Forest, and LOF. Quantitatively, the proposed method reduces test error by an average of 12 % and achieves an F1‑score above 0.85 for the newly defined CPA and CNA classes—areas where conventional detectors typically falter. Additionally, the authors present scalability and efficiency analyses (Figures 2 and 3) showing that the GA‑MLP maintains competitive runtime and memory usage even as dimensionality rises to 80 features, highlighting its suitability for high‑dimensional applications.
The paper also acknowledges limitations. The GA introduces additional computational overhead, especially during hyper‑parameter tuning (population size, crossover/mutation rates). Moreover, the generalizability of CNA and CPA across domains beyond the tested network‑traffic and credit‑card datasets remains to be validated. Real‑time or streaming scenarios would require further optimization to meet latency constraints.
In conclusion, this work makes three substantive contributions:
- Conceptual Extension – By formally defining CNA and CPA, it enriches the anomaly taxonomy to capture subtle, thin‑boundary cases.
- Methodological Innovation – The integration of GA for weight optimization in MLP provides a robust mechanism to escape local minima and improve detection accuracy for complex, non‑linear boundaries.
- Practical Framework – The five‑step pipeline offers a reproducible process for preparing both labeled and unlabeled data, making the approach applicable across diverse domains.
Future research directions suggested include exploring hybrid meta‑heuristics (e.g., Particle Swarm or Differential Evolution) to reduce GA’s computational cost, extending the framework to online learning environments, and conducting domain‑specific studies to assess the prevalence and impact of CNA/CPA in fields such as IoT sensor networks, medical diagnostics, and financial fraud detection.
Comments & Academic Discussion
Loading comments...
Leave a Comment