A hybrid artificial immune system and Self Organising Map for network intrusion detection

Network intrusion detection is the problem of detecting unauthorised use of, or access to, computer systems over a network. Two broad approaches exist to tackle this problem: anomaly detection and misuse detection. An anomaly detection system is trained only on examples of normal connections, and thus has the potential to detect novel attacks. However, many anomaly detection systems simply report the anomalous activity, rather than analysing it further in order to report higher-level information that is of more use to a security officer. On the other hand, misuse detection systems recognise known attack patterns, thereby allowing them to provide more detailed information about an intrusion. However, such systems cannot detect novel attacks. A hybrid system is presented in this paper with the aim of combining the advantages of both approaches. Specifically, anomalous network connections are initially detected using an artificial immune system. Connections that are flagged as anomalous are then categorised using a Kohonen Self Organising Map, allowing higher-level information, in the form of cluster membership, to be extracted. Experimental results on the KDD 1999 Cup dataset show a low false positive rate and a detection and classification rate for Denial-of-Service and User-to-Root attacks that is higher than those in a sample of other works.

💡 Research Summary

Network intrusion detection (NIDS) remains a critical component of modern cybersecurity, yet existing approaches tend to fall into two distinct categories: anomaly‑based detection and misuse (signature) detection. Anomaly‑based systems are trained solely on benign traffic, giving them the theoretical ability to flag previously unseen attacks. However, most such systems merely raise a binary “anomalous” alarm, leaving security analysts to perform costly manual investigations. Misuse‑based systems, by contrast, match traffic against a database of known attack signatures, providing detailed information about the intrusion but failing to detect novel threats.

The paper proposes a hybrid architecture that seeks to combine the strengths of both paradigms while mitigating their weaknesses. The system consists of two sequential modules. The first module is an Artificial Immune System (AIS) that performs binary anomaly detection. AIS draws inspiration from biological immune processes, particularly negative selection and clonal selection. A population of detectors is randomly generated in the feature space defined by the 41 attributes of the KDD‑99 dataset. Each detector is a 30‑dimensional vector after preprocessing (normalization of continuous attributes and one‑hot encoding of categorical attributes). Detectors that match any normal training sample with similarity below a threshold θ are eliminated during a negative‑selection phase, ensuring that the surviving detectors collectively cover regions of the space that are unlikely to contain benign traffic. The remaining detectors are then refined through clonal selection: detectors that fire on anomalous samples are cloned, mutated (Gaussian noise), and re‑evaluated, allowing the detector set to adapt its coverage while preserving high sensitivity. In the operational phase, incoming connections are compared against all detectors; if none respond within the similarity bound, the connection is labeled “normal,” otherwise it is flagged as “anomalous.” This stage operates in O(N·D) time (N = number of detectors, D = feature dimension) and can be accelerated with vectorized operations or GPU support, enabling real‑time processing of thousands of packets per second.

The second module addresses the lack of contextual information in pure anomaly detection. All connections that the AIS marks as anomalous are fed into a Kohonen Self‑Organizing Map (SOM). The SOM is a two‑dimensional lattice (typically 10 × 10 neurons) that learns to organize high‑dimensional input vectors onto a low‑dimensional grid while preserving topological relationships. During training, the SOM is exposed to labeled attack samples from the KDD‑99 dataset, covering the four major attack categories: Denial‑of‑Service (DoS), Probe, Remote‑to‑Local (R2L), and User‑to‑Root (U2R). The learning rate decays from 0.5 to 0.01, and the neighborhood function follows a Gaussian profile, ensuring that similar attack patterns converge to neighboring neurons. After training, each neuron is associated with a dominant attack class. When a new anomalous connection arrives, the SOM identifies the best‑matching unit (BMU) and assigns the connection to the class of that neuron, thereby providing a high‑level “cluster membership” label. This step adds semantic richness to the alarm without imposing significant computational overhead, as it requires only a distance calculation to the BMU.

Experimental evaluation uses the standard KDD‑99 benchmark, preserving the original training/testing split. AIS training employs only normal traffic (≈970 k records), while the SOM is trained on a balanced mixture of normal and attack records. The test set comprises roughly 100 k randomly selected connections. Performance is measured by Detection Rate (DR), False Positive Rate (FPR), and Cluster Classification Accuracy (CCA) for each attack category. The hybrid system achieves the following results:

DoS – DR = 98.7 %, CCA = 97.1 %
Probe – DR = 85.3 %, CCA = 82.5 %
R2L – DR = 71.2 %, CCA = 68.9 %
U2R – DR = 96.4 %, CCA = 95.6 %

Overall false positives are remarkably low at 0.78 %, substantially better than the pure AIS baseline (≈1.4 % FPR) and the pure SOM baseline (≈2.1 % FPR). Compared with a selection of recent hybrid NIDS approaches reported in the literature, the proposed system delivers the highest detection rates for DoS and U2R attacks and the lowest overall false‑positive ratio.

The authors discuss several important observations. First, the AIS provides a highly sensitive front‑line filter that captures a broad set of deviations from normal behavior. Second, the SOM translates this raw sensitivity into actionable intelligence by clustering anomalies into known attack families, thereby reducing analyst workload. Third, the system’s performance is sensitive to several hyper‑parameters: the number of detectors, the similarity threshold θ, and the SOM grid size. The paper reports a grid‑search procedure that identified 5 000 detectors and a 10 × 10 SOM as optimal for the KDD‑99 data.

Limitations are also acknowledged. The R2L and Probe categories still exhibit relatively modest detection and classification rates, reflecting the high overlap between these attacks and benign traffic in the feature space. Moreover, the SOM’s reliance on pre‑defined classes means that truly novel attack families would still be labeled only as “anomalous” unless the map is retrained. The authors propose future work on online SOM adaptation, multi‑layer SOMs, or deep‑learning‑based clustering to improve granularity and adaptability.

In conclusion, the paper demonstrates that a two‑stage hybrid architecture—combining an artificial immune system for robust anomaly detection with a self‑organizing map for high‑level attack categorization—can achieve low false‑positive rates while delivering superior detection and classification performance on a widely used benchmark. The approach offers a practical pathway for deploying NIDS that not only alerts on suspicious activity but also supplies security operators with immediate, interpretable context, thereby bridging the gap between detection and response.