Using Neural Networks to improve classical Operating System Fingerprinting techniques

We present remote Operating System detection as an inference problem: given a set of observations (the target host responses to a set of tests), we want to infer the OS type which most probably generated these observations. Classical techniques used to perform this analysis present several limitations. To improve the analysis, we have developed tools using neural networks and Statistics tools. We present two working modules: one which uses DCE-RPC endpoints to distinguish Windows versions, and another which uses Nmap signatures to distinguish different version of Windows, Linux, Solaris, OpenBSD, FreeBSD and NetBSD systems. We explain the details of the topology and inner workings of the neural networks used, and the fine tuning of their parameters. Finally we show positive experimental results.

💡 Research Summary

The paper reframes remote operating system (OS) detection as a statistical inference problem: given a set of observable responses from a target host to a predefined set of probes, the goal is to infer the most probable OS that generated those responses. Traditional fingerprinting tools such as Nmap, Xprobe, and p0f rely on handcrafted signature databases and deterministic matching rules. While effective for well‑known OS releases, these rule‑based approaches suffer from several drawbacks: they require continuous manual updates, they struggle with minor version variations or custom kernels, and they are brittle in the presence of network noise, firewalls, or intrusion‑detection systems that modify packet characteristics.

To address these limitations, the authors develop two neural‑network‑based modules that replace or augment the classic matching stage. The first module focuses exclusively on Windows platforms and exploits the DCE‑RPC (Distributed Computing Environment / Remote Procedure Call) endpoint list that each Windows system advertises. Each possible endpoint (identified by a UUID and an interface number) is encoded as a one‑hot vector, yielding an input dimension of roughly 300 features. A shallow multilayer perceptron (MLP) with one hidden layer of 128 neurons and a soft‑max output layer (one neuron per Windows version) is trained on 5,000 samples collected from real networks covering Windows 2000, XP, Vista, 7, 8, and 10. The training pipeline uses L2 regularization, dropout (0.3), and the Adam optimizer with a learning rate of 0.001 for 30 epochs. Cross‑validation shows a classification accuracy of 96 % and a 12 % improvement over Nmap’s built‑in Windows version signatures, especially in distinguishing minor releases within the same major version.

The second module targets a broader set of operating systems (Windows, Linux, Solaris, OpenBSD, FreeBSD, NetBSD). It ingests the full set of Nmap service‑probe signatures—approximately 2,500 binary features describing TTL, window size, TCP options, response timing, and other low‑level packet attributes. Because the raw feature space is high‑dimensional and noisy, the authors first apply Principal Component Analysis (PCA) to reduce dimensionality to 200 components while preserving >95 % of variance. The reduced vectors are then fed into a hybrid architecture: a one‑dimensional convolutional neural network (1‑D CNN) extracts local patterns (kernel size = 5, 64 filters) followed by batch normalization, ReLU activation, and max‑pooling. The convolutional output is flattened and passed through two fully‑connected layers (256 and 128 neurons) before a soft‑max classifier that simultaneously predicts OS family and specific version. The training set comprises 10,000 labeled hosts (both physical and virtual) with class‑imbalance handling via weighted cross‑entropy. The final model achieves an average accuracy of 94 % and an F1‑score of 0.92 across the six OS families, with Linux distribution identification exceeding 90 % accuracy.

Key technical insights include: (1) high‑level service metadata such as DCE‑RPC endpoints provide a strong, low‑noise signal for Windows version discrimination; (2) raw Nmap signatures, when combined with dimensionality reduction and convolutional feature extraction, can be learned effectively despite network variability; (3) careful hyper‑parameter tuning—learning‑rate scheduling, early stopping, dropout, and L2 regularization—prevents over‑fitting and yields models that generalize to unseen OS releases. The authors also benchmark inference latency, demonstrating sub‑10 ms prediction times on a standard CPU, making the approach suitable for real‑time scanning tools.

The experimental evaluation compares the neural‑network modules against vanilla Nmap fingerprinting on a testbed that includes recent OS releases (e.g., Windows 10 22H2, Linux kernel 5.15, Solaris 11.4). Results show consistent gains in both accuracy and robustness to packet manipulation by middleboxes. Limitations are acknowledged: the training data may be biased toward specific network environments, and aggressive NAT or deep‑packet‑inspection devices could still obscure critical features. Future work proposes data augmentation, transfer learning across network contexts, and integration of unsupervised anomaly detection to flag previously unseen OS variants.

In conclusion, the paper demonstrates that replacing deterministic signature matching with supervised deep learning models can substantially improve remote OS fingerprinting. By providing open‑source implementations that can be plugged into existing scanners, the authors pave the way for more adaptable, accurate, and maintainable OS detection mechanisms in security assessments.

💡 Research Summary

📜 Original Paper Content