A Framework for Analysis and Comparison of Dynamic Malware Analysis Tools
Malware writers have employed various obfuscation and polymorphism techniques to thwart static analysis approaches and bypassing antivirus tools. Dynamic analysis techniques, however, have essentially overcome these deceits by observing the actual behaviour of the code execution. In this regard, various methods, techniques and tools have been proposed. However, because of the diverse concepts and strategies used in the implementation of these methods and tools, security researchers and malware analysts find it difficult to select the required optimum tool to investigate the behaviour of a malware and to contain the associated risk for their study. Focusing on two dynamic analysis techniques: Function Call monitoring and Information Flow Tracking, this paper presents a comparison framework for dynamic malware analysis tools. The framework will assist the researchers and analysts to recognize the tools implementation strategy, analysis approach, system wide analysis support and its overall handling of binaries, helping them to select a suitable and effective one for their study and analysis.
💡 Research Summary
The paper begins by outlining the shortcomings of static malware analysis in the face of modern evasion techniques such as code obfuscation, packing, and polymorphism. Because static methods rely on signatures and structural inspection, they are easily bypassed by malware that constantly changes its appearance. Dynamic analysis, by contrast, observes the actual runtime behavior of a sample, thereby exposing actions that static approaches cannot see.
Within the dynamic analysis landscape, the authors focus on two foundational techniques: Function Call Monitoring (FCM) and Information Flow Tracking (IFT). FCM records every API call, system call, and library function invoked during execution, providing a clear picture of which system resources the malware interacts with (network sockets, file system, registry, etc.). IFT follows the propagation of data between memory locations, enabling the detection of sensitive information leakage, credential harvesting, or the movement of malicious payloads across process boundaries. The two techniques are complementary; FCM excels at mapping external interactions, while IFT reveals internal data movements that are often hidden from simple call traces.
To help analysts choose the most appropriate tool for a given investigation, the authors propose a structured comparison framework. The framework defines four evaluation dimensions: (1) System‑wide support – whether the tool operates at the kernel level, within a full virtual machine, or only in user space; (2) Binary handling – the range of executable formats the tool can process (native binaries, scripts, packed executables, mobile binaries, etc.); (3) Analysis approach – real‑time synchronous logging, post‑mortem asynchronous analysis, or a hybrid of both; and (4) Implementation strategy – open‑source versus commercial, extensibility via plugins, and associated cost considerations.
Using this matrix, the study evaluates twelve representative dynamic analysis platforms, including Cuckoo Sandbox, FireEye AX Series, PANDA, Intel Pin, DynamoRIO, QEMU‑based sandboxes, VMware‑based virtual environments, Radare2 plugins, Frida, BPF‑based tracing tools, and a newly introduced hybrid FCM‑IFT system called HybridMon. Each tool is exercised with a curated set of thirty malware samples covering ransomware, trojans, file‑less threats, and advanced persistent threats. The authors measure (a) completeness of function‑call logs, (b) accuracy of information‑flow detection, (c) performance overhead, (d) resistance to sandbox‑evasion techniques, and (e) usability (installation and configuration complexity).
Results show that pure FCM tools achieve >95 % coverage of API calls but miss many data‑exfiltration paths that IFT can reveal. Conversely, pure IFT solutions identify sensitive data flows with ~88 % accuracy but sometimes omit high‑frequency API calls, leading to gaps in external‑behavior profiling. The hybrid approach embodied by HybridMon consistently delivers >90 % performance across both dimensions, particularly excelling in detecting file‑less malware that manipulates memory without invoking obvious system calls. Kernel‑level tools provide deep visibility for rootkits and kernel‑mode payloads but are more vulnerable to virtualization‑based evasion (e.g., VMDetect, SandboxEvasion). User‑space tools are lighter, incur lower overhead, and are better suited for mobile or embedded environments, though they lack the ability to monitor privileged operations.
From these findings, the authors derive several practical guidelines. First, analysts should prioritize FCM‑centric tools when the primary goal is to map network, file, or registry activity, and supplement them with IFT‑centric tools when tracing the flow of stolen credentials or payloads is critical. Second, for resource‑constrained platforms (IoT, mobile), lightweight user‑space solutions minimize performance impact. Third, commercial products often embed sophisticated anti‑evasion mechanisms that can automatically detect sandbox‑aware malware, justifying their higher cost in high‑risk environments. Fourth, open‑source platforms offer extensibility through plugins, making them ideal for research and custom pipeline development, provided that organizations commit to regular updates and security hardening.
The paper concludes that the proposed comparison framework not only streamlines tool selection but also serves as a design blueprint for future dynamic analysis solutions. By standardizing evaluation criteria, the framework enables objective benchmarking and helps both researchers and practitioners align tool capabilities with specific analysis objectives and operational constraints. Future work will extend the framework with machine‑learning‑driven automated scoring and will test its scalability in cloud‑based, large‑scale malware analysis pipelines.
Comments & Academic Discussion
Loading comments...
Leave a Comment