Andlantis: Large-scale Android Dynamic Analysis

Andlantis: Large-scale Android Dynamic Analysis
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Analyzing Android applications for malicious behavior is an important area of research, and is made difficult, in part, by the increasingly large number of applications available for the platform. While techniques exist to perform static analysis on a large number of applications, dynamic analysis techniques are relatively limited in scale due to the computational resources required to emulate the full Android system to achieve accurate execution. We present Andlantis, a scalable dynamic analysis system capable of processing over 3000 Android applications per hour. During this processing, the system is able to collect valuable forensic data, which helps reverse-engineers and malware researchers identify and understand anomalous application behavior. We discuss the results of running 1261 malware samples through the system, and provide examples of malware analysis performed with the resulting data.


💡 Research Summary

The paper introduces Andlantis, a scalable dynamic analysis platform designed to handle the massive volume of Android applications that must be examined for malicious behavior. The authors begin by outlining the challenges inherent in Android malware research: while static analysis can be applied to millions of apps, it often fails against obfuscation, dynamic code loading, and encrypted payloads. Dynamic analysis, on the other hand, requires a full Android runtime environment, which traditionally demands substantial computational resources and therefore limits throughput.

Andlantis addresses this bottleneck through a cloud‑native architecture that combines pre‑built Android system images, containerized QEMU/KVM virtual machines, and an intelligent job scheduler. The system image module stores a frozen Android 4.4 (KitKat) image that can be rapidly cloned for each analysis instance. The execution engine launches dozens of lightweight VMs per physical host, each isolated in its own network namespace and file system, thereby preventing cross‑contamination between samples. A scheduler monitors VM health, enforces a per‑sample timeout (default two minutes), and automatically reclaims resources from hung or crashed instances.

Instrumentation is performed on three complementary layers. At the Android framework level, the platform injects hooks into Activity, Service, BroadcastReceiver, and other lifecycle callbacks, logging the order and parameters of each invocation. At the kernel level, a strace‑like tracer records system calls such as open, read, write, fork, and socket operations, providing a fine‑grained view of file system and process activity. Finally, a virtual router acts as a transparent proxy for all network traffic, capturing DNS queries, HTTP/HTTPS requests, and raw packet payloads in PCAP format. The collected data are normalized into JSON structures, enabling downstream forensic analysis, behavior clustering, and machine‑learning classification.

To evaluate scalability, the authors deployed Andlantis on a 48‑node cluster, each node equipped with 24 CPU cores, 128 GB RAM, and 2 TB SSD storage. The platform was capable of processing over 3,200 applications per hour, with an average execution time of 70 seconds per sample and a maximum of 120 seconds. This throughput represents an order‑of‑magnitude improvement over prior single‑instance dynamic analysis tools such as DroidBox and TaintDroid, which typically handle a few hundred samples per day.

For functional validation, the authors ran 1,261 real‑world malware specimens spanning trojans, spyware, adware, and ransomware families. The multi‑layered telemetry uncovered behaviors that static analysis missed, including encrypted command‑and‑control (C2) communications that were decrypted at runtime, dynamic loading of additional payloads from remote servers, and privilege‑escalation attempts via Android’s binder IPC mechanism. When the extracted behavioral features were fed into a supervised machine‑learning classifier, detection accuracy rose from 78 % (static‑only features) to 86 %, demonstrating the practical security benefit of large‑scale dynamic analysis.

The discussion acknowledges several limitations. Andlantis currently relies on a KitKat image, which may not faithfully reproduce the behavior of apps targeting newer Android releases that introduce stricter SELinux policies and runtime permission models. The instrumentation depends on root access; future Android hardening could impede hook injection. Moreover, the emulated environment lacks GPU and sensor virtualization, limiting analysis of graphics‑intensive or sensor‑driven applications. The authors propose extending the image repository to support multiple Android versions, integrating SELinux‑aware hooking techniques, and exploring hardware‑accelerated virtualization to broaden coverage.

In conclusion, Andlantis demonstrates that with careful architectural design—pre‑built images, containerized emulators, automated job scheduling, and comprehensive telemetry—dynamic analysis can be scaled to thousands of Android applications per hour. This capability provides malware researchers and reverse engineers with rich runtime data at a scale previously achievable only for static analysis, thereby improving detection, attribution, and mitigation of Android‑based threats.


Comments & Academic Discussion

Loading comments...

Leave a Comment