Context-Sensitive Pointer Analysis for ArkTS
Current call graph generation methods for ArkTS, a new programming language for OpenHarmony, exhibit precision limitations when supporting advanced static analysis tasks such as data flow analysis and vulnerability pattern detection, while the workflow of traditional JavaScript(JS)/TypeScript(TS) analysis tools fails to interpret ArkUI component tree semantics. The core technical bottleneck originates from the closure mechanisms inherent in TypeScript’s dynamic language features and the interaction patterns involving OpenHarmony’s framework APIs. Existing static analysis tools for ArkTS struggle to achieve effective tracking and precise deduction of object reference relationships, leading to topological fractures in call graph reachability and diminished analysis coverage. This technical limitation fundamentally constrains the implementation of advanced program analysis techniques. Therefore, in this paper, we propose a tool named ArkAnalyzer Pointer Analysis Kit (APAK), the first context-sensitive pointer analysis framework specifically designed for ArkTS. APAK addresses these challenges through a unique ArkTS heap object model and a highly extensible plugin architecture, ensuring future adaptability to the evolving OpenHarmony ecosystem. In the evaluation, we construct a dataset from 1,663 real-world applications in the OpenHarmony ecosystem to evaluate APAK, demonstrating APAK’s superior performance over CHA/RTA approaches in critical metrics including valid edge coverage (e.g., a 7.1% reduction compared to CHA and a 34.2% increase over RTA). The improvement in edge coverage systematically reduces false positive rates from 20% to 2%, enabling future exploration of establishing more complex program analysis tools based on our framework. Our proposed APAK has been merged into the official static analysis framework ArkAnalyzer for OpenHarmony.
💡 Research Summary
The paper addresses a critical gap in static analysis for ArkTS, the new programming language used in the OpenHarmony ecosystem. Existing call‑graph generation techniques, which rely on Class Hierarchy Analysis (CHA) or Rapid Type Analysis (RTA), fail to capture the dynamic features of ArkTS—particularly the declarative UI framework ArkUI, the use of decorators such as @Component and @Entry, and the global reactive store (AppStorage) that propagates objects and closures across components. These shortcomings lead to fractured call graphs, high false‑positive rates, and limited coverage for downstream analyses like data‑flow tracking or vulnerability detection.
To overcome these limitations, the authors introduce the ArkAnalyzer Pointer Analysis Kit (APAK), the first context‑sensitive Andersen‑style pointer analysis framework specifically designed for ArkTS. APAK’s architecture rests on three pillars:
-
A dedicated heap object model – Every class instance, function pointer, container, and AppStorage key‑value pair is abstracted as a distinct heap object. This fine‑grained representation enables the analysis to maintain precise points‑to sets (e.g., pts(o₁.f) = {o₂}) even when objects are created inside lambda expressions and later stored in a global store.
-
A plugin‑based extensibility mechanism – ArkTS and OpenHarmony evolve rapidly; new SDK APIs or language constructs can be supported by writing a small rule‑based plugin rather than modifying the core engine. Plugins encode domain‑specific constraints such as the propagation rule for AppStorage.setOrCreate or the semantics of the @StorageProp decorator.
-
Context‑sensitive processing – APAK distinguishes between call‑site sensitivity (for dynamically dispatched calls) and function‑level sensitivity (for static calls). By generating a separate analysis context for each call site that involves a closure or a stored object, the framework avoids the over‑approximation typical of CHA/RTA while keeping the analysis scalable.
The implementation consumes ArkIR, an intermediate representation produced by the existing ArkAnalyzer tool. ArkIR normalizes source code, desugars syntactic sugar, and isolates declaration, statement, and expression layers. The language model deliberately omits control‑flow constructs to focus on pointer‑relevant operations such as AssignmentStmt, PropertyAccessStmt, and CallStmt. New object creation (NewExpr) and lambda expressions (LambdaExpr) are treated as the primary heap allocation nodes.
During analysis, each statement generates a set of pointer constraints (e.g., x := new C() produces pts(x) ⊇ {o_C}). The core work‑list algorithm iteratively propagates these constraints across the program graph, while plugins inject additional rules on the fly. When the fix‑point is reached, the points‑to information is used to construct a call graph that reflects the true dynamic dispatches of ArkTS programs.
The authors evaluate APAK on a substantial corpus of 1,663 real‑world OpenHarmony applications collected from public repositories. They compare three metrics across CHA, RTA, and APAK:
- Valid edge coverage – APAK achieves a coverage comparable to CHA but improves over RTA by 34.2 %.
- Call‑graph accuracy – Across 12 manually verified samples, APAK’s precision improves by 5.1 % to 49.8 % relative to the baselines.
- False‑positive rate – The rate drops dramatically from 20 % (CHA/RTA) to 2 % with APAK.
- Performance – Analysis time remains on par with CHA; the overhead introduced by plugins is negligible.
These results demonstrate that APAK can accurately track dynamically created objects, closure propagation, and indirect calls mediated by AppStorage—scenarios where CHA and RTA either miss edges or introduce excessive spurious edges. By reducing false positives and increasing coverage, APAK creates a reliable foundation for higher‑level analyses such as taint tracking, memory‑leak detection, or automated refactoring.
The paper lists four concrete contributions: (1) the design of the first context‑sensitive pointer analysis framework for ArkTS; (2) a customized heap abstraction and context strategy tailored to OpenHarmony’s architecture; (3) an extensive empirical evaluation on a large real‑world dataset; and (4) open‑sourcing both the dataset and the APAK implementation, the latter being merged into the official ArkAnalyzer repository.
The authors acknowledge some limitations: the current language model abstracts away control‑flow constructs (loops, conditionals), which may affect analyses that depend on precise path sensitivity; and while the plugin system is powerful, authoring new plugins still requires expertise in the underlying IR. Future work includes integrating control‑flow modeling, automating plugin generation, and extending APAK to support data‑flow, taint, and security analyses.
In summary, APAK represents a significant advancement for static analysis in the OpenHarmony ecosystem. By providing a precise, extensible, and context‑aware pointer analysis, it overcomes the inherent dynamism of ArkTS, dramatically improves call‑graph quality, and paves the way for sophisticated security and performance tools built atop the ArkAnalyzer platform.
Comments & Academic Discussion
Loading comments...
Leave a Comment