Array Based Java Source Code Obfuscation Using Classes with Restructured Arrays
Array restructuring operations obscure arrays. Our work aims on java source code obfuscation containing arrays. Our main proposal is Classes with restructured array members and obscured member methods for setting, getting array elements and to get the length of arrays. The class method definition codes are obscured through index transformation and constant hiding. The instantiated objects of these classes are used for source code writing. A tool named JDATATRANS is developed for generating classes and to the best of our knowledge this is the first tool available for array restructuring, on java source codes.
💡 Research Summary
The paper addresses the problem of source‑level obfuscation for Java programs, focusing specifically on arrays, which are often a critical data structure but remain relatively easy to understand after decompilation. The authors propose a systematic approach that restructures arrays through four fundamental transformations—splitting, merging, folding, and flattening—and encapsulates the transformed arrays within automatically generated Java classes. These classes provide standard methods (setArray, getArray, lengthArray) whose implementations are deliberately obscured by applying index transformation functions (e.g., i → 2*i+3) and a constant‑hiding routine named F(y, count). The constant‑hiding function uses a pre‑computed table of integer pairs whose sums are prime numbers, thereby allowing a target constant (such as the integer 2) to be reconstructed only after a series of arithmetic operations, which thwarts straightforward static analysis.
To automate the creation of the obfuscating classes, the authors built a tool called JDATATRANS. The workflow begins with an input file (InFile.txt) that lists array declarations in ordinary Java syntax. JDATATRANS parses this file, determines the required transformation type and data type (int, double, String, char), and generates a set of “framework” class files for each transformation. Initially these class files contain only empty method bodies or trivial return statements, enabling the original source code to compile without the detailed implementations. When the user supplies a target source file (e.g., test.java), JDATATRANS parses it, replaces each plain array declaration with an instance of the appropriate generated class, and rewrites the class files to embed the full, obfuscated method bodies. The resulting program behaves identically to the original but its internal array handling is heavily obfuscated.
The authors evaluate the approach on four sample programs that perform simple array initialization and printing: search_orig (plain array), search_ArraySplit, search_ArrayFold, and search_ArrayFlatten. For each program they measure (1) the increase in lines of code (LOC), (2) the number of additional statements introduced by the generated classes, (3) the number of calls to the constant‑hiding function, (4) runtime on two hardware platforms (Pentium IV 3 GHz and Core Duo 1.66 GHz), and (5) the resulting binary size. They adopt the quality model from prior work, defining Potency (S_pot) as a weighted ratio of added LOC, Cost (S_cst) as a combination of storage overhead (file‑size increase) and runtime overhead (with weights y² = 0.15 and z² = 0.45), and overall Quality (S_quality) as 0.4 × S_pot − S_cst.
Results show that the array‑splitting transformation yields the highest quality score (≈62), because it adds a substantial amount of code (high potency) while incurring modest runtime and size penalties. The merging, folding, and flattening transformations produce lower quality scores (≈21–32) due to either higher storage overhead or lower potency. The analysis confirms that the proposed technique can significantly increase the difficulty of reverse engineering with acceptable performance costs.
The paper’s contributions are threefold: (1) a novel, structured method for array‑centric source‑level obfuscation, (2) the JDATATRANS tool that automates class generation and source rewriting, and (3) a quantitative assessment of obfuscation quality using established metrics. Limitations include the absence of control‑flow or variable‑name obfuscation, which means sophisticated deobfuscators that target data‑flow analysis may still succeed. Moreover, the constant‑hiding function relies on a static table; once discovered, the hidden constants become vulnerable.
Future work outlined by the authors involves strengthening the approach by (a) obscuring the indices used in setArray/getArray calls themselves, (b) layering multiple constant‑hiding functions to increase cryptographic strength, and (c) integrating the array‑based transformations with other obfuscation techniques such as control‑flow flattening, opaque predicates, and identifier renaming to build a comprehensive protection framework.
In summary, the study demonstrates that systematic restructuring of arrays, combined with automated class generation and index/constant transformation, provides an effective and practical means of source‑level Java obfuscation. The JDATATRANS prototype validates the concept and offers a foundation for further research into more robust, multi‑layered software protection mechanisms.
Comments & Academic Discussion
Loading comments...
Leave a Comment