Open Source Software: How Can Design Metrics Facilitate Architecture Recovery?
Modern software development methodologies include reuse of open source code. Reuse can be facilitated by architectural knowledge of the software, not necessarily provided in the documentation of open source software. The effort required to comprehend the system’s source code and discover its architecture can be considered a major drawback in reuse. In a recent study we examined the correlations between design metrics and classes’ architecture layer. In this paper, we apply our methodology in more open source projects to verify the applicability of our method. Keywords: system understanding; program comprehension; object-oriented; reuse; architecture layer; design metrics;
💡 Research Summary
The paper addresses the challenge of reusing open‑source software when architectural documentation is missing or insufficient. The authors propose a methodology that leverages object‑oriented design metrics—specifically the Chidamber‑Kemerer (CK) suite—to infer the architectural layering of a system solely from its source code. Their approach consists of five steps. First, a static analysis tool (Classycle) extracts a Directed Acyclic Graph (DAG) representing class dependencies. From this DAG they derive “D‑Layers,” subsets of classes where higher layers depend only on lower ones, mirroring the classic four‑tier architecture (User Interface, Controllers, Business Logic, Infrastructure). Second, consecutive D‑Layers are arbitrarily merged into four tentative architectural layers. Third, each class is measured with eight CK metrics: Weighted Methods per Class (WMC), Depth of Inheritance Tree (DIT), Number of Children (NOC), Coupling Between Objects (CBO), Response For a Class (RFC), Lack of Cohesion in Methods (LCOM), Afferent Coupling (Ca), and Number of Public Methods (NPM). Fourth, the authors compute Spearman correlation coefficients between each metric and the D‑Layer assignments across four Java‑based open‑source projects (JabRef, Jbpm, RapidMiner, SweetHome3D). They find that DIT, CBO, RFC, LCOM, and Ca consistently correlate with the layer variable, suggesting that depth of inheritance, coupling, and cohesion are strong indicators of a class’s architectural role. Fifth, they discretize the continuous metric values using the Minimal Description Length Principle (MDLP) and train a rule‑based classifier (JRip from the WEKA suite) to generate IF‑THEN rules that map metric bins to layer labels.
The experimental results show mixed success. For the UI (layer 4) and Infrastructure (layer 1) tiers, precision values range from 0.74 to 0.98 and recall from 0.60 to 0.88, indicating that the metric‑based rules can reliably identify extreme layers. However, for the intermediate Controller (layer 3) and Business Logic (layer 2) tiers, both precision and recall drop dramatically (often below 0.2), revealing that the chosen metrics and simple rule‑based model struggle to capture the nuanced behavior of middle‑tier classes. In some projects (e.g., Jbpm, SweetHome3D) the classifier fails to generate any useful rules for certain layers, underscoring the fragility of the approach when the underlying architecture does not align neatly with the assumed four‑tier decomposition.
The authors acknowledge several limitations. The process of merging D‑Layers into four groups is heuristic and may not reflect the true architectural intent of a system. Relying solely on CK metrics ignores richer information such as method‑level call graphs, dynamic binding, or runtime behavior, which could be crucial for distinguishing middle layers. Moreover, the rule‑based learner captures only linear, conjunctive relationships, leaving non‑linear interactions between metrics unexplored.
Future work is outlined along four dimensions: (1) assigning learned weights to metrics to improve predictive power; (2) incorporating additional static and dynamic metrics (e.g., call‑graph centrality, cyclomatic complexity) to enrich the feature set; (3) applying more sophisticated machine‑learning techniques such as decision trees, random forests, or graph neural networks that can model complex dependencies; and (4) validating the methodology on a broader set of open‑source projects across different programming languages to assess generalizability. By addressing these points, the authors aim to evolve their metric‑driven approach into a robust, language‑agnostic tool for automatic architecture recovery, ultimately lowering the barrier for developers to understand and reuse open‑source codebases.
Comments & Academic Discussion
Loading comments...
Leave a Comment