Identifying Clusters of Concepts in a Low Cohesive Class for Extract Class Refactoring Using Metrics Supplemented Agglomerative Clustering Technique

Identifying Clusters of Concepts in a Low Cohesive Class for Extract   Class Refactoring Using Metrics Supplemented Agglomerative Clustering   Technique
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Object oriented software with low cohesive classes can increase maintenance cost. Low cohesive classes are likely to be introduced into the software during initial design due to deviation from design principles and during evolution due to software deterioration. Low cohesive class performs operations that should be done by two or more classes. The low cohesive classes need to be identified and refactored using extract class refactoring to improve the cohesion. In this regard, two aspects are involved; the first one is to identify the low cohesive classes and the second one is to identify the clusters of concepts in the low cohesive classes for extract class refactoring. In this paper, we propose metrics supplemented agglomerative clustering technique for covering the above two aspects. The proposed metrics are validated using Weyuker’s properties. The approach is applied successfully on two examples and on a case study.


💡 Research Summary

The paper addresses a persistent problem in object‑oriented software engineering: classes that exhibit low cohesion, often referred to as “low‑cohesive” or “god‑like” classes. Such classes tend to accumulate unrelated responsibilities over time, increasing maintenance effort, reducing understandability, and amplifying the ripple effect of changes. The authors identify two interrelated tasks that must be solved to refactor these classes effectively. First, a reliable detection mechanism is needed to pinpoint low‑cohesive classes among the many components of a system. Second, once a problematic class is identified, its internal responsibilities must be partitioned into coherent groups that can be extracted into separate classes—a process known as Extract Class refactoring.

To meet these goals, the authors propose a “metrics‑supplemented agglomerative clustering” technique. The novelty lies in augmenting traditional cohesion metrics (such as LCOM) with additional quantitative signals that capture richer relationships among methods. Specifically, three sub‑metrics are defined: (1) method‑call affinity (how often two methods invoke each other directly or indirectly), (2) field‑sharing degree (the proportion of instance variables accessed by both methods), and (3) parameter‑type overlap (the similarity of method signatures). Each sub‑metric is normalized to the range


Comments & Academic Discussion

Loading comments...

Leave a Comment