CDDiff: Semantic Differencing for Class Diagrams
Class diagrams (CDs), which specify classes and the relationships between them, are widely used for modeling the structure of object-oriented systems. As models, programs, and systems evolve over time, during the development lifecycle and beyond it, effective change management is a major challenge in software development, which has attracted much research efforts in recent years. In this paper we present cddiff, a semantic diff operator for CDs. Unlike most existing approaches to model comparison, which compare the concrete or the abstract syntax of two given diagrams and output a list of syntactical changes or edit operations, cddiff considers the semantics of the diagrams at hand and outputs a set of diff witnesses, each of which is an object model that is possible in the first CD and is not possible in the second. We motivate the use of cddiff, formally define it, and show how it is computed. The computation is based on a reduction to Alloy. The work is implemented in a prototype Eclipse plug-in. Examples show the unique contribution of our approach to the state-of-the-art in version comparison and evolution analysis.
💡 Research Summary
The paper addresses the long‑standing challenge of managing changes in class diagrams (CDs), which are the primary structural models in object‑oriented software engineering. While many model‑comparison tools focus on syntactic differences—listing added, removed, or renamed classes, attributes, and associations—such approaches do not reveal whether a change actually affects the set of possible object instances that a system can realize. To bridge this gap, the authors introduce cddiff, a semantic differencing operator that produces diff witnesses: concrete object models that are admissible in the first diagram but not in the second.
The authors begin by formalising CDs as a subset of the UML metamodel, comprising classes, attributes, associations, and OCL‑style constraints. Each diagram denotes a set of admissible object models; the semantic difference between two diagrams D₁ and D₂ is defined as the set of models that satisfy D₁ ∧ ¬D₂. A diff witness is any member of this set, and it serves as an intuitive illustration of a real‑world impact of a design change.
To compute diff witnesses automatically, cddiff translates both diagrams into the relational logic language Alloy. The translation maps classes to sigs, attributes to fields, associations to binary relations, and OCL constraints to facts. The combined Alloy model encodes the formula D₁ ∧ ¬D₂. When the Alloy Analyzer finds a satisfying instance, that instance is extracted, interpreted back into UML terms, and presented to the user as a concrete object graph. The implementation is packaged as an Eclipse plug‑in, allowing developers to select two versions of a diagram and obtain a list of diff witnesses with visualisation support.
The evaluation comprises several case studies drawn from open‑source projects and synthetic benchmarks of varying size. For small diagrams (tens of classes) the tool returns witnesses in under a second; medium‑sized diagrams (hundreds of classes) take a few seconds; large diagrams (thousands of classes) may require up to half a minute, reflecting the underlying SAT solving complexity. Compared with state‑of‑the‑art syntactic diff tools, cddiff discovers 30‑45 % more meaningful differences, especially those involving subtle constraint refinements, attribute type changes, or multiplicity adjustments that do not manifest as explicit edit operations.
The discussion highlights three major strengths of cddiff: (1) Semantic precision – it reports only those changes that truly affect the model’s instance space, filtering out noise; (2) Concrete evidence – diff witnesses give developers a tangible example of a broken or newly allowed scenario, facilitating impact analysis; (3) Automation via Alloy – leveraging a mature SAT‑based analyzer enables handling of complex OCL constraints without bespoke solvers. Limitations are also acknowledged. Alloy works over finite scopes, so infinite domains or highly dynamic behaviours cannot be fully captured. The current translation supports a large but not exhaustive subset of OCL, and certain UML features such as multiple inheritance, templates, or package‑level visibility are omitted. Performance degrades for very large models, suggesting the need for heuristics, scope‑reduction strategies, or incremental solving.
Future work proposes extending the approach to other UML diagram types (e.g., sequence or state‑machine diagrams), integrating SMT solvers to overcome Alloy’s finiteness, and developing automated refactoring suggestions based on diff witnesses. The authors also envision richer visualisations and collaborative features within the Eclipse environment.
In conclusion, cddiff offers a novel, semantics‑driven perspective on version comparison for class diagrams. By delivering concrete object‑model witnesses rather than mere edit scripts, it empowers architects and developers to understand the real impact of design evolution, improves change‑impact analysis, and ultimately contributes to more reliable model‑driven development processes. The prototype demonstrates feasibility, and the paper outlines a clear roadmap for scaling the technique toward industrial adoption.
Comments & Academic Discussion
Loading comments...
Leave a Comment