UML Artifacts Reuse: State of the Art

UML Artifacts Reuse: State of the Art
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The benefits that can be derived from reusing software include accelerated development, reduced cost, reduced risk and effective use of specialists. Reuse of software artifacts during the initial stages of software development increases reuse benefits, because it allows subsequent reuse of later stage artifacts derived from earlier artifacts. UML is the de facto modeling language used by software developers during the initial stages of software development such as requirements engineering, architectural and detailed design. This survey analyzes previous works on UML artifacts reuse. The analysis considers four perspectives: retrieval method, artifact support, tool support and experiments performed. As an outcome of the analysis, some suggestions for future work on UML artifacts reuse are also provided


💡 Research Summary

The paper provides a comprehensive survey of research on reusing UML artifacts, emphasizing that reuse at the earliest stages of software development yields the greatest benefits because later‑stage artifacts can be derived from the reused models. The authors collected a representative set of publications from major journals and conferences, then classified each work according to four dimensions: retrieval method, supported UML artifact, tool support, and experimental evaluation.

In the retrieval‑method dimension, five families of techniques are identified. Ontology‑based approaches construct domain ontologies to capture semantic relationships among UML elements, enabling high‑precision matching but incurring substantial modeling effort and limited scalability. Graph‑matching methods treat class, sequence, or state diagrams as graphs and apply sub‑graph isomorphism or similarity measures; these methods are powerful for structural similarity but suffer from NP‑hard computational costs, making them unsuitable for large repositories without aggressive pruning. Case‑Based Reasoning (CBR) reuses previously solved design cases, offering a pragmatic, experience‑driven workflow; however, the quality and maintenance of the case base become critical success factors. Text‑similarity techniques rely on natural‑language descriptions, comments, or requirements associated with UML models, providing fast retrieval at the expense of semantic depth. Finally, hybrid approaches combine two or more of the above to balance precision, recall, and performance, and the authors regard hybrids as the most promising direction.

Regarding supported artifacts, the survey shows a clear bias toward structural diagrams—class diagrams and component diagrams—because their metamodels are well‑defined and many tools already expose their metadata. Behavioral diagrams such as sequence diagrams, state machines, and activity diagrams receive comparatively less attention due to the added difficulty of matching dynamic execution flows. A minority of studies attempt multi‑view reuse, integrating structural and behavioral information to preserve design consistency across views.

Tool support is examined across three categories: standalone applications, IDE plug‑ins (e.g., Eclipse, IBM Rational), and web‑based services. Most prototypes are academic, often released as open‑source, but they lack robust integration with commercial modeling environments and standardized APIs. Consequently, practitioners face hurdles when trying to adopt these tools in real projects.

Experimental evaluation is another focal point. The authors note that many papers report quantitative metrics such as precision, recall, and F‑score, yet they typically use small, self‑constructed datasets (10–30 models) and rarely make the data publicly available. Some works complement quantitative results with qualitative user studies, measuring perceived usefulness and cognitive load, but the sample sizes are limited. The lack of benchmark repositories and large‑scale industrial case studies hampers the ability to compare approaches objectively and to assess their real‑world impact.

The paper concludes by summarizing the state of the art: current research strives to improve semantic accuracy while keeping retrieval costs manageable, but significant gaps remain. The authors propose several avenues for future work: (1) development of standardized metadata and ontology frameworks to reduce the effort of semantic modeling; (2) design of scalable, cloud‑based repositories that can host millions of UML artifacts and support real‑time hybrid search; (3) creation of multi‑view retrieval algorithms that jointly consider structural, behavioral, and deployment diagrams; and (4) extensive industrial validation through large‑scale experiments and continuous feedback loops. Addressing these challenges would enable UML artifact reuse to become a mainstream practice, delivering measurable reductions in development time, cost, and risk across the software lifecycle.


Comments & Academic Discussion

Loading comments...

Leave a Comment