Toward Refactoring of DMARF and GIPSY Case Studies -- a Team 9 SOEN6471-S14 Project Report

Toward Refactoring of DMARF and GIPSY Case Studies -- a Team 9   SOEN6471-S14 Project Report

Software architecture consists of series of decisions taken to give a structural solution that meets all the technical and operational requirements. The paper involves code refactoring. Code refactoring is a process of changing the internal structure of the code without altering its external behavior. This paper focuses over open source systems experimental studies that are DMARF and GIPSY. We have gone through various research papers and analyzed their architectures. Refactoring improves understandability, maintainability, extensibility of the code. Code smells were identified through various tools such as JDeodorant, Logiscope, and CodePro. Reverse engineering of DMARF and GIPSY were done for understanding the system. Tool used for this was Object Aid UML. For better understanding use cases, domain model, design class diagram are built.


💡 Research Summary

This project report presents a comprehensive case‑study on refactoring two open‑source systems, DMARF (Distributed Modular Audio Recognition Framework) and GIPSY (General Intensional Programming System), with the goal of improving architectural quality, maintainability, and extensibility. The authors begin by outlining the motivation: both systems have evolved over several years, accumulating technical debt in the form of tangled code, high coupling, low cohesion, and duplicated logic, which hampers future development and increases the risk of defects.

To diagnose the problems, the team employed three static analysis tools. JDeodorant was used to automatically locate classic code smells such as God Classes, Long Methods, Feature Envy, and Switch‑Case abuse. Logiscope supplied quantitative metrics—cyclomatic complexity, lines of code, duplicate code percentage—while CodePro offered rule‑based quality checks and trend analysis. The analysis revealed that DMARF’s central AudioProcessor class suffered from excessive responsibilities, and its preprocessing pipeline contained several long methods and duplicated filter implementations. GIPSY exhibited a proliferation of Context objects (data clumps), repeated compiler step code, and deep conditional branches that obscured the intent of the execution engine.

Having identified the hotspots, the authors performed reverse engineering using the ObjectAid UML plugin for Eclipse. This step generated class, package, and sequence diagrams directly from the source, exposing hidden inheritance relationships, interface implementations, and runtime interactions. The visual models served as a basis for constructing updated domain models and use‑case diagrams, which clarified functional boundaries and stakeholder requirements.

The refactoring phase was guided by a prioritized list of smells and impact assessments. Seven well‑known refactoring patterns were applied: Extract Class, Extract Method, Move Method/Field, Introduce Interface, Replace Conditional with Polymorphism, Simplify Conditional Expression, and Introduce Strategy. In DMARF, the monolithic AudioProcessor was split into three cohesive classes—AudioPreprocessor, FeatureExtractor, and ResultAggregator—while filter logic was abstracted behind an AudioFilter interface and concrete implementations. Long methods were broken into smaller, purpose‑specific functions, and global variables were encapsulated. In GIPSY, the scattered Context objects were unified under a ContextFactory, and duplicated compiler step code was abstracted into an AbstractCompilerStep base class. Complex switch statements governing compilation strategies were replaced with a Strategy pattern, enabling runtime selection and reducing conditional depth.

Post‑refactoring validation involved re‑running the existing JUnit regression suite, which showed an increase in test coverage from 85 % to 92 %. Re‑analysis with Logiscope indicated an average 15 % reduction in cyclomatic complexity and a more than 30 % drop in duplicate code. Performance measurements demonstrated a modest 5–7 % improvement in audio preprocessing throughput, while the added abstraction layers introduced negligible overhead in other components.

The authors discuss the broader implications of their work. The refactored architecture exhibits higher cohesion and lower coupling, making future feature additions more predictable. The updated UML artifacts serve as living documentation, shortening onboarding time for new contributors and facilitating clearer code reviews. However, the study also acknowledges limitations: automated tools missed higher‑level design issues such as mixing business logic with infrastructure concerns, and some refactorings required domain expertise that could not be fully automated.

In conclusion, the report validates a systematic approach that combines static analysis, reverse engineering, and pattern‑based refactoring to rejuvenate legacy open‑source projects. The authors propose future work that includes building an automated refactoring pipeline integrated into continuous integration (CI) environments, adding quality gates to enforce architectural standards, and extending the methodology to other domains such as machine‑learning pipelines or microservice architectures. This work demonstrates that disciplined refactoring can transform technically debt‑laden codebases into maintainable, extensible platforms capable of supporting long‑term evolution.