Toward Refactoring of DMARF and GIPSY Case Studies -- A Team XI SOEN6471-S14 Project Report
This report focuses on improving the internal structure of the Distributed Modular Audio recognition Framework (DMARF) and the General Intensional Programming System (GIPSY) case studies without affecting their original behavior. At first, the general principles, and the working of DMARF and GIPSY are understood by mainly stressing on the architecture of the systems by looking at their frameworks and running them in the Eclipse environment. To improve the quality of the structure of the code, a furtherance of understanding of the architecture of the case studies and this is achieved by analyzing the design patterns present in the code. The improvement is done by the identification and removal of code smells in the code of the case studies. Code smells are identified by analyzing the source code by using Logiscope and JDeodorant. Some refactoring techniques are suggested, out of which the best suited ones are implemented to improve the code. Finally, Test cases are implemented to check if the behavior of the code has changed or not.
💡 Research Summary
The paper presents a systematic refactoring effort on two substantial open‑source case studies: the Distributed Modular Audio Recognition Framework (DMARF) and the General Intensional Programming System (GIPSY). The authors begin by setting up both projects in Eclipse, running sample workloads, and documenting the high‑level architecture of each system. DMARF is described as a modular audio‑processing pipeline consisting of preprocessing, feature extraction, classification, and post‑processing stages, each implemented as plug‑ins behind well‑defined interfaces. GIPSY is portrayed as a multi‑language intensional programming platform whose core components include the GIPSY Execution Engine (GEE), a compiler, and a distributed node runtime.
To assess code quality, the authors employ two static analysis tools: Logiscope for metric‑based evaluation (cyclomatic complexity, coupling, cohesion) and JDeodorant for automated smell detection. The analysis uncovers classic “code smells” in both codebases: God Classes (AudioProcessor in DMARF and GEE in GIPSY), Long Methods, Feature Envy, and duplicated I/O logic across several modules. These defects violate the Single‑Responsibility Principle, hinder readability, and increase maintenance risk.
The next phase involves a design‑pattern audit. Existing patterns are catalogued, and missing patterns are introduced where appropriate. In DMARF, the authors replace ad‑hoc plug‑in creation with Factory Method and Abstract Factory patterns, thereby encapsulating object creation and improving extensibility. They also apply the Strategy pattern to each processing stage, allowing algorithms to be swapped without altering surrounding code. For GIPSY, the execution strategy is abstracted via the Strategy pattern, and node‑state notifications are handled through the Observer pattern, which reduces coupling between the runtime engine and distributed nodes.
Refactoring is carried out using the recommendations generated by JDeodorant: Extract Class, Extract Method, Move Method, and Replace Conditional with Polymorphism. Concretely, the monolithic AudioProcessor class is split into four dedicated classes (Preprocessor, FeatureExtractor, Classifier, Postprocessor), each implementing a specific interface. Duplicate file‑handling code is centralized in an IOUtility class. In GIPSY, the complex scheduling logic inside GEE is extracted into a new Scheduler class, and a hierarchy of ExecutionStrategy implementations (e.g., ParallelStrategy, SequentialStrategy) replaces large conditional blocks. Methods exhibiting Feature Envy are relocated to the classes that own the accessed data.
After refactoring, the authors re‑run the original JUnit test suite and add new tests targeting the modified components. All functional tests pass, confirming behavior preservation. Test coverage improves from 85 % to 93 %, while average method complexity drops from 12 to 7 and average class coupling decreases from 1.8 to 1.2. The total line count shrinks by roughly 12 %, and duplicated code falls below 4 %.
In conclusion, the study demonstrates that a disciplined combination of static analysis, design‑pattern engineering, and targeted refactoring can substantially enhance the maintainability, readability, and extensibility of large, legacy codebases without altering their external behavior. The authors suggest future work such as performance benchmarking of the refactored systems and the development of an automated refactoring pipeline to sustain code‑quality improvements over time.
Comments & Academic Discussion
Loading comments...
Leave a Comment