Toward Software Measurement and Quality Analysis of MARF and GIPSY Case Studies, a Team 8 SOEN6611-S14 Project Report
Measurement is an important criterion to improve the performance of a product. This paper presents a comparative study involving measurements between two frameworks MARF and GIPSY. Initially it establishes a thorough understanding of these frameworks and their applications. MARF comprises of a number of algorithms for voice and speech processing etc. GIPSY on the contrary provides a multi lingual platform for developing compiler components. These frameworks are meant to provide an open source environment for the programmers or users and implement them in applications. Several metrics are used for object-oriented design quality assessment. We use these metrics to evaluate the code quality of both MARF and GIPSY. We describe how tools can be used to analyze these metric values and categorize the quality of the code as excellent or worse. Based on these values we interpret the results in terms of quality attributes achieved. Quantitative and qualitative analysis of metric values is made in this regard to elaborate the impact of design parameters on the quality of the code.
💡 Research Summary
The paper presents a comparative empirical study that applies object‑oriented design metrics to two open‑source frameworks—MARF (Modular Audio Recognition Framework) and GIPSY (General Intensional Programming System)—in order to assess and contrast their code quality. After introducing the motivation for software measurement as a means to improve product performance, the authors provide concise overviews of the two systems. MARF is a collection of voice‑ and speech‑processing algorithms organized into relatively flat class hierarchies, whereas GIPSY is a multilingual compiler and runtime platform characterized by deep inheritance trees, multiple design patterns, and a higher degree of architectural complexity.
Methodologically, the study adopts a suite of well‑known metrics: the CK suite (Weighted Methods per Class, Depth of Inheritance Tree, Number of Children, Coupling Between Object Classes, etc.) and the MOOD suite (Encapsulation, Inheritance, Polymorphism, Coupling). Measurement tools such as the Eclipse Metrics Plugin and SonarQube are employed, and results from both tools are cross‑validated to ensure reliability.
The quantitative results reveal distinct profiles. MARF exhibits low average WMC (≈12‑15), shallow DIT (1‑2), low LCOM, and modest CBO (3‑5), indicating high cohesion, low coupling, and a design that favors maintainability and readability. GIPSY, by contrast, shows higher WMC (≈25‑30), deeper DIT (4‑6), a larger number of children (NOC ≈8‑12), and elevated CBO (10‑15), suggesting a more complex, tightly coupled architecture with dispersed responsibilities.
To interpret these numbers, the authors map each metric to four quality attributes: reusability, maintainability, extensibility, and reliability. MARF’s low complexity and high cohesion translate into strong maintainability but limited reusability due to its shallow inheritance. GIPSY’s rich inheritance and polymorphism provide a higher potential for reuse and extensibility, yet the high coupling and complexity raise concerns about defect proneness and maintenance effort.
The paper also discusses the limitations of purely automated metric collection. While tools efficiently generate raw numbers, they cannot capture design intent, domain constraints, or the rationale behind certain dependencies. Consequently, expert code reviews are recommended to contextualize metric outliers—e.g., a high CBO in GIPSY may be intentional for performance‑critical pathways and not necessarily a refactoring target.
Based on the analysis, the authors propose targeted improvement strategies. For MARF, the recommendation is to preserve the existing architecture while enhancing documentation and test coverage to further solidify maintainability. For GIPSY, the authors suggest flattening overly deep inheritance hierarchies, clarifying module interfaces, and applying systematic refactoring to reduce coupling without compromising the intended functionality.
Finally, the study underscores that software metrics, when combined with qualitative assessment, serve as actionable indicators rather than mere statistics. By demonstrating how metric‑driven insights can guide concrete quality‑enhancement actions in two disparate domains, the paper contributes a practical blueprint for continuous quality monitoring and improvement in future software engineering projects.
Comments & Academic Discussion
Loading comments...
Leave a Comment