The more Product Complexity, the more Actual Effort? An Empirical Investigation into Software Developments

The more Product Complexity, the more Actual Effort? An Empirical   Investigation into Software Developments
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

[Background:] Software effort prediction methods and models typically assume positive correlation between software product complexity and development effort. However, conflicting observations, i.e. negative correlation between product complexity and actual effort, have been witnessed from our experience with the COCOMO81 dataset. [Aim:] Given our doubt about whether the observed phenomenon is a coincidence, this study tries to investigate if an increase in product complexity can result in the abovementioned counter-intuitive trend in software development projects. [Method:] A modified association rule mining approach is applied to the transformed COCOMO81 dataset. To reduce noise of analysis, this approach uses a constant antecedent (Complexity increases while Effort decreases) to mine potential consequents with pruning. [Results:] The experiment has respectively mined four, five, and seven association rules from the general, embedded, and organic projects data. The consequents of the mined rules suggested two main aspects, namely human capability and product scale, to be particularly concerned in this study. [Conclusions:] The negative correlation between complexity and effort is not a coincidence under particular conditions. In a software project, interactions between product complexity and other factors, such as Programmer Capability and Analyst Capability, can inevitably play a “friction” role in weakening the practical influences of product complexity on actual development effort.


💡 Research Summary

The paper challenges a foundational assumption of the COCOMO family of effort‑estimation models: that higher product complexity inevitably leads to higher development effort. While the original COCOMO81 dataset was built on the premise of a positive correlation, the authors observed a strikingly opposite pattern in several records—projects where the “Product Complexity” rating increased while the recorded “Actual Effort” (person‑months) decreased. Rather than dismissing these cases as outliers or data errors, the study asks whether this counter‑intuitive relationship can be systematically explained.

To answer this, the authors adopt a modified association‑rule‑mining approach. They first discretize the continuous variables of the COCOMO81 dataset into binary states (e.g., “Complexity ↑” vs. “Complexity ↓”, “Effort ↑” vs. “Effort ↓”). The key methodological twist is the use of a constant antecedent: every rule must begin with the statement “Complexity increases while Effort decreases.” By fixing this antecedent, the mining algorithm searches only for consequents—combinations of other project attributes—that co‑occur with the antecedent more often than would be expected by chance. This pruning dramatically reduces the search space and filters out noisy associations that would otherwise dominate a conventional Apriori‑style mining process.

The dataset is split according to the three COCOMO81 development modes—Organic, Semi‑Detached (referred to as “General” in the paper), and Embedded—and the rule‑mining procedure is applied to each subset independently. The results are as follows:

  • General (Semi‑Detached) projects: 4 significant rules.
  • Embedded projects: 5 significant rules.
  • Organic projects: 7 significant rules.

Across all modes, the consequents fall into two dominant thematic clusters:

  1. Human Capability Factors – primarily Programmer Capability (PC) and Analyst Capability (AC). Many rules read, for example, “Complexity ↑ & Effort ↓ ⇒ Programmer Capability is Very High.” This suggests that when a project is rated as more complex, managers tend to allocate more skilled personnel, and the higher skill level can offset the expected increase in effort.

  2. Product Scale Factors – mainly Product Size (KLOC) and related scale metrics. Rules such as “Complexity ↑ & Effort ↓ ⇒ Product Size is Small” indicate that a rise in complexity does not always accompany a proportional increase in system size; a complex but narrowly scoped component can be tackled with less overall work.

These findings lead the authors to propose a “friction” model of effort estimation. In this view, product complexity does not act in isolation; its impact on effort is moderated—sometimes even reversed—by the interaction with human capability and product scale. High‑skill programmers and analysts act as a lubricating force, reducing the “drag” that complexity would otherwise impose. Conversely, when a complex product is also large, the friction is insufficient, and effort rises as traditional models predict.

The paper acknowledges several limitations. Association‑rule mining captures co‑occurrence but not causality, and it may miss subtle non‑linear relationships that more sophisticated statistical or machine learning models could uncover. Moreover, COCOMO81 reflects software development practices of the early 1980s; modern agile, DevOps, and cloud‑native environments differ markedly in how complexity is managed and measured.

Future research directions proposed include:

  • Applying the same antecedent‑driven mining technique to contemporary datasets (e.g., ISBSG, NASA‑CPS) to test the generality of the friction effect.
  • Integrating causal inference methods (e.g., structural equation modeling, Bayesian networks) to move beyond correlation.
  • Extending effort‑estimation models to include interaction terms or dynamic weighting for capability and scale factors, thereby allowing the model to “turn down” the complexity coefficient when high capability or small scale is present.

In conclusion, the study demonstrates that the negative correlation between product complexity and actual effort observed in COCOMO81 is not a statistical fluke but a systematic phenomenon that emerges under specific conditions. By highlighting the moderating roles of programmer/analyst capability and product size, the paper calls for a more nuanced, interaction‑aware approach to software effort estimation—one that can adapt its complexity weighting based on the human and structural context of each project.


Comments & Academic Discussion

Loading comments...

Leave a Comment