Another Systematic Review? A Critical Analysis of Systematic Literature Reviews on Agile Effort and Cost Estimation
Background: Systematic literature reviews (SLRs) have become prevalent in software engineering research. Several researchers may conduct SLRs on similar topics without a prospective register for SLR protocols. However, even ignoring these unavoidable duplications of effort in the simultaneous conduct of SLRs, the proliferation of overlapping and often repetitive SLRs indicates that researchers are not extensively checking for existing SLRs on a topic. Given how effort-intensive it is to design, conduct, and report an SLR, the situation is less than ideal for software engineering research. Aim: To understand how authors justify additional SLRs on a topic. Method: To illustrate the issue and develop suggestions for improvement to address this issue, we have intentionally picked a sufficiently narrow but well-researched topic, i.e., effort estimation in Agile software development. We identify common justification patterns through a qualitative content analysis of 18 published SLRs. We further consider the citation data, publication years, publication venues, and the quality of the SLRs when interpreting the results. Results: The common justification patterns include authors claiming gaps in coverage, methodological limitations in prior studies, temporal obsolescence of previous SLRs, or rapid technological and methodological advancements necessitating updated syntheses. Conclusion: Our in-depth analysis of SLRs on a fairly narrow topic provides insights into SLRs in software engineering in general. By emphasizing the need for identifying existing SLRs and for justifying the undertaking of further SLRs, both in design and review guidelines and as a policy of conferences and journals, we can reduce the likelihood of duplication of effort and increase the rate of progress in the field.
💡 Research Summary
The paper investigates the growing phenomenon of duplicated systematic literature reviews (SLRs) in software engineering, using the narrowly defined domain of effort and cost estimation in Agile development as a case study. The authors followed Kitchenham’s SLR guidelines to locate relevant secondary studies across four major bibliographic databases (Scopus, IEEE Xplore, ACM Digital Library, Web of Science) and supplemented the search with Google Scholar. Their search string combined terms for systematic reviews, estimation, and Agile methodologies, yielding 162 records on March 11 2025. After removing 18 duplicates and applying two‑stage inclusion criteria—(i) focus on Agile effort/cost estimation, (ii) explicit claim of a systematic review, (iii) peer‑reviewed, and (iv) English language—144 papers were screened by title/abstract and then full‑text, resulting in a final set of 18 SLRs published between 2014 and 2024.
Four research questions guided the analysis: (RQ1) which SLRs exist on the topic; (RQ2) what are their characteristics (aims, year coverage, number of primary studies, quality scores, venue rankings); (RQ3) are the authors aware of prior SLRs; and (RQ4) how do authors justify conducting another SLR. For each SLR the authors extracted the research questions, coded the justification statements, assessed methodological rigor using the five‑item DARE checklist (scored 0, 0.5, or 1 per item), and recorded whether earlier SLRs were cited.
The temporal distribution shows a steady increase, with at least one SLR each year except 2015, and a peak of four publications in 2024. The number of primary studies per SLR ranges from 8 to 86 (mean = 33, median = 26.5), indicating a right‑skewed distribution. A Spearman correlation of 0.50 (p = 0.034) between publication year and primary‑study count suggests moderate growth of the evidence base over time, although the authors did not examine overlap among primary studies.
Quality assessment using DARE reveals an average score of 4.1 out of 5, with half of the SLRs failing to report a quality appraisal of their primary studies. Most SLRs provide explicit inclusion/exclusion criteria and search at least four digital libraries, yet DARE does not evaluate whether the SLR itself was warranted, so high scores may mask redundancy concerns.
A striking finding is the lack of awareness of prior work: 11 of the 18 SLRs (≈60 %) do not cite any earlier SLRs, and even when citations exist they are sparse (e.g., SLR6 cites only two of the five preceding reviews). This indicates that many authors embark on new reviews without a comprehensive mapping of existing secondary literature.
The authors identified four dominant justification patterns for launching a new SLR: (1) claiming coverage gaps for specific sub‑topics (e.g., user‑story estimation); (2) pointing out methodological shortcomings in earlier reviews (limited search scope, missing quality assessment); (3) asserting temporal obsolescence (e.g., “the last review is more than five years old”); and (4) invoking rapid technological or methodological advances (e.g., emergence of machine‑learning‑based estimation techniques). While some of these rationales can be legitimate, the empirical analysis shows that many new SLRs largely overlap with previous work, leading to inefficient use of research effort.
To mitigate redundancy, the paper proposes interventions on two levels. First, at the study design stage, the community should adopt a registration mechanism for SLR protocols—analogous to PROSPERO in health sciences—so that researchers can publicly record their intended scope and check for existing reviews before proceeding. Second, journals and conferences should require authors to explicitly discuss prior SLRs in the abstract and introduction, and reviewers should be instructed to evaluate the novelty claim as a formal review criterion. The authors also suggest updating existing SLR guidelines (e.g., Kitchenham, Petersen) to embed a “duplication check” step, and to encourage venues to prioritize submissions that demonstrate clear differentiation from earlier secondary studies.
In conclusion, the study provides empirical evidence that even within a narrowly bounded research area, the software engineering community produces a substantial number of overlapping systematic reviews. The lack of systematic awareness and the reliance on weak justification patterns contribute to unnecessary duplication. By instituting protocol registration and strengthening editorial/review policies, the field can reduce redundant effort, improve the efficiency of knowledge synthesis, and accelerate genuine progress in understanding Agile effort and cost estimation.
Comments & Academic Discussion
Loading comments...
Leave a Comment