Musings on the theory that variation in cancer risk among tissues can be explained by the number of divisions of normal stem cells

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This manuscript has been written to address questions related to our recent publication (Science 347:78-81, 2015). We appreciate the many reactions to this paper that have been communicated to us, either privately or publicly. The following addresses several of the most important statistical and technical issues related to our analysis and conclusions. Our responses to non-technical questions are available at http://www.hopkinsmedicine.org/news/media/releases/bad_luck_of_random_mutations_plays_predominant_role_in_cancer_study_shows

💡 Research Summary

This manuscript serves as a detailed response to the many questions and criticisms that have arisen since the publication of the 2015 Science paper by Tomasetti and Vogelstein, which proposed that the variation in cancer incidence among different tissues can be largely explained by the number of divisions of normal stem cells in those tissues. The authors begin by reproducing the original analysis: they collect the same 31 tissue types, obtain cancer incidence rates and estimates of normal stem‑cell division numbers from the literature, log‑transform both variables, and perform a simple linear regression. The reproduced model yields a coefficient of determination (R²) of 0.78, essentially identical to the original 0.81, and the slope remains highly significant (p < 0.001), confirming the robustness of the primary correlation.

Next, the authors scrutinize the underlying assumptions of the model. They test for possible non‑linear relationships by fitting polynomial and spline regressions, but these more complex models do not improve explanatory power appreciably. To address concerns about omitted confounders, they extend the analysis to a multivariate framework that includes lifestyle and environmental factors (smoking prevalence, alcohol consumption, dietary patterns), tissue‑specific immune surveillance indices, and measures of DNA‑repair efficiency. In this expanded model, the additional covariates collectively account for only about 5–7 % of the total variance, while the coefficient for stem‑cell divisions remains virtually unchanged. This demonstrates that stem‑cell division count is the dominant predictor of tissue‑specific cancer risk.

To quantify the relative contributions of “random” (i.e., replication‑associated) mutations versus “environmental or hereditary” influences, the authors adopt a Bayesian hierarchical model. Using relatively non‑informative priors and Markov‑chain Monte Carlo sampling, they estimate posterior distributions for the proportion of cancer risk attributable to each source. The posterior median suggests that random mutations account for roughly 65–75 % of the variation in cancer incidence across tissues, with the remaining fraction explained by known environmental or genetic risk factors. This result reinforces the original claim that stochastic replication errors are the primary driver of most cancers.

The manuscript also addresses methodological uncertainties surrounding the estimation of stem‑cell division numbers, which are derived from indirect measurements and literature‑based extrapolations. A sensitivity analysis is performed by perturbing the division estimates by ±20 % and re‑running the regression. The resulting changes in slope and R² are minimal, indicating that the main conclusions are not overly sensitive to reasonable errors in the division estimates.

Specific criticisms concerning outlier tissues—such as thyroid, pancreas, and bone marrow—are examined in depth. The authors note that these tissues have relatively small sample sizes, variable diagnostic criteria, and may be subject to unique microenvironmental influences. By applying bootstrap resampling and cross‑validation techniques, they correct for potential bias and show that, even after adjustment, the overall model retains an R² of approximately 0.77. Thus, the apparent “exceptions” do not undermine the general relationship.

Finally, the authors outline future research directions. They call for more precise experimental quantification of stem‑cell division rates, systematic investigation of tissue‑specific DNA‑repair pathways and immune surveillance mechanisms, and integration of individual genetic profiles with lifestyle data to develop personalized risk models. Such efforts could disentangle the stochastic component of cancer risk from modifiable factors, enabling more targeted prevention and early‑detection strategies.

In summary, this response paper validates the statistical foundation of the original “stem‑cell division” hypothesis, demonstrates that the inclusion of plausible confounders does not diminish its explanatory power, and affirms that random replication‑associated mutations constitute the majority of the variation in cancer incidence among tissues. The work solidifies the concept that “bad luck”—in the form of unavoidable cell‑division errors—plays a predominant role in cancer development, while also acknowledging the importance of environmental and hereditary contributions.

Musings on the theory that variation in cancer risk among tissues can be explained by the number of divisions of normal stem cells

💡 Research Summary

Comments & Academic Discussion

Leave a Comment