Statistics / Applications Statistics / stat.ME

On the incidence-prevalence relation and length-biased sampling

February 23, 2026

Reading time: 7 minute

...

📝 Original Info

Title: On the incidence-prevalence relation and length-biased sampling
ArXiv ID: 0808.1226
Date: 2008-08-11
Authors: ** Vittorio Addona, Masoud Asgharian, David B. Wolfson **

📝 Abstract

For many diseases, logistic and other constraints often render large incidence studies difficult, if not impossible, to carry out. This becomes a drawback, particularly when a new incidence study is needed each time the disease incidence rate is investigated in a different population. However, by carrying out a prevalent cohort study with follow-up it is possible to estimate the incidence rate if it is constant. In this paper we derive the maximum likelihood estimator (MLE) of the overall incidence rate, $\lambda$, as well as age-specific incidence rates, by exploiting the well known epidemiologic relationship, prevalence = incidence $\times$ mean duration ($P = \lambda \times \mu$). We establish the asymptotic distributions of the MLEs, provide approximate confidence intervals for the parameters, and point out that the MLE of $\lambda$ is asymptotically most efficient. Moreover, the MLE of $\lambda$ is the natural estimator obtained by substituting the marginal maximum likelihood estimators for P and $\mu$, respectively, in the expression $P = \lambda \times \mu$. Our work is related to that of Keiding (1991, 2006), who, using a Markov process model, proposed estimators for the incidence rate from a prevalent cohort study \emph{without} follow-up, under three different scenarios. However, each scenario requires assumptions that are both disease specific and depend on the availability of epidemiologic data at the population level. With follow-up, we are able to remove these restrictions, and our results apply in a wide range of circumstances. We apply our methods to data collected as part of the Canadian Study of Health and Ageing to estimate the incidence rate of dementia amongst elderly Canadians.

💡 Deep Analysis

Deep Dive into On the incidence-prevalence relation and length-biased sampling.

📄 Full Content

arXiv:0808.1226v1 [stat.ME] 8 Aug 2008 ON THE INCIDENCE-PREVALENCE RELATION AND LENGTH-BIASED SAMPLING VITTORIO ADDONA, MASOUD ASGHARIAN AND DAVID B. WOLFSON Macalester College and McGill University Abstract. For many diseases, logistic and other constraints often render large in- cidence studies diﬃcult, if not impossible, to carry out. This becomes a drawback, particularly when a new incidence study is needed each time the disease incidence rate is investigated in a diﬀerent population. However, by carrying out a prevalent cohort study with follow-up it is possible to estimate the incidence rate if it is con- stant. In this paper we derive the maximum likelihood estimator (MLE) of the overall incidence rate, λ, as well as age-speciﬁc incidence rates, by exploiting the well known epidemiologic relationship, prevalence = incidence × mean duration (P = λ × µ). We establish the asymptotic distributions of the MLEs, provide approximate conﬁdence intervals for the parameters, and point out that the MLE of λ is asymptotically most eﬃcient. Moreover, the MLE of λ is the natural estimator obtained by substituting the marginal maximum likelihood estimators for P and µ, respectively, in the expression P = λ × µ. Our work is related to that of Keiding (1991, 2006), who, using a Markov process model, proposed estimators for the incidence rate from a prevalent cohort study without follow-up, under three diﬀerent scenarios. However, each scenario requires as- sumptions that are both disease speciﬁc and depend on the availability of epidemiologic data at the population level. With follow-up, we are able to remove these restrictions, and our results apply in a wide range of circumstances. We apply our methods to data collected as part of the Canadian Study of Health and Ageing to estimate the incidence rate of dementia amongst elderly Canadians. 1Supported in part by FQRNT and NSERC of Canada Key words and phrases: prevalent cohort, right censoring, left truncation, incidence rate, and nonpara- metric maximum likelihood estimator (NPMLE) 1 2 VITTORIO ADDONA, MASOUD ASGHARIAN AND DAVID B. WOLFSON 1. Introduction In an incidence study, whose goal is to estimate a disease incidence rate, a cohort of initially disease-free subjects is followed forward in time. The subjects are monitored closely and for those who develop the disease their approximate times of disease onset are recorded. Often, as part of an incidence study, these diseased subjects are followed until “failure” or censoring. The data collected from such an incidence study may then be used to directly estimate both the disease incidence rate and the survival function for the time from onset to failure. The estimators of the incidence rate and the survival function from such data are standard. For many diseases, however, logistic and other constraints often render large in- cidence studies diﬃcult, if not impossible, to carry out. This becomes a drawback, particularly when a new incidence study is needed each time the disease incidence rate is investigated in a diﬀerent population. Nevertheless, by carrying out a prevalent co- hort study with follow-up it is possible to estimate the incidence rate if it is constant, thus avoiding the problems associated with incidence studies. In this paper we derive the maximum likelihood estimator (MLE) of the overall incidence rate, λ, as well as age-speciﬁc incidence rates from data collected as part of a prevalent cohort study with follow-up. We exploit the well known epidemiologic relationship, prevalence = incidence × mean duration (P = λ × µ), to suggest that the likelihood be derived as a function of the vector (P, µ). Once the MLE, ( ˆP, ˆµ) of (P, µ), is obtained, the MLE of λ = P µ follows by invariance. A similar approach may be used to ﬁnd the MLEs of age speciﬁc incidence rates. The asymptotic distributional properties of the estimators may be obtained by modifying previous results for the MLE of the survival function, based on survival data from a prevalent cohort study with follow-up (see Section 4). It is comforting that the MLE ˆλ = ˆP ˆµ is, therefore, also the natural ad hoc estimator of λ. In a medical setting, a prevalent cohort study with follow-up (Wang 1991) begins with the identiﬁcation, from a sampled cohort, of those with existing (prevalent) disease. The dates of onset for the diseased are ascertained and the diseased subjects are followed forward in time until failure or censoring. Other data collected include the ages at the time of recruitment, the failure/censoring times of the subjects who are followed, and P = λµ AND LENGTH-BIASED SAMPLING 3 covariates of interest to the researchers. There are two main features of the data collected from such studies. First, the dates of disease onset of the prevalent cases do not include the dates of onset of those who died prior to the start of the prevalent cohort study; we can only speculate as to the existence of such subjects. Hence, direct use of the ob

…(Full text truncated)…

📄 Read Full PDF on ArXiv

Reference

This content is AI-processed based on ArXiv data.

On the incidence-prevalence relation and length-biased sampling

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Related Posts

A study of pre-validation

A tighter constraint on Earth-system sensitivity from long-term temperature and carbon-cycle observations

Benchmarking Historical Corporate Performance

Start searching

No results found