📝 Original Info
- Title: On the incidence-prevalence relation and length-biased sampling
- ArXiv ID: 0808.1226
- Date: 2008-08-11
- Authors: ** Vittorio Addona, Masoud Asgharian, David B. Wolfson **
📝 Abstract
For many diseases, logistic and other constraints often render large incidence studies difficult, if not impossible, to carry out. This becomes a drawback, particularly when a new incidence study is needed each time the disease incidence rate is investigated in a different population. However, by carrying out a prevalent cohort study with follow-up it is possible to estimate the incidence rate if it is constant. In this paper we derive the maximum likelihood estimator (MLE) of the overall incidence rate, $\lambda$, as well as age-specific incidence rates, by exploiting the well known epidemiologic relationship, prevalence = incidence $\times$ mean duration ($P = \lambda \times \mu$). We establish the asymptotic distributions of the MLEs, provide approximate confidence intervals for the parameters, and point out that the MLE of $\lambda$ is asymptotically most efficient. Moreover, the MLE of $\lambda$ is the natural estimator obtained by substituting the marginal maximum likelihood estimators for P and $\mu$, respectively, in the expression $P = \lambda \times \mu$. Our work is related to that of Keiding (1991, 2006), who, using a Markov process model, proposed estimators for the incidence rate from a prevalent cohort study \emph{without} follow-up, under three different scenarios. However, each scenario requires assumptions that are both disease specific and depend on the availability of epidemiologic data at the population level. With follow-up, we are able to remove these restrictions, and our results apply in a wide range of circumstances. We apply our methods to data collected as part of the Canadian Study of Health and Ageing to estimate the incidence rate of dementia amongst elderly Canadians.
💡 Deep Analysis
Deep Dive into On the incidence-prevalence relation and length-biased sampling.
For many diseases, logistic and other constraints often render large incidence studies difficult, if not impossible, to carry out. This becomes a drawback, particularly when a new incidence study is needed each time the disease incidence rate is investigated in a different population. However, by carrying out a prevalent cohort study with follow-up it is possible to estimate the incidence rate if it is constant. In this paper we derive the maximum likelihood estimator (MLE) of the overall incidence rate, $\lambda$, as well as age-specific incidence rates, by exploiting the well known epidemiologic relationship, prevalence = incidence $\times$ mean duration ($P = \lambda \times \mu$). We establish the asymptotic distributions of the MLEs, provide approximate confidence intervals for the parameters, and point out that the MLE of $\lambda$ is asymptotically most efficient. Moreover, the MLE of $\lambda$ is the natural estimator obtained by substituting the marginal maximum likelihood esti
📄 Full Content
arXiv:0808.1226v1 [stat.ME] 8 Aug 2008
ON THE INCIDENCE-PREVALENCE RELATION AND
LENGTH-BIASED SAMPLING
VITTORIO ADDONA, MASOUD ASGHARIAN AND DAVID B. WOLFSON
Macalester College and McGill University
Abstract. For many diseases, logistic and other constraints often render large in-
cidence studies difficult, if not impossible, to carry out. This becomes a drawback,
particularly when a new incidence study is needed each time the disease incidence
rate is investigated in a different population. However, by carrying out a prevalent
cohort study with follow-up it is possible to estimate the incidence rate if it is con-
stant. In this paper we derive the maximum likelihood estimator (MLE) of the overall
incidence rate, λ, as well as age-specific incidence rates, by exploiting the well known
epidemiologic relationship, prevalence = incidence × mean duration (P = λ × µ). We
establish the asymptotic distributions of the MLEs, provide approximate confidence
intervals for the parameters, and point out that the MLE of λ is asymptotically most
efficient. Moreover, the MLE of λ is the natural estimator obtained by substituting the
marginal maximum likelihood estimators for P and µ, respectively, in the expression
P = λ × µ. Our work is related to that of Keiding (1991, 2006), who, using a Markov
process model, proposed estimators for the incidence rate from a prevalent cohort study
without follow-up, under three different scenarios. However, each scenario requires as-
sumptions that are both disease specific and depend on the availability of epidemiologic
data at the population level. With follow-up, we are able to remove these restrictions,
and our results apply in a wide range of circumstances. We apply our methods to data
collected as part of the Canadian Study of Health and Ageing to estimate the incidence
rate of dementia amongst elderly Canadians.
1Supported in part by FQRNT and NSERC of Canada
Key words and phrases: prevalent cohort, right censoring, left truncation, incidence rate, and nonpara-
metric maximum likelihood estimator (NPMLE)
1
2
VITTORIO ADDONA, MASOUD ASGHARIAN AND DAVID B. WOLFSON
1. Introduction
In an incidence study, whose goal is to estimate a disease incidence rate, a cohort
of initially disease-free subjects is followed forward in time. The subjects are monitored
closely and for those who develop the disease their approximate times of disease onset
are recorded. Often, as part of an incidence study, these diseased subjects are followed
until “failure” or censoring. The data collected from such an incidence study may then
be used to directly estimate both the disease incidence rate and the survival function
for the time from onset to failure. The estimators of the incidence rate and the survival
function from such data are standard.
For many diseases, however, logistic and other constraints often render large in-
cidence studies difficult, if not impossible, to carry out.
This becomes a drawback,
particularly when a new incidence study is needed each time the disease incidence rate
is investigated in a different population. Nevertheless, by carrying out a prevalent co-
hort study with follow-up it is possible to estimate the incidence rate if it is constant,
thus avoiding the problems associated with incidence studies. In this paper we derive
the maximum likelihood estimator (MLE) of the overall incidence rate, λ, as well as
age-specific incidence rates from data collected as part of a prevalent cohort study with
follow-up. We exploit the well known epidemiologic relationship, prevalence = incidence
× mean duration (P = λ × µ), to suggest that the likelihood be derived as a function of
the vector (P, µ). Once the MLE, ( ˆP, ˆµ) of (P, µ), is obtained, the MLE of λ = P
µ follows
by invariance. A similar approach may be used to find the MLEs of age specific incidence
rates. The asymptotic distributional properties of the estimators may be obtained by
modifying previous results for the MLE of the survival function, based on survival data
from a prevalent cohort study with follow-up (see Section 4). It is comforting that the
MLE ˆλ =
ˆP
ˆµ is, therefore, also the natural ad hoc estimator of λ.
In a medical setting, a prevalent cohort study with follow-up (Wang 1991) begins
with the identification, from a sampled cohort, of those with existing (prevalent) disease.
The dates of onset for the diseased are ascertained and the diseased subjects are followed
forward in time until failure or censoring. Other data collected include the ages at the
time of recruitment, the failure/censoring times of the subjects who are followed, and
P = λµ AND LENGTH-BIASED SAMPLING
3
covariates of interest to the researchers. There are two main features of the data collected
from such studies. First, the dates of disease onset of the prevalent cases do not include
the dates of onset of those who died prior to the start of the prevalent cohort study;
we can only speculate as to the existence of such subjects. Hence, direct use of the
ob
…(Full text truncated)…
Reference
This content is AI-processed based on ArXiv data.