An estimating equations approach to fitting latent exposure models with longitudinal health outcomes
The analysis of data arising from environmental health studies which collect a large number of measures of exposure can benefit from using latent variable models to summarize exposure information. However, difficulties with estimation of model parameters may arise since existing fitting procedures for linear latent variable models require correctly specified residual variance structures for unbiased estimation of regression parameters quantifying the association between (latent) exposure and health outcomes. We propose an estimating equations approach for latent exposure models with longitudinal health outcomes which is robust to misspecification of the outcome variance. We show that compared to maximum likelihood, the loss of efficiency of the proposed method is relatively small when the model is correctly specified. The proposed equations formalize the ad-hoc regression on factor scores procedure, and generalize regression calibration. We propose two weighting schemes for the equations, and compare their efficiency. We apply this method to a study of the effects of in-utero lead exposure on child development.
💡 Research Summary
Environmental health studies increasingly collect numerous exposure measurements for each participant, creating a high‑dimensional exposure profile that is often summarized using latent variable techniques such as factor analysis or structural equation modeling. While latent exposure models efficiently reduce dimensionality, linking these latent constructs to longitudinal health outcomes poses a statistical challenge: conventional maximum‑likelihood (ML) or Bayesian mixed‑effects approaches require a correctly specified residual covariance structure for the outcome data. Misspecification of this structure—common in real‑world studies with irregular visit schedules, missing data, or complex autocorrelation—can lead to biased estimates of the exposure‑outcome association.
The authors propose an estimating‑equations framework that circumvents the need for a correctly modeled outcome covariance. Their method proceeds in two stages. First, a latent exposure factor is estimated from the multiple exposure indicators using standard factor analysis, yielding individual factor scores. Second, these scores are treated as fixed covariates in a generalized estimating equation (GEE) for the longitudinal outcome. The GEE is constructed as
\
Comments & Academic Discussion
Loading comments...
Leave a Comment