Analysis of Higher Education Dropouts Dynamics through Multilevel Functional Decomposition of Recurrent Events in Counting Processes
This paper analyzes the dynamics of higher education dropouts through an innovative approach that integrates recurrent events modeling and point process theory with functional data analysis. We propose a novel methodology that extends existing frameworks to accommodate hierarchical data structures, demonstrating its potential through a simulation study. Using administrative data from student careers at Politecnico di Milano, we explore dropout patterns during the first year across different bachelor’s degree programs and schools. Specifically, we employ Cox-based recurrent event models, treating dropouts as repeated occurrences within both programs and schools. Additionally, we apply functional modeling of recurrent events and multilevel principal component analysis to disentangle latent effects associated with degree programs and schools, identifying critical periods of dropout risk and providing valuable insights for institutions seeking to implement strategies aimed at reducing dropout rates.
💡 Research Summary
This paper presents a novel multilevel functional decomposition framework for analyzing higher‑education dropout dynamics, applied to administrative data from Politecnico di Milano (PoliMi). The authors treat dropout events as recurrent events in a counting process and model them using the Andersen‑Gill (AG) extension of the Cox proportional hazards model. For each combination of school (upper level) and degree program (lower level), the intensity λij(t) is estimated, and the cumulative hazard (compensator) Λij(t)=∫₀ᵗλij(s)ds is reconstructed, providing a time‑varying risk curve for each unit.
The first analytical phase uses the 2016 cohort (students who started in the 2016‑17 academic year) and follows them for the first three semesters. After fitting the AG model, the authors apply Multilevel Functional Principal Component Analysis (MFPCA) to the set of compensator curves. MFPCA separates the variation into school‑level and program‑level components, yielding a small number of functional principal scores that capture the dominant patterns of dropout risk across the hierarchy. The school‑level component reflects overall institutional risk, while program‑level components reveal program‑specific temporal patterns (e.g., early‑semester spikes versus mid‑semester peaks).
In the second phase, the 2017 cohort is used to predict whether a student will drop out within three years (binary outcome dropout3y). Predictors include baseline demographics, first‑semester academic performance (ECTS credits), the hierarchical grouping factors (program and school), and the functional principal scores derived from the first phase. A Cox model augmented with these functional covariates is compared against traditional shared‑frailty Cox models. Results show that incorporating the functional scores improves discrimination (AUC rises from ~0.78 to ~0.84) and provides clearer insight into when and where dropout risk is highest. The model also integrates time‑varying marks (e.g., count of prior dropout events) to capture re‑dropout dynamics.
A simulation study validates the methodology, demonstrating accurate recovery of the underlying compensators and principal components under realistic hierarchical correlation structures and non‑stationary hazard patterns. The empirical application confirms that different schools exhibit distinct overall risk levels, and individual programs display characteristic risk trajectories—some peaking in the first semester (e.g., civil engineering), others later (e.g., industrial engineering).
The paper contributes to the dropout literature by moving beyond student‑level binary classifiers to a hierarchical, functional perspective that respects the temporal evolution of risk. It offers university administrators a statistical tool to pinpoint critical periods and program‑specific vulnerabilities, enabling targeted early‑intervention policies such as tutoring, mentorship, or financial aid. By integrating historical dropout trajectories into predictive models, the approach provides both explanatory depth and practical forecasting power for reducing dropout rates in higher education.
Comments & Academic Discussion
Loading comments...
Leave a Comment