Structural Drift: The Population Dynamics of Sequential Learning
We introduce a theory of sequential causal inference in which learners in a chain estimate a structural model from their upstream teacher and then pass samples from the model to their downstream student. It extends the population dynamics of genetic drift, recasting Kimura’s selectively neutral theory as a special case of a generalized drift process using structured populations with memory. We examine the diffusion and fixation properties of several drift processes and propose applications to learning, inference, and evolution. We also demonstrate how the organization of drift process space controls fidelity, facilitates innovations, and leads to information loss in sequential learning with and without memory.
💡 Research Summary
The paper introduces a novel theoretical framework called structural drift, which models the dynamics of sequential learning in a chain of agents (teachers and students). Each learner receives a set of samples generated by an upstream teacher, infers a structured probabilistic model (e.g., a Bayesian network, a Markov chain, or any parametric graphical model) from those samples, and then passes new samples drawn from the inferred model to the downstream learner. This process is a direct analogue of genetic drift, but instead of copying a single allele, agents copy an entire model structure together with its parameters.
Link to classical genetic drift – The authors first recast Kimura’s neutral theory of genetic drift as a special case of structural drift where the “structure” is trivial (a single locus) and there is no memory of past generations. In that limit, the transition kernel reduces to a simple binomial sampling process, and the well‑known diffusion and fixation results of population genetics are recovered.
Mathematical formulation – Let (\Theta) denote the space of possible model parameters (and, when relevant, structures). At generation (t) the learner’s belief is a posterior distribution (p_t(\theta)). The downstream learner receives a finite sample set (S_t) drawn from the predictive distribution induced by (p_t). The posterior update is
\
Comments & Academic Discussion
Loading comments...
Leave a Comment