Social Interactions Models with Latent Structures
This paper studies estimation and inference of heterogeneous peer effects featuring group fixed effects and slope heterogeneity under latent structure. We adapt the Classifier-Lasso algorithm to consistently discover latent structures and determine the number of clusters. To solve the incidental parameter problem in the binary choice model with social interactions, we propose a parametric bootstrap method to debias and establish its asymptotic validity. Monte Carlo simulations confirm strong finite sample performance of our methods. In an application to students’ risky behaviors, the algorithm detects two latent clusters and finds that peer effects are significant within one of the clusters, demonstrating the practical applicability in uncovering heterogeneous social interactions.
💡 Research Summary
This paper tackles two major challenges that arise when estimating binary‑choice social‑interaction models with group fixed effects: (i) how to allow heterogeneity across groups without sacrificing efficiency, and (ii) how to correct the incidental‑parameter bias that fixed effects introduce in a game‑theoretic setting. The authors propose a novel framework that assumes a latent‑cluster structure: groups are partitioned into a finite number of unknown clusters, and within each cluster all structural parameters (the peer‑effect coefficient, the coefficients on individual covariates, and the group fixed effect) are homogeneous, while they differ across clusters. Neither the number of clusters nor the cluster memberships are known a priori.
To estimate this model, the authors combine the Nested Pseudo‑Likelihood (NPL) algorithm with the Classifier‑Lasso (C‑Lasso) penalization method in a three‑step procedure. First, an initial NPL run provides preliminary estimates of the conditional choice probabilities (CCPs) for each individual. Second, the estimated CCPs are fed into C‑Lasso, which solves a penalized optimization problem that simultaneously selects the number of clusters and assigns each group to a cluster. The penalty term forces groups with similar parameter vectors to be merged, and a data‑driven tuning rule determines the optimal number of clusters. Third, after the cluster structure is fixed, NPL is run again on each cluster to obtain the final structural‑parameter estimates.
A key contribution is the treatment of the incidental‑parameter bias that arises because the group fixed effects are estimated alongside the structural parameters. The authors derive a first‑order bias expansion for the NPL estimator in a setting where both the number of groups G and the group size n grow to infinity while their ratio remains bounded. Because standard split‑panel jackknife or split‑sample corrections are infeasible—CCPs depend on the entire group—the paper adopts a parametric bootstrap approach. Using the preliminary NPL‑C‑Lasso estimates, synthetic datasets are generated under the assumed logistic error distribution; the full three‑step estimation is repeated on each bootstrap sample, and the bias is estimated as the difference between the bootstrap mean and the original estimate. Subtracting this bias yields a debiased estimator that is shown to be √(Gn)‑consistent and asymptotically normal. The bootstrap also provides valid confidence intervals with correct coverage.
Theoretical results are extensive. The authors prove (1) classification consistency: as G,n→∞, C‑Lasso recovers the true cluster memberships with probability approaching one, provided the between‑cluster parameter distance exceeds a certain threshold; (2) asymptotic normality of the post‑classification NPL estimator for the common parameters within each cluster; (3) validity of the bootstrap bias correction and the resulting confidence intervals. Computationally, the decoupling of C‑Lasso from the NPL updates reduces the burden of solving a large non‑convex problem, because C‑Lasso operates on the relatively low‑dimensional CCP vectors rather than on the full likelihood surface.
Monte‑Carlo experiments explore a variety of designs: different numbers of latent clusters (2–4), varying cluster sizes, and differing magnitudes of fixed effects. The simulations confirm that (i) the algorithm accurately selects the correct number of clusters and correctly classifies groups; (ii) the bias‑corrected estimator eliminates the incidental‑parameter bias, delivering root‑N consistency; (iii) bootstrap confidence intervals achieve nominal coverage even in modest sample sizes.
The empirical application uses the National Longitudinal Study of Adolescent to Adult Health (Add Health) to examine risky behaviors (e.g., drinking, smoking) among high‑school students. Applying the proposed method to roughly 100 schools, the algorithm discovers two latent clusters. In the larger‑school cluster, the peer‑effect coefficient is positive and statistically significant, indicating that a higher prevalence of risky behavior among peers raises an individual’s likelihood of engaging in the same behavior. In the smaller‑school cluster, the peer effect is not distinguishable from zero. These findings illustrate that ignoring latent heterogeneity could mask important peer influences that are confined to particular subpopulations.
In sum, the paper makes three substantive contributions to the literature on social interactions and panel data models: (1) it introduces a flexible latent‑cluster specification that bridges the gap between fully homogeneous and fully heterogeneous specifications; (2) it integrates C‑Lasso with NPL to achieve consistent cluster detection and efficient parameter estimation; (3) it provides a theoretically justified bootstrap bias‑correction for fixed‑effect models in binary‑choice games. The methodology is broadly applicable to any setting where agents interact within groups and where unobserved group‑level heterogeneity is suspected but not directly observable, offering researchers a powerful tool for uncovering nuanced peer effects in large‑scale networked data.
Comments & Academic Discussion
Loading comments...
Leave a Comment