Imbalances in Neurosymbolic Learning: Characterization and Mitigating Strategies
We study one of the most popular problems in neurosymbolic learning (NSL), that of learning neural classifiers given only the result of applying a symbolic component $σ$ to the gold labels of the elements of a vector $\mathbf x$. The gold labels of the elements in $\mathbf x$ are unknown to the learner. We make multiple contributions, theoretical and practical, to address a problem that has not been studied so far in this context, that of characterizing and mitigating learning imbalances, i.e., major differences in the errors that occur when classifying instances of different classes (aka class-specific risks). Our theoretical analysis reveals a unique phenomenon: that $σ$ can greatly impact learning imbalances. This result sharply contrasts with previous research on supervised and weakly supervised learning, which only studies learning imbalances under data imbalances. On the practical side, we introduce a technique for estimating the marginal of the hidden gold labels using weakly supervised data. Then, we introduce algorithms that mitigate imbalances at training and testing time by treating the marginal of the hidden labels as a constraint. We demonstrate the effectiveness of our techniques using strong baselines from NSL and long-tailed learning, suggesting performance improvements of up to 14%.
💡 Research Summary
The paper addresses a previously unexplored aspect of Neurosymbolic Learning (NSL), namely the emergence of class‑specific learning imbalances when training neural classifiers under the N E S Y setting. In this setting, a learner receives only a vector of inputs x = (x₁,…,x_M) and a weak label s that results from applying a known symbolic function σ to the hidden gold labels y₁,…,y_M. The gold labels are never observed during training. While prior work on supervised and weakly‑supervised learning has focused on class imbalance caused by skewed data distributions, the authors demonstrate that the symbolic component σ itself can induce substantial disparities in per‑class risk, even when the hidden or weak label distributions are uniform.
Theoretical Contributions
The authors formalize the problem by introducing the class‑conditional confusion matrix H(f), where H(f)_{i,j}=P(
Comments & Academic Discussion
Loading comments...
Leave a Comment