Manipulating language models' training data to study syntactic constraint learning: the case of English passivization

Manipulating language models' training data to study syntactic constraint learning: the case of English passivization
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Grammatical rules in natural languages are often characterized by exceptions. How do language learners learn these exceptions to otherwise general patterns? Here, we study this question through the case study of English passivization. While passivization is in general quite productive, there are cases where it cannot apply (cf. the following sentence is ungrammatical: *One hour was lasted by the meeting). Using neural network language models as theories of language acquisition, we explore the sources of indirect evidence that a learner can leverage to learn whether a verb can be passivized. We first characterize English speakers’ judgments of exceptions to the passive, and confirm that speakers find some verbs more passivizable than others. We then show that a neural network language model’s verb passivizability judgments are largely similar to those displayed by humans, suggesting that evidence for these exceptions is available in the linguistic input. Finally, we test two hypotheses as to the source of evidence that language models use to learn these restrictions: frequency (entrenchment) and semantics (affectedness). We do so by training models on versions of the corpus that have had sentences of the types implicated by each hypothesis removed, altered, or introduced. We find support for both hypotheses: entrenchment and affectedness make independent contributions to a verb’s passivizability. From a methodological point of view, this study highlights the utility of altering a language model’s training data for answering questions where complete control over a learner’s input is vital.


💡 Research Summary

This paper investigates how language learners acquire the exceptions to a generally productive syntactic rule—English passivization—by using neural network language models as computational analogues of human acquisition. While most transitive verbs can appear in the passive voice, a small set (e.g., “last”, “cost”, “emit”) resists passivization, creating a classic “Baker’s Paradox”: learners must distinguish forms that are ungrammatical from those that are merely unattested. The authors formulate two competing sources of indirect evidence that could guide this learning. The “entrenchment hypothesis” posits that learners rely on the statistical distribution of a verb across constructions; a verb that never appears in the passive but frequently occurs in active contexts is inferred to be unpassivizable. The “affectedness hypothesis” argues that lexical semantics—specifically whether the verb’s theme undergoes a change of state, location, or existence (i.e., is “affected”)—drives the restriction. Because low‑affectedness verbs are also rare in the passive, the two factors are highly correlated in natural corpora, making causal inference difficult.

To disentangle them, the authors treat large‑scale neural language models as controlled learners whose training data can be manipulated. They conduct three sets of experiments. Experiment 1A collects acceptability judgments from native speakers on 140 sentence pairs (active vs. passive) covering 28 verbs (10 control, 18 “critical” verbs known from the literature to be unpassivizable). Results confirm that speakers rate passive versions of critical verbs as significantly less acceptable, and reveal fine‑grained gradients across verb classes. Experiment 1B trains a transformer‑based language model on roughly 100 million words of English. By comparing model‑derived probability differences for each active‑passive pair with human ratings, they obtain a Pearson correlation of about 0.9, far exceeding that of simple frequency‑based baselines. This demonstrates that the model can acquire human‑like judgments from realistic input.

Experiment 2A tests the entrenchment hypothesis by creating a counterfactual corpus in which the passive occurrences of selected verbs are dramatically reduced while keeping active frequencies unchanged. Models trained on this altered corpus assign lower passive acceptability to those verbs, confirming that passive frequency is a causal learning signal. Experiment 2B addresses the affectedness hypothesis: the authors replace the typical arguments of an unpassivizable verb with “affected” arguments (e.g., objects that normally undergo a state change) thereby inflating the proportion of affected contexts for that verb. Models exposed to this manipulation increase the passive acceptability of the target verb, indicating that semantic context independently informs the model’s judgments.

Experiment 3 introduces a novel verb that initially appears only in active sentences. The authors systematically vary two dimensions: (i) the overall active‑to‑passive ratio (frequency) and (ii) the proportion of its arguments that are semantically affected. By training separate models on each combination, they assess whether the two factors interact. The results show additive, but not interactive, effects: both higher passive frequency and higher affectedness raise the model’s passive acceptability, yet there is no evidence that one amplifies the other.

Across all experiments, the findings converge on three main conclusions: (1) human speakers exhibit graded, verb‑specific judgments about passivizability; (2) large neural language models can learn these judgments from naturalistic input, achieving near‑human correlation; (3) both entrenchment (frequency) and affectedness (semantics) contribute independently to the acquisition of passivization restrictions. Moreover, the study demonstrates a powerful methodological tool—manipulating a learner’s training data—to test causal hypotheses about language acquisition that are otherwise infeasible with human participants. This approach offers a concrete way to resolve “Baker’s Paradox” and suggests that learners exploit multiple, partially redundant cues in the input to infer grammatical exceptions.


Comments & Academic Discussion

Loading comments...

Leave a Comment