Word-Centered Semantic Graphs for Interpretable Diachronic Sense Tracking
We propose an interpretable, graph-based framework for analyzing semantic shift in diachronic corpora. For each target word and time slice, we induce a word-centered semantic network that integrates distributional similarity from diachronic Skip-gram embeddings with lexical substitutability from time-specific masked language models. We identify sense-related structure by clustering the peripheral graph, align clusters across time via node overlap, and track change through cluster composition and normalized cluster mass. In an application study on a corpus of New York Times Magazine articles (1980 - 2017), we show that graph connectivity reflects polysemy dynamics and that the induced communities capture contrasting trajectories: event-driven sense replacement (trump), semantic stability with cluster over-segmentation effects (god), and gradual association shifts tied to digital communication (post). Overall, word-centered semantic graphs offer a compact and transparent representation for exploring sense evolution without relying on predefined sense inventories.
💡 Research Summary
The paper introduces a novel, interpretable framework for tracking semantic change over time by constructing word‑centered semantic graphs for each target word and time slice. For every time period, the authors combine two complementary signals: (1) distributional similarity obtained from a Skip‑gram Word2Vec model trained on that slice, and (2) lexical substitutability derived from a RoBERTa masked language model fine‑tuned on the same slice. The top‑k most similar words from the static embeddings (N_dist) and the top‑k predicted substitutes from the masked LM (N_sub) are merged to form the first‑layer neighborhood of the target word. The graph is then expanded layer‑by‑layer up to a configurable depth L, adding both distributional and substitution neighbors for each newly introduced node. Edges are labeled as either distributional (blue) or substitution‑based (yellow), and when both relations hold the edge is shown as substitution‑based, providing an immediate visual cue about the nature of each connection.
To extract sense‑related structure, the central target node is removed, and the peripheral graph is partitioned into connected components. Each component is interpreted as a “sense community” because its members are mutually linked through distributional and/or substitution edges, while different components are only linked via the central word. The number and size of components therefore serve as proxies for polysemy: many small components suggest a highly polysemous word, whereas a single large component indicates a more stable meaning.
Temporal alignment of senses is achieved through simple node‑overlap heuristics. For a component C_t^i at time t, the algorithm selects the component in the previous slice with the largest intersection as its predecessor. An extended version, AlignHist, looks back across all earlier slices to capture re‑emerging senses that may have disappeared temporarily. Components without any overlap are treated as new senses; those that persist for fewer than two slices are merged into a residual component representing infrequent or fleeting meanings. Sense usage distributions are then estimated by normalizing component sizes, yielding a probability distribution over senses that is comparable across time despite varying graph sizes.
The methodology is evaluated on a diachronic corpus of New York Times Magazine articles spanning 1980–2017. Three target words—“trump”, “god”, and “post”—are examined to showcase distinct semantic trajectories. For “trump”, early graphs (1980) are dominated by card‑game terms (e.g., “diamond”, “heart”), forming a dense, single component. By the 1990s and especially the 2010s, political terms (“whitehouse”, “democrat”, “senator”) replace the earlier neighbors, and the original component dissolves while a new component emerges, illustrating an event‑driven sense replacement. The “god” graphs remain densely connected with consistently high cosine similarities (>0.90) across all periods, yet the peripheral clustering over‑segments the neighborhood into many small components, indicating that the graph is sensitive to minor fluctuations even when the underlying meaning is stable. Finally, “post” shows a gradual shift: early neighbors (“position”, “paper”) give way to digital‑communication terms (“share”, “socialmedia”) over the two‑decade span, reflecting a smooth association shift rather than an abrupt sense jump.
Overall, the study demonstrates that (i) graph connectivity patterns naturally encode polysemy, (ii) peripheral connected components provide an interpretable, corpus‑driven proxy for word senses without requiring predefined sense inventories, and (iii) simple overlap‑based alignment can track sense continuity and emergence across time. The approach offers a compact, visualizable representation of semantic change that is both transparent and adaptable to different languages or domains. Limitations include sensitivity to hyper‑parameters (k_i, k_c, L), potential scalability issues for very large vocabularies, and the reliance on hard‑connected components which may miss nuanced sense overlaps. Future work is suggested to explore automatic hyper‑parameter tuning, graph compression techniques, and more sophisticated clustering that incorporates edge weights or probabilistic overlap to enhance robustness and scalability.
Comments & Academic Discussion
Loading comments...
Leave a Comment