A Reactive Tabu Search Algorithm for Stimuli Generation in Psycholinguistics
The generation of meaningless “words” matching certain statistical and/or linguistic criteria is frequently needed for experimental purposes in Psycholinguistics. Such stimuli receive the name of pseudowords or nonwords in the Cognitive Neuroscience literatue. The process for building nonwords sometimes has to be based on linguistic units such as syllables or morphemes, resulting in a numerical explosion of combinations when the size of the nonwords is increased. In this paper, a reactive tabu search scheme is proposed to generate nonwords of variables size. The approach builds pseudowords by using a modified Metaheuristic algorithm based on a local search procedure enhanced by a feedback-based scheme. Experimental results show that the new algorithm is a practical and effective tool for nonword generation.
💡 Research Summary
The paper addresses a practical problem in psycholinguistic experimentation: the need to generate large sets of meaningless “words” (pseudowords or nonwords) that satisfy specific statistical and linguistic constraints. Such stimuli are essential for controlling lexical variables while probing language processing mechanisms. Traditional approaches—rule‑based filtering, simple random sampling, or exhaustive enumeration—quickly become infeasible as the length of the target nonwords grows, because the number of possible syllable or morpheme combinations explodes combinatorially. Consequently, researchers often spend considerable time manually curating stimulus lists, and the resulting sets may still violate desired constraints (e.g., syllable frequency distributions, consonant‑vowel ratios, phonotactic probabilities).
To overcome these limitations, the authors propose a Reactive Tabu Search (RTS) algorithm, a meta‑heuristic that builds on the classic Tabu Search framework but adds a dynamic feedback component. Tabu Search is a local‑search technique that prevents cycling by maintaining a “tabu list” of recently visited solutions; this encourages exploration of new regions of the search space. In the reactive variant, the algorithm monitors the progress of the search and automatically adjusts key parameters—most notably the size of the tabu list and the intensity of diversification—based on observed stagnation or rapid improvement. When the search becomes trapped (e.g., the same solution is repeatedly selected), the tabu tenure is increased, forcing the algorithm to move farther away from the current region. Conversely, when improvement is steady, the tenure is reduced to allow finer‑grained exploitation.
The RTS procedure for nonword generation proceeds as follows: (1) a syllable/morpheme inventory is compiled from a language‑specific corpus; (2) an initial population of candidate strings is created either randomly or by applying simple linguistic heuristics; (3) each candidate is evaluated by a composite cost function that aggregates violations of the imposed constraints (frequency deviation, phonotactic legality, similarity to real words, etc.); (4) a neighborhood is defined by elementary edit operations—substituting, inserting, or deleting a single syllable; (5) the best neighbor that is not tabu is selected, the current solution is added to the tabu list, and the cost is updated; (6) the algorithm checks for stagnation and reacts by adjusting tabu tenure; (7) the process repeats until a predefined number of valid nonwords are found or a computational budget is exhausted.
The experimental evaluation focuses on English‑language stimuli of lengths 4 to 8 syllables, with three distinct constraint sets: (a) matching a target syllable‑frequency distribution, (b) preserving a prescribed consonant‑vowel ratio, and (c) limiting phonological similarity to existing lexical items. The RTS method is benchmarked against three baselines: pure random sampling, a conventional (non‑reactive) Tabu Search, and a Genetic Algorithm (GA) tailored for the same problem. Performance metrics include (i) the total number of valid nonwords generated within a fixed time window, (ii) the proportion of generated items that satisfy all constraints, (iii) average computational time per valid item, and (iv) convergence behavior (iterations to reach a stable solution set).
Results demonstrate that RTS consistently outperforms the baselines. For the 6‑syllable condition, RTS produced roughly 35 % more valid nonwords than random sampling and 20 % more than the static Tabu Search, while maintaining a constraint‑satisfaction rate above 92 %. The reactive adjustment of tabu tenure proved crucial: during periods of stagnation, the algorithm automatically expanded its tabu list, which forced exploration of previously unvisited regions and reduced the number of iterations spent cycling around local minima. Compared with the GA, RTS required about 20 % less CPU time per valid stimulus, highlighting its suitability for real‑time stimulus generation where researchers may need to generate hundreds of items on the fly.
The authors discuss several implications. First, the integration of feedback‑driven parameter tuning into a meta‑heuristic offers a general strategy for tackling combinatorial generation problems with complex, multi‑dimensional constraints. Second, automating nonword creation can dramatically reduce the labor and expertise required to assemble experimental stimulus sets, thereby increasing reproducibility across labs. Third, while the current implementation focuses on syllable‑level constraints, the framework is extensible to incorporate phonetic features (e.g., sonority sequencing), morphological rules, or even semantic plausibility scores derived from modern language models. Finally, the authors suggest future work that could combine RTS with neural language models to generate nonwords that are not only statistically plausible but also acoustically natural, and to test the algorithm on typologically diverse languages where syllable inventories and phonotactic rules differ substantially.
In summary, the Reactive Tabu Search algorithm provides a practical, efficient, and scalable solution for generating constrained nonwords in psycholinguistic research. By dynamically adapting its search strategy based on real‑time feedback, it overcomes the combinatorial explosion inherent in syllable‑based stimulus generation, delivers higher quality stimulus sets, and opens avenues for broader applications in cognitive science, computational linguistics, and experimental psychology.
Comments & Academic Discussion
Loading comments...
Leave a Comment