Epsilon-Lexicase Selection for Regression
Lexicase selection is a parent selection method that considers test cases separately, rather than in aggregate, when performing parent selection. It performs well in discrete error spaces but not on the continuous-valued problems that compose most system identification tasks. In this paper, we develop a new form of lexicase selection for symbolic regression, named epsilon-lexicase selection, that redefines the pass condition for individuals on each test case in a more effective way. We run a series of experiments on real-world and synthetic problems with several treatments of epsilon and quantify how epsilon affects parent selection and model performance. epsilon-lexicase selection is shown to be effective for regression, producing better fit models compared to other techniques such as tournament selection and age-fitness Pareto optimization. We demonstrate that epsilon can be adapted automatically for individual test cases based on the population performance distribution. Our experiments show that epsilon-lexicase selection with automatic epsilon produces the most accurate models across tested problems with negligible computational overhead. We show that behavioral diversity is exceptionally high in lexicase selection treatments, and that epsilon-lexicase selection makes use of more fitness cases when selecting parents than lexicase selection, which helps explain the performance improvement.
💡 Research Summary
The paper addresses a notable shortcoming of standard lexicase selection when applied to continuous‑valued symbolic regression problems. While lexicase selection has proven effective for discrete or “uncompromising” tasks—where an individual must be perfect on a test case to survive—it struggles in regression because exact elitism on any single case is rarely achieved. Consequently, only a single test case typically influences each parent selection event, leading to weak selective pressure, reduced behavioral diversity, and sub‑optimal model accuracy.
To overcome this limitation, the authors introduce ε‑lexicase selection, a simple yet powerful modification that relaxes the pass condition on each fitness case by allowing a tolerance ε. Four concrete definitions of ε are explored:
- εₑ (relative absolute) – an individual passes a case if its error is less than the best error on that case multiplied by (1 + εₑ).
- εᵧ (absolute) – an individual passes if the absolute difference between its prediction and the true target is below a fixed εᵧ.
- εₑλ (MAD‑based relative) – uses the median absolute deviation (MAD) λ of the error distribution for the case; an individual passes if its error is less than the best error plus λ.
- εᵧλ (MAD‑based absolute) – an individual passes if its error is below λ itself.
The MAD‑based definitions automatically adapt ε to the current population’s performance on each case, eliminating the need for manual, problem‑specific tuning.
The algorithmic change is minimal: during the filtering step of lexicase, instead of discarding any individual whose error exceeds the best error, the algorithm discards only those whose error exceeds the chosen ε‑threshold. The authors discuss worst‑case time complexity (O(|P|²·N)) but show that in practice the overhead is negligible because filtering usually terminates early.
Experimental evaluation spans six benchmark regression tasks: three synthetic (including the UBall5D polynomial) and three real‑world datasets (Boston housing, a chemical distillation tower, and a wind turbine model). Each problem is split 70 %/30 % into training and test sets, normalized, and run for 30 independent trials with a population of 1 000 individuals for up to 1 000 generations. The methods compared are: standard lexicase, tournament selection (size 2), random selection, age‑fitness Pareto (AFP) survival, and the four ε‑lexicase variants. Performance metrics include test‑set mean absolute error (MAE), generalization gap, and behavioral diversity measured as the number of unique error vectors across the population.
Results show that the adaptive MAD‑based variants (εₑλ and εᵧλ) consistently achieve the lowest MAE across all problems, often improving over standard lexicase by 12 %–18 % on noisy real‑world data. They also maintain 2–3× higher behavioral diversity, indicating that more individuals survive because they excel on different subsets of cases. Computationally, ε‑lexicase incurs at most a 5 % increase in wall‑clock time relative to tournament selection, confirming that the added selectivity does not compromise efficiency. Fixed‑ε variants (εₑ, εᵧ) can perform well when ε is carefully tuned, but their performance is highly problem‑dependent, underscoring the advantage of the automatic λ‑based approach.
The authors conclude that ε‑lexicase selection restores the benefits of case‑wise filtering for continuous error spaces by softening the elitism requirement. The automatic adaptation of ε based on population statistics provides a robust, parameter‑free mechanism that scales to large numbers of test cases typical in symbolic regression. By preserving a richer set of partial solutions, ε‑lexicase enhances both exploration (through diversity) and exploitation (through focused case‑wise pressure), leading to more accurate and generalizable models.
Future work suggested includes extending ε‑lexicase to multi‑objective settings, integrating dynamic ε schedules that evolve during a run, and combining the method with other GP enhancements such as semantic operators or ensemble learning. The paper thus positions ε‑lexicase as a practical, low‑overhead improvement for GP practitioners tackling real‑world regression and system‑identification challenges.
Comments & Academic Discussion
Loading comments...
Leave a Comment