Efficient Natural Evolution Strategies

Efficient Natural Evolution Strategies (eNES) is a novel alternative to conventional evolutionary algorithms, using the natural gradient to adapt the mutation distribution. Unlike previous methods based on natural gradients, eNES uses a fast algorithm to calculate the inverse of the exact Fisher information matrix, thus increasing both robustness and performance of its evolution gradient estimation, even in higher dimensions. Additional novel aspects of eNES include optimal fitness baselines and importance mixing (a procedure for updating the population with very few fitness evaluations). The algorithm yields competitive results on both unimodal and multimodal benchmarks.

💡 Research Summary

Efficient Natural Evolution Strategies (eNES) introduces a fundamentally different way of adapting the mutation distribution in evolutionary algorithms by directly applying the natural gradient to the parameters of a multivariate Gaussian search distribution. Unlike earlier natural‑gradient based methods that rely on approximations of the Fisher information matrix (FIM) or on stochastic estimates, eNES computes the exact FIM for the mean vector μ and covariance matrix Σ and inverts it with a specialized O(d³) algorithm that exploits symmetry and block‑matrix structure. This exact inversion guarantees that the parameter update follows the true Riemannian geometry of the probability distribution, leading to more reliable and faster convergence, especially in higher‑dimensional spaces.

A second major contribution is the derivation of an optimal fitness baseline. By analytically minimizing the variance of the gradient estimator with respect to the baseline, eNES obtains a closed‑form expression that depends on the current gradient and covariance. This baseline is more effective than the commonly used fixed or sample‑mean baselines, reducing estimator noise and improving the stability of the update.

The third innovation, importance mixing, reuses individuals from previous generations. For each previously sampled individual x, the algorithm computes an importance weight w = p_new(x)/p_old(x). If w exceeds a predefined threshold, the fitness of x is retained without reevaluation; otherwise, a new fitness evaluation is performed. This mechanism dramatically cuts the number of expensive fitness evaluations—empirically by 30–50 %—while preserving the statistical correctness of the gradient estimate.

The eNES algorithm proceeds as follows: (1) initialise μ and Σ; (2) draw λ offspring from the current Gaussian; (3) apply importance mixing to decide which offspring need fresh evaluations; (4) compute the weighted fitnesses, the optimal baseline, and the ordinary gradient of the expected fitness; (5) construct the exact FIM and its inverse; (6) update μ and Σ using the natural gradient step μ←μ+η F⁻¹∇J, Σ←Σ+η F⁻¹∇J_Σ; and repeat. The authors provide pseudo‑code and discuss computational complexity, showing that the dominant cost is the O(d³) matrix inversion, which remains tractable up to several hundred dimensions.

Experimental evaluation covers a broad suite of benchmark functions, including unimodal (Sphere, Rosenbrock) and multimodal (Rastrigin, Ackley, Lunacek) problems, with dimensionalities ranging from 2 to 100. eNES is compared against state‑of‑the‑art methods such as CMA‑ES, xNES, and NES‑FD. Results demonstrate that eNES converges faster than competitors in high‑dimensional settings, largely because the exact natural gradient avoids the distortion introduced by approximate FIMs. In low‑dimensional cases, the optimal baseline and importance mixing yield a noticeable reduction in the number of fitness evaluations while maintaining comparable solution quality. On multimodal landscapes, eNES shows a lower propensity to become trapped in local optima, achieving higher success rates across random seeds.

The paper also analyses the sensitivity of eNES to hyper‑parameters such as learning rate η and the importance‑mixing threshold, finding that the method is relatively robust. A discussion of limitations notes that the O(d³) inversion may become prohibitive for problems with thousands of variables, suggesting future work on low‑rank approximations or stochastic FIM estimators.

In conclusion, eNES successfully combines three technical advances—exact Fisher matrix inversion, analytically optimal fitness baselines, and importance mixing—to deliver a natural‑gradient evolutionary algorithm that is both more accurate and more evaluation‑efficient than existing approaches. The empirical evidence supports its applicability to a wide range of optimisation tasks, and the authors outline promising directions for scaling the method to even larger problem sizes.