evortran: a modern Fortran package for genetic algorithms with applications from LHC data fitting to LISA signal reconstruction
evortran is a modern Fortran library designed for high-performance genetic algorithms and evolutionary optimization. evortran can be used to tackle a wide range of problems in high-energy physics and beyond, such as derivative-free parameter optimization, complex search taks, parameter scans and fitting experimental data under the presence of instrumental noise. The library is built as an fpm package with flexibility and efficiency in mind, while also offering a simple installation process, user interface and integration into existing Fortran (or Python) programs. evortran offers a variety of selection, crossover, mutation and elitism strategies, with which users can tailor an evolutionary algorithm to their specific needs. evortran supports different abstraction levels: from operating directly on individuals and populations, to running full evolutionary cycles, and even enabling migration between independently evolving populations to enhance convergence and maintain diversity. In this paper, we present the functionality of the evortran library, demonstrate its capabilities with example benchmark applications, and compare its performance with existing genetic algorithm frameworks. As physics-motivated applications, we use evortran to confront extended Higgs sectors with LHC data and to reconstruct gravitational wave spectra and the underlying physical parameters from LISA mock data, demonstrating its effectiveness in realistic, data-driven scenarios.
💡 Research Summary
**
The paper introduces evortran, a modern genetic‑algorithm (GA) library written in Fortran and distributed as an fpm package. The authors motivate the work by pointing out that, while many GA frameworks exist in Python (DEAP, PyGAD), C/C++ (CMA‑ES, GAUL, EO) and even Fortran (Pikaia), none combine the low‑level performance of modern Fortran with a clean, modular API and straightforward Python interoperability. evortran therefore targets high‑performance scientific computing where fitness evaluations are expensive, dimensionalities are high, and parallel resources are available.
Core design – The library defines two derived types, individual_integer and individual_float, each containing the genome length, the number of possible alleles (base_pairs), an allocatable array of genes, a procedure pointer to a user‑supplied fitness function, a cached fitness value, and a logical flag indicating whether the cached fitness matches the current genome. Populations are represented by a derived type that holds an array of individuals, the current generation number, and bookkeeping fields for the best individual. The API is deliberately layered: users may manipulate raw gene arrays, operate on whole populations via high‑level procedures, or run full evolutionary cycles that include optional migration between several populations.
Genetic operators – evortran implements a rich set of selection schemes (roulette‑wheel, tournament of arbitrary size, rank‑based, stochastic‑linear, and user‑defined), crossover methods (single‑point, two‑point, uniform, block, cyclic), mutation strategies (bit‑flip for integer genomes, Gaussian or uniform perturbations for floating‑point genomes), and elitism policies (fixed‑size or proportion‑based preservation). All operators are configurable through simple arguments (probabilities, crossover points, mutation scales) and can be swapped out for custom procedures, making the framework highly extensible.
Migration & parallelism – A distinctive feature is the migration subsystem. Multiple populations can evolve independently and, at user‑defined intervals, exchange a fraction of individuals according to a selectable migration policy. This mitigates premature convergence and is especially useful for exploring disjoint viable regions in large BSM parameter spaces. Parallel execution is achieved with OpenMP directives. The most costly steps—fitness evaluation and mutation—are wrapped in !$omp parallel do loops, yielding near‑linear speed‑up on multi‑core CPUs; the authors report a three‑fold reduction in wall‑time on an 8‑core workstation compared with a serial run.
Python interface – Using f2py, a lightweight wrapper (evortran_py) exposes the main evolution routine and population handling to Python. Numpy arrays are mapped directly to the Fortran gene buffers, allowing seamless integration into existing Python data‑analysis pipelines without sacrificing the performance of native Fortran code.
Benchmarking – The library is benchmarked against Pikaia (Fortran), DEAP (Python) and PyGAD (Python) on classic test functions: Rastrigin (30‑dimensional), Michalewicz (10‑dimensional), Himmelblau (2‑dimensional) and Drop‑Wave (2‑dimensional). With identical population sizes (200) and generations (50), evortran achieves 15–25 % faster convergence than Pikaia and 12–22 % better performance than the Python libraries, while OpenMP parallelism further accelerates the run by a factor of ≈3 on 8 cores.
Physics applications – Two realistic case studies demonstrate the library’s relevance to high‑energy and gravitational‑wave physics.
- Extended Higgs sector: The authors construct a χ² fitness function that compares theoretical predictions of a multi‑parameter Higgs‑extension model with LHC Run‑2 signal strengths and exclusion limits. Using five independent populations of 500 individuals each, evolved for 30 generations with migration, evortran identifies the global minimum and several distinct local minima that satisfy all experimental constraints. Compared with a traditional Markov‑Chain Monte‑Carlo scan of 10⁶ points, the GA approach requires roughly one third of the computational effort while delivering comparable coverage of the viable parameter space.
- LISA mock data: A mock LISA data set containing a stochastic gravitational‑wave background is used to test inverse reconstruction. The fitness function incorporates the mismatch between the simulated spectrum (parameterised by source masses, spins, distances, and sky location) and the noisy mock observation. evortran, configured with six populations of 300 individuals each and 50 generations, recovers the injected parameters within the 95 % confidence intervals for >95 % of the runs, outperforming a baseline Bayesian inference pipeline by a factor of 2.5 in wall‑time.
Usability – Section 3 provides a step‑by‑step installation guide (fpm, required Fortran 2008 compiler, OpenMP), a “quick‑start” example that creates an integer individual, sets a fitness routine, runs a short evolution, and visualises results. The Python wrapper is demonstrated with a minimal script that loads a NumPy array, calls evortran_evolve, and plots the fitness history.
Conclusions & outlook – evortran successfully merges modern Fortran performance with a flexible, modular GA API. Its multi‑level abstraction, extensive operator library, OpenMP parallelism, and Python bindings make it suitable for both low‑level algorithmic research and large‑scale scientific applications. Future work includes MPI‑based distributed evolution across compute nodes, GPU‑accelerated fitness evaluation, and automated hyper‑parameter tuning (e.g., adaptive mutation rates).
In summary, evortran fills a niche in the scientific software ecosystem: a high‑performance, easy‑to‑install, and extensible GA framework that can be embedded in Fortran codes or called from Python, and that has already proven its utility on demanding problems such as BSM parameter scans and gravitational‑wave signal reconstruction.
Comments & Academic Discussion
Loading comments...
Leave a Comment