A few-shot and physically restorable symbolic regression turbulence model based on normalized general effective-viscosity hypothesis
Turbulence is a complex, irregular flow phenomenon ubiquitous in natural processes and engineering applications. The Reynolds-averaged Navier-Stokes (RANS) method, owing to its low computational cost, has become the primary approach for rapid simulation of engineering turbulence problems. However, the inaccuracy of classical turbulence models constitutes the main drawback of the RANS framework. With the rapid development of data-driven approaches, many data-driven turbulence models have been proposed, yet they still suffer from issues of generalizability and accuracy. In this work, we propose a few-shot, physically restorable, symbolic regression turbulence model based on the normalized general effective-viscosity hypothesis. Few-shot indicates that our model is trained on limited flow configurations spanning only a narrow subset of turbulent flow physics, yet can still outperform the baseline model in substantially different turbulent flows. Physically restorable means our model can nearly revert to the baseline model in regimes satisfying specific physical conditions, using only the symbolic regression training results. The normalized general effective-viscosity hypothesis was proposed in our previous study. Specifically, we first formalize the concept of few-shot data-driven turbulence models. Second, we train our symbolic regression turbulence models using only direct numerical simulation (DNS) data for three-dimensional periodic hill flow slices. Third, we evaluate our models on periodic hill flows, zero pressure gradient flat plate flow, NACA0012 airfoil flows, and NASA Rotor 37 transonic axial compressor flows. One of our symbolic regression turbulence models consistently outperforms the baseline model, and we further demonstrate that this model can nearly revert to baseline behavior in certain flow regimes.
💡 Research Summary
This paper introduces a novel data‑driven turbulence modeling framework that combines three emerging ideas—few‑shot learning, a normalized general effective‑viscosity hypothesis, and physical restorability—into a single symbolic‑regression (SR) model. The authors first formalize the notion of a “few‑shot” turbulence model: instead of requiring large, diverse training datasets, the model is trained on a very limited set of flow configurations (in this case, three‑dimensional slices of DNS data from a periodic hill flow). The goal is to demonstrate that, even with such a narrow training base, the model can achieve superior aggregate performance on flows that are substantially different from the training case.
The theoretical foundation builds on the classic general effective‑viscosity hypothesis (Pope, 1975), which expresses the Reynolds‑stress anisotropy tensor as a linear combination of tensor bases with scalar coefficients that are functions of invariant scalars. The authors propose a normalized formulation in which each tensor basis ˆTi is scaled to unit Frobenius norm, and the corresponding coefficients ˆgi are likewise normalized. This normalization removes dimensional disparities among the bases, allowing the magnitude of ˆgi to serve directly as a measure of each term’s relative importance. It also simplifies the learning problem for symbolic regression, because the model only needs to discover dimensionless functional relationships.
Input features are divided into two groups. The first group consists of ten second‑order tensor invariants (I1‑I17) derived from normalized strain‑rate, rotation‑rate, pressure‑gradient, and turbulent‑kinetic‑energy‑gradient tensors. The second group comprises five additional nondimensional quantities (qβ) such as the Q‑criterion, wall‑distance‑based Reynolds number, ratios of turbulent to molecular stresses, and various time‑scale ratios. Together they provide a rich description of the local flow state while remaining inexpensive to compute.
Symbolic regression is performed with the open‑source multi‑population evolutionary library PySR. The algorithm evolves mathematical expressions through mutation, crossover, simplification, and constant optimization across several populations, while allowing migration of individuals. Two SR models are obtained: SR 3T, which uses only the three tensors relevant to two‑dimensional flows, and SR 5T, which incorporates five tensors (T1, T2, T3, T4, T6). Tensor T5 is omitted because it exhibits discontinuities in the DNS dataset. The target for regression is the normalized coefficient ˆgi, obtained from the DNS anisotropy tensor via the double‑dot product ˆgi = b : ˆTi.
The baseline turbulence model is the widely used k‑ω‑SST formulation, including its production, dissipation, blending functions, and Boussinesq linear eddy‑viscosity closure. The SR models replace the linear eddy‑viscosity term with the learned tensor‑basis expansion while retaining the rest of the RANS equations unchanged.
Extensive validation is carried out on four flow cases: (i) the periodic hill flow (the training case), (ii) a zero‑pressure‑gradient flat‑plate boundary layer, (iii) a subsonic NACA 0012 airfoil at moderate angle of attack, and (iv) the transonic NASA Rotor 37 axial‑compressor rotor. Across all cases, SR 5T consistently outperforms the baseline k‑ω‑SST and the simpler SR 3T. In the flat‑plate case, the additional SR terms vanish near the wall, effectively restoring the baseline model and yielding accurate wall‑shear predictions. In the airfoil and compressor cases, SR 5T captures pressure‑gradient‑induced separation and compressibility effects more accurately, reducing prediction errors in lift, drag, and surface pressure distribution. Notably, in regions where the baseline model is already accurate (e.g., near‑wall, non‑wake zones), the learned coefficients become negligibly small, demonstrating the “physically restorable” property: the model automatically reverts to the baseline without any explicit shielding function or auxiliary neural network.
The paper’s contributions are threefold. First, it provides a clear definition and demonstration of few‑shot turbulence modeling, showing that high‑fidelity DNS data from a single geometry can be leveraged to improve predictions on unrelated geometries. Second, the normalized general effective‑viscosity hypothesis offers a mathematically clean and physically interpretable basis for symbolic regression, facilitating direct assessment of term importance. Third, the physical restorability mechanism ensures that the data‑driven model does not degrade performance in regimes where the classical model is already trustworthy, thereby preserving robustness and interpretability.
In summary, this work advances data‑driven turbulence modeling by reducing data requirements, enhancing generalizability, and embedding a self‑regulating physical safeguard. The approach holds promise for rapid development of high‑accuracy, low‑cost turbulence closures applicable to a wide range of engineering flows.
Comments & Academic Discussion
Loading comments...
Leave a Comment