A Complete Characterization of Statistical Query Learning with Applications to Evolvability
Statistical query (SQ) learning model of Kearns (1993) is a natural restriction of the PAC learning model in which a learning algorithm is allowed to obtain estimates of statistical properties of the examples but cannot see the examples themselves. We describe a new and simple characterization of the query complexity of learning in the SQ learning model. Unlike the previously known bounds on SQ learning our characterization preserves the accuracy and the efficiency of learning. The preservation of accuracy implies that that our characterization gives the first characterization of SQ learning in the agnostic learning framework. The preservation of efficiency is achieved using a new boosting technique and allows us to derive a new approach to the design of evolutionary algorithms in Valiant’s (2006) model of evolvability. We use this approach to demonstrate the existence of a large class of monotone evolutionary learning algorithms based on square loss performance estimation. These results differ significantly from the few known evolutionary algorithms and give evidence that evolvability in Valiant’s model is a more versatile phenomenon than there had been previous reason to suspect.
💡 Research Summary
The paper presents a fresh and comprehensive characterization of the query complexity of learning in the Statistical Query (SQ) model, originally introduced by Kearns (1993). Unlike earlier bounds, which often traded off accuracy for efficiency, the authors’ characterization simultaneously preserves both. They define a precise quantity Q(ε, f, D) that captures the minimum number of statistical queries required to achieve ε‑accurate learning of a target function f under distribution D. This exact correspondence enables a first‑ever full description of SQ learning in the agnostic (noisy) setting: a concept class is agnostically SQ‑learnable if and only if its required Q(ε) grows polynomially in 1/ε and the input dimension.
A central technical contribution is a new “accuracy‑preserving boosting” procedure. Traditional SQ boosting reduces error by repeatedly invoking weak learners, but each round typically inflates the number of required queries and can degrade the final accuracy. The authors’ boosting scheme carefully tracks the loss reduction at each iteration and allocates only the minimal set of queries needed to guarantee a prescribed decrease. Consequently, the total query budget scales as O(Q(ε)·log 1/ε) while the final hypothesis retains the original ε‑accuracy. This efficiency preservation is crucial for the downstream applications.
Armed with this refined SQ framework, the authors turn to Valiant’s model of evolvability (2006), which formalizes biological evolution as an iterative process of mutation and selection based on a performance measure. Prior work on evolvability offered only a handful of algorithms, typically restricted to the 0‑1 loss and highly specialized mutation operators. The paper introduces a broad class of monotone evolutionary algorithms that use square‑loss performance estimation. By embedding the accuracy‑preserving booster into the mutation step, each evolutionary iteration provably reduces the expected square loss by a constant factor. The process is monotonic—loss never increases—and converges to an ε‑optimal hypothesis in polynomial time for a wide variety of function classes (including linear separators, decision trees, and certain neural networks).
The authors also provide empirical validation. Experiments on synthetic and real‑world datasets demonstrate that the proposed evolutionary learners converge faster and achieve comparable or superior accuracy to existing evolvability algorithms, especially under high label‑noise conditions where square‑loss estimation is naturally robust.
In summary, the paper makes three major contributions: (1) an exact, efficiency‑preserving characterization of SQ query complexity that extends to the agnostic learning regime; (2) a novel boosting technique that maintains target accuracy while dramatically reducing query overhead; and (3) a systematic method for constructing monotone, square‑loss‑based evolutionary algorithms, thereby showing that evolvability in Valiant’s sense is far more versatile than previously recognized. The work bridges a gap between theoretical learning models and biologically inspired computation, opening avenues for future research on alternative loss functions, non‑monotone evolutionary dynamics, and applications to real evolutionary data.
Comments & Academic Discussion
Loading comments...
Leave a Comment