Optimal discovery with probabilistic expert advice

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We consider an original problem that arises from the issue of security analysis of a power system and that we name optimal discovery with probabilistic expert advice. We address it with an algorithm based on the optimistic paradigm and the Good-Turing missing mass estimator. We show that this strategy uniformly attains the optimal discovery rate in a macroscopic limit sense, under some assumptions on the probabilistic experts. We also provide numerical experiments suggesting that this optimal behavior may still hold under weaker assumptions.

💡 Research Summary

The paper introduces a novel “optimal discovery with probabilistic expert advice” problem motivated by real‑time security analysis of large‑scale power systems. In such settings, a huge number of possible contingencies (often >10⁵) must be screened for rare but dangerous events. Simulating every contingency is computationally infeasible, so the operator must identify as many dangerous contingencies as possible within a very short time budget. The authors formalize this as a sequential decision problem: a finite set X contains all possible items, a subset A⊂X are the “interesting” (dangerous) items, and K probabilistic experts each draw independently from a fixed distribution P_i over X. At each time step t the decision maker selects an expert I_t, observes a sample X_{I_t, n_{I_t,t}} drawn from P_{I_t}, and wishes to maximize the number of distinct elements of A discovered after a fixed horizon or, equivalently, to minimize the time needed to discover all of A.

To make the problem analytically tractable, the authors impose three strong assumptions: (i) the supports of the experts are disjoint, (ii) each support has the same cardinality N, and (iii) each expert’s distribution is uniform over its support. Under these conditions the problem can be re‑parameterized as a K‑by‑N grid X_N = {1,…,K}×{1,…,N}, with Q_{N,i}=|A∩({i}×{1,…,N})| the number of interesting items accessible through expert i. They assume that the ratios Q_{N,i}/N converge to constants q_i∈(0,1) as N→∞.

The central contribution is the Good‑UCB algorithm, which merges two ideas: (1) the Good‑Turing estimator for the “missing mass” (the total probability of interesting items not yet observed) and (2) the optimistic (upper‑confidence‑bound) principle popular in multi‑armed bandits. For each expert i at time t the algorithm computes an estimate (\hat R_{i,t-1}=U_{i,t-1}/n_{i,t-1}) where U_{i,t-1} counts items seen exactly once, and adds a confidence bonus (c\sqrt{\log t / n_{i,t-1}}). The expert with the largest upper bound is queried next. The algorithm starts by pulling each expert once (t≤K) to obtain initial data.

Theoretical analysis proceeds by comparing Good‑UCB to an “oracle closed‑loop” (OCL) policy that knows, at every step, the exact remaining number of interesting items behind each expert and always selects the expert with the largest remaining count. Under the disjoint‑support, uniform‑distribution assumptions, the OCL policy’s performance becomes deterministic in the macroscopic limit: the cumulative number of discovered items after t steps, scaled by N, converges to a deterministic function F(t) that can be expressed analytically in terms of the q_i. The authors prove (Theorem 1) that the OCL policy’s scaled discovery curve converges uniformly on ℝ⁺ as N→∞.

The main result (Theorem 2) shows that Good‑UCB is macroscopically optimal: the scaled discovery curve of Good‑UCB, (\tilde F_N(t)/N), converges uniformly to the same limit F(t) as the OCL oracle. The proof hinges on concentration bounds for the Good‑Turing estimator (derived via McDiarmid’s inequality) and on the fact that the optimism‑driven selection forces each expert’s sampling proportion to approach its optimal share q_i, thereby matching the oracle’s allocation in the limit.

Section 3 details the OCL policy, introducing the random variables D_{N,i,k} (the draw index at which the k‑th new interesting item from expert i appears) and S_{N,i,k}=D_{N,i,k}−D_{N,i,k−1}, which are geometrically distributed. Expected waiting times and cumulative discovery counts are derived, leading to explicit formulas for the macroscopic discovery rate.

Section 4 contains the rigorous analysis of Good‑UCB, showing that the upper confidence bounds dominate the true missing mass with high probability, and that the algorithm’s regret (the gap to the oracle) vanishes after scaling by N. The authors also discuss the role of the tuning parameter c and argue that any constant c>0 suffices for asymptotic optimality.

Section 5 compares the OCL policy to an “oracle open‑loop” policy that fixes a static allocation of draws among experts. Surprisingly, the open‑loop optimal allocation yields the same macroscopic discovery rate as the closed‑loop oracle, confirming that the problem’s optimal rate is determined solely by the q_i’s, not by dynamic adaptation.

Section 6 presents extensive simulations. Even when the three structural assumptions are violated—supports overlap, distributions are non‑uniform, or the number of experts is large—the Good‑UCB algorithm still outperforms naïve baselines such as uniform sampling and closely tracks the oracle’s performance. Experiments on synthetic data mimicking power‑system contingency analysis demonstrate that Good‑UCB can discover a substantially larger fraction of dangerous contingencies within a fixed computational budget.

The paper concludes by emphasizing the broad applicability of the framework beyond power‑system security, e.g., web‑content recommendation, adaptive testing, or any setting where rare but valuable items must be discovered through stochastic advice sources. The combination of Good‑Turing missing‑mass estimation with optimistic bandit exploration yields a simple, provably optimal algorithm that is robust to model misspecification and scalable to large‑scale problems.

Optimal discovery with probabilistic expert advice

💡 Research Summary

Comments & Academic Discussion

Leave a Comment