An Exact Algorithm for the Stratification Problem with Proportional Allocation
We report a new optimal resolution for the statistical stratification problem under proportional sampling allocation among strata. Consider a finite population of N units, a random sample of n units selected from this population and a number L of strata. Thus, we have to define which units belong to each stratum so as to minimize the variance of a total estimator for one desired variable of interest in each stratum,and consequently reduce the overall variance for such quantity. In order to solve this problem, an exact algorithm based on the concept of minimal path in a graph is proposed and assessed. Computational results using real data from IBGE (Brazilian Central Statistical Office) are provided.
💡 Research Summary
The paper tackles the classic problem of designing an optimal stratified sampling scheme when the allocation of the sample to strata follows a proportional rule. Given a finite population of N units, a total sample size n, and a desired number of strata L, the objective is to assign each population unit to a stratum so that the variance of the estimator of a target variable’s total is minimized. Traditional approaches rely on heuristics, ad‑hoc rules, or meta‑heuristic optimization, which cannot guarantee optimality, especially as N or L grows.
The authors introduce a novel exact algorithm based on a graph‑theoretic formulation. First, the population units are ordered according to the value of the auxiliary variable that drives stratification. A linear graph is constructed where each vertex corresponds to a unit and each edge (i, i+1) carries a weight equal to the increase in variance that would result if the interval from unit i+1 to unit j were treated as a single stratum. Under proportional allocation, the variance contribution of a stratum depends only on the number of units it contains and their within‑stratum variance, which can be expressed analytically; thus the edge weights precisely encode the objective function.
With this construction, the problem of finding the optimal L‑strata partition becomes equivalent to finding a minimum‑cost path from the first to the last vertex that uses exactly L−1 cuts, i.e., divides the path into L contiguous segments. The authors solve this via dynamic programming (DP). Let DP
Comments & Academic Discussion
Loading comments...
Leave a Comment