A Divergence Formula for Randomness and Dimension (Short Version)

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

If $S$ is an infinite sequence over a finite alphabet $\Sigma$ and $\beta$ is a probability measure on $\Sigma$, then the {\it dimension} of $ S$ with respect to $\beta$, written $\dim^\beta(S)$, is a constructive version of Billingsley dimension that coincides with the (constructive Hausdorff) dimension $\dim(S)$ when $\beta$ is the uniform probability measure. This paper shows that $\dim^\beta(S)$ and its dual $\Dim^\beta(S)$, the {\it strong dimension} of $S$ with respect to $\beta$, can be used in conjunction with randomness to measure the similarity of two probability measures $\alpha$ and $\beta$ on $\Sigma$. Specifically, we prove that the {\it divergence formula} $$\dim^\beta(R) = \Dim^\beta(R) =\CH(\alpha) / (\CH(\alpha) + \D(\alpha || \beta))$$ holds whenever $\alpha$ and $\beta$ are computable, positive probability measures on $\Sigma$ and $R \in \Sigma^\infty$ is random with respect to $\alpha$. In this formula, $\CH(\alpha)$ is the Shannon entropy of $\alpha$, and $\D(\alpha||\beta)$ is the Kullback-Leibler divergence between $\alpha$ and $\beta$.

💡 Research Summary

The paper investigates the interplay between algorithmic randomness, information theory, and fractal dimensions by introducing a constructive version of Billingsley dimension that is relative to an arbitrary computable probability measure β on a finite alphabet Σ. For an infinite sequence S∈Σ^∞ the authors define the β‑weighted constructive dimension dim^β(S) and its strong counterpart Dim^β(S) via Kolmogorov complexity K and the β‑probability of prefixes: dim^β(S)=lim inf_{n→∞} K(S↾n) / (−log β(S↾n)), Dim^β(S)=lim sup_{n→∞} K(S↾n) / (−log β(S↾n)). When β is the uniform distribution these coincide with the classical constructive Hausdorff dimension and strong dimension. The main contribution is a “divergence formula” that precisely relates these dimensions to the Shannon entropy of a second computable measure α and the Kullback‑Leibler (KL) divergence between α and β.

The authors first recall that a sequence R is α‑random (Martin‑Löf random with respect to α) if it passes all effective α‑tests. This randomness implies that the empirical frequencies of symbols in R converge to the probabilities prescribed by α, and consequently the Kolmogorov complexity of its prefixes satisfies K(R↾n)=H(α)·n+o(n), where H(α)=−∑_{a∈Σ}α(a)log α(a) is the Shannon entropy of α.

Next, they construct a β‑martingale (or equivalently a β‑weighted compressor) M_β that achieves the optimal expected loss under β. By analyzing the growth of −log β(R↾n) along an α‑random sequence they obtain the asymptotic identity −log β(R↾n)=H(α)·n + D(α‖β)·n + o(n), where D(α‖β)=∑_{a∈Σ}α(a)log(α(a)/β(a)) is the KL divergence. Substituting these asymptotics into the definitions of dim^β and Dim^β yields dim^β(R)=Dim^β(R)= H(α) / ( H(α) + D(α‖β) ). Because the o(n) error term is negligible for both lim inf and lim sup, the two dimensions coincide for any α‑random R.

The formula has an intuitive interpretation: the β‑dimension of an α‑random sequence measures how “compatible” β is with the true generating distribution α. If β=α, the KL term vanishes, the ratio becomes 1, and R is maximally random also with respect to β. As β diverges from α, D(α‖β) grows, the ratio shrinks toward 0, indicating that β perceives R as highly compressible (i.e., non‑random). Thus the divergence formula provides a quantitative bridge between statistical distance (KL divergence) and algorithmic randomness.

The paper discusses several implications. First, the formula can be used as an algorithmic similarity metric between probability measures: by sampling an α‑random sequence (or approximating one) and estimating dim^β(R), one can infer D(α‖β) up to the unknown entropy H(α). Second, in data compression and model selection, the β‑dimension of a data source reflects the efficiency of a compressor tuned to β; the divergence formula predicts the loss incurred when the compressor’s model mismatches the true source. Third, the result enriches the theory of constructive dimensions by showing that strong dimension (lim sup) does not differ from ordinary dimension (lim inf) under the strong regularity imposed by α‑randomness.

The authors also note the limitations of their framework. Both α and β must be computable and assign positive probability to every symbol, which excludes measures with zero‑probability events or non‑computable distributions. The analysis relies on Kolmogorov complexity, an uncomputable quantity, so practical applications require surrogate compressors or empirical estimators. Moreover, the results are asymptotic; finite‑sample behavior may deviate, and additional work is needed to translate the theory into concrete algorithms for finite data.

In conclusion, the paper establishes a precise analytic connection between algorithmic dimensions relative to a measure β and the information‑theoretic divergence between the true generating measure α and β. The divergence formula dim^β(R)=Dim^β(R)=H(α)/(H(α)+D(α‖β)) not only deepens our understanding of constructive dimension theory but also opens avenues for applying algorithmic randomness to statistical inference, compression, and the quantitative comparison of probabilistic models.

A Divergence Formula for Randomness and Dimension (Short Version)

💡 Research Summary

Comments & Academic Discussion

Leave a Comment