Unachievable Region in Precision-Recall Space and Its Effect on Empirical Evaluation

Precision-recall (PR) curves and the areas under them are widely used to summarize machine learning results, especially for data sets exhibiting class skew. They are often used analogously to ROC curves and the area under ROC curves. It is known that PR curves vary as class skew changes. What was not recognized before this paper is that there is a region of PR space that is completely unachievable, and the size of this region depends only on the skew. This paper precisely characterizes the size of that region and discusses its implications for empirical evaluation methodology in machine learning.

💡 Research Summary

The paper investigates a fundamental geometric property of precision‑recall (PR) curves that has been largely overlooked in the machine‑learning community. While it is well known that PR curves shift with class imbalance, the authors demonstrate that a whole region of the PR space is mathematically unattainable for any classifier, and that the size of this “unachievable region” depends solely on the positive‑class prevalence π (the proportion of positive examples).

Starting from the definition of a confusion matrix, they derive a lower bound on precision for any given recall r:

p_min(r) = (π·r) / (π·r + (1 − π)).

The curve defined by p_min(r) partitions the unit square into two parts. The area below the curve, equal to 1 − π, can never be covered by a PR curve, regardless of the classifier’s behavior. Consequently, the observed area under the PR curve (AUPR) always contains this unavoidable area, inflating the metric in proportion to the class skew.

To obtain a performance measure that reflects only the classifier’s discriminative ability, the authors propose a normalization:

AUPR_norm = (AUPR_observed − (1 − π)) / π.

This rescales the metric to the interval