Probability matrices, non-negative rank, and parameterizations of mixture models

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

In this paper we parameterize non-negative matrices of sum one and rank at most two. More precisely, we give a family of parameterizations using the least possible number of parameters. We also show how these parameterizations relate to a class of statistical models, known in Probability and Statistics as mixture models for contingency tables.

💡 Research Summary

The paper investigates the structure of non‑negative matrices that are also probability matrices (all entries non‑negative and summing to one) and whose ordinary matrix rank does not exceed two. The central object of study is the non‑negative rank, denoted r₊(A), which is the smallest integer k such that A can be written as a product of two non‑negative matrices of dimensions m×k and k×n. While the ordinary rank of a matrix is a lower bound for r₊, the two notions diverge in general; however, when rank(A) ≤ 2, the non‑negative rank can only be 1 or 2.

The authors first recall that r₊(A)=1 corresponds to a rank‑one non‑negative matrix, which can be expressed uniquely (up to scaling) as a outer product a bᵀ where a∈Δ^{m‑1} and b∈Δ^{n‑1} are probability vectors (Δ^{d‑1} denotes the d‑dimensional simplex). This representation uses (m‑1)+(n‑1) free parameters.

The main contribution is a complete, minimal‑parameter description of all probability matrices with r₊(A)≤2. They prove that any such matrix can be written as a convex combination of two rank‑one non‑negative matrices:

A = λ a bᵀ + (1 − λ) c dᵀ, 0 ≤ λ ≤ 1,

where a, c∈Δ^{m‑1} and b, d∈Δ^{n‑1}. The key insight is that the naïve factorisation A = U Vᵀ with U∈ℝ^{m×2}{≥0}, V∈ℝ^{n×2}{≥0} involves 2(m + n) non‑negative parameters, many of which are redundant because of the global sum‑to‑one constraint and the scale invariance of outer products. By fixing the scale of the first component (forcing a and b to be probability vectors) and by choosing the second component (c, d) in the orthogonal complement of the first within the simplex, the authors reduce the parameter count to the theoretical minimum:

#parameters = (m‑1) + (n‑1) + 1 = m + n − 1.

They formalise this reduction using a standardisation map φ(x)=x/∑x that projects any non‑negative vector onto the simplex, thereby eliminating the scaling degrees of freedom. A rigorous two‑step proof shows (i) existence of such a decomposition for any A with r₊≤2, and (ii) uniqueness of the parameterisation up to trivial permutations, establishing a bijection between the set of admissible matrices and the product space Δ^{m‑1} × Δ^{n‑1} ×

Probability matrices, non-negative rank, and parameterizations of mixture models

💡 Research Summary

Comments & Academic Discussion

Leave a Comment