Probability matrices, non-negative rank, and parameterizations of mixture models

Probability matrices, non-negative rank, and parameterizations of   mixture models
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

In this paper we parameterize non-negative matrices of sum one and rank at most two. More precisely, we give a family of parameterizations using the least possible number of parameters. We also show how these parameterizations relate to a class of statistical models, known in Probability and Statistics as mixture models for contingency tables.


šŸ’” Research Summary

The paper investigates the structure of non‑negative matrices that are also probability matrices (all entries non‑negative and summing to one) and whose ordinary matrix rank does not exceed two. The central object of study is the non‑negative rank, denoted rā‚Š(A), which is the smallest integer k such that A can be written as a product of two non‑negative matrices of dimensions mƗk and kƗn. While the ordinary rank of a matrix is a lower bound for rā‚Š, the two notions diverge in general; however, when rank(A) ≤ 2, the non‑negative rank can only be 1 or 2.

The authors first recall that rā‚Š(A)=1 corresponds to a rank‑one non‑negative matrix, which can be expressed uniquely (up to scaling) as a outer product a bįµ€ where aāˆˆĪ”^{m‑1} and bāˆˆĪ”^{n‑1} are probability vectors (Ī”^{d‑1} denotes the d‑dimensional simplex). This representation uses (m‑1)+(n‑1) free parameters.

The main contribution is a complete, minimal‑parameter description of all probability matrices with rā‚Š(A)≤2. They prove that any such matrix can be written as a convex combination of two rank‑one non‑negative matrices:

ā€ƒA = λ a bᵀ + (1ā€Æāˆ’ā€ÆĪ») c dįµ€,ā€ƒ0 ≤ λ ≤ 1,

where a, cāˆˆĪ”^{m‑1} and b, dāˆˆĪ”^{n‑1}. The key insight is that the naĆÆve factorisation A = U Vįµ€ with Uāˆˆā„^{mƗ2}{≄0}, Vāˆˆā„^{nƗ2}{≄0} involves 2(m + n) non‑negative parameters, many of which are redundant because of the global sum‑to‑one constraint and the scale invariance of outer products. By fixing the scale of the first component (forcing a and b to be probability vectors) and by choosing the second component (c, d) in the orthogonal complement of the first within the simplex, the authors reduce the parameter count to the theoretical minimum:

ā€ƒ#parameters = (m‑1) + (n‑1) + 1 = m + nā€Æāˆ’ā€Æ1.

They formalise this reduction using a standardisation map φ(x)=x/āˆ‘x that projects any non‑negative vector onto the simplex, thereby eliminating the scaling degrees of freedom. A rigorous two‑step proof shows (i) existence of such a decomposition for any A with rā‚Šā‰¤2, and (ii) uniqueness of the parameterisation up to trivial permutations, establishing a bijection between the set of admissible matrices and the product space Ī”^{m‑1} × Δ^{n‑1} ×


Comments & Academic Discussion

Loading comments...

Leave a Comment