VC dimension of ellipsoids
📝 Original Info
- Title: VC dimension of ellipsoids
- ArXiv ID: 1109.4347
- Date: 2011-09-21
- Authors: Yohji Akama and Kei Irie
📝 Abstract
We will establish that the VC dimension of the class of d-dimensional ellipsoids is (d^2+3d)/2, and that maximum likelihood estimate with N-component d-dimensional Gaussian mixture models induces a geometric class having VC dimension at least N(d^2+3d)/2. Keywords: VC dimension; finite dimensional ellipsoid; Gaussian mixture model💡 Deep Analysis

📄 Full Content
The vc dimension of a class describes a complexity of the class, and are employed in empirical process theory [4], statistical and computational learning theory [8,3] and discrete geometry [6]. Although asymptotic estimates of vc dimensions are given for many classes, the exact values of vc dimensions are known for only a few classes (e.g. the class of Euclidean balls [10], the class of halfspaces [6], and so on).
In Section 2, we prove :
where a covariance matrix of size d is, by definition, a real, positive definite matrix. As in statistical learning theory [8], for a class P of probability density functions we consider the class D (P) of sets {x ∈ R d ; f (x) > s} such that f is any probability density function in P and s is any positive real number. Then D (G d ) is the class of d-dimensional ellipsoids.
For a positive integer N , an N -component d-dimensional Gaussian mixture model [7] ( (N, d)-gmm ) is, by definition, any probability distribution belonging to the convex hull of some N d-dimensional Gaussian distributions. Suppose we are given a sample from a population (N, d)-gmm but the number N of the components is unknown. To select N from the sample is an example of Akaike’s model selection problem [1] (see [5] for recent approach). The authors of [9] proposed to choose N by structural risk minimization principle [8], where an important role is played by the vc dimension of the class D ((G d ) N ) with (G d ) N being the class of (N, d)-gmms. Our result is that the vc dimension of D ((G d ) N ) is greater than or equal to N (d 2 + 3d)/2.
We will prove Theorem 1. For a positive integer B, a vector a ∈ R B \ { 0}, and c ∈ R, we write an affine function ℓ a,c (x) := t ax + c (x ∈ R B ) and an open halfspace H a,c := {x ∈ R B ; ℓ a,c (x) < 0}. We say a set W ⊆ R B spans an affine subspace H ⊆ R B , if H is the smallest affine subspace that contains W . The cardinality of a set S is denoted by |S|. For a vector a = t (a 1 , . . . , a
Proof. By an affine transformation we can assume without loss of generality that all the components of the vector a are 1 and that S is the canonical basis {e Proof. Let B be the right-hand side. Let ϕ be a map S d-1 → R B which maps
there is some set S ⊂ S d-1 such that |S| = B and ϕ(S) spans the hyperplane. Let a ∈ R B be a vector with the first d components being 1 and the other components being 0. By Lemma 2, for any ε > 0 the family
. By the definition of ϕ, the class of sets defined by quadratic inequalities
But, when ε is sufficiently small, all of these sets are ellipsoids.
We verify the converse inequality.
Below, the convex hull of a set A is denoted by conv(A).
If there are x = (u, x B ), y = (u, y B ) ∈ S such that x B < y B , then for any a ∈ R B with the last component nonnegative and for any c ∈ R we have ℓ a,c (x) < ℓ a,c (y), and thus x ∈ H a,c = {x ∈ R B ; ℓ a,c (x) < 0} whenever y ∈ H a,c . This contradicts the assumption “C shatters S.” Therefore, for the canonical projection π :
By applying Radon’s theorem 1 [6] to the set π(S) ⊂ R B-1 , there is a partition (T 1 , T 2 ) of S such that we can take y from conv(π(T 1 )) ∩ conv(π(T 2 )). Then we see that there are z, z ′ ∈ R such that (y, z) ∈ conv(T 1 ) and (y, z ′ ) ∈ conv(T 2 ). Because C shatters S, there are some a ∈ R B and some c ∈ R such that the last component a B of a is nonnegative and a halfspace H a,c ∈ C cuts T 1 out of S. Thus, we have ℓ a,c (x) < 0 for all x ∈ conv(T 1 ) while ℓ a,c (x) ≥ 0 for all x ∈ conv(T 2 ) where T 2 = S \ T 1 . Therefore ℓ a,c (y, z) < ℓ a,c (y, z ′ ) and a B > 0, we have z ′ > z. On the other hand, some member H a ′ ,c ′ ∈ C cuts T 2 out of S. By a similar reasoning, we have z > z ′ , which is a contradiction.
Proof. Let 0 ∈ conv(A). Then for every finite subset A ′ of A, 0 / ∈ conv(A ′ ) and there is a hyperplane J through 0 such that conv(A ′ ) is contained in one of the two open halfspaces determined by J. So there is a new rectangular coordinate system such that the origin point is the same as the older rectangular coordinate system, one of the new coordinate axes is normal to J, and any a ∈ A ′ is represented as (a 1 , . . . , a B ) with a B > 0. So VCdim({H a,c } a∈A ′ ,c∈R ) ≤ B by Lemma 4, and thus VCdim({H a,c } a∈A,c∈R ) ≤ B.
The proof of Theorem 1 is as follows: By Lemma 3, we have only to establish that the class of d-dimensional ellipsoids has vc dimension less than or equal to B := (d 2 + 3d)/2. Assume otherwise. For a = t (a 1 , . . . , a B ) ∈ R B and x = t (x 1 , . . . , x d ), define a quadratic form q a (x) and a quadratic polynomial p a (x) by
Let A be the set of a ∈ R B s
📸 Image Gallery
