We give an explicit (in particular, deterministic polynomial time) construction of subspaces X of R^N of dimension (1-o(1))N such that for every element x in X, |x|_1 and N^{1/2} |x|_2 are equivalent up to a factor of (log N)^{log log log N}. If we are allowed to use N^{o(1)} random bits, this factor can be improved to poly(log N). Our construction makes use of unbalanced bipartite graphs to impose local linear constraints on vectors in the subspace, and our analysis relies on expansion properties of the graph. This is inspired by similar constructions of error-correcting codes.
Deep Dive into Almost Euclidean subspaces of ell_1^N via expander codes.
We give an explicit (in particular, deterministic polynomial time) construction of subspaces X of R^N of dimension (1-o(1))N such that for every element x in X, |x|_1 and N^{1/2} |x|_2 are equivalent up to a factor of (log N)^{log log log N}. If we are allowed to use N^{o(1)} random bits, this factor can be improved to poly(log N). Our construction makes use of unbalanced bipartite graphs to impose local linear constraints on vectors in the subspace, and our analysis relies on expansion properties of the graph. This is inspired by similar constructions of error-correcting codes.
Classical results in high-dimensional geometry [13,23] state that a random (with respect to the Haar measure) subspace X ⊆ R N of dimension εN [13] or even (1 -ε)N [23] is an almost Euclidean section in ℓ N 1 , in the sense that √ N x 1 and x 2 are within constant factors, uniformly for every x ∈ X. Indeed, this is a particular example of the use of the probabilistic method, a technique which is now ubiquitous in asymptotic geometric analysis.
On the other hand, it is usually the case that objects constructed in such a manner are very hard to come by explicitly. Motivated in part by ever growing connections with combinatorics and theoretical computer science, the problem of explicit constructions of such subspaces has gained substantially in popularity over the last several years; see, e.g. [36,Sec. 4], [30,Prob. 8], [22,Sec. 2.2]. Indeed, such subspaces (viewed as embeddings) are important for problems like high-dimensional nearest-neighbor search [19] and compressed sensing [10], and one expects that explicit constructions will lead, in particular, to a better understanding of the underlying geometric structure. (See also the end of the introduction for a discussion of the relevance to compressed sensing.)
If one relaxes the requirement that dim(X) = Ω(N) or allows a limited amount of randomness in the construction, a number of results are known. In order to review these, we define the distortion ∆(X) of X ⊆ R N by ∆(X) = √ N • max 0 =x∈X
x 2 x 1 .
In the first direction, it is well-known that an explicit construction with distortion O(1) and dim(X) = Ω( √ N ) can be extracted from Rudin [32] (see also [26] for a more accessible exposition). Indyk [20] presented a deterministic polynomial-time construction with distortion 1 + o(1) and dim(X) N exp(O(log log N ) 2 ) . Another very interesting line of research pursued by various authors and in quite different contexts is to achieve, in the terminology of theoretical computer science, a partial derandomization of the original (existential) results. The goal is to come up with a “constructive” discrete probabilistic measure on subspaces X of R N such that a random (with respect to this measure) subspace still has low distortion almost surely, whereas the entropy of this measure (that is, the number of truly random bits necessary to sample from it) is also as low as possible.
Denoting by A k,N a random k × N sign matrix (i.e. with i.i.d. Bernoulli ±1 entries), one can extract from the paper [23] by Kashin that ker(A k,N ), a subspace of codimension at most k has, with high probability, distortion N/k • polylog(N/k). Schechtman [33] arrived at similar conclusions for subspaces generated by rows of A N -k,N . Artstein-Avidan and Milman [2] considered again the model ker(A k,N ) and derandomized this further from O(N 2 ) to O(N log N) bits of randomness. We remark that the pseudorandom generator approach of Indyk [19] can be used to efficiently construct such subspaces using O(N log 2 N) random bits. This was further improved to O(N) bits by Lovett and Sodin [27]. Subsequent to our work, Guruswami, Lee, and Wigderson [16] used the construction approach from this paper to reduce the random bits to O(N δ ) for any δ > 0 while achieving distortion 2 O(1/δ) .
As far as deterministic constructions with dim(X) = Ω(N) are concerned, we are aware of only one result; implicit in various papers (see e.g. [11]) is a subspace with dim(X) = N/2 and distortion O(N 1/4 ). For dim(X) 3N/4, say, it appears that nothing non-trivial was shown prior to our work.
Our main result is as follows.
Theorem 1.1. For every η = η(N), there is an explicit, deterministic polynomial-time construction of subspaces
Like in [23,2,27], our space X has the form ker(A k,N ) for a sign matrix A k,N , but in our case this matrix is completely explicit (and, in particular, polynomial time computable). Its high-level overview is given in Section 1.2.3 below.
On the other hand, if we allow ourselves a small number of random bits, then we can slightly improve the bound on distortion.
Theorem 1.2. For every fixed η > 0 there is a polynomial time algorithm using N 1/ log log N random bits that almost surely produces a subspace X ⊆ R N with dim(X) (1 -η)N and distortion (log N) O (1) .
Low distortion of a section X ⊆ R N intuitively means that for every non-zero x ∈ X, a “substantial” portion of its mass is spread over “many” coordinates, and we formalize this intuition by introducing the concept of a spread subspace (Definition 2.10). While this concept is tightly related to distortion, it is far more convenient to work with. In particular, using a simple spectral argument and Kerdock codes [25], [29,Chap. 15], we initialize our proof by presenting explicit subspaces with reasonably good spreading properties. These codes appeared also in the approach of Indyk [20], though they were used in a dual capacity (i.e., as generator matrices instead of check matrices). In terms of distortion, however, t
…(Full text truncated)…
This content is AI-processed based on ArXiv data.