Circuits with arbitrary gates for random operators

We consider general circuits computing n-operators f : {0, 1} n → {0, 1} n . As gates we allow arbitrary boolean functions of their inputs; there is no restriction on their fanin or fanout. Thus, the phenomenon which causes complexity of such circuits is information transfer rather than information processing as in the case of single functions. Such a circuit is a directed acyclic graph with n input nodes x 1 , . . . , x n and n output nodes y 1 , . . . , y n . Each non-input node computes some boolean function of its predecessors. A circuit computes an operator f = (f 1 , . . . , f n ) if, for all i = 1, . . . , n, the boolean function computed at the ith output node y i is the ith component f i of the operator f . The depth of a circuit is the largest number of wires in a path from an input to an output node. The size of a circuit is the total number of wires in it. We will denote by s d (f ) the smallest number of wires in a general circuit of depth at most d computing f . If there are no restrictions on the depth, the corresponding measure is denoted by s(f ). Note that s(f ) ≤ s 1 (f ) ≤ n 2 holds for any n-operator, so quadratic lower bounds are the highest ones. Circuits of depth 2 constitute the first non-trivial model. Interest in depth-2 circuits comes from the following important result of Valiant [17]: If in every depth-2 circuit, computing f with O(n/ ln ln n) gates on the middle layer, at least n 1+Ω (1) direct wires must connect inputs with output gates, then f cannot be computed by log-depth circuits with a linear number of fanin-2 gates. To prove a super-linear lower bound for log-depth circuits is an old and well-known problem in circuit complexity. Super-linear lower bounds up to s 2 (f ) = Ω(n log 2 n) where proved using graph-theoretic arguments by analyzing some super-concentration properties of the circuit as a graph [5,9,10,12,11,1,13,14,15]. Higher lower bounds of the form s 2 (f ) = Ω(n 3/2 ) were recently proved using information theoretical arguments [4,6]. For larger depth d known lower bounds are only slightly nonlinear. All these bounds, however, are on the total number of wires, so they still have no consequences for log-depth circuits. In fact, in the class of general circuits, even the question about the complexity of a random operator remained unclear. In particular, it was unclear whether operators requiring a quadratic number of wires (even in depth 2) exist at all? 2 Circuits for general operators Note that a direct counting argument, as in the case of constant fanin circuits, does not work for general circuits: already for d > n + log n, the number 2 2 d of possible boolean functions that may be assigned to a node of fanin d may be larger than the total number 2 n2 n of n-operators. Our first result is an observation that this bad situation can be excluded by just turning the power of circuits against themselves to ensure that, in an optimal circuit, no gate can have fanin larger than n. This leads us to Theorem 1. For almost all n-operators f , s(f ) = Ω(n 2 ). Proof. Let µ(L) be the number of different n-operators computable by boolean circuits with at most L wires. Our goal is to upper bound this number in terms of n and L, and compare this bound with the total number 2 n2 n of n-operators. Take an optimal circuit with ℓ ≤ L wires computing some n-operator; hence, ℓ ≤ n 2 . Then ℓ = m i=1 d i , where d 1 , . . . , d m are the fanins of its gates. It is clear that we need m ≥ n gates, since we must have n input gates. On the other hand, m ≤ ℓ + n + 2 ≤ 2n 2 gates are always enough since every non-input gate, besides two possible constant gates, must have nonzero fanin. We now make use of the fact that the gates in our circuits may be arbitrary boolean functions: This allows us to assume that d i ≤ n for all i. Indeed, if d i > n, then we can replace the ith gate by the boolean function computed at this gate and join it to all n input variables; when doing this, the total number of wires in the circuit can only decrease. The number of sequences d 1 , . . . , d m of fanins with 0 ≤ d i ≤ n does not exceed (n + 1) m . For each such sequence and for each i = 1, . . . , m, there are at most m di ≤ m di possibilities to chose the set of inputs for the ith node and at most 2 2 d i possibilities to assign a boolean function to this node. Hence, We now observe that at most n/2 nodes can have fanin larger than 2L/n, for otherwise we would have more than (2L/n) • (n/2) = L wires in total. Since m ≤ 2n 2 and since the fanin of each gate does not exceed n, we obtain that Since the total number of operators f : {0, 1} n → {0, 1} n is 2 n2 n , the smallest number L of wires sufficient to compute all of them must satisfy log 2 µ(L) ≥ n2 n . By (1), this implies Dividing both sides by 2n 2 , we obtain that 4 L/n = Ω(2 n /n), and hence, L = Ω(n 2 ). An important class of operators are linear ones. Each such operator computes n linear forms, that is, computes a matrix-vector product where A is an n × n (0, 1)-matrix. We are interested in the complexity s 2 (f A ) of such operators in the class of depth-2 circuits. If all gates are required to be linear (parities and their negations), then easy counting shows that some linear operators require Ω(n 2 / log n) wires. It is also known that O(n 2 / log n) are also sufficient to compute any linear operator [16,3,2]. But what if we allow arbitrary (non-linear) boolean functions as gates-can we then compute linear operators f A more efficiently? The largest known lower bound for an explicit linear operator f A has the form s 2 (f A ) = Ω(n log n) [11]. This raises the following question: Do linear n-operators requiring s 2 (f A ) = Ω(n 2 / log n) wires exist at all? We are only able to answer this question positively under the additional restriction that either all output gates of all gates on the middle layer must be linear functions. The next theorem shows that the non-linearity of middle gates is no problem: any such circuit can be transformed into a linear circuit with almost the same number of wires. Hence, some linear n-operators require about n 2 / log n wires in such circuits. Theorem 2. If a depth-2 circuit computes a linear n-operator and only has linear gates on the output layer, then it can be transformed to an equivalent linear circuit by adding at most 2n new wires. Proof. Let A be an n-by-n (0, 1)-matrix, and let Φ be a depth-2 circuit computing A x. We may assume, for simplicity, that there are no direct wires from inputs to outputs: this can be easily achieved by adding n new wires on the first level. Assume that all output gates of Φ are linear boolean functions. By adding one constant-1 function on the middle layer and at most n new wires on the second level, we can also assume that each output gate computes just the sum modulo 2 of its inputs (and not the negation of this sum). Let h = (h 1 , . . . , h r ) : {0, 1} n → {0, 1} r be the operator computed by the gates on the middle layer. Since A 0 = 0 and each output gate computes the sum modulo 2 of its inputs, we may assume that h( 0) = 0 as well: If h j ( 0) = 1 for some j, then replace the function h j by the function h ′ j such that h ′ j ( 0) = 0 and h ′ j ( x) = h j ( x) for all x = 0. Let B be the n-by-r adjacency (0, 1)-matrix of the bipartite graph formed by the wires joining the gates on the middle layer with those on the output layer. Then A x = B • h( x) for all x ∈ {0, 1} n . Write each vector x = (x 1 , . . . , x n ) as the linear combination x = n i=1 x i e i of unit vectors e 1 , . . . , e n ∈ {0, 1} n , and replace the operator h computed on the middle layer by a linear operator Hence, h ′ ( x) = x ⊤ M , where M is an n × r matrix with rows h( e 1 ), . . . , h( e n ). Using the linearity of the matrix-vector product, we obtain that (with all sums mod 2): Hence, the new (linear) circuit Φ ′ computes A x as well. It remains to show that the number of wires in Φ ′ does not exceed the number of wires in Φ. The wires on the second level haven't changed at all. To show that the number of wires on the first level has not increased as well, let fanout(x i ) be the fanout of the ith input node x i , and fanin(h j ) the fanin of the jth gate h j on the middle layer. Then n i=1 fanout(x i ) = r j=1 fanin(h j ) is the total number L of wires on the first level. We know that h( 0) = 0, that is, h j ( 0) = 0 for all j = 1, . . . , r. Now we make a simple (but crucial) observation: if there is no wire from x i to h j , then h j ( e i ) = h j ( 0) = 0. This implies that the jth column of M can have at most fanin(h j ) ones. Since the number of wires on the first level of Φ ′ is just the total number of 1's in M , we are done. The second case-when only gates on the middle layer are required to be linear-is more delicate. That such circuits can be more powerful than linear ones, was shown in [7]. Given a boolean n × n matrix A, say that a circuit weakly computes the operator f A ( x) = A x if it correctly computes it on all n unit vectors e 1 , . . . , e n . Note that, for linear circuits, this is no relaxation: such a circuit weakly computes f A iff it correctly computes f A on all inputs. Hence, some linear operators cannot be weakly computed by linear depth-2 circuits using fewer than Ω(n 2 / log n) wires. It is however shown in [7] that the situation changes drastically if we only use linear gates on the middle layer but allow non-linear gates on the output layer, then any linear n-operator can be weakly computed using only O(n log n) wires. Still, using Kolmogorov complexity arguments, we can prove that, for some matrices A, such circuits require a quadratic number of wires to compute the entire operator A x. Theorem 3. If middle gates are required to be linear, then linear n-operators Proof. We use the Kolmogorov complexity argument known as the incompressibility argument (see [8] for background). Since we have 2 n 2 matrices, some matrix A requires n 2 bits to describe it. Hence, the linear operator f A ( x) = A x cannot be described using fewer than n 2 -O(1) bits, as well. Fix an arbitrary depth-2 circuit Φ computing f A , and assume that all its gates on the middle layer are linear. Let L be the number of wires in Φ. As before, we may assume that there are no direct wires from inputs to outputs. Our goal is to show that, using the circuit Φ, the operator f A can be described using O(L log n) bits. This will imply the desired lower bound L = Ω(n 2 / log n) on the number of wires. Let r be the number of nodes on the middle layer of Φ. Since at these nodes only linear functions are computed, the first level (between inputs and middle layer) computes some linear operator y = B x, where B is the r-by-n adjacency matrix of the bipartite graph formed by the wires joining the gates on the input layer with those on the middle layer. Let also C be the n-by-r adjacency matrix of the bipartite graph formed by the wires joining the gates on the middle layer with those on the output layer. Hence, L = |B| + |C| where |B| denotes the number of 1s in B. Using these two matrices B and C as well as the fact that the operator computed by the circuit Φ is linear, we can encode this operator using O(L log n) bits as follows. • Since |B| + |C| = L, both matrices B and C can be described using O(L log n) bits, just by describing the positions of their 1-entries. • The ith output gate of Φ computes g i (B x), where g i : {0, 1} r → {0, 1} is some boolean function depending only on rows of B seen by this gate, that is, on rows corresponding to the d i nodes on the middle layer seen by this gate. Let B i be the d i × n submatrix of B formed by these rows. Let Im(B i ) = {B i x : x ∈ {0, 1} n } be the column space of B i . If this space has dimension t then any t linearly independent columns of B form its basis. Take the set B ′ i = { u 1 , . . . , u t } of the first t linearly independent columns of B i , and call it the first basis of Im(B i ). • Encode the behavior of g i on this basis B ′ i by the string g i ( u 1 ), . . . , g i ( u t ) of t ≤ d i bits. The entire string, for all n output gates g 1 , . . . , g n , has length at most Having this encoding, we can recover the value g i ( x) of the ith output gate on a given input x ∈ {0, 1} n as follows. 1. Compute y i = B i x. We can do this since the ith row of C tells us what rows of B appear in B i , and we know the entire matrix B. 2. Take the first basis B ′ i of Im(B i ) and write y i as a linear combination y i = t k=1 λ k u k of basis vectors over GF (2). 3. Give z i = t k=1 λ k g i ( u k ) mod 2 as an output. We can compute this number since we know the values g i ( u 1 ), . . . , g i ( u t ). Since the circuit computes A x, the ith output gate must compute the scalar product a i , x of input vector x with the ith row a i of A. Hence, g i (B x) = a i , x , meaning that g i must be linear on Im(B). Since g i can only see the middle gates corresponding to the rows of B i , this implies that g i must be linear also on Im(B i ). Thus, λ k u k = g i ( y i ) = g i (B i x) = g i (B x) , that is, z i is a scalar product of x with the ith row of A, as desired. We have shown that, even when arbitrary boolean functions can be used as gates, some operators f : {0, 1} n → {0, 1} n require about n 2 wires. We have also shown that some linear operators require about n 2 / log n wires in depth-2 circuits, if either all output gates or all gates on the middle layer are required to be linear. We conjecture that the same lower bound for depth-2 circuits computing linear operators should also hold without any restrictions on used gates.

Circuits with arbitrary gates for random operators

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment