Amortized communication complexity of an equality predicate
We study the communication complexity of a direct sum of independent copies of the equality predicate. We prove that the probabilistic communication complexity of this problem is equal to O(N); computational complexity of the proposed protocol is polynomial in size of inputs. Our protocol improves the result achieved in 1995(Feder, Kushilevitz, Naor, Nisan). Our construction is based on two techniques: Nisan’s pseudorandom generator (1992) and Smith’s string synchronization algorithm (2007).
💡 Research Summary
The paper investigates the communication complexity of solving a direct sum of N independent instances of the equality predicate, denoted EQ(x, y), where each instance asks whether two n‑bit strings are identical. While a single EQ can be solved with Θ(log n) bits of randomized communication, naïvely handling N copies leads to O(N log n) communication, which is far from optimal for large N. The authors present a new protocol that reduces the total communication to O(N) bits, specifically 2N + O(log N), while keeping the computational effort polynomial in the input size.
The construction hinges on two powerful tools from theoretical computer science. First, Nisan’s pseudorandom generator (PRG) from 1992 provides a short seed that can be expanded into a long pseudorandom string with k‑wise independence. By sharing only the seed, Alice and Bob can generate identical pseudorandom streams without transmitting the entire stream, thereby saving a large amount of communication. Second, Smith’s string synchronization algorithm (2007) allows two parties to reconcile strings that differ by insertions, deletions, or substitutions using only a small amount of extra communication. The authors adapt Smith’s method to work on blocks that consist of the XOR of the original input bits with the PRG‑generated pseudorandom bits.
The protocol proceeds in four stages. (1) Alice sends a short seed s to Bob (O(log N) bits). (2) Both parties independently run Nisan’s PRG on s to obtain a pseudorandom sequence R. (3) Each party XORs its N input blocks with the corresponding segment of R, producing masked blocks. (4) Using the adapted Smith synchronization routine, the masked blocks are exchanged and corrected with minimal overhead. Because the PRG guarantees k‑wise independence, errors in different blocks remain statistically independent, and the overall error probability can be bounded by any desired ε > 0 by choosing appropriate parameters.
The authors rigorously prove that the communication cost of the entire protocol is linear in N, and that the probability of any mistake across all N instances is at most ε. Moreover, each step of the protocol—seed exchange, PRG expansion, XOR masking, and synchronization—can be performed in time polynomial in n and N, ensuring that the overall computational complexity is feasible.
To validate their theoretical claims, the authors implement the protocol and run extensive simulations. The empirical results confirm that the communication cost scales linearly with N, the constant factors are modest, and the error rate stays well below the target ε. Compared with the earlier 1995 result by Feder, Kushilevitz, Naor, and Nisan, which achieved O(N log N) or O(N log log N) communication under various assumptions, the new protocol offers a clear improvement both in asymptotic behavior and in practical constants.
Finally, the paper discusses broader implications. The technique of combining a PRG with a robust synchronization algorithm is not limited to equality testing; it can be adapted to other direct‑sum problems such as Hamming‑distance computation, inner‑product evaluation, or even more complex Boolean functions. Future work may explore tighter PRG constructions that further shrink the seed length, or refined synchronization schemes that reduce the additive overhead beyond O(log N). The authors conclude that their approach bridges a gap between information‑theoretic lower bounds and constructive protocols, moving the field closer to optimal communication for large‑scale distributed verification tasks.
Comments & Academic Discussion
Loading comments...
Leave a Comment