Finding Weighted Graphs by Combinatorial Search
We consider the problem of finding edges of a hidden weighted graph using a certain type of queries. Let $G$ be a weighted graph with $n$ vertices. In the most general setting, the $n$ vertices are known and no other information about $G$ is given. The problem is finding all edges of $G$ and their weights using additive queries, where, for an additive query, one chooses a set of vertices and asks the sum of the weights of edges with both ends in the set. This model has been extensively used in bioinformatics including genom sequencing. Extending recent results of Bshouty and Mazzawi, and Choi and Kim, we present a polynomial time randomized algorithm to find the hidden weighted graph $G$ when the number of edges in $G$ is known to be at most $m\geq 2$ and the weight $w(e)$ of each edge $e$ satisfies $\ga \leq |w(e)|\leq \gb$ for fixed constants $\ga, \gb>0$. The query complexity of the algorithm is $O(\frac{m \log n}{\log m})$, which is optimal up to a constant factor.
💡 Research Summary
The paper addresses the problem of reconstructing a hidden weighted graph G with n vertices when only additive queries are allowed. An additive query selects a subset S of vertices and returns the sum of the weights of all edges whose both endpoints lie in S. The vertices themselves are known, but the edge set and the individual edge weights are unknown. The authors assume that the graph contains at most m edges (with m ≥ 2) and that every edge weight w(e) satisfies a fixed magnitude bound α ≤ |w(e)| ≤ β, where α and β are positive constants. This setting captures many practical scenarios, especially in bio‑informatics where sequencing technologies often provide only aggregate signals from subsets of nucleotides.
The contribution builds on earlier work by Bshouty‑Mazzawi and by Choi‑Kim, which dealt with unweighted or positively‑weighted graphs and achieved query complexities on the order of O(m log n). The new algorithm removes the positivity restriction, handles arbitrary signs, and exploits the known lower bound α to achieve a strictly better query bound.
The algorithm proceeds in two main phases. In the first phase, a randomized “sparsification” step repeatedly draws random vertex subsets and issues additive queries. By interpreting each query result as a linear equation over the unknown edge weights, the algorithm identifies a small candidate set C of vertex pairs that are likely to contain the true edges. This phase uses a divide‑and‑conquer scheme that recursively partitions the vertex set, ensuring that after O(log m) levels each candidate edge appears in a bounded number of equations. The crucial observation is that because every non‑zero weight is at least α in magnitude, any equation that contains a true edge cannot be zero, which guarantees that the random sampling does not miss edges with high probability.
In the second phase the algorithm determines the exact edges and their weights from the candidate set C. It issues additional queries on carefully chosen sub‑subsets of vertices, producing an over‑determined linear system whose unknowns are the weights of the edges in C. The system is solved using a randomized version of Gaussian elimination that runs in polynomial time. The bounds α and β allow the algorithm to filter out spurious solutions: any solution that violates the magnitude constraints is discarded, leaving only the true edge weights.
The authors prove that the total number of additive queries required is
\
Comments & Academic Discussion
Loading comments...
Leave a Comment