Computer Science / Data Structures Computer Science / Discrete Mathematics

On the Complexity of Searching in Trees: Average-case Minimization

February 23, 2026

Reading time: 7 minute

...

#Computer Science #Data Structures #Discrete Mathematics

📝 Original Info

Title: On the Complexity of Searching in Trees: Average-case Minimization
ArXiv ID: 0904.3503
Date: 2009-04-22
Authors: Ferdinando Cicalese, Tobias Jacobs, Eduardo Laber, Marco Molinaro

📝 Abstract

We focus on the average-case analysis: A function w : V -> Z+ is given which defines the likelihood for a node to be the one marked, and we want the strategy that minimizes the expected number of queries. Prior to this paper, very little was known about this natural question and the complexity of the problem had remained so far an open question. We close this question and prove that the above tree search problem is NP-complete even for the class of trees with diameter at most 4. This results in a complete characterization of the complexity of the problem with respect to the diameter size. In fact, for diameter not larger than 3 the problem can be shown to be polynomially solvable using a dynamic programming approach. In addition we prove that the problem is NP-complete even for the class of trees of maximum degree at most 16. To the best of our knowledge, the only known result in this direction is that the tree search problem is solvable in O(|V| log|V|) time for trees with degree at most 2 (paths). We match the above complexity results with a tight algorithmic analysis. We first show that a natural greedy algorithm attains a 2-approximation. Furthermore, for the bounded degree instances, we show that any optimal strategy (i.e., one that minimizes the expected number of queries) performs at most O(\Delta(T) (log |V| + log w(T))) queries in the worst case, where w(T) is the sum of the likelihoods of the nodes of T and \Delta(T) is the maximum degree of T. We combine this result with a non-trivial exponential time algorithm to provide an FPTAS for trees with bounded degree.

💡 Deep Analysis

Deep Dive into On the Complexity of Searching in Trees: Average-case Minimization.

📄 Full Content

Searching is one of the fundamental problems in Computer Science and Discrete Mathematics. In his classical book [20], D. Knuth discusses many variants of the searching problem, most of them dealing with totally ordered sets. There has been some effort to extend the available techniques for searching and for other fundamental problems (e.g. sorting and selection) to handle more complex structures such as partially ordered sets [26,11,29,28,8]. Here, we focus on searching in structures that lay between totally ordered sets and the most general posets. We wish to efficiently locate a particular node in a tree.

More formally, as input we are given a tree T = (V, E) which has a ‘hidden’ marked node and a function w : V → Z + that gives the likelihood of a node being the one marked. In order to discover which node of T is marked, we can perform edge queries: after querying the edge e ∈ E we receive an answer stating in which of the two connected components of T \ e the marked node lies. To simplify our notation let us assume that our input tree T is rooted at a node r so that we can specify a query to an edge e = uv, with u being the parent of v, by referring to v.

A search strategy is a procedure that decides the next query to be posed based on the outcome of the previous queries. Every search strategy for a tree T = (V, E) (or for a forest) can be represented by a binary search (decision) tree D such that a path from the root of D to a leaf indicates which queries should be made at each step to discover that is the marked node. More precisely, a search tree for T is a triple D = (N, E , A), where N and E are the nodes and edges of a binary tree and the assignment A : N → V satisfies the following properties: (a) for every node v of V there is exactly one leaf in D such that A( ) = v; (b)[search property] if v is in the right (left) subtree of u in D then A(v) is (not) in the subtree of T rooted at A(u). For an example we refer to Figure 1.

Given a search tree D for T , let d(u, v) be the length (in number of edges) of the path from u to v in D. Then the cost of D, or alternatively the expected number of queries of D is given by

Therefore, our problem can be stated as follows: given a rooted tree T = (V, E) with |V | = n and a function w : V → Z + , the goal is to compute a minimum cost search tree for T . This is a natural generalization of the problem of searching an element in a sorted list with non-uniform access probabilities.

The State of the Art. The variant of the problem in which the goal is to minimize the number of edge queries in the worst case, rather than minimizing the expected number of queries, has been studied in several recent papers [5,29,28]. It turns out that an optimal (worst-case) strategy can be found in linear time [28]. This is in great contrast with the state of the art (prior to this paper) about the average-case minimization we consider here. The known results amount to the O(log n)-approximation obtained by Kosaraju et al. [21], and Adler and Heeringa [2] for the much more general binary identification problem, and the constant factor approximation algorithm that two of the authors gave in [23]. However, the complexity of the average-case minimization of the tree search problem has so far remained unknown.

Our Results. We significantly narrow the gap of knowledge in the complexity landscape of the tree search problem under two different points of view. We prove that this problem is N P-Complete even for the class of trees with diameter at most 4. This results in a complete characterization of the problem’s complexity with respect to the parametrization in terms of the diameter. In fact, the problem can be shown to be polynomially solvable for the class of trees of diameter at most 3. We also show that the tree search problem under average minimization is N P-Complete for trees of degree at most 16 (note that in any infinite class of trees either the diameter or the degree is non-constant). This substantially improves upon the state of the art, the only known result in this direction being an O(n log n) time solution [16,14] for the class of trees with maximum degree 2. The hardness results are obtained by fairly involved reductions from the Exact 3-Set Cover (X3C) with multiplicity 3 [13].

In addition to the complexity results, we also significantly improve the previous known results from the algorithmic perspective. We first show that we can attain 2-approximation by a simple greedy approach that always seeks to divide the remaining tree as evenly as possible. For bounded-degree trees, we match the new hardness results with an FPTAS. In order to obtain the FPTAS, we first devise a non-trivial Dynamic Programming based algorithm that, roughly speaking, computes the best possible search tree, among the search trees with height at most H, in O(n 2 2 H ) time. Then, we show that every tree T admits a minimum cost search tree whose height is O(∆ • (log n + log w(T ))), where ∆

…(Full text truncated)…

📄 Read Full PDF on ArXiv

📸 Image Gallery

Reference

This content is AI-processed based on ArXiv data.

On the Complexity of Searching in Trees: Average-case Minimization

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

📸 Image Gallery

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

📸 Image Gallery

Reference

Related Posts

New Branching Rules: Improvements on Independent Set and Vertex Cover in Sparse Graphs

Submodular problems - approximations and algorithms

The Distribution and Deposition Algorithm for Multiple Sequences Sets

Start searching

No results found