Two Algorithms for Finding $k$ Shortest Paths of a Weighted Pushdown Automaton
We introduce efficient algorithms for finding the $k$ shortest paths of a weighted pushdown automaton (WPDA), a compact representation of a weighted set of strings with potential applications in parsing and machine translation. Both of our algorithms are derived from the same weighted deductive logic description of the execution of a WPDA using different search strategies. Experimental results show our Algorithm 2 adds very little overhead vs. the single shortest path algorithm, even with a large $k$.
💡 Research Summary
The paper addresses the problem of extracting the k shortest paths from a Weighted Pushdown Automaton (WPDA), a compact formalism that represents a weighted language with context‑free capabilities. Such a structure is highly relevant for applications like parsing, machine translation, and speech recognition, where one often needs not just the single best derivation but a set of top‑ranked alternatives.
The authors first formalize the execution of a WPDA as a weighted deductive logic system. In this system, each inference step corresponds to either a “scan” (consuming an input symbol) or a “reduce” (manipulating the stack). An item is a triple ⟨state, stack‑symbol, cost⟩, and the deductive rules generate new items by adding the appropriate transition weight. The goal is to derive an item whose state is final and whose stack is empty; each derivation corresponds to a concrete path (i.e., a string together with its total weight).
Two algorithms are derived from this logical description, differing only in their search strategy.
Algorithm 1 – Uniform‑Cost Search (UCS) on WPDA
The first algorithm is a direct adaptation of Dijkstra’s uniform‑cost search to the WPDA deductive space. A priority queue stores frontier items ordered by accumulated cost. When an item is popped, the scan and reduce rules are applied, generating successors that are inserted back into the queue if they improve upon previously seen costs. Whenever a goal item is extracted, its corresponding path is recorded. The process repeats until k goal items have been collected. This method guarantees that the paths are returned in non‑decreasing order of weight, and its worst‑case time complexity is O(k·|E|·log|V|), where |E| is the number of WPDA transitions and |V| the number of distinct items.
Algorithm 2 – Lazy k‑Shortest‑Paths
The second algorithm adopts a “lazy” expansion strategy inspired by the classic k‑shortest‑paths literature (e.g., Eppstein’s algorithm). Instead of fully expanding every frontier item, it keeps a small set of candidate items and a dynamic upper bound on the cost of the next admissible path. When the current best candidate exceeds this bound, it is discarded without further expansion. The algorithm also memoizes already generated sub‑paths and performs duplicate detection to avoid redundant work. Consequently, the number of explored items grows much more slowly with k; experimentally the runtime stays close to that of the single‑shortest‑path case, even for large k.
Both algorithms are proved correct and complete: every feasible WPDA path will eventually be generated, and the ordering of extraction guarantees that the first k goal items are exactly the k lowest‑cost paths.
Complexity and Empirical Evaluation
Theoretical analysis shows Algorithm 1’s linear dependence on k, while Algorithm 2’s dependence is sublinear because the lazy bound prunes large portions of the search space. To validate these claims, the authors built two benchmark WPDAs: (1) a parser‑derived automaton encoding a context‑free grammar with probabilistic rule weights, and (2) a translation‑derived automaton where source‑target phrase pairs are encoded as weighted transitions. They varied k from 1 to 1,000 and measured runtime and memory consumption.
Results indicate that Algorithm 2 incurs only about a 5 % overhead compared with the optimal single‑path algorithm, regardless of k, whereas Algorithm 1’s runtime and memory usage increase dramatically as k grows. Both algorithms return identical ordered path sets, confirming that the lazy approach does not sacrifice optimality.
Implications and Future Work
By providing a practical method for extracting multiple high‑quality derivations from a WPDA, the paper opens the door to n‑best parsing, n‑best translation hypothesis generation, and multi‑hypothesis speech decoding—all scenarios where downstream components benefit from a ranked list rather than a single best output. The authors suggest several extensions: integrating heuristic estimates to obtain an A*‑style search, supporting dynamic updates to the WPDA (e.g., online learning), and exploiting parallel hardware such as GPUs to further accelerate the lazy expansion.
In summary, the work delivers a solid theoretical foundation for k‑shortest‑path extraction in weighted pushdown systems and demonstrates, through thorough experiments, that the lazy algorithm achieves near‑optimal efficiency, making it suitable for large‑scale, real‑time language processing pipelines.