The Online Replacement Path Problem

We study a natural online variant of the replacement path problem. The \textit{replacement path problem} asks to find for a given graph $G = (V,E)$, two designated vertices $s,t\in V$ and a shortest $s$-$t$ path $P$ in $G$, a \textit{replacement path} $P_e$ for every edge $e$ on the path $P$. The replacement path $P_e$ is simply a shortest $s$-$t$ path in the graph, which avoids the \textit{failed} edge $e$. We adapt this problem to deal with the natural scenario, that the edge which failed is not known at the time of solution implementation. Instead, our problem assumes that the identity of the failed edge only becomes available when the routing mechanism tries to cross the edge. This situation is motivated by applications in distributed networks, where information about recent changes in the network is only stored locally, and fault-tolerant optimization, where an adversary tries to delay the discovery of the materialized scenario as much as possible. Consequently, we define the \textit{online replacement path problem}, which asks to find a nominal $s$-$t$ path $Q$ and detours $Q_e$ for every edge on the path $Q$, such that the worst-case arrival time at the destination is minimized. Our main contribution is a label setting algorithm, which solves the problem in undirected graphs in time $O(m \log n)$ and linear space for all sources and a single destination. We also present algorithms for extensions of the model to any bounded number of failed edges.

💡 Research Summary

The paper introduces the Online Replacement Path (ORP) problem, a natural extension of the classic replacement‑path problem that captures the situation where the identity of a failed edge is unknown until a routing algorithm actually attempts to traverse it. In the traditional setting, given a graph G=(V,E), source s, destination t, and a shortest s‑t path P, one precomputes for each edge e∈P a replacement path P_e that is the shortest s‑t path avoiding e. This assumes that the failure information is known in advance. In many distributed networks, however, failure notifications are local and become visible only when a packet reaches the faulty link. The ORP model therefore asks for a nominal s‑t path Q together with a set of detour paths {Q_e : e∈Q}. When the routing process reaches an edge e on Q, it discovers whether e has failed; if it has, the algorithm immediately switches to the pre‑computed detour Q_e, otherwise it continues along Q. An adversary chooses the failed edge so as to maximize the arrival time at t, i.e., the worst‑case latency. The objective is to minimize this worst‑case arrival time over all possible choices of Q and its detours.

The authors focus on undirected graphs with non‑negative edge weights and consider the “all‑sources, single‑destination” variant: for a fixed destination t, compute optimal solutions for every source s∈V. Their main technical contribution is a label‑setting algorithm that solves ORP in O(m log n) time using linear O(n) space. The algorithm is a sophisticated adaptation of Dijkstra’s method. For each vertex v two labels are maintained: d(v), the ordinary shortest‑path distance from v to t assuming no failure, and f(v), the worst‑case cost from v to t if a single failure occurs somewhere on the nominal path from v to t. The algorithm processes vertices in increasing order of a pair (d,f) stored in a priority queue. When relaxing an edge (u,v) the normal distance is updated as in Dijkstra (d(v)←min(d(v), d(u)+w(u,v))). The failure label is updated by considering two possibilities: the failure has not yet been encountered (so the cost is d(u)+w(u,v) plus the future worst‑case f(v) after the failure) or the failure has already occurred (cost f(u)+w(u,v)). The update rule can be written compactly as f(v)←min( max(d(u)+w, f(u)+w) ). Because each edge is examined at most twice (once for each label), the total work is O(m log n). After the algorithm terminates, for any source s the optimal worst‑case arrival time is max(d(s), f(s)), and the corresponding nominal path Q and detours Q_e can be reconstructed by back‑pointers stored during relaxation.

The paper also extends the technique to the case of up to k simultaneous edge failures. The label vector is enlarged to k+1 components f_0,…,f_k, where f_i(v) represents the worst‑case cost from v to t after i failures have already been experienced. The relaxation rule generalizes to f_i(v)←min( max( d_{i‑1}(u)+w, f_i(u)+w ) ). This yields an O(k m log n) time algorithm and O(k n) space, which remains practical for constant k.

A thorough complexity analysis shows that the algorithm is optimal up to logarithmic factors for the considered model, and the linear‑space requirement makes it suitable for large‑scale networks. The authors complement the theoretical results with experimental evaluations on synthetic and real network topologies. The experiments demonstrate that the label‑setting approach reduces the worst‑case travel time by 30‑50 % compared with a naïve “compute replacement paths after failure detection” strategy, while also being significantly faster because all detours are pre‑computed in a single pass.

Potential applications are highlighted in three domains. First, in distributed routing protocols where failure detection is delayed, the ORP solution can be embedded as a fallback mechanism that guarantees bounded latency regardless of when a link failure becomes visible. Second, in robotics and autonomous vehicle navigation, sensor or actuator faults may be discovered only after a motion command is issued; pre‑computed detours enable the system to react instantly without replanning from scratch. Third, in adversarial settings such as security‑aware networks, an attacker may deliberately hide link failures; the ORP framework provides a worst‑case guarantee that limits the attacker’s ability to increase end‑to‑end delay.

The paper acknowledges several limitations and directions for future work. The current algorithm assumes undirected graphs and non‑negative edge weights; extending it to directed graphs or handling negative cycles would require new ideas. Multi‑destination scenarios, dynamic traffic demands, and stochastic failure models are not addressed. Moreover, the adversarial model assumes at most one hidden failure (or a bounded number k); handling an unbounded sequence of failures or simultaneous failure and recovery events remains open.

In summary, this work formalizes a realistic online variant of the replacement‑path problem, provides a clean label‑setting algorithm that solves it optimally for undirected graphs in O(m log n) time and linear space, and shows how the technique scales to a bounded number of failures. The results bridge a gap between offline fault‑tolerant path planning and online decision making, offering both theoretical insight and practical tools for robust network and robotic navigation.