Approximation of Singular-Stopping Control Driven by Hawkes Processes via Rescaled MDPs

Approximation of Singular-Stopping Control Driven by Hawkes Processes via Rescaled MDPs
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We investigate a singular-optimal stopping stochastic control problem driven by self-exciting dynamics governed by a Hawkes process. In the continuous-time setting, we show that the optimization problem reduces to solving a variational partial differential equation with gradient constraints. We then introduce its discrete-time counterpart, modeled as a Markov Decision Process. We prove that, under an appropriate rescaling procedure, the value function of the discrete-time problem converges to its continuous-time equivalent, implying that the discrete-time optimizers are asymptotically optimal for the continuous-time problem. Finally, we apply these results to an Ornstein-Uhlenbeck stochastic differential equation driven by a Hawkes process with singular control, motivated by optimal power plant investment under cyber threat and we illustrate the theoretical findings through numerical simulations.


💡 Research Summary

This paper studies a singular‑mixed optimal stopping stochastic control problem in which the underlying dynamics are driven by a self‑exciting Hawkes process. The authors first formulate the continuous‑time problem, where the state follows a stochastic differential equation (SDE) with a diffusion term, a drift term, and jumps generated by a Hawkes process with an exponential kernel. The control consists of two components: a regular (absolutely continuous) control that continuously influences the drift, and a singular control of bounded variation that allows instantaneous interventions (e.g., sudden capacity expansions or defensive actions). The objective is to minimize the expected total cost, which includes a running cost, a cost proportional to the singular control increments, and a terminal reward received upon stopping.

Using variational analysis, the authors derive a Hamilton‑Jacobi‑Bellman (HJB) variational inequality with gradient constraints that characterizes the value function. The inequality simultaneously enforces (i) the standard HJB dynamics, (ii) a stopping condition (value function must dominate the stopping reward), and (iii) a gradient constraint reflecting the cost of singular interventions. They prove existence and uniqueness of a viscosity solution to this variational PDE, extending classical singular‑control theory to settings with endogenous, history‑dependent jump intensities.

The second major contribution is a discrete‑time approximation based on a Markov Decision Process (MDP). The continuous‑time SDE is discretized with step size Δt; the Hawkes intensity is updated using a thinning algorithm that respects the self‑exciting structure, while the diffusion term is approximated by an Euler‑Maruyama scheme. The resulting MDP has state space (x, λ) and actions (regular control, singular increment). The authors introduce a time‑rescaling argument: as Δt → 0, the controlled Markov chain converges weakly to the original controlled SDE. They then establish uniform convergence of the discrete‑time value functions to the continuous‑time value function, showing that any optimal policy for the MDP is an ε‑optimal policy for the original problem. The proof relies on tightness of the discretized processes, reflection at a bounded domain to control state excursions, and stability properties of the Hawkes intensity under the exponential kernel.

To illustrate the theory, the paper applies the framework to an Ornstein‑Uhlenbeck (OU) process driven by Hawkes jumps, modeling a power‑plant’s operational state subject to cyber‑attack shocks. The singular control represents rapid investment or defensive actions, while the stopping decision corresponds to abandoning the investment. Parameters are calibrated using real cyber‑attack data (e.g., WannaCry) to capture clustering behavior. Monte‑Carlo simulations demonstrate that the discrete‑time optimal policies closely match the continuous‑time optimal cost, and they visualize the optimal stopping boundary and singular‑control trigger surface as functions of the Hawkes intensity and system state. Sensitivity analyses show how changes in attack intensity, intervention cost, and discounting affect the optimal strategy.

Overall, the paper makes three substantive contributions: (1) a rigorous variational‑PDE formulation of singular‑optimal‑stopping control with self‑exciting jumps; (2) a novel MDP discretization that preserves the Hawkes dynamics and provably converges to the continuous‑time solution; (3) a concrete application to power‑plant investment under cyber risk, supported by numerical experiments. By bridging singular control theory, Hawkes processes, and Markov decision processes, the work opens avenues for robust decision‑making in domains where events cluster in time and instantaneous actions are possible, such as finance, network security, and infrastructure management.


Comments & Academic Discussion

Loading comments...

Leave a Comment