Asymptotically Optimal Sequential Testing with Markovian Data

Reading time: 5 minute
...

📝 Original Info

  • Title: Asymptotically Optimal Sequential Testing with Markovian Data
  • ArXiv ID: 2602.17587
  • Date: 2026-02-19
  • Authors: ** 논문에 명시된 저자 정보가 제공되지 않았습니다. (제공되지 않음) **

📝 Abstract

We study one-sided and $α$-correct sequential hypothesis testing for data generated by an ergodic Markov chain. The null hypothesis is that the unknown transition matrix belongs to a prescribed set $P$ of stochastic matrices, and the alternative corresponds to a disjoint set $Q$. We establish a tight non-asymptotic instance-dependent lower bound on the expected stopping time of any valid sequential test under the alternative. Our novel analysis improves the existing lower bounds, which are either asymptotic or provably sub-optimal in this setting. Our lower bound incorporates both the stationary distribution and the transition structure induced by the unknown Markov chain. We further propose an optimal test whose expected stopping time matches this lower bound asymptotically as $α\to 0$. We illustrate the usefulness of our framework through applications to sequential detection of model misspecification in Markov Chain Monte Carlo and to testing structural properties, such as the linearity of transition dynamics, in Markov decision processes. Our findings yield a sharp and general characterization of optimal sequential testing procedures under Markovian dependence.

💡 Deep Analysis

📄 Full Content

Hypothesis testing is a cornerstone of statistics and theoretical computer science: from data, one decides whether an unknown data-generating mechanism satisfies a prescribed property (Lehmann & Romano, 2005;Goldreich, 2017). Classical theory largely assumes i.i.d. samples, but many modern streams are temporally dependent, making hypothesis testing under dependence both practically important and theoretically subtle (Phatarfod, 1965;Gyori & Paulin, 2015;Fauß et al., 2020).

A widely used model of dependence is Markovianity, where the future is conditionally independent of the past given the present state (Bengio et al., 1999;Nagaraj et al., 2020). Markov dynamics arise in Markov Chain Monte Carlo (MCMC) (Roy, 2020), reinforcement learning and control (Sutton et al., 1998;Bertsekas, 2019;Beutler & Ross, 1985), and hidden Markov models (Rabiner & Juang, 2003). In these settings, the transition mechanism is typically unknown; theory often proceeds by imposing structural assumptions (e.g., Gaussianity or bilinearity) (Jin et al., 2020;Ouhamma et al., 2023), whose validity may be unclear. This motivates testing model classes under Markovian data (Natarajan, 2003;Fauß et al., 2020).

We study sequential hypothesis testing for finite-state Markov chains. Let [m] := {1, . . . , m}, ∆ m be the simplex in R m , and M the set of m × m row-stochastic matrices. A time-homogeneous Markov chain is specified by (P, µ) with P ∈ M and µ ∈ ∆ m , generating X 1 , X 2 , . . . via

We assume sequential access to data: at time t we observe X t . Given an unknown (P, µ), we test whether P lies in a prescribed null class P ⊂ M versus an alternative Q ⊂ M: H 0 : P ∈ P versus H 1 : P ∈ Q, a composite versus composite problem. We assume P ∩ Q = ∅ (and impose additional separation/identifiability conditions only when required for sharp characterizations).

first non-asymptotic, instance-dependent lower bound on E Q [τ α ] for α-correct, power-one sequential tests under an unknown alternative Q ∈ Q (Theorem 3.3). The leading term is log(1/α) scaled by an information quantity D inf M (Q, P ), an infimum of stationary-weighted KL divergences over P, plus an α-independent term depending on structural properties of Q. We then construct a sequential test that matches this bound to first order as α → 0 (Theorem 4.1).

  1. Composite null is fundamentally harder than a known singleton null. Fields et al. (2025) study singleton null versus composite alternative: H 0 : P = P 0 for known P 0 , versus H 1 : P ∈ Q. Our setting is composite versus composite with a composite null: even when data is generated by a specific Q, the test must rule out every P ∈ P while maintaining uniform Type-I control. This necessity of certifying incompatibility with an entire null class (rather than a single known reference) drives both our information characterization (via an infimum over P) and the technical analysis.

  2. Two technical tools. We state and prove two results that may be of independent interest: (a) A Pinsker-type inequality (Proposition 4.3) lower bounding a stationary-weighted divergence between two transition matrices by squared gaps between stationary means of suitable functions. (b) A uniform control of solutions to the Poisson equation in terms of mixing properties (Proposition 3.1).

Applications. We instantiate our framework for (i) misspecification detection in MCMC by testing consistency with a target stationary distribution (Section 5.1), obtaining optimal detection guarantees (Corollary 5.1); and (ii) structural testing in RL by sequentially verifying linear transition dynamics in MDPs (Section 5.2).

Markovian Sequential Testing. Sequential testing dates back to Wald (1945) and the SPRT for two simple hypotheses under i.i.d. data. Work on sequential testing under Markovian dependence is comparatively limited, but includes early contributions such as Phatarfod (1965Phatarfod ( , 1971)); Schmitz & S üselbeck (1983); Dimitriadis & Kazakos (2007); Kiefer & Sistla (2016), which are largely SPRT-type procedures tailored to simple hypotheses (singleton null and singleton alternative). Fauß et al. (2020) study sequential and fixed-sample testing for Markov processes from a minimax/robust perspective, deriving tests with optimal worst-case guarantees. Their objective is complementary to ours: we consider one-sided, α-correct, power-one tests and seek instance-dependent characterizations under the (unknown) alternative.

The closest work to ours is Fields et al. (2025), which analyzes one-sided sequential tests for Markov chains based on the plug-in likelihood estimator of Takeuchi et al. (2013). Their setting differs in two key respects: (i) they test a singleton null against its complement, whereas we treat composite null and composite alternative classes; and (ii) their analysis assumes uniformly bounded likelihood ratios, which we do not require. Moreover, the above works do not provide instance-dependent lower bounds on the expected stopping tim

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut