Identifying the consequences of dynamic treatment strategies: A decision-theoretic overview

We consider the problem of learning about and comparing the consequences of dynamic treatment strategies on the basis of observational data. We formulate this within a probabilistic decision-theoretic framework. Our approach is compared with related work by Robins and others: in particular, we show how Robins’s ‘G-computation’ algorithm arises naturally from this decision-theoretic perspective. Careful attention is paid to the mathematical and substantive conditions required to justify the use of this formula. These conditions revolve around a property we term stability, which relates the probabilistic behaviours of observational and interventional regimes. We show how an assumption of ‘sequential randomization’ (or ’no unmeasured confounders’), or an alternative assumption of ‘sequential irrelevance’, can be used to infer stability. Probabilistic influence diagrams are used to simplify manipulations, and their power and limitations are discussed. We compare our approach with alternative formulations based on causal DAGs or potential response models. We aim to show that formulating the problem of assessing dynamic treatment strategies as a problem of decision analysis brings clarity, simplicity and generality.

💡 Research Summary

The paper tackles the problem of estimating and comparing the consequences of dynamic treatment strategies (DTS) using only observational data. It frames the problem within a probabilistic decision‑theoretic paradigm, treating the selection of a treatment policy as a decision problem whose value is the expected patient outcome under that policy. Central to the analysis is the concept of “stability,” which asserts that the conditional probability laws governing the observed regime (the natural course of care) are identical to those that would obtain under an interventional regime that enforces a particular policy. When stability holds, the expected utility of any policy can be computed from the observed data without needing a separate experimental study.

The authors identify two sufficient sets of assumptions that guarantee stability. The first is sequential randomization (also called the “no unmeasured confounders” assumption). It requires that at each decision point the treatment assigned is conditionally independent of future potential outcomes given the past observed covariates and past treatments. The second is sequential irrelevance, a weaker condition that stipulates past treatments affect the current state only through observed covariates, not directly. Both assumptions can be expressed graphically and tested, at least partially, using the data.

With stability established, the paper derives Robins’s G‑computation formula naturally from the decision‑theoretic viewpoint. The G‑computation algorithm proceeds by (1) estimating the series of conditional distributions of covariates and treatment choices from the observational data, (2) “forcing” the treatment assignments to follow a candidate policy, and (3) recursively integrating these conditional distributions to obtain the expected outcome under the policy, denoted E