Hidden Markov Individual-level Models of Infectious Disease Transmission

Reading time: 6 minute
...

📝 Original Info

  • Title: Hidden Markov Individual-level Models of Infectious Disease Transmission
  • ArXiv ID: 2602.15007
  • Date: 2026-02-16
  • Authors: ** (논문에 명시된 저자 정보를 원문에서 확인할 수 없으므로, 실제 논문에 기재된 저자명을 그대로 기입하시기 바랍니다.) **

📝 Abstract

Individual-level epidemic models are increasingly being used to help understand the transmission dynamics of various infectious diseases. However, fitting such models to individual-level epidemic data is challenging, as we often only know when an individual's disease status was detected (e.g., when they showed symptoms) and not when they were infected or removed. We propose an autoregressive coupled hidden Markov model to infer unknown infection and removal times, as well as other model parameters, from a single observed detection time for each detected individual. Unlike more traditional data augmentation methods used in epidemic modelling, we do not assume that this detection time corresponds to infection or removal or that infected individuals must at some point be detected. Bayesian coupled hidden Markov models have been used previously for individual-level epidemic data. However, these approaches assumed each individual was continuously tested and that the tests were independent. In practice, individuals are often only tested until their first positive test, and even if they are continuously tested, only the initial detection times may be reported. In addition, multiple tests on the same individual may not be independent. We accommodate these scenarios by assuming that the probability of detecting the disease can depend on past observations, which allows us to fit a much wider range of practical applications. We illustrate the flexibility of our approach by fitting two examples: an experiment on the spread of tomato spot wilt virus in pepper plants and an outbreak of norovirus among nurses in a hospital.

💡 Deep Analysis

📄 Full Content

detection time corresponds to infection or removal or that infected individuals must at some point be detected. Bayesian coupled hidden Markov models have been used

previously for individual-level epidemic data. However, these approaches assumed each individual was continuously tested and that the tests were independent. In practice, individuals are often only tested until their first positive test, and even if they are continuously tested, only the initial detection times may be reported. In addition, multiple tests on the same individual may not be independent. We accommodate these scenarios by assuming that the probability of detecting the disease can depend on past

Epidemiologists are often interested in questions related to the transmission of an infectious disease at the individual level. For example, whether susceptibility differs by individual-level characteristics such as age (Cohen et al., 1997;Davies et al., 2020), or how far an infected individual could realistically spread the disease (Hu et al., 2021;Lichtemberg et al., 2022). Individual-level models (ILMs) of infectious disease transmission (Deardon et al., 2010) can be valuable tools for helping answer these types of questions (Vynnycky and White, 2010). These approaches model each individual in the population moving through different disease states, such as susceptible, infectious, and removed (Ward et al., 2025). Transitions between states can occur in continuous (Almutiry et al., 2021) or discrete (Warriyar et al., 2020) time. We will focus on discrete time. For discrete-time models, the probability of infection at each time step may depend on the number of infectious individuals in the population, their distance from the susceptible individual, and the inherent susceptibility or infectivity of individuals, which may vary with covariates (Keeling et al., 2001;Mahsin et al., 2022). Therefore, these models can describe a wide range of complex mixing patterns.

However, a significant challenge with fitting ILMs to individual-level epidemic data is that we usually only know when an individual was detected (e.g, showed symptoms), not when they were infected or removed (Touloupou et al., 2020). For example, in Section 4.2, we look at an experiment on the spread of tomato spotted wilt virus (TSWV) in pepper plants, which was described in Hughes et al. (1997) and analyzed using ILMs by Almutiry et al. (2021). The experimenters only monitored a plant until symptoms of TSWV appeared, so that the data consists of a single detection time for each detected plant (corresponding to observed symptom onset). Since symptoms of TSWV take 2-4 weeks to appear in a plant, plants were likely infected before they were detected. Furthermore, signs of TSWV can be difficult to spot in a plant; therefore, it is possible that the experimenters never detected some infected plants. In this example, we do not know the infection or removal times or even how many plants were infected, which makes inference challenging.

The most popular way to account for uncertain infection and removal times in epidemic modelling is to treat them as unknown parameters of the model within a Bayesian framework, which is known as data augmentation (DA) (O’Neill, 2002;O’Neill and Kypraios, 2019).

Most DA methods assume that infection times are unknown and removal times are known (O’Neill and Roberts, 1999;Deardon et al., 2010;O’Neill and Kypraios, 2019), or that infection times are known and removal times are unknown (Bu et al., 2022). However, in many applications, we do not observe when individuals were removed or infected (Neal and Roberts, 2004;Pokharel and Deardon, 2022). In such cases, a susceptible-infectious-notifiedremoved (SINR) model (Jewell et al., 2009) can be used to estimate unknown infection and removal times using observed notification (detection) times (Almutiry et al., 2021). In an SINR model, all infectious individuals must enter the notified state before transitioning to the removed state. That is, these models assume that all infectious individuals must be detected before removal. However, infectious individuals could recover without being detected if, for example, they show mild or no symptoms and testing is based on the appearance of symptoms (Mullis et al., 2009). In addition, studies that use SINR models have made much stronger assumptions. For example, Jewell et al. (2009) assumed that the notification and removal times were known. Almutiry et al. (2021) did not require the removal times to be known. However, they assumed that only those who showed symptoms were infected. It is possible for individuals who did not show symptoms to have been infected, if they had asymptomatic infections, hard to spot symptoms, or were infected late in the study and

had not yet developed symptoms (Jewell et al., 2009). The assumption that no undetected individuals were infected during the observation period is common when using many existing DA methods (Britton and O’Neill, 2002)

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut