Bayesian hidden Markov models for latent variable labeling assignments in conflict research: application to the role ceasefires play in conflict dynamics
A crucial challenge for solving problems in conflict research is in leveraging the semi-supervised nature of the data that arise. Observed response data such as counts of battle deaths over time indicate latent processes of interest such as intensity…
Authors: Jonathan P Williams, Gudmund H Hermansen, Håvard Str
Submitted to the Annals of Applied Statistics B A YESIAN HIDDEN MARK O V MODELS FOR LA TENT V ARIABLE LABELING ASSIGNMENTS IN CONFLICT RESEARCH: APPLICA TION TO THE R OLE CEASEFIRES PLA Y IN CONFLICT D YN AMICS B Y J O NAT H A N P W I L L I A M S 1 , 5 , G U D M U N D H H E R M A N S E N 2 , 3 , 5 , H Å V A R D S T R A N D 2 , 3 , 5 , G OV I N D A C L A Y T O N 4 , A N D H Å V A R D M O K L E I V N Y G Å R D 3 1 North Car olina State University , jwilli27@ncsu.edu 2 University of Oslo 3 P eace Resear ch Institute Oslo (PRIO) 4 ETH Zurich & Centr e for Humanitarian Dialogue 5 Centr e for Advanced Study , Norwegian Academy of Science and Letters A crucial challenge for solving problems in conflict research is in lever - aging the semi-supervised nature of the data that arise. Observed response data such as counts of battle deaths over time indicate latent processes of in- terest such as intensity and duration of conflicts, but defining and labeling in- stances of these unobserved processes requires nuance and imprecision. The av ailability of such labels, ho wever , would make it possible to study the ef fect of interv ention-related predictors — such as ceasefires — directly on conflict dynamics (e.g., latent intensity) rather than through an intermediate proxy like observ ed counts of battle deaths. Moti vated by this problem and the ne w av ailability of the ETH-PRIO Civil Conflict Ceasefires data set, we propose a Bayesian autoregressiv e (AR) hidden Markov model (HMM) frame work as a sufficiently fle xible machine learning approach for semi-supervised regime labeling with uncertainty quantification. W e motiv ate our approach by illus- trating the way it can be used to study the role that ceasefires play in shaping conflict dynamics. This ceasefires data set is the first systematic and globally comprehensiv e data on ceasefires, and our work is the first to analyze this ne w data and to explore the effect of ceasefires on conflict dynamics in a compre- hensiv e and cross-country manner . 1. Introduction. W ithin the conflict research community HMMs have been studied from a v ariety of different perspectiv es, and with v arying degrees of sophistication. Early applications of HMMs in the literature are inv estigated as case studies for various countries. They were largely motiv ated by a perceived need for the conflict research community to explore its data beyond what can be provided by linear models, arguing that the dynamics exhibited by these data are complex, non-linear political systems ( Petrof f, Bond and Bond , 2013 ; Schrodt , 1997a , b , 2006 ). A fe w years after the preliminary in vestigations of HMMs in the conflict research literature, the book chapter Petrof f, Bond and Bond ( 2013 ) summarized the best practices with particular emphasis on forecasts/predictions of violence. Overall, the ideas about, and implementations of, HMM strategies for explaining/predicting conflict data are outdated and hav e not advanced beyond the ideas and strategies prescribed in the classical tutorial paper , Rabiner ( 1989 ). See Anders ( 2020 ), Besle y , Fetzer and Mueller ( 2021 ), and Randahl and V e gelius ( 2022 ) for more recent accounts. T ypically , there has been a proposed set of theoretically motiv ated (unobserved) conflict states (in the range of 3- 6) linked by way of an HMM for pre-processing and organizing sequences of event-coded K e ywor ds and phrases: state space model, multistate model, discrete-time Markov process, discrete-valued time series, count-valued time series. 1 2 symbols from a large repository of international news summaries (provided by an agency such as Reuters). Locally maximum likelihood estimates are obtained from a Baum-W elch algorithm, the most plausible (unobserved) states-sequences are inferred via the V iterbi algo- rithm, and the fitted HMMs hav e been regarded as predominantly uninterpretable but useful for forecasts/predictions of future data. Furthermore, that it is not clear whether data has been properly discretized in the exist- ing studies. For example, Petroff, Bond and Bond ( 2013 ) describes the ability to tailor the length/number of time intervals to the precision of the data av ailable, and discourages ag- gregation of data (i.e., hours, days, weeks, etc.). While it is true that a discrete-time Markov process can be defined on any time grid, the a priori chosen grid must apply to all observed and future data sequences. The use of time stamps of observed data sequences, as they were actually recorded in time, requires adherence to a continuous-time Marko v process. Such a process can be modeled with a continuous-time HMM, and has been studied extensi vely in the disease progression literature; e.g., Satten and Longini Jr ( 1996 ) and W illiams et al. ( 2020 ). Our contributions are the following. W e propose a discrete-time Bayesian HMM to make inferences on how violence dynamics ev olve in time over a latent, discrete, conflict-inferred state space, moti v ated by the new ETH-PRIO Civil Conflict Ceasefires data set ( Clayton et al. , 2021 ), combined with the violence data from the Uppsala Conflict Data Program’ s (UCDP) geo-referenced event data set ( Sundberg and Melander , 2013 ). W e use weekly battle death counts as the emitted response variable, combined with conflict-domain-theory motiv ated, country-specific cov ariates. In particular , we demonstrate how the semi-supervised defining and labeling of conflict-inferred states is methodologically essential to dev eloping fundamen- tal insights into some of the most challenging contemporary questions in conflict research, such as the effect of ceasefires on conflict dynamics. The utility of HMM frame works for defining and assigning labels for latent variables has also been exhibited in the recent article Anders ( 2020 ) to identify territorial control during a civil war , that is intrinsically dif ficult (if not impossible or infeasible) to manually label. Arguably , due to a deficiency in meaningful labels, HMM-based semi-supervised data labeling strategies could pa ve the way for the ne xt decade of conflict research progress. Additionally , we offer a variety of inferential analyses and conclusions that can be drawn from fitting conflict data within this framework, as well as graphical tools and algorithms that could be used from a policy making perspectiv e for predicting or characterizing intensity of violence. W ith suf ficient data and intervention-rele vant predictors it is possible to conduct analysis using the state space sampler to quantify the effect of changes in policy (e.g., how the risk or duration of a conflict w ould be predicted to change if interventions are implemented). The weekly battle death count data we consider are modeled, conditional on the under- lying latent state, using a negati ve-binomial distribution with an AR mean structure. This is a natural choice because conflict-related death count data are time series that are commonly characterized by both over -dispersion and zero-inflation; both are common features of many of the battle death series (there are several illustrations belo w). F or a general introduction and ov ervie w of count time series, see for example the recent re view paper Davis et al. ( 2021 ). W e implement a Markov chain Monte Carlo (MCMC) algorithm to fit the HMM, and the re- peated sampling cov erage of all HMM parameter estimates is ev aluated via the construction of posterior credible sets. Ceasefires are arrangements through which conflict parties commit to stop fighting, and all ceasefires share the same immediate objecti ve: to stop violence ( Clayton, Nathan and W iehler , 2021 ). They are a common part of intra-state conflict, each year occurring in about HMMS FOR LA TENT V ARIABLE LABELING ASSIGNMENTS IN CONFLICT RESEARCH 3 a third of all conflicts. 1 Between 1989 and 2020 there were at least 2202 ceasefires across 66 countries, in 109 ci vil conflicts ( Clayton et al. , 2021 ). Surprisingly , despite their frequency , it remains unclear to what extent ceasefires really work, i.e., it is not known to what extent they shift a conflict from a more violent to a less violent state. T o illustrate this point, in early 2003 a ceasefire between the gov ernment of Sudan and the SPLA/M mark ed a transition from a long period of sustained violence into a relati vely non-violent state that remained in place until the parties reached a comprehensiv e peace agreement in 2005. Y et in Syria, for example, where there have been more than 130 ceasefires in the ongoing ci vil conflict, many ceasefires seem to have produced an escalation rather than deescalation in violence, or had no effect at all ( Karakus and Svensson , 2020 ). From existing research it is not possible to determine if the Syrian or Sudanese case are indicative of the general effect of ceasefires on conflict violence. 2 The lacuna in understanding the role ceasefires play in conflict is a result of conflict re- searchers lacking not only the necessary data but also the statistical tools to properly model violence dynamics, and to study and understand the covariates that influence these dynam- ics. T o date, ceasefire research is largely limited to case studies ( Palik , 2021 ; Pinaud , 2020 ; Åkebo , 2016 ), or analysis tailored for the policy and practice community (e.g. Brickhill , 2018 ; Buchanan, Clayton and Ramsbotham , 2021 ). The research in this area details a number of cases in which ceasefires ha ve ultimately prov ed successful (e.g., De Soto , 1999 ), b ut also sho ws that in many cases violence does not end with the onset of a ceasefire ( Kolås , 2011 ; Jarman , 2004 ; Höglund , 2005 ; Akebo , 2016 ), and some ceasefires ev en mak e violence worse ( Luttwak , 1999 ; K olås , 2011 ; Mahieu , 2007 ). In-depth qualitative analysis has many advan- tages (see, George and Bennet , 2005 ), but is ill-suited to systematically identifying broad trends, such as whether ceasefires in general produce a significant shift in violence dynamics. This instead requires comparativ e quantitativ e analysis which has, to date, been limited for questions surrounding ceasefire onset ( Clayton et al. , 2019 ) and design ( Clayton and Sticher , 2021 ). There is some evidence that ceasefires stop violence ( Fortna , 2003 , 2004 ), but this is limited to inter-state conflict and the analysis suf fers a number of serious methodological lim- itations. Accordingly , perhaps the most fundamental question on ceasefires remains largely unanswered: Do ceasefires stop violence? Furthermore, stopping violence can serve various purposes. Firstly , ceasefires can help to support conflict management efforts: creating breaks in the fighting to facilitate humanitar- ian assistance ( Aary , 1995 ); helping to contain conflict when resolution is not yet possible ( Clayton et al. , 2020a ); and terminating violence in such a way that does not require the resolution of the incompatibility ( Hanson , 2020 ; Kreutz , 2010 ). Second, ceasefires can also help with conflict resolution efforts: helping to build trust ( Åkebo , 2016 ); signalling control and cohesion ( Höglund , 2011 ); and creating an environment more conduci ve to negotiations ( Smith , 1995 ; Mahieu , 2007 ; Chounet-Cambas , 2011 ; Clayton and Sticher , 2021 ). Third, not all ceasefires are concei ved for peaceful purposes. Ceasefires can be used to gain some strate- gic adv antage, including b uying time to rearm and regroup ( Clayton et al. , 2020b ; Smith , 1 For simplicity we refer to ‘armed conflict’ as conflict or armed conflict. W e follow the UCDP/PRIO Armed Conflict Database and define an armed conflict as: ‘a contested incompatibility that concerns government and/or territory where the use of armed force between two parties, of which at least one is the go vernment of a state, results in at least 25 battle-related deaths in one calendar year’ ( Gleditsch et al. , 2002 ). In this article we only focus on internal armed conflicts. 2 In contrast, a bur geoning body of literature explores the dri vers of ceasefire onset, and the impact that cease- fires have on other outcomes such as peace processes, crime, and state-building; see, Åkebo ( 2016 ); W aterman ( 2020 ); Bara, Clayton and Rustad ( 2021 ); Clayton et al. ( 2020a ). 4 1995 ), to support state building efforts ( W oods , 2011 ; Sosnowski , 2020 ), or undertaking il- licit acti vity ( K olås , 2011 ; Dukalskis , 2015 ). 3 Ne vertheless, in almost all cases, it is logical to assume that in order to achiev e their purpose, ceasefires must first achieve the immediate objecti ve, i.e., shift conflict from a violent to a non-violent state. Building on existing conflict research literature, we discuss how our analyses and infer- ences are motiv ated from and translate to the existing theory for how ceasefires shape conflict violence. Most notably , a major finding of ours is surprising e vidence for an escalation in the state of violence in the pre-ceasefire period (i.e., the two weeks prior to a ceasefire) of a conflict. The remainder of the paper is organized as follows. In the next section, we describe the data set that moti vates our work, along with a discussion of why we focus on a weekly resolution of battle deaths. After commenting on limitations and challenges of using these data for modeling conflict dynamics, Section 3 moti vates the specification of our count- v alued time series model for weekly battle deaths. The technical details of the discrete- time HMM and its implementation are provided in Section 3.2 . W e illustrate why simpler statistical models are inadequate for our methods with various empirical studies in Sec- tion 4 , along with a simulation study on synthetic data for the model we propose. Re- sults, analyses, and robustness considerations are pro vided in Section 5 . The paper con- cludes in Section 6 by moti v ating a v ariety of open problems in conflict research to at- tract the attention of other statisticians. Documented R code for the use of our methods by other researchers and policy analysts on their data, along with the workflow for reproduc- ing our results are av ailable in the Supplementary Material ( Williams et al. , 2024 ) and at https://jonathanpw.github.io/research.html . 2. Data. Our goal is to build a model that is able to capture and re-create the intensity of conflict, with its spik es and lulls, as well as more enduring patterns of violence. The violence data we utilize come from the UCDP geo-referenced ev ent data set ( Sundber g and Melander , 2013 ), which reports all events with at least one battle-related casualty . Each ev ent record sho ws where and when an e vent took place, which actors where in volved, and ho w many battle-related deaths ensued. The response variable that we consider records the number of people killed due to intrastate conflict and/or internationalized intrastate conflicts per week in each country . This also means that the data for countries with sev eral parallel on-going conflicts are collapsed into one country time series. For ceasefires, we rely on the ETH-PRIO Ci vil Conflict Ceasefires data set ( Clayton et al. , 2021 ), which represents the first systematic and globally comprehensi ve data on ceasefires. Our work is the first to use this ne w data to explore the effect of ceasefires on conflict dy- namics in a comprehensiv e and cross-country manner . The ceasefires data defines a ceasefire as ‘an arrangement that includes a statement by at least one conflict party to stop violence temporarily or permanently from a specific point in time’. This broad conceptualization of a ceasefire captures the full range of security arrangements through which belligerents might agree to temporarily suspend and/or terminate hostilities. W e include unilateral and bi/multi- lateral ceasefires. A unilateral ceasefire occurs if one group alone declares the cessation of hostilities. For example, in December 2018 the T atmadaw (army) in Myanmar declared a unilateral ceasefire towards a number of armed ethnic or ganizations that was not recipro- cated. A bi/multi-lateral ceasefire occurs when two or more parties jointly declare a ceasefire 3 A ceasefire might prove to be successful according to one purpose (e.g., humanitarian aid), but unsuccessful in another (e.g., promoting peace talks), or successful in the eyes of one conflict party , while representing an abject f ailure in the eyes of another . Ceasefires might also achie ve their purpose, but produce other unintended effects (e.g., promoting the splintering of a non-state group ( Plank , 2017 )). HMMS FOR LA TENT V ARIABLE LABELING ASSIGNMENTS IN CONFLICT RESEARCH 5 to wards one another . For example, in Nov ember 2018, Israel and Hamas jointly agreed to simultaneously halt hostilities. Focusing on the impact of ceasefires on violence dynamics, rather than peace agreements, we consider only non-definitiv e ceasefires (i.e., ceasefires that attempt to suspend rather than permanently terminate a conflict). Moreover , since we are focused on country lev el dynamics, we also exclude ceasefires that only cover a part of the conflict area (i.e., so-called local ceasefires), as these agreements seek to reshape violence in a limited area, and so it does not make sense in our context to assess their impact on the conflict as a whole. Finally , we exclude ceasefires that extend or renew prior agreements, based on the assumption that any shift in violence dynamics is likely to ha ve occurred in response to the original agreement. The ceasefires data includes the date on which the arrangement enters into effect. W e define the week containing the start date of the ceasefire, together with the follo wing four weeks (i.e., fi ve weeks in total) as a ceasefir e period . W e do this to focus the analysis and the attention of the model on the dynamics that we are most confident relate to the ceasefire. This helps to mitigate the record-keeping uncertainty surrounding the effecti ve duration of a ceasefire. Further , we define the two weeks prior to the start of a ceasefire (i.e. the two weeks before the week that contains the start date) as the pr e-ceasefir e period . Ceasefires tend to be negotiated fairly quickly , and once agreed to often take a few days to implement. Thus, we believe two-weeks represents a suf ficiently long period so as to capture the direct period in which the ceasefire is under consideration, but short enough to av oid picking up other conflict-related factors. 4 There are se veral countries in the data set that do not contain any battle deaths or ceasefires. Out of the 170 countries, there are 74 without any battle deaths, and 124 countries with less than 1000 battle deaths throughout the entire time period 1989–2018. These countries are important for estimating the baseline state of ‘non-violent’, also referred to as ‘state 1’. W e partially label the data using the following definition: a week is labelled as state 1, without error , if it is at least 60 days (in both directions in time) remov ed from an observation of at least one battle death, and if it is also part of a consecutive period of at least 2 years without any battle-related deaths. This is the only state labeling assignment used for model estimation. 2.1. Contr ol covariates. There are a collection of standard cov ariate types that are known in the conflict research community to affect the likelihood of conflict, such as political regime, economic de velopment, and population. For these cov ariate types we use the follo wing. Po- litical regime is measured with the polyarchy index from the V -Dem project ( Coppedge et al. , 2019 ), used to control for alternative conflict management opportunities within Dahl’ s Dahl ( 1971 ) conceptualization of democracy . It is understood that countries some where in the middle of the polyarchy spectrum are most vulnerable to violent conflict, and so we include polyarchy in nominal value, as well its squared and cubed v alues, as cov ariates in our analy- sis. The V -Dem indices consist of a mix of variables measured either at the end of a year, at the maximum value throughout a year , or as the average o ver a year . That being true, we use lagged polyarchy v alues to pre vent mixing cause and ef fect. Economic de velopment is measured as gross domestic product (GDP) per capita. GDP re- lates directly to the feasibility of armed conflict because potential rebels in wealthy countries hav e more to lose from a rebellion, and wealthy gov ernments ha ve a larger capacity to co-opt 4 Since we are aggreg ating the data at a country level, it means that it is possible for a country to simultaneously be in a pre-ceasefire and a ceasefire period. This is relatively rare (only 244 out of a total 3872 weeks ha ve multiple ev ents) thus we leav e modeling this challenge to future work (see Section 6 for additional discussion). 6 a population through public goods and coerce a potential rebel through a strong security ap- paratus. Poor countries are more lik ely to ha ve poor citizens, often more willing to engage in risky w arfare, less ability to co-opt, and weaker militaries. Population affects both the risk of conflict in the first place and the likelihood of a cease- fire in a specific conflict. Larger countries are more likely to have conflicts simply because of the larger number of people able to start a rebellion. Consequently , larger countries are more likely to ha ve se veral parallel conflicts, which creates a complex situation where a rebel group might seek a ceasefire with the government to actively harm competing rebel groups. The economic dev elopment and population variables are obtained from the W orld Dev elop- ment Indicators ( http://data.worldbank.org/indicator ), and are included in log-lag-v alues in our analyses. Lastly , we include as cov ariates, indicators for ceasefire and pre-ceasefire periods (as defined in Section 2 ) for each country/week. 2.2. Limitations of the data and weekly r esolution. The UCDP geo-referenced event data set has a number of kno wn limitations (see also Raleigh, Kishi and Linke , 2023 ). The main problem is missing data, as some e vents are undetected and some not found newsw orthy or politically useful ( Dawkins , 2021 ). Significant biases arise in UCDP data sets because they are based on secondary sources. W eidmann ( 2015 ) is the best reference for this; the bias is substantial and correlated with cell phone coverage. Sources are sometimes uncertain, conflicting, or partially overlapping, and measurement error can occur on a number of di- mensions. An “ev ent size bias” as in vestigated in Price and Ball ( 2014 ) is also an important artifact that arises when the probability of an e vent being reported is associated with the size of the e vent. Further , reports may be conflicting about the se verity of a giv en ev ent; reports may be partially overlapping, raising questions about whether there were actually two different ev ents or imprecise reports of a single event. Methods and case studies for determining the unique reported e vents (i.e., “unique entity estimation”) and linking duplicate reported e vents (i.e., “duplicate detection” or “entity resolution”) hav e been published in the statistics literature (e.g., Sadinle , 2014 , 2018 ; Chen, Shri vasta va and Steorts , 2018 ). All three papers, Sadinle ( 2014 ), Sadinle ( 2018 ), and Chen, Shri vastav a and Steorts ( 2018 ), are indeed very interesting and address a major problem in the collection of conflict data, namely the potential ov erlap between reports from different sources. See Brunborg, L yngstad and Urdal ( 2003 ) for a more in depth discussion of this problem in the context of Srebrenica and the requirements for legal documentation used in an international court of law . Y et the paper only addresses one dimension as it focuses on individual, civilian casualties collected by individual traits. The v ast majority of casualties in conflict remain anon ymous and are referred to in relation to the location and organizations in volved, which introduces two dimensions not discussed by any of the papers Brunborg, L yngstad and Urdal ( 2003 ), Sadinle ( 2014 ), Sadinle ( 2018 ), or Chen, Shri vasta va and Steorts ( 2018 ). Despite these limitations, UCDP has decades of experience in consistently coding such ov erlaps and has fairly good routines to handle data reporting challenges. As a result, their data set contains, in addition to the best estimate, a high and lo w estimate. In about 80% of the cases these are the same or very close, but sometimes there is a substantial variation. While this does not address missing data, it does offer an opportunity to assess the rob ustness of empirical findings. Moreover , it is reasonable to believ e that larger ev ents are more likely to be precisely reported than smaller events ( Price and Ball , 2014 ), and that more violent periods in general are more likely to be reported. These presumptions suffice for reasoning that the UCDP data reflect, to some extent, true tr ends in conflict intensity , and our studies support our assumption that ov erall trends in conflict dynamics are consistently measured by UCDP . The same logic is used in ( T ai, Mehra and Blumenstock , 2022 ) to support similar use HMMS FOR LA TENT V ARIABLE LABELING ASSIGNMENTS IN CONFLICT RESEARCH 7 of ev ent data. Additional issues arise from conflicting reports about or ganizational affiliation, which is a major problem in many situations but not ours. Since we aggre gate fatalities at the country le vel, the or ganizational aspect is not rele v ant ( Lacina and Gleditsch , 2012 ). Aggregation to a weekly resolution of data is commonly used for the study of conflict dy- namics ( W ood , 2014 ; Krtsch , 2021 ; Holtermann , 2021 ), as well as for studying other aspects of contentious behaviors such as electoral violence ( Reeder and Seeberg , 2018 ) and denial of service attacks ( Lutscher et al. , 2020 ). Studying conflict data on a weekly resolution bal- ances sev eral concerns. The UCDP geo-referenced ev ent data set is often coded on a daily precision le vel, but a significant number of observations hav e some lev el of imprecision. A country-day data structure would saturate the data set with large number of artificial-zero observ ations, which would add more noise than information. On the other hand, a monthly aggregation would obscure much of the dynamics we aim to capture. In our aim to trace an ef fect of ceasefires on conflict dynamics, the weekly resolution provides enough precision and enough time between observations. The ceasefire data set is also reliably av ailable at a weekly le vel of resolution. There sometimes exist conflicting reports on the timing of a gi ven e vent, b ut again, UCDP will inform about the uncertainty of temporal data. Since a week is some what arbitrary one could imagine that our analysis would look slightly different if we defined a week as Thursday–W ednesday rather than Monday–Sunday , but based on a rudimentary analysis, we do not believe this particular problem to be very significant. In the UCDP geo-referenced e vent data, the e vent start and end dates cross over into the next week in less than 10% of the cases. Finally , there is a limitation from only relying on incidence of battle deaths as a response v ariable associated with underlying conflict dynamics. Conflict dynamics surely e xhibit mul- tifaceted manifestations, including but not limited to deaths, e.g., injury and disability . In fact, there has been much debate on the topic of conflict related injuries; it is ar gued in Fazal ( 2014 ) that a decline in war casualties has been due to medical improvements rather than a decline in the number of wars, but adequate data does not e xist to fully in vestigate the issue. A paper published in 2003 — Ghobarah, Huth and Russett ( 2003 ) — did report long- term ef fects from civil wars on a measurement called disability adjusted life years (DAL Y) , but D AL Y is a theoretically constructed index that is not regularly av ailable (i.e., weekly , monthly , etc.). W e do not consider counts of injuries and disabilities due to conflict simply because of a lack of adequate data on such incidences. W e admit that there are significant problems with the UCDP data, and UCDP uncertainty is sometimes, perhaps ev en often, present on se veral dimensions at once. They ha ve sometimes resolved this by clustering a number of unclear ev ents together to a super-e vent that may last for quite some time, raising obvious issues for our weekly model. W e believ e, howe ver , that we use the best data av ailable, and that the conceptual validity of focusing on fatalities as an indicator of conflict intensity is defensible as the best option av ailable. Moreov er , we trust that the experts at UCDP are handling these issues in a manner which we cannot impro ve on. 2.3. Challenges of modeling conflict dynamics. Figure 1 illustrates the statistical chal- lenge at hand. It sho ws weekly aggre gated number of battle deaths in internal armed conflicts ov er the 1989 to 2018 period for the Democratic Republic of Congo “Congo” and Colombia. Once again, our goal is to build a model that is able to capture and re-create the intensity of conflict, with its spikes and lulls, as well as more enduring patterns of violence. The typical statistical models employed by conflict researchers, overwhelmingly logistic or ordinary least squares regression, count models such as Poisson and negativ e-binomial, and time-to-ev ent models such as Cox proportional hazard models, are not particularly well suited for such purposes (a recent re vie w is Dav enport et al. , 2019 ). In contrast to the literature on conflict 8 dynamics, the literature on the onset of armed conflict has benefited tremendously from hav- ing a, more or less, standard statistical model (a logistic cross-section time series model) and a standard set of cov ariates (in particular related to socio-economic de velopment, political institutions, and demography). This standard model has allowed the literature to accumulate kno wledge as more and more pieces of the puzzle have been added. Unfortunately , this has not been the case for the conflict dynamics literature, which has de veloped in a much less coordinated fashion. Congo 0 100 200 300 1992−01−01 1997−01−01 2002−01−01 2007−01−01 2012−01−01 2017−01−01 Number of Battle Deaths (Weekly) Colombia 0 100 200 300 1992−01−01 1997−01−01 2002−01−01 2007−01−01 2012−01−01 2017−01−01 Number of Battle Deaths (Weekly) F I G 1 . W eekly number of battle deaths in the Democratic Republic of Congo “Congo” and Colombia, 1989– 2018. Note that the Democratic Republic of Congo suffer ed 3,000 deaths the week of December 14, 1998, a value be yond the plot range c hosen for illustration. 3. Statistical methodology . HMMS FOR LA TENT V ARIABLE LABELING ASSIGNMENTS IN CONFLICT RESEARCH 9 3.1. Response variable model. Fitting a Poisson model is perhaps the simplest approach for inference on count-valued data, such as the number of battle deaths in a week, for a gi ven country . A suggested by Figure 1 , howe ver , weekly battle deaths in a given country exhibit obvious patterns of zero-inflation and over -dispersion, and so a more flexible (i.e., two parameter) negati ve-binomial model is a standard, pragmatic choice. Furthermore, beyond issues relating to zero-inflation and over -dispersion, temporal correlations in the week-to- week fatality counts cannot be ignored. For instance, in Jakobsen ( 2021 ), it is observed that se veral countries have battle death time series with fairly strong autocorrelation, motiv ating the application of a count time series model. An introduction to count time series models can be found in W eiß ( 2018 ), and the recent re view paper Da vis et al. ( 2021 ) is a more general revie w . In short, count time series models are a rich class of models, well-suited for data that are zero-inflated, over -dispersed, and with heteroscedasticity . A v ariety of such models are e xplored in Jakobsen ( 2021 ) in the context of battle deaths time series. From this, there is not a clear “best” model for such time series; different types of models work well, provided they have sufficient capacity to model autocorrelation, zero-inflation, and ov er-dispersion. Indeed, a Poisson-based model is found to be lacking a sufficient degree of dispersion, while a negati ve-binomial is recognized as more appropriate. Moreover , it reasoned that an AR type of (mean) structure is sufficient for capturing the rele vant autocorrelation across time. Accordingly , to model the battle death counts Y i,k for week k of country i , we specify a simple negati ve-binomial model with an AR mean structure: (1) Y i,k | Y i, ( k − 4):( k − 1) ∼ neg ativ e-binomial ( r i,k , p ) , with p = c ( c + 1) − 1 and r i,k := a + ρ · Y i, ( k − 4):( k − 1) , where Y i, ( k − 4):( k − 1) := 1 4 P 4 l =1 Y i,k − l and c, a, ρ > 0 . This is actually a special case of a so-called “NB-DIN ARCH(4)” model, i.e., negati ve-binomial dispersion inte ger AR conditional heteroscedasticity model, introduced in Xu et al. ( 2012 ). In experimenting, we determined that specifying the rate r i,k as an AR function of the av erage battle death counts — over the pre vious four weeks — balances the noise in the weekly data v alues with their monthly composite. Further , this AR structure allo ws for meaningful interpretability of parameters with respect to the underlying conflict intensity; for example, E { Y i,k | Y i, ( k − 4):( k − 1) } = a c + ρ c · Y i, ( k − 4):( k − 1) , meaning that we can view a/c as a type of baseline intensity and ρ/c as an escalation pa- rameter . The time series is weakly stationary if 0 ≤ ρ/c < 1 , and explosi ve otherwise ( Xu et al. , 2012 , see Theorem 4.1), which gives a natural characterization of the non-escalatory or escalatory dynamics of conflict violence. Assuming no battle deaths occurred in a giv en four consecutive weeks, small values of the c parameter and large v alues of the a parameter are associated with an increased likelihood of violence on the fifth week: Pr { Y i,k > 0 | Y i, ( k − 4):( k − 1) = 0 } = 1 − c c + 1 a . Model ( 1 ), howe ver , is not fully adequate for characterizing a time series of battle deaths ov er the entire time period of the data (1989-2018) because it assumes that the v alues for c, a, ρ are fixed ov er time and across countries. This assumption is clearly violated for coun- tries like Congo, where, as observed in Figure 1 , the country transitions in and out of re gimes of violence and peace. In fact, sev eral countries in the data set do not exhibit any conflict violence for all of 1989-2018. What is necessary is for the response variable model ( 1 ) to be able to adapt to latent regime changes between violence and peace, as well as to incorpo- rate country-specific covariates. These necessary extensions motiv ate our construction of an HMM, introduced next. 10 3.2. HMM likelihood function. In this section we dev elop our discrete-time Bayesian HMM. As described previously , we regard the data resolution on a weekly grid, but the details would be the same for any discretization ov er time. Note that if the data cannot be meaningfully organized on a grid of time points (no matter how precise), then a continuous- time HMM must be developed for the transition matrix to be properly computed (e.g., see W illiams et al. , 2020 ). For a data set consisting of N countries, denote each country by an index i ∈ { 1 , . . . , N } , and let y i,k be the number of observed battle deaths for week k , for k ∈ { 1 , . . . , n i } , where n i is the number of weeks included in the data set for country i . The v alue of y i,k results from a multitude of circumstances, and we summarize these circumstances by a latent state s i,k ∈ { 1 , 2 , 3 } , corresponding to the ‘state’ of conflict dynamic at week k . These three un- derlying conflict-related states are defined as ‘non-violent’, ‘stable violence’, and ‘intensified violence’, respecti vely; their names/interpretations are deduced from our inferences on the HMM fitted to the real data, as discussed in Section 5 . W e limit our focus to three states, as Petroff, Bond and Bond ( 2013 ) argues that beyond three states [in an HMM of conflict dynamics] it becomes exceedingly difficult to interpret the empirical results, and finds dis- tinctions accounted for by including additional states to be v ague at best. The underlying state sequence, s i, 1 , . . . , s i,n i , defines a stochastic process, and we assume for computational feasibility , as required for an HMM, that it is a Marko v chain in that the state of the process at week k only depends on the state of the process (and possibly co- v ariates) at week k − 1 . That being so, the conditional distribution of the random variable S i,k | S i,k − 1 is determined by a 3 × 3 transition probability matrix P ( i,k ) , which we express as, P ( i,k ) := 1 1+ e q ( i,k ) 1 + e q ( i,k ) 2 0 0 0 1 1+ e q ( i,k ) 3 + e q ( i,k ) 4 0 0 0 1 1+ e q ( i,k ) 5 + e q ( i,k ) 6 1 e q ( i,k ) 1 e q ( i,k ) 2 e q ( i,k ) 3 1 e q ( i,k ) 4 e q ( i,k ) 5 e q ( i,k ) 6 1 , where q ( i,k ) 1 , q ( i,k ) 2 , q ( i,k ) 3 , q ( i,k ) 4 , q ( i,k ) 5 , q ( i,k ) 6 are real-valued parameters that determine the rates of their respective state transitions. Note that the matrix on the left simply re-scales (as row- wise multiv ariate logistic function transformations) the matrix on the right to have unit row sums, making it a proper transition probability matrix. The column and row indices corre- spond to the state space { 1 , 2 , 3 } . For example, the (1,2) component of P ( i,k ) expresses the v alue of the probability of transition from state 1 to state 2, for any two successive weeks. Furthermore, we express the transition probability parameters as, q ( i,k ) 1 q ( i,k ) 2 q ( i,k ) 3 q ( i,k ) 4 q ( i,k ) 5 q ( i,k ) 6 := ( x ( i ) k ) ′ ζ , where x ( i ) k is a column vector of the geopolitical, re gion specific control cov ariates from Sec- tion 2.1 as well as the indicators for ceasefire and pre-ceasefire periods, for country i at week k , and ζ is a coefficient matrix. The feature-rich coefficient matrix ζ is a crucial component of the HMM for the purpose of studying dynamics of conflict. In theory , the transition rate parameters can be made arbitrarily conflict specific by including as many features (i.e., co- v ariates and coefficient parameters) as necessary . These features determine the rate at which the underlying state of conflict e volv es or ceases to progress at all. Accordingly , for weeks 1 , . . . , n i the probability mass function of the latent state sequence s i, 1 , . . . , s i,n i has the form, (2) ℓ { s i, 1 , . . . , s i,n i } | P ( i,k ) = π s i, 1 · n i Y k =2 P ( i,k ) s i,k − 1 ,s i,k , HMMS FOR LA TENT V ARIABLE LABELING ASSIGNMENTS IN CONFLICT RESEARCH 11 where P ( i,k ) s i,k − 1 ,s i,k denotes row s i,k − 1 and column s i,k of the matrix P ( i,k ) , and π s i, 1 is the initial state probability for state s i, 1 . This Markov process defined for the latent conflict state sequences is then embedded as a structural component of the response variable model ( 1 ), such that the parameters a and ρ are allowed to depend on the latent-state and to be country-specific, as follows. For week k ∈ { 1 , . . . , n i } and country index i ∈ { 1 , . . . , n } , conditional on the latent process S i,k , the data-generating model is expressed as (3) Y i,k | Y i, ( k − 4):( k − 1) , S i,k ∼ Neg ativ e-Binomial ( r i,k , p ) , where p := c (1 + c ) − 1 , (4) r i,k := a i,k + ρ i,k · Y i, ( k − 4):( k − 1) , a i,k := a 1 1 { s i,k = 1 } + a 2 1 { s i,k = 2 } + a 3 1 { s i,k = 3 } , and (5) ρ i,k := 1 { s i,k = 1 } · e ( β 1 1 { s i,k =2 } + β 2 1 { s i,k =3 } ) ′ x ( i ) k , where a 1 , a 2 , a 3 , and c are positive parameters, β 1 and β 2 are coef ficient column vectors, and 1 {·} is the indicator function. Denote a := ( a 1 , a 2 , a 3 ) and β := ( β 1 , β 2 ) . Recall that for this model to be weakly stationary , it suf fices that 0 ≤ ρ i,k < 1 . An implication of this data-generating process specification is that if s i,k = 1 , then Y i,k ∼ negati ve-binomial ( a 1 , p ) . Accordingly , the parameters a 1 , a 2 , a 3 , and c describe the model for rare conflict related deaths that may occur during weeks when a country is not experiencing a substantial conflict (e.g., isolated terrorist attacks). Moreov er , this forces the rate parameter ρ i,k to be identified with an increased mortality rate relating specifically to a defined period of conflict (i.e., when the system is in state 2 or state 3). W ith the negati ve- binomial rate parameter defined as in ( 4 ), (6) E { Y i,k | Y i, ( k − 4):( k − 1) , S i,k } = a i,k c + ρ i,k c · Y i, ( k − 4):( k − 1) . From the definition of ρ i,k in ( 5 ), this implies that the number of conflict deaths during state 1 has a mean of a 1 /c , whereas in the conflict states the mean structure is AR. T o distinguish between states 2 and 3, the constraint that β 11 ≤ β 12 is imposed (i.e., base- line ρ i,k for state 2 does not exceed that for state 3). This constraint helps to facilitate the identification of states 2 and 3, respectiv ely , as associated with ‘stable’ versus ‘intensified’ conflict violence. Furthermore, we also impose the constraints that a 1 ≤ a 2 ≤ a 3 for the pur- pose of state space identification. Finally , combining components ( 2 ) and ( 3 ) giv es a full likelihood function for the HMM for each country i ∈ { 1 , . . . , N } . F or ef ficient estimation of the parameters, the likelihood can be expressed as a marginal mass function, resulting from integrating ov er all possible state space sequences. That is, p ( y i, 5 , . . . , y i,n i ) = 3 X s i, 1 =1 · · · 3 X s i,n i =1 X y i, 1 ≥ 0 · · · X y i, 4 ≥ 0 p ( y i, 1 , . . . , y i,n i , s i, 1 , . . . , s i,n i ) = 3 X s i, 1 =1 p ( s i, 1 ) · · · 3 X s i, 5 =1 p ( s i, 5 | s i, 4 ) · p ( y i, 5 | y i, 1:4 , s i, 5 ) · · · × 3 X s i,n i =1 p ( s i,n i | s i,n i − 1 ) · p ( y i,n i | y i, ( n i − 4):( n i − 1) , s i,n i ) = π ′ · P ( i, 2) · · · P ( i, 4) · P ( i, 5) D ( i, 5) · · · P ( i,n i ) D ( i,n i ) 1 , (7) 12 where π is the common initial state probability column vector , D ( i,k ) := p ( y i,k | y i, ( k − 4):( k − 1) , s i,k = 1) p ( y i,k | y i, ( k − 4):( k − 1) , s i,k = 2) p ( y i,k | y i, ( k − 4):( k − 1) , s i,k = 3) , and using the negati ve-binomial mass function, p ( y i,k | y i, ( k − 4):( k − 1) , s i,k ) = r i,k + y i,k − 1 y i,k p r i,k (1 − p ) y i,k . Note that if further state space information is av ailable, such as partial labels, then the number of state sequences that are integrated ov er is reduced. F or example, if it is kno wn that s i,k ∈ { 2 , 3 } for some k ∈ { 1 , . . . , n i } , then P 3 s i,k =1 · reduces to P 3 s i,k =2 · within expression ( 7 ). Equi valently , the (1 , 1) component of D ( i,k ) is set to zero. Finally , the joint posterior density including the data from all countries i ∈ { 1 , . . . , N } then has the form, (8) π ( ζ , β , a, c | { y i,k } ) ∝ N Y i =1 p ( y i, 5 , . . . , y i,n i ) · π ( ζ , β , a, c ) · 1 { β 11 ≤ β 12 , a 1 ≤ a 2 ≤ a 3 } , where π ( ζ , β , a, c ) is a prior density . 3.3. Remarks on implementation. W e implement a Metropolis-within-Gibbs MCMC al- gorithm to draw posterior samples from ( 8 ). W ith these posterior samples, we then estimate the conditional posterior distribution of the latent state space for a gi ven country in our data set; we refer to this algorithm as the state space sampler algorithm. The state space sampler algorithm is a tool for ev aluating or predicting the most likely state (i.e., non-violent, stable violence, or intensified violence) at any giv en country/week, based on the HMM fit to the real data. In Section 5 , we present a visual representation of the posterior distribution of the latent state sequence for countries both in our training data and in our held-out test data. 4. Empirical studies. 4.1. Illustration of why simpler statistical models ar e inadequate. As motiv ated by Sec- tions 3.1 and 3.2 , the negati ve-binomial response model for battle death counts with an AR mean structure, conditional on a latent state space process, is essential for capturing features of these data that are critical for pursuing conflict research questions. Moreov er, assuming the latent state space process is Markovian, as in the HMM framework we proposed, is the sim- plest and most computationally pragmatic approach. T o exemplify the importance of these methodological features, in this section we examine the limitations of simpler approaches to analyzing these data. Such simpler approaches may lead to similar, albeit more limited, conclusions as those that can be drawn from fitting our full HMM model (e.g., as in Section 5 ). Simpler statistical approaches in volv e numerous subjecti ve choices that are not necessary easy to substantiate. Many of these subjectiv e choices become data dri ven in the context of our HMM approach. A most simple approach is to compare the number of battle deaths before and after many records of ceasefires. If ceasefires work as intended, we would expect, on av erage, a statis- tically/practically significant reduction in the magnitude of violence after a ceasefire is in ef fect. Figure 2 plots the weekly average number of battle deaths in the 8 weeks surrounding each ceasefire recorded in the data set, over all countries; it is seen that the av erage number of battle deaths tend to be higher in the weeks preceding a ceasefire than those that follow . HMMS FOR LA TENT V ARIABLE LABELING ASSIGNMENTS IN CONFLICT RESEARCH 13 These data, ho we ver , exhibit a right sk e w due to the influence of a fe w countries with excep- tionally high battle death counts, and so we instead reconstruct Figure 2 and use a trimmed mean for the robustness of our subsequent analyses in this illustrati ve section; see Figure 3 . 15 20 25 30 −4 0 4 8 Week Battle Deaths (A verage) Ceasefire − Mean Battle Deaths per W eek (mean) F I G 2 . Displayed is the mean number of battle deaths per week, of the 4 weeks before each ceasefir e declaration and the 4 weeks after each ceasefire is in effect. All battle deaths series are center ed at the week of the ceasefire, which is then defined as week 0. The two vertical dashed lines mark the period defined as declar ed and ceasefire , r espectively . 6 7 8 9 −4 0 4 8 Week Battle Deaths (A verage) Ceasefire − Mean Battle Deaths per W eek (10% tr immed mean) F I G 3 . Displayed is the 10% trimmed mean number of battle deaths per week (i.e., the mean after remo ving the 10% highest and lowest values for each week surrounding all ceasefires, for all countries in the data set), of the 4 weeks befor e each ceasefir e declaration and the 4 weeks after each ceasefir e is in effect. Compar e to F igur e 2 . A further complication is that the number of weeks to include in the befor e and after cease- fire periods are indeterminate. Including too many weeks before or after the ceasefire will pull do wn the av erage battle death count, as illustrated by Figure 4 , simply because conflicts 14 typically expire, eventually . The diminishing battle death counts near the right and left bound- aries reflect increasingly more weeks associated with periods of peace. Next, since there are sometimes less than, for example, 50 weeks between two ceasefires, [subjecti ve] choices are required to avoid weeks that overlap between ± 50 weeks of more than one ceasefire; e.g., keeping or removing ceasefires with overlapping intervals or reducing the number of weeks to include before and after – all options will likely introduce a systematic bias into the anal- ysis. Moreover , it is unrealistic to assume a fixed number of weeks (e.g., be it 4, 50, etc.) would apply to all conflicts, across all countries and time. The utility of the HMM frame- work introduced in Section 3 is that the model determines the weeks most likely associated with conflict and peace and at the country-specific resolution, so that the ef fects of ceasefires on violent conflict can be estimated without conflating weeks of peacetime in the estimation. 4 6 8 −30 0 30 Week Battle Deaths (A verage) Ceasefire − Mean Battle Deaths per W eek (10% tr immed mean) F I G 4 . The 10% trimmed mean number of battle deaths per week, of the 50 weeks before each ceasefire declaration and the 50 weeks after each ceasefir e is in effect. Compar e to F igure 3 . For the sake of more precise illustration, we will proceed to compare the 10% trimmed mean number of battle death counts before and after the ceasefires as in Figure 3 , i.e., includ- ing the 4 weeks before each ceasefire declaration and the 4 weeks after each ceasefire is in ef- fect. Let W j,k be the number of fatalities associated with week k ∈ {− 6 , . . . , − 3 } ∪ { 5 , . . . , 8 } for ceasefire j ∈ { 1 , . . . , J } , where J is the total number of ceasefires observed in the data set, within and across all 170 countries. For the simple approach of in vestigating the role cease- fires play in conflict dynamics by comparing battle death counts before and after a ceasefire (again for now , assuming exactly 4 weeks of counts before and after are appropriate), all that is necessary is to fit the parameters of an analysis of v ariance (ANO V A) model, such as: independently for j ∈ { 1 , . . . , J } and k ∈ {− 6 , . . . , − 3 } ∪ { 5 , . . . , 8 } , (9) W j,k = δ · 1 { k < − 2 } + µ j + U j,k , where δ + µ j represents the expected number of fatalities before a ceasefire has been declared, and U j,k is a mean zero inno v ation. In this specification, δ is the theoretical ef fect of the ceasefire on the expected number of battle deaths. Next, denote (10) W j, before := 1 4 − 3 X k = − 6 W j,k and W j, after := 1 4 8 X k =5 W j,k , HMMS FOR LA TENT V ARIABLE LABELING ASSIGNMENTS IN CONFLICT RESEARCH 15 and V j := W j, before − W j, after = δ + ( U j, before − U j, after ) , with U j, before and U j, after defined analogous to the averages in ( 10 ). In this formulation, E ( V j ) = δ and the 10% trimmed mean of the observed v j = w j, before − w j, after for j ∈ { 1 , . . . , J } is a seemingly robust estimate for δ . Accordingly , we find the estimated δ from the data is P J 1 v j /J = 2 . 36 with a corresponding 0.95 bootstrap confidence interv al of (1.08, 3.79), indicating a significant decrease in fatalities after a ceasefire is in effect. Note that re- moving ceasefires that hav e ov erlapping weeks in the 4 weeks before a ceasefire is declared, or the 4 weeks after a ceasefire is in ef fect, does not change these estimates significantly . W e hav e not yet, howe ver , controlled for any ceasefire-specific covariates, such as GDP , population, or polyarchy (democracy score). T o do so, a simple extension to model ( 9 ) is (11) W j,k = ( δ + x ′ j γ ) · 1 { k < − 2 } + µ j + U j,k , where x j is a vector of ceasefire-specific cov ariates consisting of GDP , population, and pol- yarchy , and γ is a corresponding coefficient parameter vector with 3 components. In T able 1 , we present the trimmed 10% least squares estimates (c.f., Rousseeuw , 1984 ) of the parame- ters in model ( 11 ). Estimate 0.025 0.975 Ceasefire δ 2.26 1.09 3.87 GDP γ 1 -0.02 -1.57 1.76 population γ 2 -0.21 -2.08 1.21 polyarchy γ 3 -2.16 -3.98 -0.17 T A B L E 1 P arameter estimates for model ( 11 ) ar e based on trimmed 10% least squar es, after normalizing the explanatory variables, i.e., mean center ed and scaled to have unit standard de viation. The 0.025 and 0.975 columns display the corr esponding bootstrapped quantiles. The estimates displayed in T able 1 for the δ parameter are more or less unchanged from those without including cov ariates, and polyarchy is the only explanatory variable with evi- dence of a significant association, at the 0.95 le vel. If we remove ceasefires that have ov er- lapping time periods, ho wev er , then the ef fect of the democrac y score polyarchy is no longer significant. Moreover , Figures 5 and 6 demonstrate ho w the estimate of δ as in model ( 9 ) will change if more or less weeks are included in the periods before a ceasefire is declared or after it has gone into ef fect. These figures sho wcase ho w an in vestigation of the role ceasefires play in conflict dynamics depends on the number of weeks to include – whether ceasefires with ov erlapping time periods are included or not; it is clear that some choices will lead to the find- ing of a δ significantly different from zero while other choices will not. Beyond this finding, such an analysis does not allow for the number of before/after weeks to be ceasefire-specific. The HMM framew ork we propose in Section 3 avoids these limitations because the HMM itself will quantify the uncertainty , in a data-driv en manner , for the number of before/after weeks to include for each observed ceasefire in estimating the ef fect of a ceasefire, and at the country-specific resolution (i.e., using country-specific co variates). Moreov er, our HMM for - mulation estimates the effect of the presence of a ceasefire on a rolling basis, thus bypassing the need for justifications about dropping/including weeks with ov erlapping ceasefires that would be necessary in the simpler , ANO V A approach with predefined before/after weeks. A problem with the ANO V A approach is that, unlike the HMM approach, it treats each ceasefire as an independent e vent, e ven if it overlaps a simultaneously occurring ceasefire. 16 −3 0 3 6 9 −200 −100 0 100 200 Week Since Ceasefire Estimated Change Estimated Eff ect of a Ceasefire on Battle Deaths F I G 5 . The estimated effect, δ as in model ( 9 ), is plotted as points, each corresponding to a differ ent number of weeks included in the periods befor e a ceasefire is declared and after it has gone into effect. Zer o “W eeks Since Ceasefir e” corr esponds to including 4 weeks befor e and after , as in the discussion above; the magnitude of ne gative (positive) values of “W eeks Since Ceasefir e” correspond to how many additional weeks are included in the period befor e (after) a ceasefire. The solid lines correspond to 0.025 and 0.975 bootstrapped quantiles. All ceasefir es that have overlapping ceasefire time periods wer e r emoved. It is clear that the choice of the number of weeks to compar e befor e and after a ceasefir e will influence the significance of the effect of a ceasefir e. −3 0 3 6 9 −200 −100 0 100 200 Week Since Ceasefire Estimated Change Estimated Eff ect of a Ceasefire on Battle Deaths F I G 6 . Same as F igur e 5 , but without r emoving ceasefir es that have overlapping ceasefir e time periods. 4.2. Simulation study of synthetic battle death data. The purpose of this section is to verify that synthetic data generated by our HMM resembles important features of the real battle death count data, and that our Bayesian estimation procedure produces credible sets HMMS FOR LA TENT V ARIABLE LABELING ASSIGNMENTS IN CONFLICT RESEARCH 17 for the HMM parameters that achiev e their corresponding repeated sampling cov erage. The ‘true’ HMM parameter values used to generate the synthetic data are set as the posterior means estimated from the real data set; those v alues are presented in T able 3 . W e generate synthetic data for each of N = 167 countries in our training data set (lea ving data from 3 countries as test data), based on the following procedure. For each country/week with recorded co variates a ‘true’ latent state s i,k is sampled from either the initial state prob- ability vector π if k = 1 or P s i,k − 1 , · if k > 1 , and a count of battle deaths y i,k is sampled from ( 3 ). Note that the first four values, y i, 1:4 , are sampled from ( 3 ) with ρ i,k = 0 . Further- more, the maximum number of battle deaths in any one week is restricted to not exceed the highest week-death-count to country-population proportion of any country in the data set. The highest such proportion is approximately 0.0006 for Congo. If a simulated death count y i,k exceeds 0.0006 times the country population, then the generated data for that country is discarded and re-simulated until the restriction is satisfied. This was not a problem for any of the 167 countries in the data set, with the notable exception of India. For India, it took an excessiv e amount of time to generate battle death data that satisfy the maximum weekly death restriction, and so India was omitted from our simulation study . trans. rates baseline pre-ceasefire ceasefire v2x v2x 2 v2x 3 GDP pop ζ ′ 1 ( 1 → 2 ) .66 .83 .91 .92 .93 .94 .90 .91 ζ ′ 2 ( 1 → 3 ) .94 .97 .93 .94 .92 .94 .95 .94 ζ ′ 3 ( 2 → 1 ) .93 .96 .94 .97 .99 .94 .95 .92 ζ ′ 4 ( 2 → 3 ) .95 .96 .97 .93 .91 .97 .95 .93 ζ ′ 5 ( 3 → 1 ) .96 .96 .93 .93 .96 .95 .92 .95 ζ ′ 6 ( 3 → 2 ) .95 .93 .97 .94 .91 .94 .97 .98 AR coef. baseline pre-ceasefire ceasefire v2x v2x 2 v2x 3 GDP pop β ′ 1 (state 2) .90 .97 .95 .96 .95 .96 .92 .94 β ′ 2 (state 3) .96 .97 .97 .96 .95 .93 .97 .93 other a 1 a 2 a 3 c π 2 π 3 .91 .66 .95 .96 .90 .95 T A B L E 2 Pr oportion of 100, .95 posterior credible sets that contains the true parameter value for eac h of the HMM parameters (constructed by e xcluding the upper and lower .025 tails of each marginal posterior distrib ution). Synthetic data for N = 166 countries ar e generated for eac h of the 100 data sets in this simulation study . Note that parameter values r eflect covariate values for lag v2x polyar chy (linear , quadractic, and cubic), la g and log GDP per capita, and lag and log population. These variables have all been center ed and scaled to have unit variance. See the Supplementary Material ( W illiams et al. , 2024 ) for box plots over the 100 posterior medians for each par ameter . India is an outlier country in our data set in the sense that it has an uncharacteristically large population size which, based on the fitted HMM parameters (see T able 3 ) is associated with markedly less time spent in state 1, and battle death sequences represented with an explosi ve or non-stationary series (i.e., ρ i,k > 1 ). Additionally , India is often in a state of ceasefire which is associated with increased instability (i.e., states 2 and 3). One possible explanation for why conflict dynamics are not explained so well by our fitted HMM for countries with populations as large as India is the greater possibility for numerous unrelated conflicts ongoing at any point in time. For the av erage country , conflict dynamics are more likely limited to a single conflict at a time. As with our real data set, for our synthetic data sets we apply a single rule-based partial label. That is, any week that is at least two months after or two months before a week with one or more battle deaths, and is within a two year sequence of weeks with no battle deaths 18 is labeled as state 1, without error . A total of 100 synthetic data sets are generated based on the described procedure. W e implement a simple independent components Gaussian prior density with mean zero and excessi vely diffuse standard deviation 20 for all parameters. T o enforce the constraint that a 1 , a 2 , a 3 , and c are positiv e-valued, we place the Gaussian prior on log( a 1 ) , log ( a 2 ) , log ( a 3 ) , and log ( c ) . Similarly , the Gaussian prior is placed on the logit transforms of the components of the initial state probability vector π . An MCMC algorithm is used to estimate the posterior distributions for all 70 HMM parameters, for each of the 100 synthetic data sets. The co verage at the 0.95 lev el of significance for each parameter is stated in T able 2 . Box plots of the 100 posterior medians for each parameter are presented in our Supplementary Material ( W illiams et al. , 2024 ). Figure 7 giv es a visual representation of the synthetic data we generated for the held-out test set countries, Sudan and Afghanistan. Sudan 0 500 1000 1500 1992−01−01 1997−01−01 2002−01−01 2007−01−01 2012−01−01 2017−01−01 Synthetic Battle Deaths 0 500 1000 1500 1992−01−01 1997−01−01 2002−01−01 2007−01−01 2012−01−01 2017−01−01 Number of Battle Deaths (Weekly) Afghanistan 0 300 600 900 2007−01−01 2012−01−01 2017−01−01 Synthetic Battle Deaths 0 300 600 900 2007−01−01 2012−01−01 2017−01−01 Number of Battle Deaths (Weekly) F I G 7 . The top panels for each country display 1000 synthetic realizations of the r esponse variable sequence simulated fr om the HMM fit with the posterior means of all parameters pr esented in T able 3 , the maximum a posteriori latent state sequence, and all covariate values observed in the real data set. The triangle points r epr esent the upper 0.025 percentile while the cir cle points r epresent the lower 0.025 percentile . F or r efer ence, the bottom panels for each country display the r eal numbers of observed battle deaths. HMMS FOR LA TENT V ARIABLE LABELING ASSIGNMENTS IN CONFLICT RESEARCH 19 5. Results and analyses. For our analysis of the real data, we focus on three inferential aspects of the fitted model. First, the posterior mean estimates of all 70 HMM parameters described in Section 3 are summarized in T able 3 . The MCMC trace plots and histograms of the posterior samples are provided in the Supplementary Material ( Williams et al. , 2024 ). The posterior means presented in T able 3 quantify the effect of the v arious parameters/features in the model, but they are not dynamic in the sense that they represent the model at a weekly resolution whereas the HMM is fitted to the data as a system that e volves over many years of accumulated weeks. For this purpose, second, we present probability ev olution curves for a small selection of countries in Figures 8 and 9 . For these figures, the transition probabilities are computed and plotted ov er the same time periods observed in the training data, using the cov ariate v alues associated with each country/week. Furthermore, these probability ev olution plots ignore the HMM response data (i.e., the counts of conflict deaths each week) to provide inference purely on the effects of the state transition probability co variates. In particular , they demonstrate the role that ceasefires have played in the de-escalation of violence for the countries in our data set. Third, we use the posterior mean estimates from T able 3 to sample state sequences via the state space sampler discussed in Section 3.3 . These are displayed, for the same small selection of countries, in Figures 10 and 11 . The probability ev olution and state space sampler plots for all 170 countries included in our data set are av ailable with our Supplementary Material ( W illiams et al. , 2024 ). trans. rates baseline pre-ceasefire ceasefire v2x v2x 2 v2x 3 GDP pop ζ ′ 3 ( 2 → 1 ) − 4 . 714 ⋆ − 0 . 322 1 . 243 ⋆ − 0 . 938 ⋆ 1 . 748 ⋆ − 0 . 899 ⋆ − 0 . 554 ⋆ − 0 . 588 ⋆ ζ ′ 4 ( 2 → 3 ) − 5 . 965 ⋆ 1 . 693 ⋆ 0 . 524 ⋆ − 0 . 375 − 0 . 572 ⋆ 0 . 421 − 0 . 486 ⋆ − 0 . 516 ⋆ ζ ′ 5 ( 3 → 1 ) − 0 . 986 ⋆ − 1 . 550 ⋆ 0 . 367 − 1 . 617 ⋆ 0 . 153 0 . 317 − 0 . 554 ⋆ − 0 . 005 ζ ′ 6 ( 3 → 2 ) 0 . 993 ⋆ − 0 . 289 0 . 161 − 0 . 734 ⋆ 0 . 134 0 . 245 0 . 143 0 . 305 AR coef. baseline pre-ceasefire ceasefire v2x v2x 2 v2x 3 GDP pop β ′ 1 (state 2) − 4 . 228 ⋆ 0 . 078 − 0 . 099 ⋆ 1 . 784 ⋆ − 3 . 945 ⋆ 2 . 389 ⋆ 0 . 238 ⋆ 0 . 337 ⋆ β ′ 2 (state 3) − 3 . 849 ⋆ 0 . 660 ⋆ 0 . 061 − 0 . 986 ⋆ − 0 . 851 ⋆ 0 . 392 0 . 724 ⋆ 0 . 948 ⋆ other a 1 a 2 a 3 c π 2 π 3 0 . 0004 ⋆ 0 . 0911 ⋆ 5 . 8714 ⋆ 0 . 0246 ⋆ 0 . 0279 ⋆ 0 . 0140 ⋆ T A B L E 3 P osterior means of the HMM parameter s. Note that parameter values r eflect covariate values for lag v2x polyar chy (linear , quadractic, and cubic), la g and log GDP per capita, and lag and log population. These variables have all been center ed and scaled to have unit standar d deviation. Boldface ⋆ indicates that the .95 cr edible r e gion, formed by excluding posterior samples in the upper and lower .025 tails, e xcludes the value 0. Note that we omit the state 1 → 2 transitions fr om this table because the inferential focus of our application to conflict r esear ch is r estricted to the other transitions. See the Supplementary Material ( W illiams et al. , 2024 ) for the MCMC trace plots and histogr ams of the posterior samples. The parameter estimates in T able 3 suggest a v ariety of interesting findings. First, we note that the state 2 (stable violence) to state 1 (non-violent) transition probability increases by about a factor of 3.5 when a ceasefire is in effect, taking all other covariates at mean value. Note that while there is also found to be a statistically significant, positive coefficient for the ceasefire indicator variable associated with the 2 → 3 transition, the larger magnitude of the 2 → 1 coefficient indicates that 2 → 1 transitions will occur with higher probability than 2 → 3 transitions. Nonetheless, the finding of positi ve coef ficients for both the 2 → 1 and 2 → 3 transitions is evidence that ceasefires are directly associated with some change in the underlying conflict dynamics, most likely a cessation of violence. Over time, the factor of 3.5 effect of ceasefires is visually displayed in the top panels of Figures 8 and 9 , within the 20 vertical bars that indicate ceasefires are in effect. Note that it is also observed in the figures that this effect carries momentum for diminishing state 2 or 3 transition probabilities even after the ceasefire period. Second, observe the statistically significant, positi ve pre-ceasefire indicator variable coefficient appearing for transition 2 → 3 , as well as the statistically sig- nificant, negati ve pre-ceasefire indicator v ariable coefficient for the 3 → 1 transition. Such findings suggest heightened or sustained lev els of violence in the weeks associated with the pre-ceasefire period. This is a major finding of our analysis. It highlights the lag time between when a ceasefire is negotiated and when it actually be gins, and that negotiating and preparing for a ceasefire is associated with an immediate short-term escalation in violence (which in turn is likely to also increase the likelihood of a ceasefire). Parties have incentiv es to fight harder to gain the strongest relati ve position prior to the ceasefire suspending the violence. Escalated violence also increases the incentives for a ceasefire. Once a ceasefire enters into effect, we find that conflict dynamics tend to transition from a violent to a non-violent state in the weeks that follow the ceasefire. An explanation is that the benefits accrued from a ceasefire, whether peaceful or military/strategic, require some immediate shift in violence dynamics. Finally , we find e vidence for three underlying states of conflict, which we describe as ‘non-violent’, ‘stable violence’, and ‘intensified vio- lence’. W e illustrate the utility of the constructed HMM for both inferential purposes and as a tool for predicting intensity and violence in conflict. South Sudan 0.00 0.25 0.50 0.75 1.00 2012−01−01 2013−01−01 2014−01−01 2015−01−01 2016−01−01 2017−01−01 T ransition Probability State 2 or 3 State 3 Declared Ceasefire F I G 8 . Displayed is the evolution of the pr obabilities of transition on a week-by-week r esolution. Counts of conflict r elated deaths ar e omitted fr om the computation of these probabilities. Instead, the pr obabilities e xclusively reflect the effects, on the tr ansition rates, of the covariates observed for the labelled country , over time . Israel 0.00 0.25 0.50 0.75 1.00 1992−01−01 1997−01−01 2002−01−01 2007−01−01 2012−01−01 2017−01−01 T ransition Probability State 2 or 3 State 3 Declared Ceasefire F I G 9 . Displayed is the evolution of the pr obabilities of transition on a week-by-week r esolution. Counts of conflict r elated deaths ar e omitted fr om the computation of these probabilities. Instead, the pr obabilities e xclusively reflect the effects, on the tr ansition rates, of the covariates observed for the labelled country , over time . HMMS FOR LA TENT V ARIABLE LABELING ASSIGNMENTS IN CONFLICT RESEARCH 21 Furthermore, the estimated mean AR coef ficient, ρ i,k /c (recall equation ( 6 )), increases from 0.8659 in state 3 to 1.6753, taking all other cov ariates at mean value, for all pre-ceasefire weeks. Howe ver , in either case, this coefficient will be explosiv e (i.e., the AR process is not stationary) for countries with lar ger GDP per capita and/or lar ger population, as demonstrated by the significant coefficient estimates 0.724 and 0.948, respecti vely . The explosi ve value of this coef ficient is consistent with our interpretation of state 3 as ‘intensified’ violence. Con versely , the ‘stable’ violence interpretation for state 2 comes from the fact that it has an AR coefficient, taking all other cov ariates at mean v alue, estimated to be less than 1, which describes a weakly stationary process. Such processes revert to a stationary mean, and it is in this sense that the ‘stable’ violence state describes both non-escalating and de-escalating violence. South Sudan 0 100 200 300 400 2012−01−01 2013−01−01 2014−01−01 2015−01−01 2016−01−01 2017−01−01 Number of Battle Deaths (Weekly) Declared Ceasefire 0.00 0.25 0.50 0.75 1.00 2012−01−01 2013−01−01 2014−01−01 2015−01−01 2016−01−01 2017−01−01 P osterior State Probability State: 1 2 3 F I G 1 0 . The top panel displays the observed data for South Sudan, and the bottom panel displays the estimated posterior pr obability of each state for each week. One final important finding is the statistically significant effect for the third degree polyno- mial coef ficient for the v2x polyarchy v ariable for the state 2 to 1 transition probability (with a leading negati ve coef ficient), and for the state 2 AR coefficient. Both democracies and au- tocracies are less likely than countries in the middle to enter into conflict, but when they do their trajectories differ dramatically . Democracies are known to have less fatal conflicts than other regimes ( Lacina , 2006 ), but also more durable conflicts ( Crisman-Cox , 2022 ). Conflicts are more likely to erupt in the hybrid regimes between pure dictatorships and democracies ( Hegre et al. , 2001 ). The significant ef fect of a third de gree polynomial for the v2x polyarch y v ariable is e vidence for the hypothesized non-monotonic nature of the association between measure of democrac y and violent conflict dynamics. Our results suggest that low-le vel con- flicts are most lik ely to terminate for autocratic regimes, and most likely to escalate for hybrid regimes. Democracies are least likely to mo ve in either direction, which is consistent with the literature. F or instance, consider the Middle East. Israel, a democrac y , has been in volved in a 22 series of armed conflicts with v arious non-state org anizations. These conflicts usually remain acti ve but do not escalate beyond a certain lev el because Israel puts a limit on its own use of force. Its most proximate neighbors, howe ver , have not shown the same restraint. Jordan used maximum force in September 1970 to expel PLO from Jordan, and were successful. Four decades later, Syria tried the same strategy , and failed, with the Syrian Civil W ar as a consequence. Israel 0 200 400 1992−01−01 1997−01−01 2002−01−01 2007−01−01 2012−01−01 2017−01−01 Number of Battle Deaths (Weekly) Declared Ceasefire 0.00 0.25 0.50 0.75 1.00 1992−01−01 1997−01−01 2002−01−01 2007−01−01 2012−01−01 2017−01−01 P osterior State Probability State: 1 2 3 F I G 1 1 . The top panel displays the observed data for Israel, and the bottom panel displays the estimated posterior pr obability of each state for each week. GDP per capita and population show very similar effects. It is important to keep in mind that conflicts tend to happen in poorer countries. Hence, the negati ve effect of GDP per capita on transitions and the positiv e effect on fatalities must be interpreted in the context of the sample of countries in the data set that had conflicts. The very poorest countries are more likely to have very intermittent conflicts, which causes the country to move often between state 1 and 2, and sometimes state 3. Richer countries are, once they find themselves in a state of conflict, more capable of persistently fighting these conflicts. Similarly , more populous countries are also more likely to maintain a strong military and are as such more likely to field a sustained campaign. This does not mean that the most f atal conflicts necessarily are in richer country , but the conflicts with one or two deaths per week, scattered over time tend to be in the poorest and smallest countries. The fact that the data exhibit patterns of both stable and unstable models is a strong indi- cation that conflicts endure distinct phases, and that intensified phases marked by explosi ve violence are only sustained for short periods of time. Namely , observe that, taking all other cov ariates at mean v alue and in the absence of ceasefires or negotiations, the baseline transi- tion probabilities from 3 → 1 , 3 → 2 , and 3 → 3 are 0.0916, 0.6628, and 0.2456, respectiv ely; once in state 3 (intensified violence) it is about 0.75 probability of transitioning to another state in the next week. HMMS FOR LA TENT V ARIABLE LABELING ASSIGNMENTS IN CONFLICT RESEARCH 23 Our last remark on the estimates in T able 3 is that the fitted expected number of battle deaths per week, in the absence of battle deaths in pre vious weeks, during state 1, state 2, and state 3 are 0.0163, 3.7033, and 238.6748, respecti vely . Recall from the expected value of the negati ve-binomial distribution in ( 6 ) that these are computed as a j /c , for j ∈ { 1 , 2 , 3 } . This statistic serves as a simple, rudimentary description of how the HMM has fitted the mean behavior of each of the three states. Finally , we note that (although not displayed in T able 3 ) the baseline probabilities for tran- sition from state 1 → 2 and 1 → 3 , in one week, are estimated to be 0.0006276 and 6 × 10 − 7 , respecti vely . This is consistent with the fact that periods of violent conflict are rare e vents, and are not part of the natural progression of the state of aff airs for the av erage country , (a verage in the sense of the cov ariates we consider). Ho wev er , we find that the transition probability from state 1 → 2 increases by a factor of about 52 for a country with a declared ceasefire, and by a factor of about 18 for a country with a ceasefire in effect, taking all other covariates at mean v alue. Mostly , this reflects the fact that ceasefires occur in the midst of violent conflict, and so the likelihood of future violence is higher than if a country was in state 1 not having recently transitioned from a state of violent conflict. 5.1. Addr essing limitations of the data: model r e-estimation using low and high estimates of weekly battle death counts. UCDP states that the fatality estimates in the UCDP geo- referenced ev ent data set used in this analysis are fraught with uncertainty 5 . T o quantify the uncertainty expressed in the source material, they publish three estimates: best, high, and low . T o clarify the UCDP geo-referenced e vent coding scheme ( Högbladh , 2023 ), geo- referenced event e vents are defined based on a subjectiv e judgement of one or more sources with at least one trustworthy claim that at least one person was killed. The estimates of best, high, and lo w reflect the v ariance reported in the number of casualties from sources deemed trustworthy ( Högbladh , 2023 ). UCDP has a conservati ve policy , by which they will report the lowest number supported by a source. The variability between best, high, and low can arise either from a range reported by a source or different estimates given by different sources. If a partially corresponding aggre gate report contains a single fatality estimate, then the corresponding record for this ev ent will be best = high = low . There are generally four dif ferent types of uncertainty: 1. Some ev ents are reported by se veral independent sources. The September 11 terrorist attach in the United States, for instance, is reported by a large number of sources, which all report the same number of casualties (2,986). This is an example of a credible and precise number . 2. For some e vents, the best, high, and low estimates dif fer , sometimes by a lot. There might be conflicting reports that claim either 2 dead or 25 dead. Both of these are precise, but they cannot both be credible. Nonetheless, we hav e to choose between 2 or 25, not any random number in-between. 3. There might be sev eral sources that report, e.g., between 10 and 20 casualties. This is not precise, but can be credible. In this case any random number between 10 and 20 is as likely as an y other . 4. There might be a single report of an imprecise number , such as “at least 50,000 persons are killed”. This is not very precise, and it is hard to adjudicate the credibility . W e cannot report any number other than 50,000 without making a number of additional assumptions, but sole y reporting 50,000 creates a false impression of precision. 5 See https://www .pcr .uu.se/research/ucdp/methodology/ for a discussion. 24 W e use the best estimates in our main analysis in the previous section, but we check the robustness of the fitted model using both the high and low estimates, here; see T ables 4 and 5 . W e additionally fit the model on another 100 data sets, where the number of fatalities each week is taken as a mixture drawn from all three categories to assess the robustness of the current model to the unknown variability across these categories; see T able 6 . The HMM fit to the best–high–low mixture is largely consistent with the model fit using the UCDP best estimates of f atalities, e ven more so than the parameters fit using exclusi vely the UCDP high or lo w estimates. trans. rates baseline pre-ceasefire ceasefire v2x v2x 2 v2x 3 GDP pop ζ ′ 3 ( 2 → 1 ) − 4 . 572 ⋆ − 0 . 354 0 . 947 ⋆ − 0 . 954 ⋆ 1 . 924 ⋆ − 1 . 069 ⋆ − 0 . 620 ⋆ − 0 . 619 ⋆ ζ ′ 4 ( 2 → 3 ) − 6 . 141 ⋆ 1 . 458 ⋆ 1 . 195 ⋆ − 0 . 894 ⋆ − 0 . 104 0 . 115 − 0 . 490 ⋆ − 0 . 472 ⋆ ζ ′ 5 ( 3 → 1 ) 2 . 427 ⋆ − 1 . 239 ⋆ − 0 . 641 ⋆ 0 . 205 − 0 . 618 1 . 507 ⋆ 0 . 040 − 0 . 385 ζ ′ 6 ( 3 → 2 ) 1 . 372 ⋆ 0 . 046 0 . 201 − 0 . 143 0 . 145 − 0 . 352 0 . 501 ⋆ 0 . 501 ⋆ AR coef. baseline pre-ceasefire ceasefire v2x v2x 2 v2x 3 GDP pop β ′ 1 (state 2) − 4 . 404 ⋆ − 0 . 008 − 0 . 219 ⋆ 2 . 133 ⋆ − 4 . 543 ⋆ 2 . 732 ⋆ 0 . 239 ⋆ 0 . 321 ⋆ β ′ 2 (state 3) − 2 . 102 ⋆ − 0 . 753 ⋆ − 0 . 421 ⋆ − 1 . 004 ⋆ 0 . 778 0 . 543 0 . 615 ⋆ 0 . 655 ⋆ other a 1 a 2 a 3 c π 2 π 3 0 . 0004 ⋆ 0 . 0937 ⋆ 5 . 7008 ⋆ 0 . 0201 ⋆ 0 . 0396 ⋆ 0 . 0208 ⋆ T A B L E 4 P osterior means of the HMM parameter s using high counts of battle deaths . Compar e with T able 3 . trans. rates baseline pre-ceasefire ceasefire v2x v2x 2 v2x 3 GDP pop ζ ′ 3 ( 2 → 1 ) − 4 . 773 ⋆ 0 . 079 1 . 228 ⋆ − 1 . 328 ⋆ 2 . 618 ⋆ − 1 . 506 ⋆ − 0 . 508 ⋆ − 0 . 560 ⋆ ζ ′ 4 ( 2 → 3 ) − 6 . 323 ⋆ 1 . 078 ⋆ 1 . 169 ⋆ − 1 . 030 ⋆ 0 . 440 − 0 . 111 − 0 . 402 ⋆ − 0 . 384 ⋆ ζ ′ 5 ( 3 → 1 ) 3 . 289 ⋆ − 1 . 244 ⋆ 0 . 742 − 0 . 807 ⋆ 1 . 062 ⋆ 0 . 357 0 . 129 − 1 . 519 ⋆ ζ ′ 6 ( 3 → 2 ) 1 . 898 ⋆ − 0 . 332 1 . 126 ⋆ − 0 . 004 0 . 091 − 1 . 035 ⋆ 0 . 776 − 0 . 204 AR coef. baseline pre-ceasefire ceasefire v2x v2x 2 v2x 3 GDP pop β ′ 1 (state 2) − 4 . 160 ⋆ 0 . 116 ⋆ − 0 . 102 ⋆ 2 . 232 ⋆ − 5 . 035 ⋆ 3 . 101 ⋆ 0 . 223 ⋆ 0 . 317 ⋆ β ′ 2 (state 3) − 2 . 731 ⋆ 1 . 675 ⋆ − 1 . 198 ⋆ − 1 . 072 ⋆ − 0 . 235 1 . 069 ⋆ 0 . 673 ⋆ 1 . 001 ⋆ other a 1 a 2 a 3 c π 2 π 3 0 . 0005 ⋆ 0 . 0891 ⋆ 6 . 4379 ⋆ 0 . 0252 ⋆ 0 . 0600 ⋆ 0 . 0007 ⋆ T A B L E 5 P osterior means of the HMM parameter s using low counts of battle deaths . Compare with T able 3 . As pointed out above, the interpretation of isolated coefficients is dif ficult. The intended ef fect of a ceasefire can both be to reduce the fatalities in a giv en state of conflict and to increase the transition probability to a less intensified state of conflict. The AR coefficients for ceasefire are strongly neg ativ e in T ables 4 , 5 , and 6 for state 2, and as such pro vide stronger support than the main results. The finding that ceasefires are preceded by an increased intensity does not find similarly robust support. Using the UCDP lo w estimate counts, the pre-ceasefire period is associated with a strong increase in the number of fatalities, whereas the high estimate counts result in the opposite association. This finding is hard to explain as an ything other than a reflection of the uncertainties in the data. The model fit on both the high and lo w estimate counts as well as the best–high–low mixture, howe ver , finds support for an increased transition probability HMMS FOR LA TENT V ARIABLE LABELING ASSIGNMENTS IN CONFLICT RESEARCH 25 from state 2 to state 3 prior to ceasefires, as well as a reduced transition probability from state 3 to state 1, all consistent with the findings in T able 3 . The UCDP high estimates hav e associated baseline transition rates consistent with less duration in state 3 than those of the best estimates, and while the estimate for a 3 based on the high estimates is similar to that based on the best estimate, the baseline AR coefficient for state 3 is markedly higher based on the UCDP high estimates. This reflects UCDP’ s tendency tow ards conserv ativ e estimates. Alternativ ely , using the UCDP low estimate counts compared to the best, the estimates for a 3 and the baseline AR coef ficient for state 3 are both higher , but the low estimates hav e associated baseline transition rates consistent with even less duration in state 3. This suggests more pronounced dif ferences between the distributions of battle death counts for states 2 and 3, associated with the UCDP lo w estimates (i.e., battle death counts that are more flat ov er time with abrupt but short-li ved spikes). trans. rates baseline pre-ceasefire ceasefire v2x v2x 2 v2x 3 GDP pop ζ ′ 3 ( 2 → 1 ) -4.73(.23) -.37(.80) 1.08(.33) -1.02(.45) 1.78(.66) -.90(.58) -.55(.13) -.59(.15) ζ ′ 4 ( 2 → 3 ) -5.93(.45) 1.60(.45) .50(.51) -.45(.57) -.66(.76) .52(.56) -.44(.19) -.59(.21) ζ ′ 5 ( 3 → 1 ) .30(.63) -1.42(.75) .13(.77) -1.33(.56) .44(.63) .84(.76) -.52(.53) -.30(.59) ζ ′ 6 ( 3 → 2 ) .58(.45) -.28(.76) .13(.74) -.68(.58) .12(.80) -.18(.69) .16(.39) .43(.47) AR coef. baseline pre-ceasefire ceasefire v2x v2x 2 v2x 3 GDP pop β ′ 1 (state 2) -4.31(.05) .07(.05) -.13(.05) 1.93(.26) -4.13(.63) 2.48(.41) .25(.02) .34(.02) β ′ 2 (state 3) -3.83(.48) .64(.85) -.09(.59) -.96(.56) -.60(.78) .46(.70) .66(.25) 1.08(.43) other a 1 a 2 a 3 c π 2 π 3 .00(.00) .09(.00) 6.18(.32) .02(.00) .04(.03) .04(.04) T A B L E 6 A verage of posterior means of eac h HMM parameter over 100 data sets, eac h data set constructed by uniformly sampling over the best, high, or low count of battle deaths, when variation e xists. Standar d deviations of the posterior means ar e given in par entheses. Compar e with T able 3 . 6. Concluding remarks. There are many research directions we hope to in vestigate in furthering this work. Namely , we hope to distinguish between ceasefires of dif ferent types and scope, separating those ceasefires that hav e markedly different characteristics (e.g., ces- sation of hostility arrangements vis-a-vis preliminary ceasefires). A complimentary research direction is to include a cov ariate to account for the duration of a giv en conflict (i.e., time in state 2 and/or state 3). This would allo w us to address questions relating to whether con- flicts tend to hav e limited persistence. Finally , spatial correlations in conflict data (e.g., due to geopolitics, etc.) are well established ( Gleditsch , 2007 ). In part, they stem from the fact that certain conflict-inducing factors are present in the same area. Pov erty , for instance, is geographically clustered. But they also relate through conflict dynamics. V iolent conflict can spill ov er from one country to another , such as between Rwanda and the Democratic Repub- lic of Congo. This might be because ethnic groups cohabit both sides of a border , or it might be because a government intervenes in another country to pre vent the organization of armed resistance, such as the Israeli intrusion of Lebanon. In any case, spatial correlations are a feature of these data that we plan to address in a subsequent, more complex formulation of our HMM approach, in our future research. Further open research problems include the follo wing: 1. W e in vestigate the patterns that civil conflicts share across countries. Howe ver , if labels were on the dyad lev el (i.e., pairs of armed and opposing actors) rather than the country le vel, would the same patterns emerge? Are there additional patterns that would emerge? What about patterns on the resolution of conflict labeled data? 26 2. In our analysis, we are able to estimate the ef fect of ceasefires for static time periods immediately surrounding the agreement. Further methods are needed to determine the length of persistence of a gi ven ceasefire. 3. What are the ef fects of peacekeeping ef forts on the sustainability of ceasefires? 4. What are the ef fects of seasonality on the sustainability of ceasefires? Certain countries and regions simply cannot maintain violence during certain parts of the year (e.g., due to seasonal weather patterns, the necessity of farming, etc.). SUPPLEMENT AR Y MA TERIAL Code and data to r eproduce the empirical results presented in the manuscript. The sav ed MCMC output for the real data analysis and the simulation study are provided, but they can of course be reproduced by re-running the provided script files and data. The figures presented in the paper , along with the figures delegated to the Supplementary Material are contained in these files. See workflow .sh for line-by-line Unix command line code for reproducing the results. Instructions for how the code should be run and parallelized are also provided in the w orkflow .sh file. REFERENCES A A RY , V . (1995). Concluding Hostilities: Humanitarian Provisions in Cease-Fire Agreements. Military Law Re- view 148 186–273. A K E B O , M . (2016). Ceasefire agr eements and peace pr ocesses: A comparative study . T aylor & Francis. A N D E R S , T. (2020). T erritorial control in civil wars: Theory and measurement using machine learning. Journal of P eace Resear ch 57 701–714. B A R A , C . , C L A Y T O N , G . and R U S TA D , S . A . (2021). Understanding Ceasefires. International P eacekeeping 28 329–340. Publisher: Routledge _eprint: https://doi.org/10.1080/13533312.2021.1926236. B E S L E Y , T., F E T Z E R , T. and M U E L L E R , H . (2021). Ho w big is the media multiplier? evidence from dyadic news data. CESifo W orking P aper . B R I C K H I L L , J . (2018). Mediating security arrangements in peace processes: critical perspectives fr om the field . ETH Press, Zurich. OCLC: 1043551307. B R U N B O R G , H ., L Y N G S TA D , T . H . and U R D A L , H . (2003). Accounting for Genocide: How Many W ere Killed in Srebrenica? Eur opean Journal of P opulation / Revue eur opéenne de Démographie 19 229-248. B U C H A N A N , C . , C L AY T O N , G . and R A M S B OT H A M , A . (2021). Ceasefire monitoring: developments and com- plexities. Accor d Spotlight: Conciliation Resources London, UK . C H E N , B . , S H R I V A S TA V A , A . and S T E O RT S , R . C . (2018). Unique entity estimation with application to the Syrian conflict. The Annals of Applied Statistics 12 1039–1067. C H O U N E T - C A M BA S , L . (2011). Negotiating ceasefires. Mediation Practice Series Centre f or Humanitarian Dialogue Gene va. C L AYT O N , G . , N AT H A N , L . and W I E H L E R , C . (2021). Ceasefire Success: A conceptual framew ork. Interna- tional P eacekeeping 28 341–365. C L AYT O N , G . and S T I C H E R , V . (2021). The logic of ceasefires in civil war. International Studies Quarterly Online First . C L AYT O N , G . , M A S O N , S . , S T I C H E R , V . and W I E H L E R , C . (2019). Ceasefires in Intra-state Peace Processes. CSS Analysis in Security P olicy 252 . C L AYT O N , G . , N Y G Å R D , H . M ., R U S TA D , S . A . and S T R A N D , H . (2020a). Ceasefires in ci vil wars: A new research agenda (Introduction to Special Section of the Journal of Conflict Resolution). W orking P aper . C L AYT O N , G ., N Y G Å R D , H . M . , R U S TA D , S . A . and S T R A N D , H . (2020b). Strategic Suspension: The correlates of ceasefires in civil conflict. this issue . C L AYT O N , G ., N Y G Å R D , H . M . , S T R A N D , H . , R U S TA D , S . A . , W I E H L E R , C . , S A G Å R D , T., L A N D S V E R K , P ., R Y L A N D , R . , S T I C H E R , V . , W I N K , E . and B A R A , C . (2021). Introducing the Ci vil Conflict Ceasefire Dataset. ETH W orking P aper . C O P P E D G E , M . , G E R R I N G , J ., K N U T S E N , C . H . , L I N D B E R G , S . I . , T E O R E L L , J . , A LT M A N , D ., B E R N - H A R D , M . , F I S H , M . S . , G LYN N , A . , H I C K E N , A . , L Ü H R M A N N , A . , M A R Q U A R D T , K . L . , M C M A N N , K . , P A X T O N , P . , P E M S T E I N , D . , S E I M , B . , S I G M A N , R . , S K A A N I N G , S . - E . , S TA T O N , J . , W I L S O N , S . , C O R - N E L L , A . , G A S TA L D I , L . , G J E R L ØW , H . , I C H E N K O , N . , K R U S E L L , J . , M A X W E L L , L . , M E C H K OV A , V . , M E D Z I H O R S K Y , J . , P E R N E S , J . , V O N R Ö M E R , J . , S T E PA N OV A , N . , S U N D S T R Ö M , A . , T Z E L G OV , E . , W A N G , Y . - T., W I G , T. and Z I B L AT T , D . (2019). V -Dem [Country-Y ear/Country-Date] Dataset v9. HMMS FOR LA TENT V ARIABLE LABELING ASSIGNMENTS IN CONFLICT RESEARCH 27 C R I S M A N - C O X , C . (2022). Democracy , reputation for resolve, and civil conflict. Journal of P eace Resear ch 59 382-394. D A H L , R . A . (1971). P olyarc hy: P olitical P articipation and Opposition . Y ale University Press, New Hav en, CT . D A V E N P O R T , C . , N Y G Å R D , H . M ., F J E L D E , H . and A R M S T RO N G , D . (2019). The Consequences of Contention: Understanding the Afteref fects of Political Conflict and V iolence. Annual Revie w of P olitical Science 22 1–30. D A V I S , R . A . , F O K I A N O S , K . , H O L A N , S . H ., J O E , H . , L I V S E Y , J ., L U N D , R ., P I P I R A S , V . and R A V I S - H A N K E R , N . (2021). Count time series: A methodological review . Journal of the American Statistical As- sociation 1–15. D AW K I N S , S . (2021). The problem of the missing dead. Journal of P eace Resear ch 58 1098-1116. D E S O T O , A . (1999). Ending V iolent Conflict in El Salvador . In Her ding Cats: Multiparty Mediation in a Com- plex W orld United States Institute of Peace Press, W ashington D.C. D U K A L S K I S , A . (2015). Why Do Some Insurgent Groups Agree to Cease-Fires While Others Do Not? A Within- Case Analysis of Burma/Myanmar, 1948–2011. Studies in Conflict & T err orism 38 841–863. F A Z A L , T. M . (2014). Dead wrong?: Battle deaths, military medicine, and exaggerated reports of war’ s demise. International Security 39 95–125. F O RT N A , V . P . (2003). Scraps of paper? Agreements and the durability of peace. International Organization 57 337–372. F O RT N A , V . P . (2004). P eace time: Cease-fir e agr eements and the durability of peace . Princeton University Press. G E O R G E , A . L . and B E N N E T , A . (2005). Case studies and theory development in the social sciences . Cambridge Univ ersity Press, Cambridge. G H O B A R A H , H . A ., H U T H , P . and R U S S E T T , B . (2003). Civil wars kill and maim people—long after the shooting stops. American P olitical Science Revie w 97 189–202. G L E D I T S C H , K . S . (2007). Transnational Dimensions of Ci vil W ar. J ournal of P eace Resear ch 44 293-309. G L E D I T S C H , N . P . , W A L L E N S T E E N , P . , E R I K S S O N , M . , S O L L E N B E R G , M . and S T R A N D , H . (2002). Armed conflict 1946-2001: A new dataset. J ournal of peace resear ch 39 615–637. H A N S O N , K . (2020). Li ve and Let Li ve: Explaining Long-term Truces in Separatist Conflicts. International P eacekeeping Online First 1–23. H E G R E , H . , E L L I N G S E N , T., G ATE S , S . and G L E D I T S C H , N . P . (2001). T ow ard a Democratic Civil Peace? Democracy , Political Change, and Ci vil W ar , 1816–1992. American P olitical Science Review 95 33–48. H Ö G L U N D , K . (2005). V iolence and the peace process in Sri Lanka. Civil W ars 7 156–170. H O LT E R M A N N , H . (2021). Blinding the Elephant: Combat, Information, and Rebel V iolence. T err orism and P olitical V iolence 33 1469-1491. H Ö G B L A D H , S . (2023). UCDP GED Codebook version 23.1. Department of P eace and Conflict Researc h, Upp- sala University . H Ö G L U N D , K . (2011). T actics in Negotiations between States and Extremists: The Role of Cease-Fires and Counterterrorist Measures. In Engaging Extr emists: T rade-Of fs, T iming, and Diplomacy (I. W . Zartman and G. O. Faure, eds.) United States Institute of Peace, W ashington D.C. J A K O B S E N , J . H . (2021). Application of count time series to battle deaths. Master’s Thesis, University of Oslo: http://urn.nb .no/URN:NBN:no-90532 . J A R M A N , N . (2004). From war to peace? Changing patterns of violence in Northern Ireland, 1990–2003. T error - ism and P olitical V iolence 16 420–438. K A R A K U S , D . C . and S V E N S S O N , I . (2020). Between the Bombs: Exploring Partial Ceasefires in the Syrian Civil W ar, 2011–2017. T err orism and P olitical V iolence 32 681–700. K O L Å S , Å . (2011). Naga militancy and violent politics in the shadow of ceasefire. J ournal of P eace Researc h 48 781–792. K R E U T Z , J . (2010). How and When Armed Conflicts End: Introducing the UCDP Conflict T ermination Dataset. Journal of P eace Researc h 47 243–250. K RT S C H , R . (2021). The T actical Use of Ci vil Resistance by Rebel Groups: Evidence from India’ s Maoist Insur - gency . J ournal of Conflict Resolution 65 1251-1277. L A C I N A , B . (2006). Explaining the Severity of Ci vil W ars. The J ournal of Conflict Resolution 50 276–289. L A C I N A , B . and G L E D I T S C H , N . P . (2012). The waning of war is real: A response to Gohdes and Price. J ournal of Conflict Resolution 57 1109–1127. L U T S C H E R , P . M . , W E I D M A N N , N . B ., R O B E RT S , M . E . , J O N K E R , M . , K I N G , A . and D A I N OT T I , A . (2020). At Home and Abroad: The Use of Denial-of-service Attacks during Elections in Nondemocratic Regimes. Journal of Conflict Resolution 64 373-401. L U T T W A K , E . N . (1999). Giv e war a chance. F or eign affairs 36–44. M A H I E U , S . (2007). When should mediators interrupt a civil war? The best timing for a ceasefire. International Ne gotiation 12 207–228. P A L I K , J . (2021). W atchdogs of Pause: The Challenges of Ceasefire Monitoring in Y emen. International P eace- keeping 0 1–26. 28 P E T R O FF , V . B . , B O N D , J . H . and B O N D , D . H . (2013). Using hidden Markov models to predict terror before it hits (again). In Handbook of computational appr oaches to counterterr orism 163–180. Springer . P I N AU D , M . (2020). Home-Grown Peace: Ci vil Society Roles in Ceasefire Monitoring. International P eacekeep- ing 0 1–26. P L A N K , F. (2017). When Peace Leads to Div orce: The Splintering of Rebel Groups in Po wersharing Agreements. Civil W ar s 19 176–197. P R I C E , M . and B A L L , P . (2014). Big data, selection bias, and the statistical patterns of mortality in conflict. The SAIS Revie w of International Affairs 34 9–20. R A B I N E R , L . R . (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Pr oceedings of the IEEE 77 257–286. R A L E I G H , C . , K I S H I , R . and L I N K E , A . (2023). Political instability patterns are obscured by conflict dataset scope conditions, sources, and coding choices. Humanities and Social Sciences Communications 10 1–17. R A N D A H L , D . and V E G E L I U S , J . (2022). Predicting escalating and de-escalating violence in Africa using Markov models. International Interactions 48 597–613. R E E D E R , B . W . and S E E B E R G , M . B . (2018). Fighting your friends? A study of intra-party violence in sub- Saharan Africa. Democratization 25 1033-1051. R O U S S E E U W , P . J . (1984). Least median of squares regression. Journal of the American statistical association 79 871–880. S A D I N L E , M . (2014). Detecting duplicates in a homicide re gistry using a Bayesian partitioning approach. The Annals of Applied Statistics 2404–2434. S A D I N L E , M . (2018). Bayesian propagation of record linkage uncertainty into population size estimation of human rights violations. The Annals of Applied Statistics 12 1013–1038. S AT T E N , G . A . and L O N G I N I J R , I . M . (1996). Marko v chains with measurement error: Estimating the ‘true’course of a marker of the progression of human immunodeficiency virus disease. Journal of the Royal Statistical Society: Series C (Applied Statistics) 45 275–295. S C H R O D T , P . A . (1997a). Pattern Recognition of International Crises Using Hidden Markov Models. Nonlinear Models and Methods in P olitical Science . S C H R O D T , P . A . (1997b). Early warning of conflict in Southern Lebanon using hidden Markov models. In Amer- ican P olitical Science Association . S C H R O D T , P . A . (2006). Forecasting conflict in the Balkans using hidden Markov models. In Pr ogramming for P eace 161–184. Springer . S M I T H (1995). Stopping W ar s: Defining The Obstacles T o Cease-fir e . W estview Press, Boulder , Colorado. S O S N O W S K I , M . (2020). Ceasefires as violent state-building: local truce and reconciliation agreements in the Syrian civil w ar. Conflict, Security & Development 20 273–292. S U N D B E R G , R . and M E L A N D E R , E . (2013). Introducing the UCDP Georeferenced Event Dataset. Journal of P eace Resear ch 50 523-532. T A I , X . H . , M E H R A , S . and B L U M E N S T O C K , J . E . (2022). Mobile phone data reveal the effects of violence on internal displacement in Afghanistan. Natur e Human Behaviour 6 624-634. W ATE R M A N , A . (2020). Ceasefires and State Order-Making in Naga Northeast India. International P eacek eeping Online First 1–30. W E I D M A N N , N . B . (2015). On the Accuracy of Media-based Conflict Event Data. Journal of Conflict Resolution 59 1129-1149. W E I SS , C . H . (2018). An intr oduction to discr ete-valued time series . John W iley & Sons. W I L L I A M S , J . P ., S T O R L I E , C . B ., T H E R N E AU , T . M . , J R , C . R . J . and H A N N I G , J . (2020). A Bayesian approach to multistate hidden Markov models: application to dementia progression. Journal of the American Statistical Association 115 16–31. W I L L I A M S , J . P ., H E R M A N S E N , G . H . , S T R A N D , H . , C L A Y T O N , G . and N Y G Å R D , H . M . (2024). Supplement to “Bayesian hidden Markov models for latent variable labeling assignments in conflict research: application to the role ceasefires play in conflict dynamics”. W O O D , R . M . (2014). From Loss to Looting? Battlefield Costs and Rebel Incentives for V iolence. International Or ganization 68 979–999. W O O D S , K . (2011). Ceasefire capitalism: military–priv ate partnerships, resource concessions and military–state building in the Burma–China borderlands. The J ournal of P easant Studies 38 747–770. X U , H . - Y ., X I E , M . , G O H , T. N . and F U , X . (2012). A model for inte ger-v alued time series with conditional ov erdispersion. Computational Statistics & Data Analysis 56 4229–4242. Å K E B O , M . (2016). Ceasefire Agr eements and P eace Pr ocesses: A Comparative Study . Routledge, London.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment