Markov switching negative binomial models: an application to vehicle accident frequencies
In this paper, two-state Markov switching models are proposed to study accident frequencies. These models assume that there are two unobserved states of roadway safety, and that roadway entities (roadway segments) can switch between these states over…
Authors: Nataliya V. Malyshkina, Fred L. Mannering, Andrew P. Tarko
Mark o v switc hing negativ e bi nomial mo dels: an appl ication to v eh i cle accident frequencies Natal iya V. Malyshkin a ∗ , F red L. Manner ing, Andrew P . T ar k o Scho ol o f Civil Engine ering, 550 Stadium Mal l Drive, Pur due Univ ersity, West L afayette, IN 47907 , Unite d States Abstract In this p ap er, t wo-stat e Mark o v switc hing mod els are prop osed to study accident frequencies. These mo dels assum e th at th er e are t wo un observ ed states of road- w ay safet y , and that roadwa y en tities (roadw a y segmen ts) can sw itc h b et we en th ese states o v er time. The states are distinct, in the sense that in the differen t states acci- den t frequencies are generated b y separate count ing pro cesses (b y separate Poisson or negativ e binomial pro cesses). T o demonstrate the applicabilit y of the approac h present ed h erein, t w o-state Mark ov switc hing negativ e binomial mo dels are esti- mated using five- y ear acciden t frequencies on selected Ind iana in terstate high wa y segmen ts. Bay esian inference metho ds and Mark o v Chain Mon te Carlo (MCMC) sim u lations are used for mo d el estimat ion. The estimate d Ma rk ov switc hing mo d- els r esult in a sup erior statistica l fit relativ e to the standard (single-sta te) negativ e binomial mo del. It is foun d that the more fr equen t state is safer and it is correlated with b etter w eather conditions. The less f requen t state is found to b e less safe and to b e correlated with adv erse weat her conditions. Key wor ds: Acciden t frequen cy; n egativ e b in omial; coun t data mo del; Marko v switc hing; Ba y esian; MCMC 1 In tr o duction V ehicle acciden ts place an incredible so cial and economic burden on society . As a result, considerable researc h has b een conducted on understanding and pre- dicting a cciden t frequencies ( t he n umber of a cciden ts o ccurring on ro adw ay ∗ Corresp ond ing author. Email addr esses: nmal yshk@purd ue.edu (Nataliy a V. Malyshkina), flm@ecn. purdue.e du (F red L. Mannering), tarko@e cn.purdue .edu (Andrew P . T ark o). Preprint subm itted to Acciden t Analysis and Preven tion segmen ts o ve r a give n time perio d). Because acciden t freque ncies are non- negativ e inte gers, coun t data mo dels are a reasonable statistical mo deling ap- proac h (W ashington et al., 2003). Simple mo deling a pproac hes include P oisson mo dels and negativ e binomial (NB) mo dels (Hadi et al., 1995; Shank ar et al., 1995; Poc h and Mannering, 1996; Miaou and Lord, 2003). These mo dels a s- sume a single pro cess f o r acciden t data generation (a P oisson pro cess or a negativ e binomial pro cess) and inv olv e a nonlinear regression of the ob- serv ed acciden t frequenc ies on v arious roadwa y-segmen t characteristics (suc h as roadw ay geometric a nd en vironmen tal factors). Because a prep onderance of zero-acciden t observ ations is often observ ed in empirical data, Miaou (1994), Shank a r et al. (1997) and others ha ve applied zero-inflated P oisson (ZIP) and zero-inflated negativ e binomial (ZINB) mo dels for predicting a cciden t frequen- cies. Zero-inflated mo dels assume a t wo-state pro cess for acciden t dat a gener- ation – one state is assum ed to b e safe with z ero acciden ts (o v er the duration of time b eing conside r ed) a nd the other state is assumed to b e unsafe with a p ossibilit y of nonzero acciden t f requencies . In zero-inflated mo dels, individual roadw ay segmen ts are a ssumed to b e alw ays in the safe or unsafe state. While the a pplicatio n o f zero-inflated mo dels often pro vides a b etter statistical fit of observ ed acc iden t frequenc y da t a, the applicability of these mo dels has been questioned b y L o rd et al. (2005, 20 07). In particular, Lord et al. (2005, 2007) argue that it is unreasonable to exp ect some roadw ay segmen ts to b e alw ays p erfectly safe. In addition, they argue that zero-inflated mo dels do not accoun t for a lik ely p ossibility for r o adw ay segmen ts to c hange in time from one state to ano ther. In t his paper, t wo-state Marko v switc hing coun t data models are explored as a metho d for studying acciden t frequencies. These mo dels assume Mark ov switc hing (ov er time) b et w een tw o unobse rv ed states of ro a dw ay safety . 1 There are a n um b er of r easons to exp ect the existence of m ultiple states. First, the safet y of roadw ay segmen ts is lik ely to v a ry under differen t en viron- men tal conditions, driv er reactions and other factors that ma y not neces sar- ily b e a v ailable to t he analyst. F or an illustration, consider Fig ure 1, whic h sho ws w eekly time series of the n umber of acciden ts on selecte d Indiana in- terstate segme n ts during the 1 9 95-1999 time interv al. The figure shows t hat the n umber of a cciden ts p er w eek fluctuates widely o ver time. Th us, under differen t conditions, r o ads can b ecome considerably more or less safe. As a r e- sult, it is reasonable to assume that there exist tw o o r more states of ro adw ay safet y . These states can help accoun t for the existen ce of n umerous uniden- tified and/or unobs erv ed factors (unobserv ed heterogeneit y) that influenc e roadw ay safet y . Mar ko v switc hing mo dels are designed to a ccoun t for unob- 1 In fact, there ma y b e more than t w o states but, for illustration pur p oses, the tw o- state case will b e considered in this pap er. Extending the approac h to the p ossibility of additional stat es would significan tly complicate the model structure. Ho w ever, once this extension w ere done, add itional states could b e empirically tested. 2 Jan−95 Jul−95 Jan−96 Jul−96 Jan−97 Jul−97 Jan−98 Jul−98 Jan−99 Jul−99 0 20 40 60 80 100 Date Number of accidents per week Fig. 1. W eekly acciden t frequencies on the sample of I n diana interstate segmen ts from 1995 to 1999 (the horizonta l dashed line sh ows the av erage v alue). serv ed stat es in a con v enien t and feasible wa y . In fact, these mo dels ha v e b een success fully applied in other s cientific fields. F or ex a mple, t w o - state Mark ov switc hing auto regressiv e mo dels h a ve been used in economics, where the t w o states are usually iden tified as economic recession and expansion (Hamilton, 1989; Mc Cullo c h and Ts ay, 1 994; T say, 20 02). Another reason to exp ect the existence of multiple states is the empirical suc- cess of zero-inflated mo dels, whic h are predicated o n the existence of t w o -state pro cess – a safe and an unsafe state (see Shank ar et al., 1997; Carson and Mannering, 2001; Lee and M a nnering, 2002). Mark o v switc hing can b e view ed as an exten- sion of previous work on zero-inflated mo dels, in that it relaxes the assumption that a safe state exists (whic h has b een brought up as a concern, see L o rd et al. (2005, 200 7)) and instead considers t w o significan tly different unsafe states. In addition, in con trast t o zero-inflated mo dels, Mark ov sw itc hing explicitly considers the p ossibilit y that roadw ay segmen ts can change their state ov er time. 2 Mo del sp ecification Mark ov switc hing mo dels are pa rametric and can b e fully sp ecified b y a lik eli- ho o d function f ( Y | Θ , M ), whic h is t he conditional pro ba bilit y distribution of the ve ctor of all observ ations Y , giv en the v ector of all parameters Θ of mo del M . In our s tudy , w e o bserv e the n um b er of acciden ts A t,n that o ccur on t he n th roadw ay segmen t during time p erio d t . Th us Y = { A t,n } includes all acci- den ts observ ed on all roa dw ay segmen ts o v er all time p erio ds ( n = 1 , 2 , . . . , N t and t = 1 , 2 , . . . , T ). Mo del M = { M , X t,n } includes the mo del’s name M (for example, M = “negative binomial”) and the v ector X t,n of all roadw ay segmen t c ha r acteristic v ar ia bles (segmen t length, curv e characteris tics, grades, pa ve men t prop erties, and so on). T o define the like liho o d function, w e first in tro duce an unobserv ed (latent) 3 state v ariable s t , whic h determines the state of all roadw a y segmen ts during time p erio d t . At eac h t , the state v ariable s t can assume only t w o v alues: s t = 0 corresp onds t o one state and s t = 1 correspo nds to the other state. The state v a r ia ble s t is a ssumed to follow a stationary tw o-state M a rk ov c hain process in time, 2 whic h can b e specified b y time -indep enden t transition probabilities as P ( s t +1 = 1 | s t = 0) = p 0 → 1 , P ( s t +1 = 0 | s t = 1) = p 1 → 0 . (1) Here, for example, P ( s t +1 = 1 | s t = 0) is the conditional probabilit y of s t +1 = 1 at time t + 1, given that s t = 0 at time t . Note that P ( s t +1 = 0 | s t = 0) = 1 − p 0 → 1 and P ( s t +1 = 1 | s t = 1) = 1 − p 1 → 0 . T ra nsition probabilities p 0 → 1 and p 1 → 0 are unkno wn parameters to b e estimated from accid en t data. The stationary unconditional probabilities ¯ p 0 and ¯ p 1 of states s t = 0 a nd s t = 1 are 3 ¯ p 0 = p 1 → 0 / ( p 0 → 1 + p 1 → 0 ) for state s t = 0 , ¯ p 1 = p 0 → 1 / ( p 0 → 1 + p 1 → 0 ) for state s t = 1 . (2) Without loss of generalit y , w e assume that (on a v erage) state s t = 0 o ccurs more or equally frequen tly tha n state s t = 1. Therefore, ¯ p 0 ≥ ¯ p 1 , and from Eqs. (2) w e obtain restriction 4 p 0 → 1 ≤ p 1 → 0 . (3) W e refer to states s t = 0 and s t = 1 as “more frequen t ” and “less frequen t” states respectiv ely . Next, consider a t wo-state Mark ov switc hing negativ e binomial (MSNB) mo del that assumes ne gativ e binomial data- g enerating pro cesses in eac h of the t w o states. With this, the probability of A t,n acciden ts o ccurring on roadwa y se g- men t n during time p erio d t is 2 Mark o v prop ert y means that the probabilit y distr ib ution of s t +1 dep end s only on the v alue s t at time t , b ut not on th e previous history s t − 1 , s t − 2 , . . . (Breiman, 1969). Stationarit y of { s t } is in the statisti cal sense. Belo w w e will relax the assumption of stationarit y and discu ss a case of time-dep end ent transition probabilities. 3 These ca n b e found from stationarit y conditions ¯ p 0 = (1 − p 0 → 1 ) ¯ p 0 + p 1 → 0 ¯ p 1 , ¯ p 1 = p 0 → 1 ¯ p 0 + (1 − p 1 → 0 ) ¯ p 1 and ¯ p 0 + ¯ p 1 = 1 (Breiman, 1969). 4 Restriction (3) allo ws to a v oid the problem of switc hing of s tate labels, 0 ↔ 1. This problem would otherwise arise b ecause of the symmetry of the lik eliho o d fun c- tion (4)–(6) under the lab el switc hing. 4 P ( A ) t,n = Γ( A t,n + 1 /α t ) Γ(1 /α t ) A t,n ! 1 1 + α t λ t,n ! 1 /α t α t λ t,n 1 + α t λ t,n ! A t,n , (4) λ t,n = exp( β ′ (0) X t,n ) if s t = 0 exp( β ′ (1) X t,n ) if s t = 1 , (5) α t = α (0) if s t = 0 α (1) if s t = 1 , t = 1 , 2 , . . . , T , n = 1 , 2 , . . . , N t . Here, Eq. ( 4) is the standard negative binomial probability mass function (W ashington et al., 2003), Γ( ) is the g a mma function, pr ime means trans- p ose (so β ′ (0) is the transpo se of β (0) ), N t is the n um b er of ro a dw ay segmen ts observ ed during t ime p erio d t , and T is the total n umber of time p erio ds. P arameter v ectors β (0) and β (1) , and ov er- disp ersion parameters α (0) ≥ 0 and α (1) ≥ 0 are the unknown estimable para meters of n egativ e binomial mo dels in the t w o s t ates, s t = 0 and s t = 1 resp ectiv ely . 5 W e set the first comp o nen t of X t,n to unit y , and, therefore, the first comp onen ts of β (0) and β (1) are the in tercepts in the t w o states. If accid en t ev en ts a re assumed to b e indep enden t, the lik eliho o d function is f ( Y | Θ , M ) = T Y t =1 N t Y n =1 P ( A ) t,n . (6) Here, b ecause the state v ariables s t are unobserv able, the v ector of all es- timable parameters Θ must include all states, in addition to all mo del para m- eters ( β -s, α - s) and transition probabilities. Th us, Θ = [ β ′ (0) , α (0) , β ′ (1) , α (1) , p 0 → 1 , p 1 → 0 , S ′ ] ′ , S ′ = [ s 1 , . . . , s T ] . (7) V ector S ha s length T and contains all state v alues. Eqs. (1)-(7) define the t wo-state Mark o v switc hing negativ e binomial (MSNB) mo dels considered in this study . 5 T o ensu re that α (0) and α (1) are n on-negativ e, their logarithms are used in e sti- mation. 5 3 Mo del estimation methods Statistical estimation of Mark ov switc hing mo dels is complicated b y uno bserv- abilit y of the state v ariables s t . 6 As a res ult, the traditio nal maximum lik eli- ho o d estimation ( MLE) pro cedure is of v ery limited use for Mark ov switc hing mo dels. Instead, a Bay esian inference appro a c h is used. G iv en a mo del M with likelihoo d function f ( Y | Θ , M ), the Ba y es form ula is f ( Θ | Y , M ) = f ( Y , Θ |M ) f ( Y | M ) = f ( Y | Θ , M ) π ( Θ |M ) R f ( Y , Θ |M ) d Θ . (8) Here f ( Θ | Y , M ) is the p osterior probability distribution o f mo del parameters Θ c o nditional on t he o bserv ed data Y and model M . F unction f ( Y , Θ |M ) is the joint probabilit y distribution of Y and Θ g iv en mo del M . F unction f ( Y | M ) is the marg inal lik eliho o d function – the probabilit y distribution of data Y giv en mo del M . F unction π ( Θ |M ) is the pr io r probability distribu- tion of parameters that reflects prior kno wledge ab out Θ . The in tuition b ehind Eq. (8) is straig h tforward: given mo del M , the p o sterior distribution accoun ts for b oth the observ ations Y and o ur prior kno wledge of Θ . W e use the har- monic mean form ula to calculate the marginal lik eliho o d f ( Y |M ) of data Y (see Kass and Ra f tery, 1 9 95) a s, f ( Y | M ) − 1 = Z f ( Θ | Y , M ) f ( Y | Θ , M ) d Θ = E h f ( Y | Θ , M ) − 1 Y i , (9) where E ( . . . | Y ) is the p osterior expectation (whic h is calculated b y using the p osterior distribution). In our study (and in most practical studies), the direct application of Eq. (8) is not feasible b ecause the parameter v ector Θ con ta ins to o many comp o- nen ts, making in tegration o v er Θ in Eq. (8) extrem ely difficu lt. Ho w eve r, the p osterior distribution f ( Θ | Y , M ) in Eq. (8) is kno wn up to its normalization constan t, f ( Θ | Y , M ) ∝ f ( Y | Θ , M ) π ( Θ |M ). As a r esult, we use Mark ov Chain Monte Carlo (MCMC) simulations, whic h prov ide a con venie nt and practical c omputational metho dology for sampling from a probability distri- bution kno wn up to a constan t (the p osterior distribution in our cas e). Given a large enough p osterior sample of parameter v ector Θ , any p osterior exp ecta- tion and v ariance can b e found and Bay esian inference can b e readily applied. In the App endix w e describe our choice of prior distribution π ( Θ |M ) and the MCMC sim ulations. The prior distribution is ch osen to b e wide and essen tially noninformativ e. F or the MCMC sim ulations in this pap er, sp ecial n umerical 6 Belo w we will ha v e 260 time p erio d s ( T = 260). In this case, th er e are 2 260 p ossible com binations for v alue of v ector S = [ s 1 , . . . , s T ] ′ . 6 co de was written in the MA TLAB pro gramming language and tested on arti- ficial acciden t da ta sets. The test pro cedure included a generation of artificial data with a kno wn mo del. Then thes e data w ere used to estimate the unde r - lying mo del by means of o ur sim ulation code. With this pro cedure w e found that the MSNB models, used to generate the artificial data, were repro duced success fully with our estimation co de. F or comparison of differen t mo dels w e use the followin g Ba y esian approa c h. Let there b e tw o mo dels M 1 and M 2 with parameter v ectors Θ 1 and Θ 2 resp ectiv ely . Assu ming that w e ha v e equal preferences of these mo dels, the ir prior probabilities are π ( M 1 ) = π ( M 2 ) = 1 / 2. In this case, t he ratio of the mo dels’ p osterior probabilities, P ( M 1 | Y ) and P ( M 2 | Y ), is equal to t he Ba yes factor. The later is defined as the ra tio of the mo dels’ marginal lik eliho o ds (Kass a nd Raftery, 1995). Th us, w e ha v e P ( M 2 | Y ) P ( M 1 | Y ) = f ( M 2 , Y ) /f ( Y ) f ( M 1 , Y ) /f ( Y ) = f ( Y | M 2 ) π ( M 2 ) f ( Y | M 1 ) π ( M 1 ) = f ( Y | M 2 ) f ( Y | M 1 ) , (10) where f ( M 1 , Y ) and f ( M 2 , Y ) a r e the join t distributions of the mo dels and the data, f ( Y ) is the unconditional distribution of t he dat a , and the marginal lik eliho o ds f ( Y |M 1 ) and f ( Y |M 2 ) are giv en by Eq. (9). If the rat io in Eq. (10) is larger tha n one, then mo del M 2 is fa vored, if the ratio is less than one, then mo del M 1 is fa v o red. An adv an tag e of the use of Bay es factors is that it has an inheren t p enalt y for including to o man y parameters in the mo del and guards against o v erfitting. 4 Mo del estimation results Data are used from 5769 acciden ts t ha t we re observ ed on 3 3 5 in terstate high- w ay segmen t s in Indiana in 1995 -1999. W e use we ekly time p erio ds, t = 1 , 2 , 3 , . . . , T = 260 in tot al. 7 Th us, in the presen t study the state ( s t ) is the same fo r all roadw a y segmen ts and can c hange ev ery w eek. F our types of acciden t frequency models are estimated: • First, we estimate a standard (single-state) ne g ativ e binomial (NB) mo del without Mark o v switc hing by maxim um lik eliho o d estimation (MLE). W e refer to this mo del as “NB-b y-MLE”. • Second, w e estimate the same standard negativ e binomial mo del b y the Ba ye sian in ference approac h and the MCMC sim ulat io ns. W e refer to this 7 A week is from Sunda y to S aturda y , there are 260 full w eeks in the 199 5-1999 time in terv al. W e a lso consid ered d aily time p erio d s and obtained qualita tively similar results (not rep orted here). 7 mo del as “ NB-b y-MCMC”. As one exp ects, for our c hoice of a non-informativ e prior distribution, the estimated NB-b y-MCMC mo del turned out to b e v ery similar to the NB- b y-MLE mo del. • Third, w e es timate a restricte d t w o-state M ark ov switc hing negativ e bino- mial (MSNB) mo del. In this restricted switch ing mo del only the inte rcept in the mo del parameters vec tor β and the ov er-disp ersion parameter α are allo wed to switc h b etw een the t w o s tates of ro adw ay safet y . In other w ords, in Eq. (5) only the first comp onents of v ectors β (0) and β (1) ma y differ, while the remaining c o mp onen ts are restricted to be the same. In this case, the t wo states c an hav e differe n t a v erage acc iden t rates, giv en by Eq. (5) , but the rates ha v e the same dep endence on the explanatory v aria bles. W e refer to this mo del as “restricted MSNB”; it is estimated b y the Bay esian-MCMC metho ds. • F ourth, we estimate a f ull tw o-state Marko v switc hing negative binomial (MSNB) mo del. In this mo del all estimable mo del parameters ( β -s and α ) are allow ed to switc h betw een the t w o states of roadw a y safet y . T o obtain the final full MSNB mo del rep orted here, w e consecutiv ely construct and use 60%, 85% and 95% Bay esian credible in terv als fo r ev aluation of the statistical significance of eac h β -parameter. As a res ult, in the final model some comp onen ts of β (0) and β (1) are restricted to ze ro or restricted to b e the same in the tw o stat es. 8 W e do not imp ose a n y restrictions on o v er- disp ersion parameters ( α - s). W e refer to the final full MSNB mo del as “ full MSNB”; it is estimated b y the Bay esian-MCMC methods. Note that the t w o states, and th us the M SNB mo dels, do not ha ve to exis t . F or example, they will not exist if all estimated mo del parameters turn out to b e statistically the same in the tw o states, β (0) = β (1) , (whic h suggests the tw o states are identical and the MSNB mo dels reduce to the standard NB mo del). Also, the tw o states will not exist if all estimated s tate v ariables s t turn out to b e close to zero, resulting in p 0 → 1 ≪ p 1 → 0 (compare to Eq. (3)), then the less frequen t state s t = 1 is not realized and the pro cess s ta ys in state s t = 0. The mo del estimation results for acciden t frequencies are give n in T able 1 . P osterior (or MLE) estimates of all con tinuous mo del para meters ( β -s, α , p 0 → 1 and p 1 → 0 ) are g iven tog ether with the corresp o nding 95% confidence in terv als for MLE mo dels and 9 5% credible in terv als for Ba ye sian-MCMC mo dels (refer to the sup erscript and subscript num b ers a dj a cen t to parameter p osterior/MLE estimates in T able 1). 9 T able 2 gives summary statistics of all 8 A β -parameter is restricted to zero if it is statistically insignificant. A β -parameter is restricted to b e the same in the t wo states if the d ifference of its v alues in the t wo states is statistically in significan t. A (1 − a ) cred ib le in terv al is c hosen in such w ay that the p osterior probabilities of b eing b elo w and ab o ve it are b oth equal to a/ 2 (w e u se s ignificance lev els a = 40% , 15% , 5%). 9 Note that MLE estimation assumes asymptotic normalit y of the estimates, resu lt- ing in confid ence interv als b eing sym m etric around the means (a 95% confidence 8 Jan−95 Jul−95 Jan−96 Jul−96 Jan−97 Jul−97 Jan−98 Jul−98 Jan−99 Jul−99 0 20 40 60 80 100 Date Number of accidents per week Jan−95 Jul−95 Jan−96 Jul−96 Jan−97 Jul−97 Jan−98 Jul−98 Jan−99 Jul−99 0 0.2 0.4 0.6 0.8 1 Date P(S t =1|Y) Fig. 2. The top p lot is the same as Figure 1. Th e b ottom p lot sho ws w eekly p osterior probabilities P ( s t = 1 | Y ) for the full MSNB mo del. roadw ay segmen t ch aracteristic v ariables X t,n (except the in tercept). T o visually see how the mo del tr ac ks the data, consider Fig ure 2. The to p plot in Figure 2 sho ws the we ekly acciden t frequencies in our data as give n in Figure 1. The b otto m plot in Figure 2 sho ws corresp onding w eekly p osterior probabilities P ( s t = 1 | Y ) o f the less freque n t state s t = 1 for the full MSNB mo del. These probabilities are equal to the p osterior exp ectations of s t , P ( s t = 1 | Y ) = 1 × P ( s t = 1 | Y ) + 0 × P ( s t = 0 | Y ) = E ( s t | Y ). W eekly v a lues of P ( s t = 1 | Y ) for the restricted MSNB mo del are v ery similar to those giv en on the top plot in Figure 2, and, as a result, are not shown on a separate plot. Indeed, the time-cor r elat io n 10 b et wee n P ( s t = 1 | Y ) for the tw o MSNB mo dels is ab o ut 99 . 5 %. T urning to the estimation results, the findings show that t wo states exist and Mark ov switc hing mo dels are strongly fav ored b y the em pirical data. In par- ticular, in the restricted MSNB mo del we ov er 99 . 9% confident tha t the differ- in terv al is ± 1 . 96 standard deviations around the mean). In con trast, Ba yesia n esti- mation d o es not require this assumption, and p osterior distributions of parameters and Ba y esian credible in terv als are usually non-symmetric. 10 Here and b elo w w e calculate weigh ted correlation coefficien ts. F or v ariable P ( s t = 1 | Y ) ≡ E ( s t | Y ) we use w eigh ts w t in versely prop ortional to the p osterior standard deviations of s t . That is w t ∝ min { 1 / std( s t | Y ) , median[1 / std( s t | Y )] } . 9 ence in v alues of β -in tercept in the t w o states is non-zero. 11 T o compare the Mark ov switc hing mo dels (restricted and full MSN B) with the corresponding standard non-switc hing mo del (NB), w e calculate and use Ba yes factors giv en b y Eq. (10). W e use Eq. (9) and b o otstra p sim ulations 12 for calculation of the v a lues and the 95% confidence in terv als of the logarithms of the marginal like li- ho o ds giv en in T able 1. The log-marginal- lik eliho o ds are − 161 08 . 6, − 15850 . 2 and − 15809 . 4 for the NB, restricted MSNB and full MSNB mo dels respec- tiv ely . Therefore, the restricted a nd full MSNB models pro vide considerable (258 . 4 a nd 299 . 2 ) improv emen ts of the log- marginal-like liho o ds o f the data as compared to the corresp onding standard non-switc hing NB mo del. Th us, giv en the acciden t data, the po sterior probabilities of the restricted and full MSNB models are larger than the probability o f the standard NB mo del b y e 258 . 4 and e 299 . 2 resp ectiv ely . W e can also use a classical statistics a pproac h for mo del comparison, based on the maxim um likelihoo d estimation (MLE). Referring to T a ble 1 , the MLE giv es the maxim um log-like liho o d v alue − 16 0 81 . 2 for the standard NB mo del. The maxim um log-lik eliho o d v alues observ ed during our MC MC sim ulations for the restricted and full MSNB mo dels are − 1578 6 . 6 and − 15744 . 8 resp ec- tiv ely . An imag ina ry MLE, at its con v ergence, w ould giv e MSNB log-lik eliho o d v a lues that were ev en la rger than these observ ed v alues. Therefore, the MSNB mo dels provide v ery lar g e (at least 294 . 6 and 336 . 4) improv ements in the max- im um log- lik eliho o d v alue o ve r the standard NB mo del. These impro v emen t s come with only mo dest increases in the n um b er of free con tin uous mo del pa- rameters ( β -s and α -s) that enter the lik eliho o d function. Both the Ak aik e Information Criterion (AIC) and the Ba yes ia n Information Criterion (BIC) 13 strongly fa v or the MSNB mo dels o v er the NB mo del. 11 The difference of the inte r cept v alues is statistically non-zero despite the fact that the 95% credib le int er v als for these v alues o ve r lap (see the “Intercept” line and the “Restricted MSNB” co lu mns in T able 1). The reaso n is that the p osterior d ra ws of th e in tercepts are correlated. Th e statistical test of whether the in tercept v alues differ, m ust b e based on ev aluation of their difference. 12 During b o otstrap sim u lations we rep eatedly dr aw, with replacemen t, p osterior v alues of Θ to calculate the p osterior exp ectation in Eq. (9). In eac h of 10 5 b o otstrap dra w s th at we make , the num b er of Θ v alues dra w n is 1/100 of the total num b er of all p osterior Θ v alues av ailable fr om MCMC simulations. 13 Minimization o f AI C = 2 K − 2 LL and B I C = K ln( N ) − 2 LL ens ures an opti- mal choice o f explanatory v ariables in a mod el and a v oids ov erfitting (Tsa y, 2002; W ashington et al., 2003). Here K is the num b er of free cont in u ous model parame- ters that en ter the lik eliho o d function, N is the num b er of observ ations and LL is the log-lik eliho o d. Wh en N ≥ 8, BIC fav ors few er f ree p arameters than AIC do es. 10 T able 1 Estimation results for n egativ e b inomial mo dels of accident frequency (the sup erscrip t and subs cript num b ers to the righ t of individual parameter p osterior/MLE estimates are 95% confidence/credible in terv als – see text for furth er explanation) V ariable NB-b y-ML E a NB-b y-MC MC b Restricted MSNB c F ull MSNB d state s = 0 state s = 1 state s = 0 state s = 1 In tercept (constan t term) − 21 . 3 − 18 . 7 − 23 . 9 − 20 . 6 − 18 . 5 − 22 . 7 − 20 . 9 − 18 . 7 − 23 . 0 − 19 . 9 − 17 . 8 − 22 . 1 − 20 . 7 − 18 . 7 − 22 . 8 − 20 . 7 − 18 . 7 − 22 . 8 Acciden t o ccurring on inte r states I-70 or I-164 (dumm y) − . 655 − . 562 − . 748 − . 657 − . 565 − . 750 − . 656 − . 564 − . 748 − . 656 − . 564 − . 748 − . 660 − . 568 − . 752 − . 660 − . 568 − . 752 Pa v emen t quali t y index (PQI) av erage e − . 0132 − . 00581 − . 0205 − . 0189 − . 0134 − . 0244 − . 0195 − . 0141 − . 0248 − . 0195 − . 0141 − . 0248 − . 0220 − . 0166 − . 0273 − . 0125 − . 00700 − . 0180 Road segmen t length (in miles) . 0512 . 0809 . 0215 . 0546 . 0826 . 0266 . 0538 . 0812 . 0264 . 0538 . 0812 . 0264 . 0395 . 0625 . 0165 . 0395 . 0625 . 0165 Logarithm of road segment length (in miles) . 909 . 974 . 845 . 903 . 964 . 842 . 900 . 961 . 840 . 900 . 961 . 840 . 913 . 973 . 853 . 913 . 973 . 853 T otal num b er of ramps on the road viewing and opp osite sides − . 0172 − . 00174 − . 0327 − . 021 − . 00624 − . 0358 − . 0187 − . 00423 − . 0331 − . 0187 − . 00423 − . 0331 – − . 0264 − . 00656 − . 0464 Number of ramps on the viewing side p er l ane p er mi le . 394 . 479 . 309 . 400 . 479 . 319 . 397 . 475 . 317 . 397 . 475 . 317 . 359 . 429 . 289 . 359 . 429 . 289 Median configuration is depressed (dumm y) . 210 . 314 . 106 . 214 . 318 . 111 . 211 . 315 . 108 . 211 . 315 . 108 . 209 . 313 . 107 . 209 . 313 . 107 Median barr ier presence (dummy) − 3 . 02 − 2 . 38 − 3 . 67 − 2 . 99 − 2 . 40 − 3 . 67 − 3 . 01 − 2 . 42 − 3 . 69 − 3 . 01 − 2 . 42 − 3 . 69 − 3 . 01 − 2 . 42 − 3 . 69 − 3 . 01 − 2 . 42 − 3 . 69 In terior shoulder presence (dummy) − 1 . 15 − . 486 − 1 . 81 − 1 . 06 . 135 − 2 . 26 − 1 . 02 . 148 − 2 . 23 − 1 . 02 . 148 − 2 . 23 − 1 . 16 − . 523 − 1 . 87 − 1 . 16 − . 523 − 1 . 87 Width of the interior shoulder is less that 5 feet (dumm y) . 373 . 477 . 270 . 384 . 491 . 279 . 386 . 492 . 281 . 386 . 492 . 281 . 380 . 486 . 275 . 380 . 486 . 275 In terior rumble strips presence (dumm y) − . 166 − . 0382 − . 293 − . 142 . 857 − 1 . 16 − . 163 . 836 − 1 . 14 − . 163 . 836 − 1 . 14 – – Width of the outside shoulder is less that 12 feet (dummy) . 281 . 380 . 182 . 272 . 370 . 174 . 268 . 366 . 170 . 268 . 366 . 170 . 267 . 365 . 170 . 267 . 365 . 170 Outside barrier is absent (dummy) − . 249 − . 139 − . 358 − . 255 − . 142 − . 366 − . 255 − . 142 − . 366 − . 255 − . 142 − . 366 − . 251 − . 140 − . 362 − . 251 − . 140 − . 362 Ave r age ann ual daily traffic (AADT) − 4 . 09 − 3 . 04 − 5 . 15 × 10 − 5 − 4 . 09 − 3 . 24 − 4 . 95 × 10 − 5 − 4 . 07 − 3 . 22 − 4 . 94 × 10 − 5 − 4 . 07 − 3 . 22 − 4 . 94 × 10 − 5 − 3 . 90 − 3 . 11 − 4 . 72 × 10 − 5 − 4 . 53 − 3 . 61 − 5 . 48 × 10 − 5 Logarithm of av erage ann ual daily traffic 2 . 08 2 . 36 1 . 80 2 . 06 2 . 30 1 . 83 2 . 07 2 . 30 1 . 83 2 . 07 2 . 30 1 . 83 2 . 07 2 . 30 1 . 84 2 . 07 2 . 30 1 . 84 Po sted s peed limit (in mph) . 0154 . 0244 . 00643 . 0150 . 0241 . 00589 . 0161 . 0251 . 00697 . 0161 . 0251 . 00697 . 0161 . 0252 . 00712 . 0161 . 0252 . 00712 Number of bridges p er mile − . 0213 − . 00187 − . 0407 − . 0241 − . 00721 − . 0419 − . 0233 − . 00648 − . 0410 − . 0233 − . 00648 − . 0410 – − . 0607 − . 0232 − . 102 Maximum of recipro cal v alues of horizon tal curv e radii (in 1 / mile) − . 182 − . 122 − . 242 − . 179 − . 118 − . 241 − . 178 − . 117 − . 239 − . 178 − . 117 − . 239 − . 175 − . 114 − . 237 − . 175 − . 114 − . 237 Maximum of recipro cal v alues of v ertical curve r adii (in 1 / mile) . 0191 . 0285 . 00972 . 0177 . 027 . 00843 . 0183 . 0275 . 00917 . 0183 . 0275 . 00917 . 0184 . 0274 . 00925 . 0184 . 0274 . 00925 Number of v er tical curves p er mile − . 0535 − . 0180 − . 0889 − . 057 − . 0233 − . 0924 − . 0586 − . 0249 − . 0940 − . 0586 − . 0249 − . 0940 − . 0565 − . 0231 − . 0917 − . 0565 − . 0231 − . 0917 Pe r cen tage of single unit truc ks (daily av erage) 1 . 38 1 . 88 . 886 1 . 25 1 . 75 . 758 1 . 19 1 . 68 . 701 1 . 19 1 . 68 . 701 . 726 1 . 28 . 171 2 . 57 3 . 39 1 . 77 11 T able 1 (Con tinued) V ariable NB-by-MLE a NB-b y-MC MC b Restricted MSNB c F ull MSNB d state s = 0 state s = 1 sta te s = 0 state s = 1 Win ter season (dummy ) . 148 . 226 . 0698 . 148 . 226 . 0689 − . 116 . 0563 − . 261 − . 116 . 0563 − . 261 − . 159 − . 0494 − . 269 – Spring season (dumm y) − . 173 − . 0878 − . 258 − . 173 − . 0899 − . 257 − . 0932 . 0547 − . 209 − . 0932 . 0547 − . 209 – – Summer s eason (dummy) − . 179 − . 0921 − . 266 − . 180 − . 0963 − . 263 − . 0332 . 111 − . 146 − . 0332 . 111 − . 146 – − . 549 − . 293 − . 883 Ove r -disp ersion parameter α in NB mo dels . 957 1 . 07 . 845 . 968 1 . 09 . 849 . 537 . 677 . 392 1 . 24 1 . 51 . 986 . 443 . 595 . 300 1 . 16 1 . 39 . 945 Mean accident rate ( λ t,n for NB), av eraged o ver all v al ues of X t,n – . 0663 . 0558 . 1440 . 0533 . 1130 Standard deviation of acciden t rate ( p λ t,n (1 + αλ t,n ) for NB), a veraged o ver all v alues of explanat or y v ariables X t,n – . 2050 . 1810 . 3350 . 1760 . 2820 Marko v transition probability of jump 0 → 1 ( p 0 → 1 ) – – . 0933 . 147 . 0531 . 158 . 225 . 100 Marko v transition probability of jump 1 → 0 ( p 1 → 0 ) – – . 651 . 820 . 463 . 627 . 773 . 474 Unconditional pr obabilities of states 0 and 1 ( ¯ p 0 and ¯ p 1 ) – – . 873 . 929 . 797 and . 127 . 203 . 0713 . 798 . 868 . 718 and . 202 . 282 . 132 T otal num b er of free mo del parameters ( β -s and α -s ) 26 26 28 28 Po sterior a verage of the log-likelihoo d (LL) – − 16097 . 2 − 16091 . 3 − 16105 . 0 − 15821 . 8 − 15807 . 9 − 15835 . 2 − 15778 . 0 − 15672 . 9 − 15794 . 9 Max( LL ): estimated max. v alue of log-li k elihoo d (LL) for MLE; max. observed LL during MCM C simulations for Bay esian estim. − 16081 . 2 (MLE) − 16086 . 3 (observ . ) − 15786 . 6 (observed) − 15744 . 8 (observed) Logarithm of marginal li ke l ihoo d of data (ln[ f ( Y |M )]) – − 16108 . 6 − 16105 . 7 − 16110 . 7 − 15850 . 2 − 15840 . 1 − 15849 . 5 − 15809 . 4 − 15801 . 7 − 15811 . 9 Goo dness-of-fit p-v alue – 0 . 701 0 . 729 0 . 647 Maximum of the potential scale r eduction factors (PSRF) f – 1 . 00874 1 . 00754 1 . 00939 Multiv ariate poten tial scale reduction f actor (MPSRF) f – 1 . 00928 1 . 00925 1 . 01002 a Standard (con ven tional) negativ e binomial estimated b y maximum likelihoo d estimation (MLE). b Standard negative binomial estimated by Marko v Chain Monte Car l o (MCMC) simulat ions. c Restricted tw o-s tate Marko v switching negative binomial (MSNB) mo del with only the int er cept and o ver-disp ersion parameters allo wed to v ary b et we en states. d F ull t wo-state Marko v switch ing negative binomial (MSNB) mo del wi th all parameters allow ed to v ary betw een states. e The pav ement quality index (PQI) is a composi te measure of ov erall pa vemen t quality ev aluated on a 0 to 100 scale. f PSRF/MPSRF ar e calculated separately/jointly for all cont inuo us model parameters. PSRF and MPSRF are close to 1 for con verged MCMC c hains. 12 T able 2 Summary statistics of r oadw ay segmen t c haracteristic v ariables V ariable Mean Standard deviation Minimum Median Maximum Acciden t o ccurring on inte r states I-70 or I-164 (dumm y) . 155 . 363 0 0 1 . 00 Pa v emen t quali t y index (PQI) av erage a 88 . 6 5 . 96 69 . 0 90 . 3 98 . 5 Road segmen t length (in miles) . 886 1 . 48 . 00900 . 356 11 . 5 Logarithm of road segment length (in miles) − . 901 1 . 22 − 4 . 71 − 1 . 03 2 . 44 T otal num b er of ramps on the road viewing and opp osite sides . 725 1 . 79 0 0 16 Number of ramps on the viewing side p er l ane p er mi le . 138 . 408 0 0 3 . 27 Median configuration is depressed (dumm y) . 630 . 484 0 1 . 00 1 . 00 Median barr ier presence (dummy) . 161 . 368 0 0 1 In terior shoulder presence (dummy) . 928 . 258 0 1 1 Width of the interior shoulder is less that 5 feet (dumm y) . 696 . 461 0 1 . 00 1 . 00 In terior rumble strips presence (dumm y) . 722 . 448 0 1 . 00 1 . 00 Width of the outside shoulder is less that 12 feet (dummy) . 752 . 432 0 1 . 00 1 . 00 Outside barrier absence (dummy) . 830 . 376 0 1 . 00 1 . 00 Ave r age ann ual daily traffic (AADT) 3 . 03 × 10 4 2 . 89 × 10 4 . 944 × 10 4 1 . 65 × 10 4 14 . 3 × 10 4 Logarithm of av erage ann ual daily traffic 10 . 0 . 623 9 . 15 9 . 71 11 . 9 Po sted s peed limit (in mph) 63 . 1 3 . 89 50 . 0 65 . 0 65 . 0 Number of bridges p er mile 1 . 76 8 . 14 0 0 1 24 Maximum of recipro cal v alues of horizon tal curv e radii (in 1 / mile) . 650 . 632 0 . 589 2 . 26 Maximum of recipro cal v alues of v ertical curve r adii (in 1 / mile) 2 . 38 3 . 59 0 0 14 . 9 Number of v er tical curves p er mile 1 . 50 4 . 03 0 0 50 . 0 Pe r cen tage of single unit truc ks (daily av erage) . 0859 . 0678 . 00975 . 0683 . 322 Win ter season (dummy ) . 242 . 428 0 0 1 . 00 Spring season (dumm y) . 254 . 435 0 0 1 . 00 Summer s eason (dummy) . 254 . 435 0 0 1 . 00 a The pav ement quality index (PQI) is a composite measure of o verall pav ement quality ev aluated on a 0 to 100 scale. 13 F o cusing o n t he full MSNB mo del, whic h is statistically superior, its estimation results sho w that the less frequen t state s t = 1 is ab out four times as rare as the more frequen t state s t = 0 (refer to the es t imated v alues of the unconditional probabilities ¯ p 0 and ¯ p 1 of states 0 and 1, whic h ar e g iv en in t he “F ull MSNB” columns in T able 1). Also, the findings sho w that the les s f r equen t state s t = 1 is conside rably les s safe than the more frequen t stat e s t = 0 . This r esult fo llo ws f r om the v alues of the mean wee kly acciden t rate λ t,n [giv en b y Eq. (5) with mo del pa r a meters β - s set to their p o sterior means in the tw o states], av eraged ov er a ll v alues of the explanatory v ar ia bles X t,n observ ed in the data sample (see “mean acciden t rate” in T able 1). F or the full MSNB mo del, on a verage, state s t = 1 has ab out tw o time s mor e acciden ts p er w eek than state s t = 0 has. 14 Therefore, it is not a surprise, that in Figure 2 the w eekly n umber of acciden ts (s ho wn on the b ottom plot) is la rger when t he p osterior probability P ( s t = 1 | Y ) of t he state s t = 1 (show n on t he t o p plot) is higher. Note that t he long- term unconditional mean of the acciden t rates is equal to the av erage of the mean acciden t rate ov er the t wo states, t his av era g e is calculated by using the stationary probabilities ¯ p 0 and ¯ p 1 giv en b y Eq. (2) (see the “unconditional probabilities of states 0 and 1” in T able 1). It is also noteworth y tha t the num b er of a cciden ts is more volatile in the less frequen t and less-safe stat e ( s t = 1). This is reflected in the fact t ha t the standard deviation of the acciden t rate (std t,n = q λ t,n (1 + αλ t,n ) for NB distribution), a v erag ed o v er all v alues of explanatory v ariables X t,n , is higher in state s t = 1 t ha n in state s t = 0 ( r efer to T able 1). Moreo ver, for the full MSNB mo del the ov er- disp ersion parameter α is higher in state s t = 1 ( α = 0 . 4 43 in state s t = 0 and α = 1 . 1 6 in stat e s t = 1). Because state s t = 1 is relatively rare, this suggests t hat ov er-disp ersed v olatility of a cciden t frequencies, whic h is of t en observ ed in empirical data, could b e in part due to the laten t switc hing b etw een the states, and in part due to high acciden t v olatility in the less frequen t and less safe state s t = 1 . T o study the effect of w eather ( whic h is usually unobserv ed heterogeneit y in most data bases) o n states, T able 3 giv es time-correlation co efficien ts b et wee n p osterior probabilities P ( s t = 1 | Y ) for the full MSNB mo del a nd w eather- condition v ariables. These c o rrelations w ere found b y using daily and hourly historical w eather data in Indiana, a v ailable at the Indiana State Climate 14 Note th at acciden t frequency rates can easily b e con verted from on e time p er io d to another (for e x amp le, we ekly rates can b e conv erted to an nual rates). Because acciden t ev en ts are in dep end en t, th e conv ersion is done b y a s u mmation of moment- generating (or c haracteristic) functions. The sum of Po iss on v ariates is Poi sson. The sum of NB v ariates is a lso NB if all explanatory v ariables d o not dep end on t ime ( X t,n = X n ). 14 T able 3 Correlations of the p osterior probabilities P ( s t = 1 | Y ) with weather-co n dition v ari- ables for the full MSNB mo del All y ear Win ter Summer (No v emb er–Marc h) (Ma y–Septem b er) Precipitation (inc h) 0 . 03 1 – 0 . 144 T emp erature ( o F ) − 0 . 518 − 0 . 591 0 . 201 Sno w fall (inch) 0 . 602 0 . 577 – > 0 . 2 (dumm y) 0 . 651 0 . 638 – F og / F rost (dum m y) 0 . 223 (frost) 0 . 539 (fog ) 0 . 05 1 Visibilit y distance (mile) − 0 . 221 − 0 . 232 − 0 . 126 Office at Purdue Univ ersity ( www.agry .purdue.edu/climate). F or these corre- lations, the precipitation and sno wfall amoun ts are daily amounts in inc hes a ve raged o v er the w eek and across sev eral w eather observ ation stations that are lo cated close to the roadw ay segmen ts. 15 The temp erature v ariable is the mean daily air temp erature ( o F ) a v eraged o ver the w eek and across the w eather statio ns. The effect of fog/frost is captured b y a dumm y v ariable that is equal to one if a nd o nly if the difference b etw een air and dewp oint temp era- tures do es not excee d 5 o F (in this case frost can form if the dewp oint is b elo w the freezing point 32 o F , and fog can form otherwise). The fog/frost dummies are calculated for ev ery hour and are a veraged ov er the w eek and across the w eather stations. Finally , visibilit y distance v ariable is the harmonic mean of hourly visibilit y distances, whic h are measured in miles ev ery hour and are a ve raged o v er the w eek and across the w eather stations. 16 T able 3 show s tha t the less frequen t and less safe state s t = 1 is p ositiv ely correlated with extreme temp eratures (lo w during win ter and high during summer), rain precipitations and sno wfalls, fogs a nd f rosts, lo w vis ibility dis- tances. It is reasonable to expect that during bad w eather, roads can b ecome significan tly less safe, r esulting in a change of the state of roadwa y safet y . As a useful test of the switc hing b et wee n the t w o states, all w eather v ariables, listed in T able 3, were a dded in to our full MSNB mo del. Ho wev er, when doing t his, the t wo states did not disappear a nd the p o sterior probabilities P ( s t = 1 | Y ) did not c hanged substan tially (the correlatio n b et wee n the new and the old probabilities w as around 90%). 15 Sno w fall and precipitation amount s a re w eakly related with eac h other b ecause sno w dens it y ( g /cm 3 ) can v ary by more than a factor of ten. 16 The harmonic mean ¯ d of distances d n is calculated as ¯ d − 1 = (1 / N ) P N n =1 d − 1 n , assuming d n = 0 . 25 miles if d n ≤ 0 . 25 miles. 15 In addition to the MSNB mo dels, w e estimated tw o-state Mark o v switc hing P oisson (MSP) mo dels, whic h ha v e the P o isson likelih o o d function inste ad of the NB lik eliho o d function in Eq. (4). Our findings for the MSP mo dels are v ery similar to those for the MSNB mo dels (Malyshkina, 20 08). Also, b ecause the time series in Figure 2 s eems to exhibit a seasonal pattern (roads a pp ear to b e less safe and P ( s t = 1 | Y ) app ears to b e higher during win ters), w e esti- mated MSNB and MSP mo dels in whic h the transition pro babilities p 0 → 1 and p 1 → 0 are not constan t (allow ing eac h of them to assume tw o differen t v alues: one during win ters and the other during all remaining seasons). How ev er, these mo dels did not p erform as w ell as the MSNB and MSP mo dels with constant transition probabilities [as judged b y the Ba y es factors, see Eq. (10)]. 17 5 Summary and con clusions The empirical finding that t w o states exist and that these states are correlated with w eather conditions has imp ortan t implications. The findings suggest that m ultiple states of roa dw ay safet y can exist due to slow and/or inadequate adjustmen t b y driv ers (a nd p ossibly b y roadwa y main tenance se rvices) to ad- v erse conditions and other unpredictable, uniden tified, and/or unobserv able v a r ia bles that influence roadw ay safety . All these v ariables are lik ely to interact and c hange ov er time, resulting in transitions from one state to the ne xt. As disc ussed earlier, the empiric al findings s how that the less frequ en t state is significantly le ss safe than t he other, m ore freq uen t state. T he full M SNB mo del results show that explanatory v ariables X t,n , other than the interc ept, exert differen t influences on ro a dw ay safet y in differen t states as indicated by the fact that some of the para meter estimates for the tw o states of the full MSNB mo del are significantly differen t. 18 Th us, the states not only differ b y a ve rage acciden t frequencies, but also differ in the magnitude and/or direction of the effects that v arious v ariables e xert o n acc ident frequencies . T his again underscores the imp ortance of the t w o- state approac h. The Marko v switc hing mo dels presen ted in this study are similar to zero- inflated count data mo dels (whic h hav e b een previously applied in acciden t frequency researc h) in the sense that they are also tw o -state mo dels (see 17 W e ha ve only fiv e winte r p eriod s in our fi v e-y ear data. MSNB and MSP with seasonally changing tr an s ition probabilities could p erform b etter for an acciden t data that co v ers a longer time interv al. 18 T able 1 shows that parameter estimates for p a ve men t qualit y ind ex, tota l n u m b er of ramps on the road viewing and opp osite sid es, a verag e annual d aily traffic, num b er of bridges p er mile, and p ercen tage of sin gle u nit tr uc ks are all significan tly different b et ween the t wo states. 16 Shank a r et al. , 199 7; Lord et al., 2007). How ev er, in con trast to zero- inflated mo dels, the mo dels presen ted herein allow for switc hing b et w een the t w o states o ve r time. Also, in this study , a “safe” state is not assumed and acciden t fre- quencies can b e nonzero in b oth states. In terms of future w ork o n M a rk ov switc hing models for roadw a y safety , ad- ditional empirical studies (for other acciden t data samples), and m ulti-state mo dels (with more than t w o states of roadw ay safety ) a r e t wo areas that w ould further dem onstrate the p oten tial of the approac h. Ac knowledgm en ts W e thank Dominique Lord f o r his in terest in this study and f or useful discus- sions. References Breiman L., 1 969. Probability and sto c hastic pro cesse s with a view tow ard applications. Hough ton Miffl in Co., Boston. Bro oks, S.P ., Gelman, A., 199 8. General metho ds for monitoring con ve rgence of iterat ive sim ulations. J. Comput. G raphic. Statist. 7(4), 434-455. Carson, J., Mannering, F.L., 2001. The effect of ice warning signs o n ice- acciden t frequencies and sev erities. Accid. Anal. Prev. 33(1), 99-1 0 9. Hadi, M.A., Aruldhas, J., Lee-F ang Cho w, W attlew or t h, J.A., 1995. Estimat- ing safet y effects of cross-section design for v arious high wa y types using negativ e binomial regression. T ranspor t . Res . Rec. 1500, 169- 1 77. Hamilton, J.D ., 1 9 89. A new approach to the economic analysis of nonstation- ary time series and the business cycle. Econometrica 57, 357-384. Kass, R.E., Raftery , A.E., 1995 . Ba ye s F actors. J. Americ. Sta t ist. Asso c. 90(430), 7 73-795. Lee, J., Mannering, F.L., 2 0 02. Impact of roadside features on the freq uency and sev erit y of run-o ff - roadw ay acciden ts: an empirical analysis. Accid. Anal. Prev. 34(2), 149-161. Lord, D., W ashington, S., Iv an, J.N., 2 005. P o isson, P oisson-gamma and zero- inflated regres sion mo dels of motor v ehicle crashes : balancing statistical fit and theory . Acc id. Anal. Prev. 37(1), 35-46. Lord, D ., W ashington, S., Iv an, J.N., 2007. F urther notes o n the application of zero-infla t ed mo dels in high w ay safety . Accid. Anal. Prev. 39(1), 53- 5 5. Malyshkina, N.V., 2008. Mark o v switc hing mo dels: an application of to roadw ay safet y . PhD thesis in prepara tion, Purdue Univ ersit y . h ttp://a rxiv.org/abs/0808.144 8 17 McCulloch, R.E., Tsa y , R.S., 1 994. Statistical analysis of economic time series via Mark o v switc hing mo dels. J. Time Series Analys . 15(5), 523-539. Miaou, S.P ., 1994. The r elat io nship b et w een truc k acciden ts and geometric design o f road sections: P oisson v ersus negative binomial regressions. Accid. Anal. Prev. 26(4), 471-482. Miaou, S.P ., Lord, D., 2003. Mo deling traffic crash-flow relationships for in ter- sections: disp ersion parameter, functional fo r m, a nd Bay es v ersus empirical Ba ye s methods. T ra nsp ort. Res . Rec. 1840, 31-40 . P o ch, M., Mannering, F.L., 1996 . Negative binomial analysis of inte rsection acciden t frequency . J. T ransp ort. Eng. 122(2), 105-113 . SAS Institute Inc., 20 06. Preliminary Capabilities for Bay esian Analysis in SAS/ST A T Soft ware. Cary , NC: SAS Institute Inc. h ttp://supp o rt.sas.com/rnd/app/pap ers/ba y esian.p df Shank a r , V.N., Mannering, F.L., Barfield, W., 1995. Effect of ro a dw ay geomet- rics and environme ntal facto r s on rural freew a y acciden t frequencies. Accid. Anal. Prev. 27(3), 371-389. Shank a r , V., Milton, J., Mannering, F ., 19 97. Mo deling acciden t frequencies as zero-altered probabilit y pro cesses: an empirical inquiry . Accid. Anal. Prev. 29(6), 829 -837. Tsa y , R. S., 2002. Analysis of financial time series: financial econometrics. John Wiley & Sons , Inc . W ashington, S.P ., Ka rlaftis, M.G., Mannering, F.L., 2003. Statistical a nd econometric metho ds for transp orta t io n data analysis. Chapman & Hall/CR C. A pp endix: MCMC sim ulation a lgorithm F or brevit y , in this app endix we o mit mo del notation M in all equations. F or example, here we write the p osterior distribution, giv en b y Eq. (8 ), a s f ( Θ | Y ). W e c ho ose the prior distribution π ( Θ ) of parameter v ector Θ , give n by Eq. ( 7 ), as π ( Θ ) = f ( S | p 0 → 1 , p 1 → 0 ) π ( p 0 → 1 , p 1 → 0 ) 1 Y s =0 " π ( α ( s ) ) Y k π ( β ( s ) ,k ) # , (A.1) π ( p 0 → 1 , p 1 → 0 ) ∝ π ( p 0 → 1 ) π ( p 1 → 0 ) I ( p 0 → 1 ≤ p 1 → 0 ) , (A.2) f ( S | p 0 → 1 , p 1 → 0 ) = P ( s 1 ) T Y t =2 P ( s t | s t − 1 ) ∝ T Y t =2 P ( s t | s t − 1 ) = = ( p 0 → 1 ) n 0 → 1 (1 − p 0 → 1 ) n 0 → 0 ( p 1 → 0 ) n 1 → 0 (1 − p 1 → 0 ) n 1 → 1 (A.3) Here β ( s ) ,k is the k th comp onen t of vec t o r β ( s ) . The priors of α ( s ) and β ( s ) ,k are c hosen to b e normal distributions, π ( α ( s ) ) = N ( µ α , Σ α ) and π ( β ( s ) ,k ) = 18 N ( µ k , Σ k ). The joint prior of p 0 → 1 and p 1 → 0 is giv en b y Eq. (A.2), where π ( p 0 → 1 ) = B eta ( υ 0 , ν 0 ) and π ( p 1 → 0 ) = B eta ( υ 1 , ν 1 ) are b eta distributions, and function I ( p 0 → 1 ≤ p 1 → 0 ) is equal to unity if the restriction giv en b y Eq . (3) is satisfied and to zero otherwis e. W e disregard distribution P ( s 1 ) in Eq. (A.3) b ecause its con tribution is negligible when T is large (alternatively , we can assume P ( s 1 = 0) = P ( s 1 = 1) = 1 / 2). Num b er n i → j is the total n um b er of state transitions i → j (where i, j = 0 , 1). Parameters t ha t en ter t he prior distributions a r e called h yp er-parameters. F or these, the means µ α and µ k are equal to the ma ximum lik eliho o d estimation (MLE) v alues of α and β k for the corr espo nding standard single-state mo del (NB model in this study). The v ariances Σ α and Σ k are ten times larg er than the maxim um b etw een the MLE v alues of α and β k squared and the MLE v ariances of α and β k for the standard mo del. W e c ho o se υ 0 = ν 0 = υ 1 = ν 1 = 1 (in this case the b eta distributions become the unif o rm dis t r ibution betw een zero and one). T o obtain dra ws from a posterior distribution, w e use the h ybrid G ibbs sam- pler, whic h is an MCMC sim ula tion algorithm that inv olv es bot h Gibbs and Metrop olis-Hasting sampling (McCullo c h and Tsa y, 1994; Tsa y , 2 0 02; SAS Institute Inc . , 2006). Assume that Θ is comp osed of K comp onen ts: Θ = [ θ ′ 1 , θ ′ 2 , . . . , θ ′ K ] ′ , where θ k can be scalars or v ectors, k = 1 , 2 , . . . , K . Then, the h ybrid Gibbs sampler w orks as follo ws: (1) Cho ose an ar bitrary initial v alue of the parameter v ector, Θ = Θ (0) , suc h that f ( Y , Θ (0) ) > 0. (2) F or eac h g = 1 , 2 , 3 , . . . , parameter v ector Θ ( g ) is generated comp onen t- b y-comp onent f r o m Θ ( g − 1) b y the follo wing procedure: (a) Fir st, dra w θ ( g ) 1 from the conditional p osterior probabilit y distribu- tion f ( θ ( g ) 1 | Y , θ ( g − 1) 2 , . . . , θ ( g − 1) K ). If this distribution is exactly known in a closed analytical form, then w e draw θ ( g ) 1 directly from it. This is G ibbs sampling. If the c onditional po sterior distribution is know n up to an unknow n normalization constan t, then we dra w θ ( g ) 1 b y us- ing the Metrop olis-Hasting (M-H) algorithm describ ed b elow. This is M-H sampling. (b) Second, for all k = 2 , 3 , . . . , K − 1, draw θ ( g ) k from the conditional p os- terior distribution f ( θ ( g ) k | Y , θ ( g ) 1 , . . . , θ ( g ) k − 1 , θ ( g − 1) k +1 , . . . , θ ( g − 1) K ) by us- ing either Gibbs sampling (if the distribution is kno wn exactly) or M-H sampling (if t he distribution is kno wn up to a constan t). (c) Finally , dra w θ ( g ) K from conditional p osterior probability distribution f ( θ ( g ) K | Y , θ ( g ) 1 , . . . , θ ( g ) K − 1 ) by using either Gibbs or M- H sampling. (3) The resulting Mark ov c hain { Θ ( g ) } con v erges to the true p osterior dis- tribution f ( Θ | Y ) as g → ∞ . Note that all conditional p osterior distributions are prop ortional to the join t distribution f ( Y , Θ ) = f ( Y | Θ ) π ( Θ ), where the lik eliho o d f ( Y | Θ ) is given 19 b y Eq. (6 ) and the prior π ( Θ ) is giv en b y Eq. (A.1 ) . By using the h ybrid G ibbs sampler algorithm describ ed ab ov e, w e obtain a Mark ov c hain { Θ ( g ) } , where g = 1 , 2 , . . . , G bi , G bi + 1 , . . . , G . W e discard the first G bi “burn-in” dra ws b ecause they can depend on the initial c ho ice Θ (0) . Of the remaining G − G bi dra ws, w e t ypically store eve ry third or ev ery ten th dra w in t he computer memory . W e use these draws for Bay esian inference. Our typic a l c hoice is G bi = 3 × 10 5 and G = 3 × 10 6 . In our study , one MCMC sim ulation run t ak es f ew days on a single computer CPU. W e usually con- sider eight c hoices of the initial parameter v ector Θ (0) . Th us, w e obta in eight Mark ov chains of Θ , and use them for the Bro oks-Gelman-Rubin diagnos- tic o f con ve rgence of our MCMC sim ulations (Bro o ks and Gelman, 1998). W e also c hec k con vergenc e b y monitoring t he lik eliho o d f ( Y | Θ ( g ) ) and the j o in t distribution f ( Y , Θ ( g ) ). The Metrop olis-Hasting (M-H) algorithm is used to sample fr o m conditional p osterior distributions kno wn up to their normalization constants . 19 F or brevit y , w e use notation f g ( θ k | Y , Θ \ θ k ) ≡ f ( θ k | Y , θ ( g ) 1 , . . . , θ ( g ) k − 1 , θ ( g − 1) k +1 , . . . , θ ( g − 1) K ), where Θ \ θ k means all comp onents of Θ exce pt θ k . T he M-H algo rithm is as follo ws: • Cho ose a jumping probability distribution J ( ˆ θ k | θ k ) of ˆ θ k . It m ust stay the same fo r all dr aws g = G bi + 1 , . . . , G , and we discuss its c hoice b elow . • Draw a candidate ˆ θ k from J ( ˆ θ k | θ ( g − 1) k ). • Calculate ratio ˆ p = f g ( ˆ θ k | Y , Θ \ θ k ) f g ( θ ( g − 1) k | Y , Θ \ θ k ) × J ( θ ( g − 1) k | ˆ θ k ) J ( ˆ θ k | θ ( g − 1) k ) . (A.4) • Set θ ( g ) k = ˆ θ k with pro babilit y min( ˆ p, 1 ) , θ ( g − 1) k otherwise . (A.5) Note t hat the unknown normalization constant of f g ( . . . ) cancels out in Eq. (A.4). In this study Θ is giv en by Eq. (7), and the h ybrid Gibbs sampler gen erates dra ws Θ ( g ) from Θ ( g − 1) as follows (for brevit y , below w e drop g indexing): (a) W e dra w v ector β (0) comp onen t- by-component b y using the M-H a lgo- rithm. F or eac h comp onen t β (0) ,k of β (0) w e use a nor ma l jumping dis- tribution J ( ˆ β (0) ,k | β (0) ,k ) = N ( β (0) ,k , σ 2 (0) ,k ). V ariances σ 2 (0) ,k are adjusted 19 In general, the M-H algorithm allo ws to mak e draws from any pr obabilit y distri- bution known up to a constan t. The algorithm co nv erges as the n umber of d r a ws go es t o infinity . 20 during the burn-in sampling ( g = 1 , 2 , . . . , G bi ) to ha ve approxim a tely 30% acceptance rate in Eq. (A.5). 20 The conditional p osterior distribu- tion of β (0) ,k is f ( β (0) ,k | Y , Θ \ β (0) ,k ) ∝ f ( Y , Θ ) = f ( Y | Θ ) π ( Θ ) ∝ f ( Y | Θ ) π ( β (0) ,k ) . (b) W e draw α (0) first, all comp onen ts of β (1) second, and α (1) third, from their conditio nal p osterior distributions b y using the M-H algorithm in a w ay very sim ila r to the drawing the comp onen ts of β (0) . In all cases, w e use normal jumping distributions with v ariances c hosen to hav e ≈ 30% acceptance rates. (c) By using Gibbs sampling, w e draw , first, p 0 → 1 and, second, p 1 → 0 from their conditio nal p osterior distributions, whic h are truncated b eta distri- butions, f ( p 0 → 1 | Y , Θ \ p 0 → 1 ) ∝ f ( Y , Θ ) ∝ f ( S | p 0 → 1 , p 1 → 0 ) π ( p 0 → 1 , p 1 → 0 ) ∝ ∝ B eta ( υ 0 + n 0 → 1 , ν 0 + n 0 → 0 ) I ( p 0 → 1 ≤ p 1 → 0 ) , f ( p 1 → 0 | Y , Θ \ p 1 → 0 ) ∝ B eta ( υ 1 + n 1 → 0 , ν 1 + n 1 → 1 ) I ( p 0 → 1 ≤ p 1 → 0 ) . (A.6) (d) Finally , we dra w comp onents of S = [ s 1 , s 2 , . . . , s T ] ′ b y Gibbs sampling. Neigh b or ing comp onen ts of S can b e strongly (anti-)correlated. There- fore, to speed up MCMC con v ergence in this case, we dra w subse ctions S t,τ = [ s t , s t +1 , . . . , s t + τ − 1 ] ′ of S at a time. The conditional p osterior dis- tribution of S t,τ is f ( S t,τ | Y , Θ \ S t,τ ) ∝ f ( Y , Θ ) ∝ f ( Y | Θ ) f ( S | p 0 → 1 , p 1 → 0 ) . (A.7) V ector S t,τ has length τ and can assume 2 τ p ossible v alues. By ch o osing τ small enough, w e can compute the right-hand-side of Eq. (A.7) for eac h of these v alues and find t he normalization constant of f ( S t,τ | Y , Θ \ S t,τ ). This allo ws us to ma ke Gibbs sampling o f S t,τ . Our ty pical c hoice o f τ is from 10 to 14. W e draw all subsections S t,τ one after another. Additional details on MCMC sim ulat io n algor it hms and their implemen ta tion in the con text of acc ident mo deling can b e fo und in Malyshkina (2008). 20 W e also tr ied Cauc hy ju mping d istributions and obtained similar results. 21
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment