This work concerns estimation of linear autoregressive models with Markov-switching using expectation maximisation (E.M.) algorithm. Our method generalise the method introduced by Elliot for general hidden Markov models and avoid to use backward recursion.
Deep Dive into Estimation of linear autoregressive models with Markov-switching, the E.M. algorithm revisited.
This work concerns estimation of linear autoregressive models with Markov-switching using expectation maximisation (E.M.) algorithm. Our method generalise the method introduced by Elliot for general hidden Markov models and avoid to use backward recursion.
In the present paper we consider an extension of basic (HMM). Let (X t , Y t ) t∈Z be the process such that 1. (X t ) t∈Z is a Markov chain in a finite state space E = {e 1 , ..., e N }, which can be identified without loss of generality with the simplex of R N , where e i are unit vector in R N , with unity as the ith element and zeros elsewhere.
- Given (X t ) t∈Z , the process (Y t ) t∈Z is a sequence of linear autoregressive model in R and the distribution of Y n depends only of X n and
Hence, for a fixed t , the dynamic of the model is :
) + σ X t+1 ε t+1 with F X t+1 ∈ {F e 1 , …, F e N } linear functions, σ X t+1 ∈ {σ e 1 , …, σ e N } strictly positive numbers and (ε t ) t∈N * a i.i.d sequence of Gaussian random variable N (0, 1).
The Markov property implies here that P (X t+1 = e i |F t ) = P (X t+1 = e i |X t ) . Write a ij = P (X t+1 = e i |X t = e j ) and A = (a ij ) ∈ R N ×N and define :
With the previous notations, we obtain the general equation of the model, for t ∈ N :
The parameters of the model are the transition probabilities of the matrix A, the coefficients of the linear functions F e i and the variances σ e i . A successfull method for estimating such model is to compute the maximum likelihood estimator1 with the E.M. algorithm introduced by Demster , Lair and Rubin (1977). Generally, this algorithm demands the calculus of the conditional expectation of the hidden states knowing the observations (the E.-step), this can be done with the Baum and Welch forward-backward algorithm (see Baum et al. (1970)). The derivation of the M-step of the E.M. algorithm is then immediate since we can compute the optimal parameters of the regression functions thanks weighted linear regression.
However we show here that we can also embed these two steps in only one. Namely we can compute, for each step of the E.M. algorithm, directly the optimal coefficients of the regression functions as the variances and the transition matrix thanks a generalisation of the method introduced by Elliott (1994).
The fundamental technique employed throughout this paper is the discrete time change of measure. Write σ the vector (σ e 1 , …, σ e N ), φ(.) for the density of N (0, 1) and ., . the inner product in R N .
We wish to introduce a new probability measure P , using a density Λ, so that d P dP = Λ and under P the random variables y t are N (0, 1) i.i.d. random variables.
Define
and construct a new probability measure P by setting the restriction of the Radon-Nikodym derivative to G t equal to Λ t . Then the following lemma is a straightforward adaptation of lemma 4.1 of Elliot (1994) (see annexe).
Lemma 1 Under P the Y t are N (0, 1) i.i.d. random variables.
Conversely, suppose we start with a probability measure P such that under P 1. (X t ) t∈N is a Markov chain with transition matrix A.
t∈N is a sequence of N (0, 1) i.i.d. random variable.
We construct a new probability measure P such that under P we have
To construct P from P , we introduce λl := (λ l ) -1 and Λt := (Λ t ) -1 and we define P by putting dP d P | Gt = Λt , Definition 2 let (H t ), t ∈ N be a sequence adapted to (G t ), We shall write :
.
The proof of the following theorem is a detailled adaption of the proof of theorem 5.3 of Elliott (1994) (see annexe).
Theorem 1 Suppose H t is a scalar G-adapted process of the form :
f is a scalar valued function and α, β, δ are G predictable process (β will be N -dimensional vector process). Then :
where a i := Ae i , a T i is the transpose of a i and diag (a i ) is the matrix with vector a i for diagonal and zeros elsewhere.
We will now consider special cases of processes H. In all cases, we will calculate the quantity γ t,t (H t ) and deduce γ t (H t ) by summing the components of γ t,t (H t ). Then, we deduce from the conditional Bayes’ theorem the conditional expectation of H t : Ht) γt(1) .
3 Application to the Expectation (E.-step) of the E.M. algorithm
We will use the previous theorem in order to compute conditional quantities needed by the E.M. algorithm.
X l-1 , e r X l , e s be the number of jump from state e r to state e s at time t, we obtain :
Write now O r t = t+1 n=1 X n , e r for the number of times, up to t, that X occupies the state e r . We obtain
For the regression functions, the M-Step of the E.M. algorithm is achieved by finding the parameters minimising the weighted sum of squares :
where γ i (t) is the conditional expectation of the hidden e i at time t knowing the observations y -p+1 , • • • , y n . Write ψ T (t) = (1, y t-1 , …, y t-p ) and θ i = (a i 0 , …, a i p ), suppose that the matrix
Hence, in order to compute θi (n), we need to estimate the conditional expectation of the following processes :
for -1 ≤ j ≤ p and 1 ≤ r ≤ N .
X l , e r Y l+1 .
T D r t+1 (j) = t+1 l=1 X l , e r Y l-j for 0 ≤ j ≤ p and 1 ≤ r ≤ N .
Applying theorem (2) with H t+1 (j) = T A r t+1 (j), H 0 = 0, α t+1 = 0,
where a r is the r-th column of A.
Then, applying t
…(Full text truncated)…
This content is AI-processed based on ArXiv data.