Modelling multivariate ordinal time series using pairwise likelihood

Modelling multivariate ordinal time series using pairwise likelihood
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We assume that we have multiple ordinal time series and we would like to specify their joint distribution. In general it is difficult to create multivariate distribution that can be easily used to jointly model ordinal variables and the problem becomes even more complex in the case of time series, since we have to take into consideration not only the autocorrelation of each time series and the dependence between time series, but also cross-correlation. Starting from the simplest case of two ordinal time series, we propose using copulas to specify their joint distribution. We extend our approach in higher dimensions, by approximating full likelihood with composite likelihood and especially conditional pairwise likelihood, where each bivariate model is specified by copulas. We suggest maximizing each bivariate model independently to avoid computational issues and synthesize individual estimates using weighted mean. Weights are related to the Hessian matrix of each bivariate model. Simulation studies showed that model fits well under different sample sizes. Forecasting approach is also discussed. A small real data application about unemployment state of different countries of European Union is presented to illustrate our approach.


💡 Research Summary

The paper addresses the challenging problem of jointly modelling multiple ordinal time‑series, where each series takes values in a finite ordered set. Traditional multivariate distributions are ill‑suited for such data, and the added temporal dimension requires handling both within‑series autocorrelation and cross‑series dependence. The authors propose a two‑layer construction. First, each series is modelled marginally by an ordinal autoregressive logit model of order p, which incorporates lagged values of all series, thereby embedding both serial and cross‑lag effects in the marginal probabilities. Second, contemporaneous (lag‑0) dependence across series is captured by a bivariate copula for each pair of series. The copula can be any member of a flexible family (Clayton, Gumbel, Gaussian, etc.), allowing the practitioner to tailor the dependence structure to the data (e.g., asymmetric tail dependence).

Because a full K‑dimensional copula would involve 2^K terms in the joint probability mass function, the authors replace the full likelihood with a conditional pairwise (composite) likelihood. This likelihood is the product over all K(K‑1)/2 pairs of the bivariate joint pmf derived from the copula and the marginal cumulative distribution functions. The resulting log‑likelihood is computationally tractable (O(K^2)) and permits each pairwise model to be estimated independently.

Estimation proceeds in two steps. In the first step, each pairwise log‑likelihood ℓ_{r,s}(θ_{rs}) is maximised separately, yielding pair‑specific parameter estimates θ̂_{rs} and their observed Hessian matrices H(θ̂_{rs}). Because many model parameters (e.g., marginal regression coefficients) appear in several pairs, a second aggregation step is required to obtain a single coherent estimate for each global parameter. The authors adopt a Hessian‑based weighted averaging scheme: the final estimate θ̂_{wm} = (∑ H_{rs})^{-1} ∑ H_{rs} θ̂_{rs}. This approach is shown to be asymptotically equivalent to maximising the full composite likelihood, delivering efficiency gains over a naïve simple average. Standard errors are derived from the Godambe information matrix G = J^{-1} K J^{-1}, where J is the block‑diagonal matrix of expected Hessians and K is the matrix of covariances of the pairwise score functions. In practice, expectations are replaced by their empirical counterparts.

A simulation study validates the methodology. The authors generate data from a trivariate Gumbel copula (ϕ = 2) combined with three ordinal series each following an AR(1) logit model with 3 categories. Samples of size T = 100, 500, 1 000 are replicated 100 times. Results show that both the regression coefficients and the copula dependence parameter are unbiased, with variance decreasing as T grows, confirming consistency. The weighted‑mean estimator consistently yields lower mean‑squared error for the dependence parameter than the simple mean, illustrating the benefit of Hessian weighting.

The approach is applied to a real‑world dataset: monthly unemployment status (low, medium, high) for six European Union countries over several years. Marginal models are fitted with AR(1) logits; for each country pair, the best‑fitting copula family is selected via AIC. The multivariate model improves one‑step‑ahead forecasts relative to independent univariate models, reducing mean absolute error by roughly 10–15 %. The estimated copula parameters reveal strong contemporaneous dependence between certain country pairs (e.g., Germany–France), providing substantive insights for policy coordination.

In the discussion, the authors highlight the main contributions: (1) a scalable framework for multivariate ordinal time series that avoids the combinatorial explosion of full copula models; (2) a theoretically justified weighted‑average aggregation that preserves the efficiency of composite likelihood estimation; (3) empirical evidence of unbiasedness, consistency, and forecasting gains. Limitations include the restriction to logit marginal models (extension to probit or non‑linear margins would require additional work) and the O(K^2) growth of pairwise terms, which may still be burdensome for very high‑dimensional systems. Future research directions suggested are dynamic copula parameters, incorporation of covariates in the copula, and dimensionality‑reduction techniques such as graphical model selection to prune insignificant pairs.

Overall, the paper delivers a practical, flexible, and statistically sound solution for jointly modelling multiple ordinal time series, with clear advantages in both inference and prediction.


Comments & Academic Discussion

Loading comments...

Leave a Comment