We discuss some methods to quantitatively investigate the properties of correlation matrices. Correlation matrices play an important role in portfolio optimization and in several other quantitative descriptions of asset price dynamics in financial markets. Specifically, we discuss how to define and obtain hierarchical trees, correlation based trees and networks from a correlation matrix. The hierarchical clustering and other procedures performed on the correlation matrix to detect statistically reliable aspects of the correlation matrix are seen as filtering procedures of the correlation matrix. We also discuss a method to associate a hierarchically nested factor model to a hierarchical tree obtained from a correlation matrix. The information retained in filtering procedures and its stability with respect to statistical fluctuations is quantified by using the Kullback-Leibler distance.
Deep Dive into Correlation, hierarchies, and networks in financial markets.
We discuss some methods to quantitatively investigate the properties of correlation matrices. Correlation matrices play an important role in portfolio optimization and in several other quantitative descriptions of asset price dynamics in financial markets. Specifically, we discuss how to define and obtain hierarchical trees, correlation based trees and networks from a correlation matrix. The hierarchical clustering and other procedures performed on the correlation matrix to detect statistically reliable aspects of the correlation matrix are seen as filtering procedures of the correlation matrix. We also discuss a method to associate a hierarchically nested factor model to a hierarchical tree obtained from a correlation matrix. The information retained in filtering procedures and its stability with respect to statistical fluctuations is quantified by using the Kullback-Leibler distance.
arXiv:0809.4615v1 [q-fin.ST] 26 Sep 2008
Correlation, hierarchies, and networks in
financial markets
Michele Tumminello a Fabrizio Lillo a,b Rosario N. Mantegna a
aDipartimento di Fisica e Tecnologie Relative, Universit`a di Palermo, Viale delle
Scienze, I-90128 Palermo, Italy
bSanta Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, U.S.A.
Abstract
We discuss some methods to quantitatively investigate the properties of correlation
matrices. Correlation matrices play an important role in portfolio optimization and
in several other quantitative descriptions of asset price dynamics in financial mar-
kets. Specifically, we discuss how to define and obtain hierarchical trees, correlation
based trees and networks from a correlation matrix. The hierarchical clustering and
other procedures performed on the correlation matrix to detect statistically reliable
aspects of the correlation matrix are seen as filtering procedures of the correlation
matrix. We also discuss a method to associate a hierarchically nested factor model
to a hierarchical tree obtained from a correlation matrix. The information retained
in filtering procedures and its stability with respect to statistical fluctuations is
quantified by using the Kullback-Leibler distance.
Key words: multivariate analysis, hierarchical clustering, correlation based
networks, bootstrap validation, factor models, Kullback-Leibler distance.
JEL classification: C32, G10
1
Introduction
Many complex systems observed in the physical, biological and social sciences
are organized in a nested hierarchical structure, i.e. the elements of the system
can be partitioned in clusters which in turn can be partitioned in subclusters
and so on up to a certain level (Simon, 1962). The hierarchical structure of
interactions among elements strongly affects the dynamics of complex sys-
tems. Therefore a quantitative description of hierarchies of the system is a
key step in the modeling of complex systems (Anderson, 1972). The analy-
sis of multivariate data provides crucial information in the investigation of
Preprint submitted to Elsevier
26 November 2024
a wide variety of systems. Multivariate analysis methods are designed to ex-
tract the information both on the number of main factors characterizing the
dynamics of the investigated system and on the composition of the groups
(clusters) in which the system is intrinsically organized. Recently physicists
started to contribute to the development of new techniques to investigate mul-
tivariate data (Blatt et al., 1996; Hutt et al., 1999; Mantegna, 1999; Giada
and Marsili, 2001; Kraskov et al., 2005; Tumminello et al., 2005; Tsafrir et al.,
2005; Slonim, 2005). Among multivariate techniques, natural candidates for
detecting the hierarchical structure of a set of data are hierarchical clustering
methods (Anderberg, 1973).
The modeling of the correlation matrix of a complex system with tools of
hierarchical clustering has been useful in the multivariate characterization of
stock return time series (Mantegna, 1999; Bonanno et al., 2001; Bonanno et
al., 2003), market index returns of worldwide stock exchanges (Bonanno et
al., 2000), and volatility increments of stock return time series (Micciche et
al., 2003), where the estimation of statistical reliable properties of the corre-
lation matrix is crucial for several financial decision processes such as asset
allocation, portfolio optimization (Tola et al., 2008), derivative pricing, etc.
We have termed the selection of statistical reliable information of the corre-
lation matrix with the locution ”filtering procedure” in Ref. Tumminello et
al. (2007a). Hierarchical clustering procedures are filtering procedures. Other
filtering procedures which are popular within the econophysics community are
procedures based on the random matrix theory (Laloux et al., 1999; Plerou et
al., 1999; Rosenow et al., 2002; Coronnello et al., 2005; Potters et al., 2005;
Tumminello et al., 2007a), and procedures using the concept of shrinkage of a
correlation matrix (Ledoit and Wolf, 2003; Sch¨afer and Strimmer, 2005; Tum-
minello et al., 2007b). Many others might be devised and their effectiveness
tested.
The correlation matrix of the time series of a multivariate complex system
can be used to extract information about aspects of hierarchical organization
of such a system. The clustering procedure is done by using the correlation
between pairs of elements as a similarity measure and by applying a clustering
algorithm to the correlation matrix. As a result of the clustering procedure,
a hierarchical tree of the elements of the system is obtained. The correlation
based clustering procedure allows also to associate a correlation based network
with the correlation matrix. For example, it is natural to select the minimum
spanning tree, i.e. the shortest tree connecting all the elements in a graph, as
the correlation based network associated with the single linkage cluster anal-
ysis. Different correlation based networks can be ass
…(Full text truncated)…
This content is AI-processed based on ArXiv data.