A quantile-copula approach to conditional density estimation

Reading time: 6 minute
...

s/papers/0709.3192/cover.png"

📝 Original Info

  • Title: A quantile-copula approach to conditional density estimation
  • ArXiv ID: 0709.3192
  • Date: 2008-06-13
  • Authors: ** Olivier P. Faugeras (Université Paris‑Sud, LSTA) **

📝 Abstract

We present a new non-parametric estimator of the conditional density of the kernel type. It is based on an efficient transformation of the data by quantile transform. By use of the copula representation, it turns out to have a remarkable product form. We study its asymptotic properties and compare its bias and variance to competitors based on nonparametric regression.

💡 Deep Analysis

Figure 1

📄 Full Content

arXiv:0709.3192v3 [stat.ME] 12 Jun 2008 A quan tile- opula approa h to onditional densit y estimation. Olivier P . F augeras L.S.T.A, Université Paris 6 175, rue du Chevaler et, 75013 Paris, F r an e T el:+(33) 1 44 27 85 62 F ax:+(33) 1 44 27 33 42 Abstra t W e presen t a new non-parametri estimator of the onditional densit y of the k ernel t yp e. It is based on an e ien t transformation of the data b y quan tile transform. By use of the opula represen tation, it turns out to ha v e a remark able pro du t form. W e study its asymptoti prop erties and ompare its bias and v arian e to omp etitors based on nonparametri regression. A omparativ e n umeri al sim ulation is pro vided. Key wor ds: onditional densit y, k ernel estimation, opula, quan tile transform, nonparametri regression, 1991 MSC: 62G007, 62M20, 62M10 1 In tro du tion 1.1 Motivation Let ((Xi, Yi); i = 1, . . . , n) b e an indep enden t iden ti ally distributed sample from real-v alued random v ariables (X, Y ) sitting on a giv en probabilit y spa e. F or predi ting the resp onse Y of the input v ariable X at a giv en lo ation x, it is of great in terest of estimating not only the onditional mean or r e gr ession fun tion E(Y |X = x), but the full

onditional density f(y|x). Indeed, estimat- ing the onditional densit y is m u h more informativ e, sin e it allo ws not only to re al ulate the onditional exp e ted v alue E(Y |X) and onditional v arian e Email addr ess: olivier.faugeras gmail. om (Olivier P . F augeras ). Preprin t submitted to Elsevier No v em b er 1, 2018 from the densit y , but also to pro vide the general shap e of the onditional den- sit y . This is esp e ially imp ortan t for m ulti-mo dal or sk ew ed densities, whi h often arise from nonlinear or non-Gaussian phenomenas, where the exp e ted v alue migh t b e no where near a mo de, i.e. the most lik ely v alue to app ear. Moreo v er, for situations in whi h onden e in terv als are preferred to p oin t estimates, the estimated onditional densit y is an ob je t of ob vious in terest. 1.2 Estimation by kernel smo othing A natural approa h to estimate the onditional densit y f(y|x) of Y giv en X = x w ould b e to exploit the iden tit y f(y|x) = fXY (x, y) fX(x) (1) where fXY and fX denote the join t densit y of (X, Y ) and X , resp e tiv ely . By in tro du ing P arzen-Rosen blatt k ernel estimators of these densities, namely ˆfn,XY (x, y) : = 1 n n X i=1 K′ h′(Xi −x)Kh(Yi −y) ˆfn,X(x) : = 1 n n X i=1 K′ h′(Xi −x) where Kh(.) = 1/hK(./h) and K′ h′(.) = 1/h′K′(./h′) are (res aled) k ernels with their asso iated sequen e of bandwidth h = hn and h′ = h′ n going to zero as n →∞, one an onstru t the quotien t ˆf R n (y|x) := ˆfn,XY (x, y) ˆfn,X(x) and obtain an estimator of the onditional densit y . Su h an estimator w as rst studied b y Rosen blatt [26℄, and more re en tly b y Hyndman et al. [17 ℄, who sligh tly impro v ed on Rosen blatt’s k ernel based estimator. 1.3 Estimation by r e gr ession te hniques As p oin ted out b y n umerous authors, see e.g. F an and Y ao [7℄

hapter 6, this approa h is equiv alen t to the one arising from onsidering this onditional densit y estimation problem in a regression framew ork. Indeed, let F(y|x) b e the um ulativ e onditional distribution fun tion of Y giv en X = x. It stems from the fa t that E  1|Y −y|≤h|X = x  = F(y + h|x) −F(y −h|x) ≈2h.f(y|x) 2 as h →0 , that, if one repla e the exp e tation in the ab o v e expression b y its empiri al oun terpart, one an apply the usual lo al a v eraging metho ds and p erform a regression estimation on the syn theti data ((1/2h)1|Yi−y|≤h ; i = 1, . . . , n). By a Bo

hner t yp e theorem, one an ev en repla e the transformed data b y its smo othed v ersion Y ′ i := Kh(Yi −y) := 1 hK Yi −y h  . In parti ular, the p opular Nadara y a-W atson regression estimator ˆf NW n (y|x) := Pn i=1 Y ′ i K′ h′(Xi −x) Pn i=1 K′ h′(Xi −x) redu es itself to the same estimator of the onditional densit y of the double k ernel t yp e as b efore ˆf NW n (y|x) := Pn i=1 Kh(Yi −y)K′ h′(Xi −x) Pn i=1 K′ h′(Xi −x) = ˆf R n (y|x). T aking adv an tage of this regression form ulation, F an, Y ao and T ong [8℄ pro- p osed a onditional densit y estimator whi h generalizes the k ernel one b y use of the lo al p olynomial te hniques. In parti ular, it allo ws to ta kle with the bias issues of the k ernel smo othing. Ho w ev er, and unlik e the former, it is no longer guaran teed to ha v e p ositiv e v alue nor to in tegrate to 1 with resp e t to y . With these issues in mind, Hyndman and Y ao [18℄ built on lo al p oly- nomial te hniques and suggested t w o impro v ed metho ds, the rst one based on lo ally tting a log-linear mo del and the se ond one on onstrained lo al p olynomial mo deling. An o v erview an b e found in F an and Y ao [7℄ ( hapter 6 and 10). V ery re en tly , Gy ör and K ohler [15℄ studied a partitioning t yp e estimate and studied its prop erties in to

📸 Image Gallery

cover.png

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut