In this note we provide explicit expressions and expansions for a special function which appears in nonparametric estimation of log-densities. This function returns the integral of a log-linear function on a simplex of arbitrary dimension. In particular it is used in the R-package "LogCondDEAD" by Cule et al. (2007).
Deep Dive into On an Auxiliary Function for Log-Density Estimation.
In this note we provide explicit expressions and expansions for a special function which appears in nonparametric estimation of log-densities. This function returns the integral of a log-linear function on a simplex of arbitrary dimension. In particular it is used in the R-package “LogCondDEAD” by Cule et al. (2007).
defines a maximum likelihood estimator f := exp( ψ) of a probability density on S, based on P .
For existence and uniqueness of this estimator see, for instance, Cule et al. (2008).
To compute ψ explicitly, note that ψ ∈ G is uniquely determined by its values at the corners (extremal points) of all simplices S j , and ψ d P is a linear function of these values. The second integral in (1) may be represented as follows: Let S j be the convex hull of x 0j , x 1j , . . . , x dj ∈ R d , and set
where
while J(•) is an auxiliary function defined and analyzed subsequently.
2 The special function J(•)
For d ∈ N let
Then for y 0 , y 1 , . . . , y d ∈ R we define
Standard considerations in connection with beta-and gamma-distributions as described in Section 6 reveal the following alternative representation:
E s and stochastically independent, standard exponential random variables E 0 , E 1 , . . . , E d . This representation shows clearly that J(•) is symmetric in its arguments.
An often useful identity is J(y 0 , y 1 , . . . , y d ) = exp(y * )J(y 0 -y * , y 1 -y * , . . . , y d -y * ) for any y * ∈ R.
(2)
For d = 1 one can compute J(y 0 , y 1 ) explicitly:
For d ≥ 2 one may use the following recursion formula:
Since J(y 0 , y 1 , . . . , y d ) is continuous in y 0 , y 1 , . . . , y d , it suffices to verify (3) in case of y 0 = y 1 .
We may identify T d with the set (v, u) : u ∈ T d-1 , v ∈ (0, 1 -u + ) . Then it follows from Fubini’s theorem that
It is well-known that for any integer 0 ≤ j < d,
are stochastically independent with B ∼ Beta(j + 1, d -j); see also Section 6. Hence we end up with the following recursive identity:
Here we used the well-known identity
Plugging in j = d -1 into the previous recursive equation leads to
(5)
With ȳ := (d + 1) -1 d i=0 y i and z i := y i -ȳ one may write J(y 0 , y 1 , . . . , y d ) = exp(ȳ)J(z 0 , z 1 , . . . , z d ) by virtue of (2). Note that z
It follows from Lemma 6.1 that
In particular,
and
Consequently, if y 0 = y d . This formula is okay numerically if y d -y 0 is not too small. Otherwise one should use (6). This leads to the the pseudo code in Table 1.
To avoid messy formulae, one can express partial derivatives of J(•) in terms of higher order versions of J(•) by means of the recursion (3). For instance, ∂J(y 0:d ) ∂y 0 = lim ǫ→0 J(y 0 + ǫ, y 1:d ) -J(y 0 , y 1:d ) ǫ = lim ǫ→0 J(y 0 , y 0 + ǫ, y 1:d ) = J(y 0 , y 0 , y 1:d ).
Table 1: Pseudo-code for J(y) with ordered input vector y.
Similarly,
In view of (3) we consider an arbitrary function f : R → R which is infinitely often differentiable.
Then
defines a smooth and symmetric function h : R 2 → R such that
Its first partial derivatives of order one and two are given by
The other partial derivatives of order one and two follow via symmetry considerations.
Recall that J(r, s) = Note first that
Thus it suffices to derive formulae for (r, s) = (0, y) and b ≤ a. It follows from (4) that
Lemma 6.1. The random vector B and the random variable G + are stochastically independent.
Moreover,
while B is distributed according to the Lebesgue density
As a by-product of this lemma we obtain the following formula: Corollary 6.2. For arbitrary numbers a 0 , a 1 , . . . , a m > 0,
Proof of Lemma 6.1. Note that G = (G i ) m i=0 my be written as Ξ(G + , B) with the bijective mapping
Note also that Γ(a i ) -1 u a i -1 i = Γ(a + ) -1 s a + -1 exp(-s) • f (u).
Since this is the density of Gamma(a + ) at s times f (u), we see that G + and B are stochastically independent, where G + has distribution Gamma(a + ), and that f is indeed a probability density on T m describing the distribution of B.
The fact that f integrates to one over T m entails Corollary 6.
This content is AI-processed based on ArXiv data.