Hessian and concavity of mutual information, differential entropy, and entropy power in linear vector Gaussian channels

Reading time: 6 minute
...

📝 Original Info

  • Title: Hessian and concavity of mutual information, differential entropy, and entropy power in linear vector Gaussian channels
  • ArXiv ID: 0903.1945
  • Date: 2009-03-12
  • Authors: Researchers from original ArXiv paper

📝 Abstract

Within the framework of linear vector Gaussian channels with arbitrary signaling, closed-form expressions for the Jacobian of the minimum mean square error and Fisher information matrices with respect to arbitrary parameters of the system are calculated in this paper. Capitalizing on prior research where the minimum mean square error and Fisher information matrices were linked to information-theoretic quantities through differentiation, closed-form expressions for the Hessian of the mutual information and the differential entropy are derived. These expressions are then used to assess the concavity properties of mutual information and differential entropy under different channel conditions and also to derive a multivariate version of the entropy power inequality due to Costa.

💡 Deep Analysis

Deep Dive into Hessian and concavity of mutual information, differential entropy, and entropy power in linear vector Gaussian channels.

Within the framework of linear vector Gaussian channels with arbitrary signaling, closed-form expressions for the Jacobian of the minimum mean square error and Fisher information matrices with respect to arbitrary parameters of the system are calculated in this paper. Capitalizing on prior research where the minimum mean square error and Fisher information matrices were linked to information-theoretic quantities through differentiation, closed-form expressions for the Hessian of the mutual information and the differential entropy are derived. These expressions are then used to assess the concavity properties of mutual information and differential entropy under different channel conditions and also to derive a multivariate version of the entropy power inequality due to Costa.

📄 Full Content

generalized to the abstract Wiener space by Zakai in [6] and by Palomar and Verdú in two different directions: in [1] they calculated the partial derivatives of the mutual information and differential entropy with respect to arbitrary parameters of the system, rather than with respect to the SNR alone, and in [7] they represented the derivative of mutual information as a function of the conditional marginal input given the output for channels where the noise is not constrained to be Gaussian.

In this paper we build upon the setting of [1], where loosely speaking, it was proved that, for the linear vector Gaussian channel

i) the gradients of the differential entropy h(Y ) and the mutual information I(S; Y ) with respect to functions of the linear transformation undergone by the input, G, are linear functions of the MMSE matrix E S and ii) the gradient of the differential entropy h(Y ) with respect to the linear transformation undergone by the noise, C, are linear functions of the Fisher information matrix, J Y .

In this work, we show that the previous two key quantities E S and J Y , which completely characterize the first-order derivatives, are not enough to describe the second-order derivatives. For that purpose, we introduce the more refined conditional MMSE matrix Φ S (y) and conditional Fisher information matrix Γ Y (y) (note that when these quantities are averaged with respect to the distribution of the output y, we recover E S = E {Φ S (Y )} and J Y = E {Γ Y (Y )}). In particular, the second-order derivatives depend on Φ S (y) and Γ Y (y) through the following terms: E {Φ S (Y ) ⊗ Φ S (Y )} and E {Γ Y (Y ) ⊗ Γ Y (Y )}. See Fig. 1 for a schematic representation of these relations.

Analogous results to some of the expressions presented in this paper particularized to the scalar Gaussian channel were simultaneously derived in [8], [9], where the second and third derivatives of the mutual information with respect to the SNR were calculated.

As an application of the obtained expressions, we show concavity properties of the mutual information and the differential entropy, derive a multivariate generalization of the entropy power inequality (EPI) due to Costa in [10]. Our multivariate EPI has already found an application in [11] to derive outer bounds on the capacity region in multiuser channels with feedback.

This paper is organized as follows. In Section II, the model for the linear vector Gaussian channel is given and the differential entropy, mutual information, minimum mean-square error, and Fisher information quantities as well as the relationships among them are introduced. The main results of the paper are given in Section III where we present the expressions for the Jacobian matrix of the MMSE and Fisher information and also for the Hessian matrix of the mutual information and differential entropy. In Section IV the concavity properties of the mutual information are studied and in Section V a multivariate generalization of Costa’s EPI in [10] is given. Finally, an extension to the complex-valued case of some of the obtained results is considered in Section VI.

Notation: Straight boldface denote multivariate quantities such as vectors (lowercase) and matrices (uppercase). Uppercase italics denote random variables, and their realizations are represented by lowercase italics. The sets of q-dimensional symmetric, positive semidefinite, and positive definite matrices are denoted by S q , S q + , and S q ++ , respectively. The elements of a matrix A are represented by A ij or [A] ij interchangeably, whereas the elements of a vector a are represented by a i . The operator diag(A) represents a column vector with the diagonal entries of matrix A, Diag(A) and Diag(a) represent a diagonal matrix whose non-zero elements are given by the diagonal elements of matrix A and by the elements of vector a, respectively, and vecA represents the vector obtained by stacking the columns of A. For symmetric matrices, vechA is obtained from vecA by eliminating the repeated elements located above the main diagonal of A. The Kronecker matrix product is represented by A ⊗ B and the Schur (or Hadamard) element-wise matrix product is denoted by A • B. The superscripts (•) T , (•) † , and (•) + , denote transpose, Hermitian, and Moore-Penrose pseudo-inverse operations, respectively. With a slight abuse of notation, we consider that when square root or multiplicative inverse are applied to a vector, they act upon the entries of the vector, we thus have √ a i = √ a i and [1/a] i = 1/a i .

We consider a general discrete-time linear vector Gaussian channel, whose output Y ∈ R n is represented by the following signal model

where S ∈ R m is the zero-mean channel input vector with covariance matrix R S , the matrix G ∈ R n×m specifies the linear transformation undergone by the input vector, and Z ∈ R n represents a zero-mean Gaussian noise with non-singular covariance matrix R Z .

The channel transition probability density func

…(Full text truncated)…

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut