A Lower Bound on the Bayesian MSE Based on the Optimal Bias Function

1 A Lo wer Bound on t he Bayesian MSE Based on the Optimal Bia s Function Zvika Ben-Haim, Student Member , I EEE, and Y onina C. Eldar , Senior Member , IEEE Abstract — A lower bound on the minimum mean-squared error ( MSE) in a Ba yesian estimation pr obl em is pr op osed in this paper . This b ound uti lizes a well-kn own connection to th e deterministic estimation setting. Using the prior distribution, the bias function whi ch minimizes the Cram ´ er –Rao bound can be determined, resulting i n a lower b ound on the Bayesian MSE . The bound is de veloped fo r the general case of a v ector parameter with an arbitrary p robability d istribution, and is shown to b e asymptotically tight in both the high and low signal-to-noise ratio regimes. A numerical study demonstrates several cases i n wh ich the p roposed technique is b oth simp ler to compute and tighter than alternativ e methods. Index T erms — Bayesian bounds, Bayesian estimation, mini- mum mean-squared error estimation, optimal bias, performance bounds. I . I N T RO D U C T I O N The g oal of estimation theor y is to infer the value of an unknown parame ter based on o bservations. A common approa ch to this p roblem is the Bayesian framework, in which the estimate is con structed by com bining th e measur ements with pr ior inform ation ab out the p arameter [1]. In this setting, the pa rameter θ is random , an d its distribution describ es the a p riori knowledge o f the unk nown value. In ad dition, measuremen ts x are obta ined, whose c ondition al distribution, giv en θ , p rovides further inform ation about the p arameter . The objective is to c onstruct an estimator ˆ θ , which is a fun ction of th e measuremen ts, so that ˆ θ is close to θ in some sense. A common measure of th e quality of an estimator is its mean- squared error (MSE), given by E {k θ − ˆ θ k 2 } . It is well-known that the posterior m ean E { θ | x } is the technique min imizing the MSE. Thu s, from a theor etical perspective, there is n o difﬁculty in ﬁndin g the minimu m MSE (MMSE) estimator in any given pro blem. In p ractice, howe ver, the complexity of com puting the posterio r mean is often p rohibitive. As a result, various alternatives, such as the maximu m a posteriori (MAP) technique, h av e been developed [2]. The purpose of such me thods is to ap proach the perfor mance of the MMSE estimator with a co mputation ally efﬁcient algorithm . An impo rtant g oal is to quan tify th e perfo rmance degra- dation resulting from the use of these suboptimal tech niques. One way to do this is to compare the M SE of the meth od used in p ractice with the MMSE. Unfortun ately , com putation of the MMSE is itself inf easible in many cases. This has led The authors are with the Depart m ent of Electri cal E ngineeri ng, T echnio n—Israel Insti tute of T echnology , Haifa 32000, Israel (e-mail: zvikabh @techn ion.ac.il; yonina@e e.technion.ac.il). This work was supported in part by the Israel Science Foundati on under Grant no. 1081/07 and by the European Commission in the framewo rk of the FP7 Network of Excelle nce in Wirel ess COMmunicat ions NEW COM++ (contract no. 216715). to a large b ody o f work seek ing to ﬁnd simple lower bou nds on the MMSE in various estimation prob lems [3]– [12]. Generally speaking, previous bo unds can be divided into two ca tegories. Th e W eiss–W einstein family is b ased on a covariance inequality and includes the Bayesian Cram ´ er–Rao bound [3], the Bobrovsk i–Zakai boun d [8 ], an d the W eiss – W einstein boun d [9] , [10]. The Z iv–Zakai family of b ound s is based on comparin g the estimation pr oblem to a related detection scenario . This family includes the Ziv–Zakai bound [4] and its improvements, notab ly the Bellini–T artara bo und [6], th e Chaza n–Zaka i–Ziv boun d [7] , and the gener alization of Bell et al. [ 11]. Recently , Renaux et al. have combined both approa ches [12]. The accu racy of th e b ounds described above is usu ally tested numerically in particular estimation settings. Few of the previous results provide any so rt of analytical proof of accuracy , even und er asymptotic con ditions. Bellini and T ar tara [6] b rieﬂy discuss p erform ance of their bou nd at high signal-to-n oise ratio (SNR), and B ell et al. [11] prove that their bound co n verges to th e true value at low SNR for a particular family of Gaussian- like p robab ility distributions. T o the best of o ur knowledge, there ar e no other results concern ing the asymptotic p erfor mance of Bayesian b ound s. A different estimation setting arises when one considers θ as a d eterministic un known parameter . In this case, too , a commo n g oal is to con struct an estimator having low MSE. Howe ver, the term MSE h as a very d ifferent meaning in the deterministic setting, since in this case, the expectation i s taken only over the r andom variable x . One elemen tary difference with far-reaching imp lications is that in the Bayesian case, the MSE is a single real nu mber, whereas the deter ministic MSE is a fu nction of the un known parameter θ [13 ]–[15 ]. Many lower bou nds have been developed for the deter min- istic setting, as well. These includ e classical results suc h as the Cram ´ er–Rao [16 ], [17] , Hamm ersley–Chapman –Robbin s [18], [19 ], Bhattachary a [20], and Barank in [21] bound s, as well as more recent r esults [22]–[2 7]. By far the simplest an d most co mmon ly used of these app roaches is the Cram ´ er –Rao bound (CRB). Like mo st other determ inistic bo unds, the CRB deals explicitly with unbiased estimators, or , equivalently , with estimators having a spe ciﬁc, pre-speciﬁed bias f unction . T wo exceptions are the un iform CRB [23], [2 5] and the minimax linear-bias bound [2 6], [27 ]. The CRB is k nown to be a symptotically tight in many cases, even th ough many late r bound s are sharper than it [1 4], [2 5], [2 8]. Although th e d eterministic an d Baye sian setting s stem fr om different points of view , there exist insightful relations between the two appro aches. The basis f or this co nnection is the fact that by adding a prio r distribution for θ , any determin istic 2 problem can b e transfo rmed to a corr espond ing Bayesian set- ting. Se veral theorems relate the performa nce of corresponding Bayesian and de terministic scenarios [ 13]. As a co nsequen ce, numero us boun ds have both a determin istic and a Baye sian version [3] , [ 10], [ 12], [29]. The simplicity and asym ptotic tigh tness of the deter ministic CRB motiv a te its use in prob lems in which θ is random . Such an application was described by Y oung and W esterbe rg [5], wh o consider ed the case of a scalar θ constrained to the interval [ θ 0 , θ 1 ] . They used the p rior distribution of θ to determine the optimal bias function fo r use in the biased CRB, and thus obtained a Bayesian bound . It should be n oted that this result differs f rom the Bay esian CRB of V an Trees [3]; the two bou nds are com pared in Section II- C. W e refer to the re sult of Y oun g and W esterberg as the optima l-bias bou nd (OBB), since it is based on choo sing the b ias function which optimizes the CRB u sing the given prior distribution. This p aper provid es an extension and a deep er analy sis of the OBB. Speciﬁcally , we g eneralize the bo und to an arbitrary n -dime nsional estimation setting [3 0]. The bou nd is de termined by ﬁn ding the solution to a certain par tial differential eq uation. Using to ols fr om function al an alysis, we demonstra te that a unique solution exists for this differential equation. Un der suitable symmetry co nditions, it is shown that the method can be reduced to the solution of an o rdinary differential eq uation an d, in some c ases, presen ted in closed form. The mathematical tools employed in this paper are also used for ch aracterizing th e p erform ance of the OBB. Sp eciﬁcally , it is demonstrated analytically that the proposed bound is asymp- totically tigh t for both hig h and low SNR values. Furth ermore , the OBB is comp ared with several other boun ds; in th e examples considered , the OBB is both simpler computationa lly and more accurate than all relev ant alternatives. The remainde r of this paper is o rganized as follows. In Sec- tion II, we derive the OBB for a vector parameter . Section I II discusses some mathema tical concepts required to ensur e the existence of the OBB. I n Sectio n IV, a pr actical techniqu e f or calculating the bou nd is developed using variational calculus. In Sectio n V, we demo nstrate som e pr operties of the OBB, including its asym ptotic tightness. Finally , in Sectio n VI, we compare the perfo rmance of th e bou nd with that of other relev ant techniques. I I . T H E O P T I M A L - B I A S B O U N D In th is section, we derive th e OBB for the g eneral vector case. T o this end, we ﬁrst examine the relation between the Bayesian and deterministic estimation setting s (Section II-A). Next, we focu s on the d eterministic case and revie w the basic proper ties of the CRB (Section I I-B). Finally , the OBB is derived from the CRB (Section II-C). The fo cus of this pa per is the Bayesian estimation p rob- lem, but the boun d we p ropo se stems from th e th eory of deterministic estimation. T o av o id confusio n, we will in dicate that a pa rticular qu antity refers to the deterministic settin g by app ending the symbol ; θ to it. For example, the notation E { ·} deno tes expectation over both θ an d x , i.e., exp ectation in th e Bayesian sense, while expectation solely o ver x (in the d eterministic setting ) is den oted by E {· ; θ } . The notation E { · | θ } indicates Bayesian exp ectation co nditioned o n θ . Some further n otation used thro ugho ut the p aper is as fol- lows. Lo wer case boldface letters signify vectors and upp ercase boldface letters indicate matrices. The i th compo nent of a vector v is denoted v i , while v (1) , v (2) , . . . signiﬁes a sequence of vectors. The derivati ve ∂ f /∂ v of a function f ( v ) is a vector f unction who se i th element is ∂ f /∂ v i . Similarly , gi ven a vector fu nction b ( θ ) , th e deriv ative ∂ b /∂ θ is deﬁned as the matrix func tion whose ( i, j ) th en try is ∂ b i /∂ θ j . The sq uared Euclidean nor m v T v of a vector v is deno ted k v k 2 , while the squared Frob enius n orm T r( M M T ) of a m atrix M is denoted k M k 2 F . In Section III, we will also d eﬁne some function al norms, which will be of use later in the pap er . A. The Bayesian –Deterministic Conn ection W e now r evie w a funda mental relation between th e Bayes- ian and deter ministic estimation settings. Let θ be a n unk nown random vector in R n and let x be a measuremen t v ector . The joint pr obability density f unction (pdf) of θ and x is p x , θ ( x , θ ) = p x | θ ( x | θ ) p θ ( θ ) , where p θ is the prior distri- bution of θ and p x | θ is th e conditio nal distribution of x given θ . For later use, deﬁne the set Θ of feasible p arameter values by Θ = { θ ∈ R n : p θ ( θ ) > 0 } . (1) Suppose ˆ θ = ˆ θ ( x ) is an e stimator of θ . Its (Bayesian) MSE is g iv en by MSE = E n k ˆ θ − θ k 2 o = Z k ˆ θ − θ k 2 p x , θ ( x , θ ) d x d θ . (2) By the law of total exp ectation, we have MSE = Z  Z k ˆ θ − θ k 2 p x | θ ( x | θ ) d x  p θ ( θ ) d θ = E n E n k ˆ θ − θ k 2    θ oo . (3) Now consider a determin istic estimatio n setting, i.e., su p- pose θ is a d eterministic unk nown which is to be estimated from r andom measuremen ts x . Let the distribution p x ; θ of x (as a function of θ ) be gi ven b y p x ; θ ( x ; θ ) = p x | θ ( x | θ ) , i.e., the distribution of x in the d eterministic case equa ls the co nditional distribution in the correspo nding Bayesian problem . The estimator ˆ θ deﬁned above is simp ly a fun ction o f the measuremen ts, and can therefor e be ap plied in the d etermin- istic c ase as well. Its determ inistic MSE is given b y E n k ˆ θ − θ k 2 ; θ o = Z k ˆ θ − θ k 2 p x ; θ ( x ; θ ) d x (4) Since p x ; θ ( x ; θ ) = p x | θ ( x | θ ) , we ha ve E n k ˆ θ − θ k 2 ; θ o = E n k ˆ θ − θ k 2    θ o . (5) Combining this fact with ( 3), we ﬁnd that the Bayesian MSE equals the expectation of th e MSE o f the correspo nding deterministic p roblem, i.e. E n k ˆ θ − θ k 2 o = E n E n k ˆ θ − θ k 2 ; θ oo . (6) This relation will b e used to construct the OBB in Section II-C. 3 B. The De terministic Cr a m ´ er –Rao Bound Before dev elo ping the OBB, we revie w some basic results in the deterministic estimation setting. Supp ose θ is a d eterminis- tic par ameter vector and let x b e a measuremen t vector having pdf p x ; θ ( x ; θ ) . Den ote by Θ ⊆ R n the set of all po ssible values o f θ . W e assume fo r techn ical reason s that Θ is an open set. 1 Let ˆ θ be an estimator of θ fro m the measuremen ts x . W e require the fo llowing regularity co nditions to ensure tha t the CRB holds [ 31, § 3.1.3] . 1) p x ; θ ( x ; θ ) is continuou sly differentiable with respec t to θ . This cond ition is req uired to ensure the existence o f the Fisher in formatio n. 2) The Fisher information m atrix J ( θ ) , deﬁned by [ J ( θ )] ij = E  ∂ log p x ; θ ∂ θ i ∂ log p x ; θ ∂ θ j ; θ  (7) is bou nded and positive deﬁnite f or all θ ∈ Θ . This ensures that the measuremen ts contain data abou t th e unknown paramete r . 3) Exchan ging the integral an d deriv ative in the equation Z t ( x ) ∂ ∂ θ i p x ; θ ( x ; θ ) d x = ∂ ∂ θ i Z t ( x ) p x ; θ ( x ; θ ) d x (8) is justiﬁed for an y measurab le function t ( x ) , in the sense that, if on e side exists, then the o ther exists an d the two sides are equ al. A sufﬁcient co ndition f or this to hold is that the sup port o f p x ; θ does not de pend o n θ . 4) All estimato rs ˆ θ are Borel m easurable fun ctions which satisfy     ∂ p x ; θ ∂ θ ˆ θ T     F ≤ g ( x ) for all θ (9) for some integrable function g ( x ) . This techn ical r e- quiremen t is need ed in order t o exclude certain patholog- ical estimators whose statistical behavior is insufﬁ ciently smooth to allow the ap plication o f the CRB. The bias o f an estimator ˆ θ is deﬁned as b ( θ ) = E n ˆ θ ; θ o − θ . (10) Under the a bove a ssumptions, it can be shown that the bias of any estimator is continuou sly differentiable [5, Lemma 2 ]. Furthermo re, und er these assumptio ns, the CRB h olds, and thus, for any estimator having bias b ( θ ) , we hav e E n k θ − ˆ θ k 2 ; θ o ≥ CRB[ b , θ ] , T r "  I + ∂ b ∂ θ  J − 1 ( θ )  I + ∂ b ∂ θ  T # + k b ( θ ) k 2 . (11) A mo re com mon form o f th e CRB is o btained by restricting attention to unbiased estimators (i.e., techniques for w hich 1 This is requir ed in order to ensure that one can discuss dif ferentiabi lity of p x ; θ with respec t to θ at any point θ ∈ Θ . In the Bayesian setti ng to which we will return in Section II-C, Θ is deﬁned by (1); in this case, adding a boundary to Θ essentially leave s the setting unchanged, as long as the prior probabili ty for θ to be on the boundary of Θ is zero. Therefore, this require ment is of littl e practical rele vance. b ( θ ) = 0 ). Under the unb iasedness assump tion, the boun d simpliﬁes to MSE ≥ T r( J − 1 ( θ )) . Howe ver, in the sequel we will make use of the general f orm ( 11). C. A Baye sian B ound fr om the CRB The OBB o f Y oung a nd W esterberg [5] is based o n ap ply- ing the Bayesian–deter ministic conn ection descr ibed in Sec- tion I I-A to the deterministic CRB (11). Sp eciﬁcally , re turning now to the Bay esian setting, on e can combine (6) and (1 1) to obtain th at, for any estimator ˆ θ with bias f unction b ( θ ) , E n k θ − ˆ θ k 2 o ≥ Z [ b ] , Z Θ CRB[ b , θ ] p θ ( d θ ) (12) where the expectation is n ow perfor med over both θ an d x . Note that (12) describe s the Bayesian MSE as a fu nction of a deter ministic prop erty (th e bias) of ˆ θ . Since any estimator has some bias function, and since all bias fun ctions are continuo usly differentiable in our setting, minimizin g Z [ b ] over all con tinuously differentiable functio ns b yield s a lower bound on th e MSE of any Bayesian estima tor . Thu s, under the regularity co nditions of Section II -B, a lower b ound on the Bay esian MSE is g iv en by s = inf b ∈ C 1 Z Θ " k b ( θ ) k 2 + T r  I + ∂ b ∂ θ  J − 1 ( θ )  I + ∂ b ∂ θ  T ! # p θ ( d θ ) (13) where C 1 is the space of con tinuou sly differentiable fu nctions f : Θ → R n . Note that th e OBB differs f rom the Baye sian CRB o f V an T rees [3]. V an T rees’ result is based on applying the Cauchy–Sch warz ineq uality to the joint pdf p x , θ , whereas th e deterministic CRB is based on ap plying a similar proced ure to p x ; θ . As a con sequence, the regularity conditions required for the Bayesian CRB are stricter, requ iring that p x , θ be twic e differentiable with r espect to θ . By contra st, the OBB require s differentiability only of the c ondition al pd f p x | θ . An example in which this d ifference is impor tant is the case in which the prior distrib u tion p θ is discontinuous, e.g., when p θ is uniform . The perfo rmance of the OBB in this setting will be examine d in Sectio n VI. In the next section , we will see that it is advantageou s to perfor m the minimiza tion (13) over a somewhat modiﬁed class of function s. Th is will allow us to p rove the u nique existence of a solution to the op timization pro blem, a result which will be o f use when examin ing the p roper ties o f th e bo und later in the p aper . I I I . M A T H E M AT I C A L S A F E G UA R D S In the previous section, we saw that a lower bo und on the MMSE can be obtained by solving th e minimiza tion p roblem (13). However , at th is p oint, we have no guarantee th at the solution s of (13) is anywhere near the true value of the MMSE. In deed, at ﬁrst sight, it may appear that s = 0 fo r any estimation setting . T o see this, no te that Z [ b ] is a sum of two compon ents, a bias grad ient p art an d a squared bias 4 Fig. 1. A sequence of contin uous functions for which both | b ( θ ) | 2 and | 1 + b ′ ( θ ) | 2 tend to zero for almost ev ery va lue of θ . part. Both pa rts are nonn egati ve, but the former is zero when the bias gradient is − I , while the latter is zero when the bias is zer o. No differentiable fun ction b satisﬁes th ese two constraints simultaneo usly for all θ , since if the squ ared bias is ev e rywhere zero, th en the bia s grad ient is also zero. Howe ver, it is po ssible to construct a sequen ce of functions b ( i ) for which both the bias gradien t an d the squar ed bias nor m ten d to zer o for almost every value of θ . An example of such a se- quence in a one-dimen sional setting is plotted in Fig. 1. Here, a sequence b ( i ) of smooth, perio dic functions is presented. The function perio d tends to zero, and the per centage of the cycle in which the deri vativ e equals − 1 increases a s i increases. Thus, the p ointwise limit o f the f unction sequence is zero almost everywhere, and th e po intwise limit of the derivati ve is − 1 almost e very where. In the speciﬁc case sh own in Fig. 1 , it ca n b e shown that th e value of Z [ b ( i ) ] doe s not ten d to zero; in fact, Z [ b ( i ) ] tends to inﬁn ity in th is situation. Howe ver, our example illustrates that care must be taken when app lying co ncepts fr om ﬁnite- dimensiona l optimizatio n pro blems to variational calcu lus. The purpo se of this section is to show that s > 0 , so that the bou nd is meaningfu l, for any problem setting satisfying the regularity co nditions o f Sec tion II-B. (This question was n ot addressed by Y oung and W ester berg [5 ].) While doing so, we develop some abstract concepts which will also be u sed wh en analyzing the asymptotic pr operties of th e OBB in Section V. As often happen s with variational problems, it turns out that th e minimu m of (13) is not n ecessarily ach iev e d by any continuo usly differentiable function. In orde r to guaran tee an achiev ab le minimum, one m ust instead minimize (1 3) over a slightly m odiﬁed space, which is d eﬁned below . As explain ed in Section II-B, all bias fun ctions a re co ntinuou sly differen- tiable, so th at the minimizing f unction ultimately obtained, if it is not differentiab le, will not be the b ias of any estimator . Howe ver, as w e will see, the min imum value of our n ew optimization problem is identica l to th e inﬁmum of (13). Furthermo re, this app roach allows us to demon strate sev eral importan t theoretical properties o f th e OBB. Let L 2 be the space of p θ -measurab le functions b : Θ → R n such that Z Θ k b ( θ ) k 2 p θ ( d θ ) < ∞ . (14) Deﬁne th e associated inner p rodu ct D b (1) , b (2) E L 2 , n X i =1 Z Θ b (1) i ( θ ) b (2) i ( θ ) p θ ( d θ ) (15) and the correspo nding norm k b k 2 L 2 , h b , b i L 2 . Any fun ction b ∈ L 2 has a der iv ative in th e distributional sense, but this deriv ative might not be a fu nction. For exam ple, discontinu ous function s have distributional derivati ves which co ntain a Dir ac delta. If, fo r ev ery i , the distributional d eriv ative ∂ b i /∂ θ of b is a functio n in L 2 , the n b is said to be weak ly differentiable [32], and its we ak deriv ativ e is the ma trix functio n ∂ b /∂ θ . Roughly speak ing, a fun ction is weakly differentiable if it is continuo us and its deriv ative exists almost e very where. The space of all weakly differentiable fu nctions in L 2 is called the ﬁrst-order Sobolev space [32] , an d is denoted H 1 . Deﬁne a n inner p roduc t on H 1 as D b (1) , b (2) E H 1 , D b (1) , b (2) E L 2 + n X j =1 * ∂ b (1) j ∂ θ , ∂ b (2) j ∂ θ + L 2 . (16) The associated norm is k b k 2 H 1 , h b , b i H 1 . An importan t proper ty wh ich will be used extensiv ely in our analy sis is that H 1 is a Hilbe rt spac e. Note th at since Θ is an op en set, not all functio ns in C 1 are in H 1 . For example, in the case Θ = R n , the function b ( θ ) = k , for some nonzer o co nstant k , is con tinuou sly differentiable but no t integrable. Th us b is in C 1 but not in H 1 , nor even in L 2 . Howev er , any measur able fu nction which is not in H 1 has k b k H 1 = ∞ , meaning th at eith er b or ∂ b /∂ θ ha s inﬁnite L 2 norm. Consequen tly , either the bias norm part or the bias gradient part of Z [ b ] is inﬁnite. It fo llows tha t p erfor ming th e minimization (13) over C 1 ∩ H 1 , rath er tha n over C 1 , does not cha nge the minim um value. On the other h and, C 1 ∩ H 1 is dense in H 1 , and Z [ b ] is c ontinuo us, so that m inimizing (1 3) over H 1 rather th an C 1 ∩ H 1 also d oes n ot alter the min imum. Consequently , we will hencefo rth consider the problem s = inf b ∈ H 1 Z [ b ] . (17) The advantage of including weakly differentiable f unctions in the minimizatio n is that a un ique m inimizer can n ow be guaran teed, as demonstrated b y th e following result. Pr oposition 1: Consider the prob lem ¯ b = a rg min b ∈ H 1 Z [ b ] (18) where Z [ b ] is giv en b y (12) and J ( θ ) is positive deﬁnite and boun ded with probability 1. This p roblem is well-deﬁn ed, i.e., there exists a uniqu e ¯ b ∈ H 1 which minimizes Z [ b ] . Furthermo re, the m inimum value s = Z [ ¯ b ] is ﬁnite and nonzer o. Proving the u nique existence of a min imizer fo r (1 7) is a technical exercise in fu nctional analy sis which ca n be fo und in 5 Append ix II. However , once the existence of su ch a min imizer is demo nstrated, it is no t difﬁcult to see that 0 < s < ∞ . T o see that s < ∞ , we must ﬁnd a function b for which Z [ b ] < ∞ . One such fun ction is b = 0 , for which Z [ b ] is ﬁnite since J ( θ ) is bo unded . Now suppose by con tradiction that s = 0 , which implies that there exists a f unction ¯ b ∈ H 1 such th at Z [ ¯ b ] = 0 . Therefo re, both the bias gradien t and the squared bias pa rts of Z [ ¯ b ] are z ero. In p articular, since the squ ared bias part equals zero, we h ave k ¯ b k L 2 = 0 . Hence, ¯ b = 0 , because L 2 is a normed space. But then, b y the deﬁnition (12) of Z [ · ] , Z [ ¯ b ] = Z Θ T r( J − 1 ( θ )) p θ ( d θ ) (19) which is po siti ve; this is a contradictio n. Note that f unctions in H 1 are d eﬁned up to chang es on a set having zero measure. In p articular, the fact th at b (0) is u nique does n ot preclu de fun ctions which are ide ntical to b (0) almost ev e rywhere (wh ich obviously have the same value Z [ b ] ). Summarizin g the discussion of th e last two sections, we have the fo llowing theorem. Theor e m 1: Let θ be an unknown ran dom vector with pdf p θ ( θ ) > 0 over the open set Θ ⊆ R n , an d let x be a measuremen t vector who se pdf , co nditioned o n θ , is g iv en by p x | θ ( x | θ ) . Assume the regularity condition s of Section II-B hold. T hen, for any estimator ˆ θ , E n k θ − ˆ θ k 2 o ≥ min b ∈ H 1 Z Θ CRB[ b , θ ] p θ ( θ ) d θ . (20) The min imum in (20) is nonzero and ﬁnite. Furthermore , this minimum is achiev ed by a function ¯ b ∈ H 1 , which is un ique up to changes having zero probability . T wo remark s are in order co ncernin g Theorem 1. First, the f unction b solving (20) m ight no t b e the bias of any estimator; indeed, under our assumptions, all bias functions are continuo usly differentiable, wh ereas b need o nly be weak ly differentiable. Ne vertheless, (20) is still a lower bound on the MMSE. Another imp ortant observation is that Theorem 1 arises from the determ inistic CRB; hence, there are no requ ire- ments o n the prior d istribution p θ ( θ ) . I n p articular, p θ ( θ ) can be discontinu ous or have b ound ed support. By co ntrast, many previous Bayesian bo unds d o not apply in such cir cumstances. I V . C A L C U L AT I N G T H E B O U N D In ﬁnite-dimen sional co n vex op timization problem s, the requirem ent of a vanishing ﬁrst derivati ve results in a set of equa tions, whose solu tion is th e glo bal minimu m. Analo- gously , in the case of co n vex functional optim ization prob lems such as (2 0), the optimu m is giv en b y the solution o f a set o f differential eq uations. The following theorem, whose proof can be fo und in Ap pendix III, speciﬁes the d ifferential equa tion relev ant to our o ptimization pr oblem. In this sectio n an d in the remain der of the p aper, we will consider th e case in which the set Θ = { θ : p θ ( θ ) > 0 } is bound ed. From a practical point of view , e ven when Θ consists of the entire set R n , it can be appro ximated by a bou nded set containing only th ose values of θ for which p θ ( θ ) > ǫ . Theor e m 2: Under th e co nditions o f Th eorem 1, suppose Θ is a bo unded su bset of R n with a smooth boundar y Λ . Th en, the o ptimal b ( θ ) of (20) is given by the solutio n to the system of partial differential equations p θ ( θ ) b i ( θ ) = p θ ( θ ) X j,k ∂ 2 b i ∂ θ j ∂ θ k ( J − 1 ) j k + X j,k  δ ik + ∂ b i ∂ θ k   ( J − 1 ) j k ∂ p θ ∂ θ j + p θ ( θ ) ∂ ( J − 1 ) j k ∂ θ j  (21) for i = 1 , . . . n , within th e ran ge θ ∈ Θ , which satisﬁes the Neumann bounda ry condition  I + ∂ b ∂ θ  J − 1 ν ( θ ) = 0 (22) for all p oints θ ∈ Λ . Her e, ν ( θ ) is a n ormal to the b ounda ry at θ . All d eriv atives in this system o f e quations a re to b e interpreted in the weak sense. Note th at Th eorem 1 g uarantees the existence of a unique solution in H 1 to the differential equ ation (2 1) with the bound ary conditions (22). The bou nd o f Y oung and W esterberg [ 5] is a special case of Theorem 2, and is giv en here for completeness. Cor ollary 1: Under the settin gs of Theo rem 1, supp ose Θ = ( θ 0 , θ 1 ) is a boun ded interval in R . Th en, the bias function b ( θ ) minimizin g (20) is a solution to th e second- order ordinar y differential equatio n J ( θ ) b ( θ ) = b ′′ ( θ ) + (1 + b ′ ( θ ))  d log p θ dθ − d log J dθ  (23) within the range θ ∈ Θ , subject to the boun dary co nditions b ′ ( θ 0 ) = b ′ ( θ 1 ) = − 1 . Theorem 2 can b e solved nu merically , thus obtain ing a bound for any pr oblem satisfying the regularity co nditions. Howe ver, directly solvin g (21) beco mes inc reasingly co mplex as the dim ension of the p roblem increases. In stead, in many cases, symmetry relations in the pro blem can be used to simplify the solution. As an example, the following spherically symmetric case can be redu ced to a one- dimensiona l setting equiv ale nt to th at of Corollary 1. The proof of this theor em can be fou nd in Ap pendix IV. Theor e m 3: Under the setting of Theor em 1, supp ose that Θ = { θ : k θ k < r } is a sphere centered on th e origin, p θ ( θ ) = q ( k θ k ) is spherica lly symmetric, an d J ( θ ) = J ( k θ k ) I , wh ere J : R → R is a scalar fun ction. Then , the optimal- bias bo und (20) is gi ven by E n k θ − ˆ θ k 2 o ≥ 2 π n/ 2 Γ( n/ 2) Z r 0 " b 2 ( ρ ) + (1 + b ′ ( ρ )) 2 J ( ρ ) + n − 1 J ( ρ )  1 + b ( ρ ) ρ  2 # q ( ρ ) ρ n − 1 dρ. (24 ) Here, Γ( · ) is th e Gamma f unction, and b ( ρ ) is a so lution to the ODE J ( θ ) b ( θ ) = b ′′ ( θ ) + ( n − 1)  b ′ ( θ ) θ − b ( θ ) θ 2  + (1 + b ′ ( θ ))  d log q dθ − d log J dθ  (25) 6 subject to the bound ary co nditions b (0 ) = 0 , b ′ ( r ) = − 1 . The bias function for which the boun d is achieved is given by b ( θ ) = b ( k θ k ) θ k θ k . (26) In this theorem , the requiremen t J ( θ ) = J ( k θ k ) I in dicates that the Fisher info rmation matrix is diagon al and that its compon ents are spherically symme tric. Parameters having a diagona l m atrix J are sometimes referred to as orthogo nal . The simp lest ca se o f or thogo nality occurs w hen, to each parameter θ i , there corr esponds a m easuremen t x i , in such a way tha t the ran dom variables x i | θ are indepen dent. Oth er orthog onal scenarios can often be constructed by an appropr i- ate parametrization [ 33]. The requ irement th at J have sp herically symmetr ic co mpo- nents o ccurs, fo r example, in lo cation p roblems, i.e., situations in wh ich the measurements have th e for m x = θ + w , wh ere w is additive noise whic h is indep enden t of θ . Indeed , under such c ondition s, J is constant in θ [31, § 3 .1.3] . If, in addition , the n oise com ponen ts are in depend ent, then this setting a lso satisﬁes the or thogon ality requirement, and thus application o f Theorem 3 is appro priate. Note that this estimation pr oblem is not separable, since the compon ents of θ are c orrelated; thus, the MMSE in this situatio n is lower than the sum of the compon ents’ MMSE. An example of such a setti ng is presented in Section VI. V . P RO P E RT I E S In this section , w e examine several proper ties of the OBB. W e ﬁrst d emonstrate that the optima l bias function has ze ro mean, a p roper ty wh ich a lso cha racterizes the bias function of the MM SE estimator . Next, we prove that, u nder very general condition s, the resulting bou nd is tight at both low and high SNR values. Th is is an impor tant r esult, since a desirable proper ty of a Bayesian b ound is that it provides an accur ate estimate of the ambig uity region between hig h and low SNR [11]. Reliable estimation at th e two e xtremes increases the likelihood tha t the transition betwe en these two regimes will be c orrectly identiﬁed. A. Optimal Bias Has Zer o Mean In any Bayesian estimation p roblem, the bias of the MMSE estimator ˆ θ opt = E { θ | x } h as zero mean: E n ˆ θ opt o = E { E { θ | x }} = E { θ } (27) so that E n b ( ˆ θ opt ) o = E { E { θ | x } − θ } = 0 . (28) Thus, it is interesting to ask wh ether the optimal bias wh ich minimizes (20) also has zero mea n. This is indeed the case, as shown by the following theo rem. Theor e m 4: Let b ( θ ) be the solution to (20). Then, E { b ( θ ) } = 0 . (29) Pr oof: Assume by contrad iction tha t b ( θ ) has n onzero mean E { b ( θ ) } = µ 6 = 0 . Deﬁne b 0 ( θ ) , b ( θ ) − µ . From (11), we th en ha ve CRB[ b 0 , θ ] − CRB[ b , θ ] = k b 0 ( θ ) k 2 − k b ( θ ) k 2 = k µ k 2 − 2 µ T b ( θ ) . (30) Using th e function al Z [ · ] deﬁned in (12), we ob tain Z [ b 0 ] − Z [ b ] = E  k µ k 2 − 2 µ T b ( θ )  = k µ k 2 − 2 µ T E { b ( θ ) } = −k µ k 2 < 0 . (31) Thus Z [ b 0 ] < Z [ b ] , c ontradictin g the fact that b ( θ ) m inimizes (20). B. T ightness at Low SNR Bell et al. [11] examined the p erform ance of the extended Ziv–Zakai bo und at low SNR and demonstrated that, f or a particular family of distributions, the extended Z iv–Zakai bound achieves the MSE of the optimal estimator as the SNR tends to 0 . W e now examine the low-SNR perfor mance of th e OBB, an d demon strate tigh tness for a much wider range o f problem settings. Bell e t al. did not deﬁn e the gen eral meaning of a low SNR value, and on ly stated that “[a] s o bservation time and /or SNR become very sma ll, the ob servations b ecome useless . . . [and] the minimu m MSE estimator con verges to the a priori mean . ” This stateme nt clea rly do es no t ap ply to all estimation prob- lems, since it is n ot always clear what par ameter corresp onds to th e observation time or the SNR. W e pr opose to deﬁne the z ero SNR case m ore gen erally as any situatio n in which J ( θ ) = 0 with probability 1. This deﬁnition implies th at the measuremen ts do n ot conta in infor mation about th e u nknown parameter, wh ich is the usual inform al mea ning of zero SNR. In the case J ( θ ) = 0 , it can be shown that the MMSE estimator is the pr ior mean , so tha t our deﬁn ition implies the statement of Bell et al. The OBB is in applicable when J ( θ ) = 0 , since the CRB is b ased on the assumptio n that J ( θ ) is p ositiv e deﬁnite. T o av oid this sing ularity , we c onsider a sequ ence of estimation settings which converge to zero SNR. Mor e speciﬁcally , we require all eigenv alues of J ( θ ) to decr ease mon otonically to zero for p θ -almost all θ . Th e following theorem, the pro of of which can be fo und in App endix V, demo nstrates the tightness of the OBB in this low-SNR setting. Theor e m 5: Let θ be a rando m vector whose pdf p θ ( θ ) is nonzer o over an open set Θ ⊆ R n . Let x (1) , x (2) , . . . be a se- quence of observation vecto rs having ﬁn ite Fisher info rmation matrices J (1) ( θ ) , J (2) ( θ ) , . . . , r espectiv ely . Sup pose tha t, for all N , the matrix J ( N ) ( θ ) is p ositiv e de ﬁnite fo r p θ -almost all θ , an d that all eige n values of J ( N ) ( θ ) decre ase mo notonica lly to zero as N → ∞ for p θ -almost all θ . Let β N denote th e optimal-b ias boun d for estimating θ from x ( N ) . Then, lim N →∞ β N = E n k θ − E { θ }k 2 o . (32) 7 C. T ightness at High SNR W e now examine the performance of the OBB fo r high SNR values. T o formally deﬁne the high SNR r egime, we consider a sequence of measurements x (1) , x (2) , . . . of a single parameter vector θ . It is assumed that, when con ditioned o n θ , all measu rements x ( i ) are id entically and ind ependen tly distributed (IID). Furtherm ore, we assume that the Fisher in- formation m atrix of a single ob servation J ( θ ) is well-deﬁn ed, positive deﬁnite a nd ﬁnite for p θ -almost all θ . W e co nsider the problem of estimating θ from the set of measu rements { x (1) , . . . , x ( N ) } , for a given v alue of N . The hig h SNR regime is obtain ed when N is large. When N tends to inﬁnity , the MSE of the o ptimal estimato r tends to zero . An im portan t qu estion, howev er , concern s the rate of co n vergenc e of the minimum MSE. More p recisely , giv en the op timal estimator ˆ θ ( N ) of θ from { x (1) , . . . , x ( N ) } , one w o uld like to determine th e asym ptotic distrib u tion of √ N ( ˆ θ ( N ) − θ ) , condition ed on θ . A funda mental result of asymptotic e stimation theo ry can be lo osely stated as follows [28, § I II.3], [1 3, § 6.8]. Under som e fairly mild regularity condition s, the asymp totic d istribution of √ N ( ˆ θ ( N ) − θ ) , condition ed o n θ , does not depend on the pr ior distribution p θ ; rather, √ N ( ˆ θ ( N ) − θ ) | θ converges in d istribution to a Gaussian rand om vector with mean zero and covariance J − 1 ( θ ) . It f ollows that lim N →∞ N E n k ˆ θ ( N ) − θ k 2 o = E  T r[ J − 1 ( θ )]  . (33) Since the minimum MSE ten ds to ze ro at high SNR, any lower bo und on the minimum MSE mu st also ten d to zero as N → ∞ . Howe ver, one would fu rther expect a good lower b ound to fo llow the beh avior of ( 33). In othe r words, if β N represents the lower bound for e stimating θ from { x (1) , . . . , x ( N ) } , a desira ble proper ty is N β N → E  T r[ J − 1 ( θ )]  . Th e following theorem, who se pr oof is found in Appendix V, demon strates th at this is indeed the case for the OBB. Except f or a very brief treatment by Bellini a nd T artara [6], no pre v ious B ayesian bound has sho wn such a result. Although it ap pears that the Ziv–Zakai and W eiss–W einstein bound s may also satisfy this p roperty , this has n ot been p roven formally . It is also known that the Bayesian CRB is not asymptotically tight in this sense [34, Eqs. (37)–( 39)]. Theor e m 6: Let θ be a r andom vector wh ose pdf p θ ( θ ) is non zero over an open set Θ ⊆ R n . Let x (1) , x (2) , . . . be a sequence o f m easuremen t vectors, such that x (1) | θ , x (2) | θ , . . . are IID. Let J ( θ ) be the Fisher in formatio n matrix for estimating θ fro m x (1) , and suppose J ( θ ) is ﬁnite and positiv e deﬁnite f or p θ -almost all θ . Let β N be the op timal-bias bound (2 0) for estimating θ from the observation sequ ence { x (1) , . . . , x ( N ) } . Then, lim N →∞ N β N = E  T r( J − 1 ( θ ))  . (34) Note that for Theor em 6 to hold , we req uire on ly that J ( θ ) be ﬁnite and p ositiv e deﬁnite. By contrast, th e various theorems g uaranteein g asymptotic efﬁciency o f Baye sian esti- mators all req uire substantially strong er regularity cond itions [28, § III.3] , [13, § 6.8]. One reason f or this is that asymptotic efﬁciency describes the behavior of ˆ θ cond itioned on each possible v alue of θ , and is thu s a strong er result than the asymptotic Bay esian MSE o f (33). V I . E X A M P L E : U N I F O R M P R I O R The original bou nd of Y oung a nd W est erberg [5] predates most Bayesian b ound s, an d, surprising ly , it has never b een cited by or com pared with later r esults. In this section, we measure the p erform ance of the original bo und and of its extension to th e vector case ag ainst th at of various other technique s. W e co nsider the case in whic h θ is unif ormly distributed over an n -dimen sional open ball Θ = { θ : k θ k < r } ⊆ R n , so that p θ ( θ ) = 1 V n ( r ) 1 Θ (35) where 1 S equals 1 when θ ∈ S an d 0 o therwise, an d V n ( r ) = π n/ 2 r n − 1 Γ(1 + n/ 2) (36) is th e volume o f a n n -ball of radiu s r [3 5]. W e further assume that x = θ + w (37) where w is zero -mean G aussian noise, independ ent of θ , having cov ar iance σ 2 I . W e are inter ested in lower bound s on the M SE achiev able by an estimator of θ from x . W e begin b y developing the OBB fo r th is setting, as well as some alternati ve bound s. W e then co mpare the different approa ches in a o ne-dimen sional an d a three-dime nsional setting. The Fisher informatio n matrix for th e given estimation problem is g iv en b y J ( θ ) = σ − 2 I , so that the conditions of Theorem 3 h old. It follows th at the optim al b ias function is given by b ( θ ) = b ( k θ k ) θ / k θ k , where b ( · ) is a solution to the d ifferential equation b σ 2 = b ′′ + ( n − 1)  b ′ θ − b θ 2  (38) with bou ndary conditions b (0) = 0 , b ′ ( r ) = − 1 . The ge neral solution to this differential eq uation is given by b ( θ ) = C 1 θ 1 − n/ 2 I n/ 2  θ σ  + C 2 θ 1 − n/ 2 K n/ 2  θ σ  (39) where I α ( z ) and K α ( z ) are the mo diﬁed Bes sel functions of the ﬁrst a nd secon d types, respectively [3 6]. Since K α ( z ) is singular at the orig in, the r equirem ent b (0) = 0 le ads to C 2 = 0 . Differentiating (39) with r espect to θ , we obtain b ′ ( θ ) = C 1 θ − n/ 2  I n/ 2  θ σ  + θ σ I 1+ n/ 2  θ σ  (40) so that the req uiremen t b ′ ( r ) = − 1 leads to C 1 = − r n/ 2 I n/ 2 ( r /σ ) + r /σ I 1+ n/ 2 ( r/σ ) . (41) 8 Substituting th is value of b ( · ) into (2 4) yield s the OBB, wh ich can be compu ted by ev alu ating a sing le one-d imensional in - tegral. Alternatively , in the one-dim ensional case, the integral can be computed analy tically , as will be sho wn b elow . Despite the wid espread u se of ﬁnite-sup port p rior distri- butions [4 ], [1 0], th e regularity co nditions of many b ounds are v iolated by suc h p rior pd f fu nctions. I ndeed, the Bayesian CRB of V an T rees [3], the Bobrovski–Zakai bound [8], and the Bayesian A bel b ound [1 2] all assum e tha t p θ ( θ ) has inﬁnite support, and thus cann ot be app lied in this scenario. T echniques from the Ziv–Zakai family are applicable to constrained problem s. An exten sion o f the Z iv–Zakai bound for vector pa rameter estimation was developed by Bell et al. [11]. Fr om [ 11, Property 4], the M SE of th e i th compon ent of θ is bou nded by E n ( θ i − ˆ θ i ) 2 o ≥ Z ∞ 0 V  max δ : e T i δ = h A ( δ ) P min ( δ )  h dh (42) where e i is a un it vector in the dir ection of the i th compo nent, V {·} is the valley-ﬁlling fun ction deﬁned by V { f ( h ) } = max η ≥ 0 f ( h + η ) , (43) A ( δ ) , Z R n min ( p θ ( θ ) , p θ ( θ + δ )) d θ , (44) and P min ( δ ) is the minimum probab ility o f e rror f or the problem of testing hypo thesis H 0 : θ = θ 0 vs. H 1 : θ = θ 0 + δ . In the current settin g, P min ( δ ) is gi ven by P min ( δ ) = Q ( k δ k / 2 σ ) , where Q ( z ) = (2 π ) − 1 / 2 R ∞ z e − t 2 / 2 dt is the tail function of th e n ormal distribution. Also, we ha ve A ( δ ) = V C n ( r , k δ k ) V n ( r ) (45) where V C n ( r , h ) = Z R n 1 Θ 1 Θ+ h e 1 d θ (46) and Θ + h e 1 = { θ + h e 1 : θ ∈ Θ } . Thu s, V C n ( r , h ) is the volume of th e intersection of two n -balls whose center s a re at a distanc e of h un its from one an other . Substituting these results into ( 42), we h av e E n ( θ i − ˆ θ i ) 2 o ≥ Z ∞ 0 V  max δ : e T i δ = h V C n ( r , k δ k ) V n ( r ) Q  k δ k 2 σ  h dh. (47) Note that bo th V C n ( r , k δ k ) a nd Q ( k δ k / 2 σ ) decrease with k δ k . Therefo re, the maximum in (47) is obtained for δ = h e i . Also, since the argumen t of V {·} is monoto nically decreasing, the valley-ﬁlling fu nction has no effect and can be r emoved. Finally , since V C n ( r , h ) = 0 for h > 2 r , th e integration can be limited to th e range [0 , 2 r ] . Th us, the extended Zi v–Zakai bound is g iv e n by E n k θ − ˆ θ k 2 o ≥ Z 2 r 0 n V C n ( r , h ) V n ( r ) Q  h 2 σ  h dh. (48) W e now compute the W eiss–W einstein b ound fo r the setting at hand. Th is bound is given by E n k θ − ˆ θ k 2 o ≥ T r( H G − 1 H T ) (49) where H = [ h 1 , . . . , h m ] is a ma trix con taining an ar bitrary number m of test vectors and G is a m atrix whose elem ents are g iv en by G ij = E { r ( x , θ ; h i , s i ) r ( x , θ ; h j , s j ) } E { L s i ( x ; θ + h i , θ ) } E { L s j ( x ; θ + h j , θ ) } (50) in wh ich r ( x , θ ; h i , s i ) , L s i ( x ; θ + h i , θ ) − L 1 − s i ( x ; θ − h i , θ ) (51) and L ( x ; θ 1 , θ 2 ) , p θ ( θ 1 ) p x | θ ( x | θ 1 ) p θ ( θ 2 ) p x | θ ( x | θ 2 ) . (5 2) The vectors h 1 , . . . , h m and the scalars s 1 , . . . , s m are arbi- trary , and can b e optimized to m aximize the b ound (49). T o av oid a multidimension al n onconvex o ptimization problem, we restrict attention to m = n , h i = h e i , an d s i = 1 / 2 , as suggested by [10]. This results in a dep endency on a single scalar p arameter h . Under th ese conditions, G ij can be written as G ij = 1 M ( h i ) M ( h j )  ˜ M ( h i − h j , − h j ) + ˜ M ( h i − h j , h i ) − ˜ M ( h i + h j , h j ) − ˜ M ( h i + h j , h i )  (53) where M ( h ) , E n L 1 / 2 ( x ; θ + h , θ ) o (54) and ˜ M ( h 1 , h 2 ) , E n L 1 / 2 ( x ; θ + h 1 , θ ) 1 Θ+ h 2 o . (55) Note tha t w e have u sed the c orrected versio n of the W eiss– W einstein b ound [37] . Sub stituting the pro bability d istribution of x and θ into the deﬁnitions of M ( h ) and ˜ M ( h 1 , h 2 ) , we have M ( h ) = E n e −k θ + h − x k 2 / 4 σ 2 e k θ − x k 2 / 4 σ 2 1 Θ+ h o = V C n ( r , k h k ) V n ( r ) e −k h k 2 / 8 σ 2 (56) and, similar ly , ˜ M ( h 1 , h 2 ) = e −k h 1 k 2 / 8 σ 2 V n ( r ) Z 1 Θ 1 Θ+ h 1 1 Θ+ h 2 d θ . (57) Thus, M ( h ) is a f unction only of k h k , and ˜ M ( h 1 , h 2 ) is a function only of k h 1 k , k h 2 k , and k h 1 − h 2 k . Since h i = h e i , it follows that, for i 6 = j , the nu merator of (5 3) vanishes. Thus, G is a diag onal m atrix, who se diagonal elements equ al G ii = 2 ˜ M (0 , h e 1 ) − ˜ M (2 h e 1 , h e 1 ) M 2 ( h e 1 ) . (58) The W eis s–W einstein bo und is g iv en b y substituting this r esult into (49) and maxim izing over h , i.e., E n k θ − ˆ θ k 2 o ≥ max h ∈ [0 , 2 r ] nh 2 M 2 ( h e 1 ) 2[ ˜ M (0 , h e 1 ) − ˜ M (2 h e 1 , h e 1 )] . (59) The value of h y ielding the tig htest bou nd can be determ ined by performin g a gr id sear ch. 9 −20 −10 0 10 20 0.05 0.1 0.15 0.2 0.25 0.3 SNR (dB) MSE Actual MMSE Optimal−bias bound Weinstein−Weiss Bellini−Tartara (a) −20 −10 0 10 20 0.75 0.8 0.85 0.9 0.95 1 SNR (dB) Ratio between bound and actual MMSE Optimal−bias bound Weiss−Weinstein Bellini−Tartara (b) Fig. 2. Comparison of the MSE bounds and the minimum achie vable MSE in a one-dimensional setting for which θ ∼ U [ − r, r ] and x | θ ∼ N ( θ , σ 2 ) . T o c ompare th e OBB with the alter native appro aches de- veloped above, we ﬁrst consider the one-d imensional c ase in which θ is uniform ly distributed in the range Θ = ( − r, r ) . Let x = θ + w be a sing le noisy observation, where w is zero- mean Gaussian no ise, indepe ndent of θ , with variance σ 2 . W e wish to bound the MSE of an estimator o f θ f rom x . The optimal b ias fun ction is given by (3 9). Using the fact that I 1 / 2 ( t ) = p 2 /π sinh( t ) / √ t , we obtain b ( θ ) = − σ sinh( θ/ σ ) cosh( r/σ ) (60) which also follows [5] from Coro llary 1. Substituting th is expression into (20), we hav e that, for any estimator ˆ θ , E n ( θ − ˆ θ ) 2 o ≥ σ 2  1 − tanh( r/σ ) r/σ  . (61) Apart fro m the red uction in co mputation al comp lexity , the simplicity of (61) also emphasizes several f eatures of the estimation pro blem. First, the depen dence of the pro blem on th e d imensionless quantity r /σ , rather than on r and σ separately , is clear . This is to be expected , as a chan ge in un its of measuremen t would mu ltiply bo th r and σ by a con stant. Second, the asymptotic p roperties demon strated in Theo rems 5 and 6 ca n be easily veriﬁed. For r ≫ σ , th e bou nd con verges to the n oise variance σ 2 , cor respond ing to an unin formative prior whose op timal estimator is ˆ θ = x ; whereas, for σ ≫ r , a T aylor expan sion of ta nh( z ) /z immediately shows th at the bou nd c onv erges to r 2 / 3 , corresp onding to th e case of uninfo rmative measuremen ts, where the optimal estimator is ˆ θ = 0 . Thus, the boun d (61) is tight both fo r very low and for very high SNR, as expected. In the one-d imensional case, we have V 1 ( r ) = 2 r and V C 1 ( r , h ) = max(2 r − h, 0 ) , so tha t the extended Ziv–Zakai bound ( 48) an d the W eiss–W einstein bou nd (59) can also be simpliﬁed somewhat. In particular , the extended Zi v– Zakai bound ( 48) can b e wr itten as E n k θ − ˆ θ k 2 o ≥ Z 2 r 0  1 − h 2 r  hQ  h 2 σ  dh. (6 2) Using in tegration b y p arts, ( 62) b ecomes E n k θ − ˆ θ k 2 o ≥ 2 r 2 3 Q  r σ  + σ 2  Γ 3 / 2  r 2 2 σ 2  − 8 3 √ 2 π σ r Γ 2  r 2 2 σ 2  (63) where Γ a ( z ) = (1 / Γ( a )) R z 0 e − t t a − 1 dt is the inco mplete Gamma functio n. Like the expre ssion (61) for the OBB, th is bound can be sho wn to converge to th e noise v arian ce σ 2 when r ≫ σ an d to the prior variance r 2 / 3 when σ ≫ r . Howe ver, while the conv ergen ce of the OBB to these asym ptotic v alues has been dem onstrated in general in Theo rems 5 and 6, the asymptotic tigh tness o f the Ziv–Zakai bo und in the g eneral case remains an op en que stion. The W eiss–W einstein b ound (59) can like wise be sim pliﬁed further in the on e-dimensio nal case, yie lding E n k θ − ˆ θ k 2 o ≥ max h ∈ [0 , 2 r ] h 2 e − h 2 / 4 σ 2  1 − h 2 r  2 2  1 − h 2 r − max  0 , 1 − h r  e − h 2 / 2 σ 2  . (64) Howe ver, calculating this bound still requires a num erical search for th e o ptimal v alue of h . These bounds are compared with the exact value o f th e MMSE in Fig. 2. In this ﬁgure, the SNR is deﬁned as SNR(dB) = 10 log 10  V ar ( θ ) V ar ( w )  = 1 0 lo g 10  r 2 3 σ 2  . (65 ) The MMSE was compu ted by Mo nte Carlo approxima tion o f the error o f the optimal estimator E { θ | x } , wh ich was itself computed by n umerical integration. Fig . 2( a) plots the MMSE and the values ob tained by the afo remention ed bo unds, while 10 −20 −15 −10 −5 0 5 10 15 20 0.1 0.2 0.3 0.4 0.5 0.6 SNR (dB) MSE Actual MSE Optimal−bias bound Weinstein−Weiss Extended Ziv−Zakai (a) −20 −15 −10 −5 0 5 10 15 0.7 0.75 0.8 0.85 0.9 0.95 1 SNR (dB) Ratio between bound and optimal MSE Optimal bias bound Weiss−Weinstein Extended Ziv−Zakai (b) Fig. 3. Comparison of the MSE bounds and the m inimum achie vabl e MSE in a three-dimensiona l setting for which θ is uniformly distrib uted ov er a ball of radius r and x | θ ∼ N ( θ , σ 2 I ) . Fig. 2 (b) p lots th e ratio betwe en each of the b ound s and th e actual M MSE in or der to emp hasize the d ifference in accuracy between th e various bou nds. As can be seen from this ﬁgur e, the OBB is closer to the tru e MSE tha n all other bound s, for all tested SNR values. The impr ovements provided by the OBB continu e to hold in higher dimen sions as well, alth ough in this case it is no t possible to provid e a closed form for any of the bo unds. For example, Fig. 3 co mpares the afo remention ed boun ds with the true MMSE in the three-dimension al case. In this case the SNR is g iv en b y SNR(dB) = 10 log 10  V ar( θ ) V ar( w )  = 1 0 lo g 10  r 2 5 σ 2  . (6 6) Here, computation of the min imum MSE requires mu lti- dimensiona l nu merical integration, and is by far more co mpu- tationally complex than the calculation of the bounds. Again, it is evident f rom this ﬁgure that the OBB is a very tigh t bou nd in all ranges of o peration , and is consider ably closer to the true v alue th an either o f th e alternative ap proache s. V I I . C O N C L U S I O N Although often consider ed distinct settings, there are in- sightful connectio ns between the Bayesian and d eterministic estimation pro blems. One such relation is the use of the deterministic CRB in a Bayesian problem. The app lication of th is determ inistic bo und to the p roblem of estimating the minimum Bayesian MSE results in a Bay esian boun d which is p rovably tight at both h igh and low SNR values. Numerical simulation o f th e loc ation estimation problem demo nstrates that the technique is both simpler an d tighter than alternativ e approa ches. A C K N O W L E D G E M E N T The author s are gr ateful to Dr . V olker Pohl for fr uitful discussions co ncernin g many o f the mathem atical aspects o f the paper . Th e author s would also like to th ank the an onymous revie wers for th eir m any co nstructive comm ents. A P P E N D I X I S O M E T E C H N I C A L L E M M A S The p roof of several the orems in the p aper relies on the following technical results. Lemma 1 : Consider th e m inimization problems M ℓ = inf b ∈ S Z ℓ [ b ] , ℓ = 1 , 2 , 3 (67) where J ( θ ) is positive d eﬁnite and bound ed a. e. ( p θ ), Z 1 [ b ] , Z Θ k b ( θ ) k 2 p θ ( d θ ) Z 2 [ b ] , Z Θ T r  I + ∂ b ∂ θ  J − 1 ( θ )  I + ∂ b ∂ θ  T ! p θ ( d θ ) Z 3 [ b ] , Z 1 [ b ] + Z 2 [ b ] (68) and S ⊂ H 1 is conve x , closed, and bo unded under the H 1 norm (16). Then, fo r each ℓ , there exists a fu nction b (0) ∈ S such that Z [ b (0) ] = M ℓ . If ℓ = 1 o r ℓ = 3 , then th e min imizer of (67) is un ique. Note that Z 3 [ b ] eq uals Z [ b ] o f (12); the n otation Z 3 [ b ] is introdu ced for simplicity . Also no te that unde r m ild regu larity assumptions on J ( θ ) , uniquen ess can be demo nstrated for ℓ = 2 as well, but this is no t ne cessary for o ur purposes. Pr oof: The space H 1 is a Cartesian p rodu ct of n Sobolev spaces H 1 (Θ) , each of which is a separable Hilbert space [38, § 3.7.1] . Ther efore, H 1 is also a separable Hilbert space. It follows from the Banach–Alao glu theo rem [3 9, § 3.17] th at all boun ded seq uences in H 1 have weakly convergent sub se- quences [3 2, § 2.1 8]. Recall that a seq uence f (1) , f (2) , . . . ∈ H 1 is said to conver g e we akly to f (0) ∈ H 1 (denoted f ( i ) ⇀ f (0) ) if L [ f ( j ) ] → L [ f (0) ] (69) 11 for all c ontinuo us linear functionals L [ · ] [32 , § 2.9]. Giv en a particu lar value ℓ ∈ { 1 , 2 , 3 } , let b ( i ) be a sequence of function s in S such th at Z ℓ [ b ( i ) ] → M ℓ . This is a b ound ed sequence since S is bou nded, and therefor e there exists a subsequen ce b ( i k ) which con verges we akly to s o me b ( ℓ ) opt ∈ H 1 . Furthermo re, since S is c losed, 2 we have b ( ℓ ) opt ∈ S . W e w ill now show that Z ℓ [ b ( ℓ ) opt ] = M ℓ . T o this end, it sufﬁces to show that Z ℓ [ · ] is weak ly lower semicontinu ous, i.e., f or any sequ ence f ( i ) ∈ H 1 which conv e rges wea kly to f (0) ∈ H 1 , we must show that Z ℓ [ f (0) ] ≤ lim inf i →∞ Z ℓ [ f ( i ) ] . (70) Consider a weak ly conver g ent sequ ence f ( j ) ⇀ f (0) . Then, (69) hold s for any con tinuous linear fun ctional L [ · ] . Speciﬁ- cally , ch oose the con tinuou s lin ear functional L 1 [ f ] = Z Θ f (0) ( θ ) f ( θ ) p θ ( d θ ) . (71) W e the n h ave Z 1 [ f (0) ] = L 1 [ f (0) ] = lim j →∞ L 1 [ f ( j ) ] = lim j →∞ Z Θ n X i =1 f (0) i ( θ ) f ( j ) i ( θ ) p θ ( d θ ) ≤ lim inf j →∞ s Z Θ k f (0) ( θ ) k 2 p θ ( d θ ) · Z Θ k f ( j ) ( θ ) k 2 p θ ( d θ ) = q Z 1 [ f (0) ] lim inf j →∞ q Z 1 [ f ( j ) ] (72) where we ha ve used the Cauchy–Sch warz ine quality . It follows that q Z 1 [ f (0) ] ≤ lim inf j →∞ q Z 1 [ f ( j ) ] (73) and theref ore Z 1 [ f (0) ] ≤ lim inf j →∞ Z 1 [ f ( j ) ] , so that Z 1 [ · ] is w eakly lower semicontinuo us. Similarly , co nsider th e co ntinuo us linear function al L 2 [ f ] = Z Θ T r I + ∂ f (0) ∂ θ ! J − 1 ( θ )  I + ∂ f ∂ θ  T ! p θ ( d θ ) (74) for which we h av e Z 2 [ f (0) ] = L 2 [ f (0) ] = lim j →∞ L 2 [ f ( j ) ] = lim j →∞ Z Θ T r   I + ∂ f (0) ∂ θ ! J − 1 ( θ ) · I + ∂ f ( j ) ∂ θ ! T   p θ ( d θ ) . (75) 2 In fa ct, we require that S be “weakl y closed” in the sense that weakly con ver gent sequences in S con verge to an element in S . Howe ver , since S is con vex, this notio n is equi vale nt to the ordinary deﬁnition of closure [39, § 3.13]. Note that, for any p ositiv e deﬁnite matrix W , T r( AW B T ) is a n inner produc t of the two matr ices A an d B . Ther efore, by the Cauch y–Schwarz inequ ality , T r( AW B T ) ≤ q T r( AW A T ) T r( B W B T ) . (76) Applying th is to ( 75), we h av e Z 2 [ f (0) ] ≤ lim inf j →∞ Z Θ v u u u t T r   I + ∂ f (0) ∂ θ ! J − 1 ( θ ) I + ∂ f (0) ∂ θ ! T   · v u u u t T r   I + ∂ f ( j ) ∂ θ ! J − 1 ( θ ) I + ∂ f ( j ) ∂ θ ! T   p θ ( d θ ) . (77) Once again using the Cauchy– Schwarz inequality r esults in Z 2 [ f (0) ] ≤ lim inf j →∞ q Z 2 [ f (0) ] Z 2 [ f ( j ) ] (78) and therefo re Z 2 [ f (0) ] ≤ lim inf j →∞ Z 2 [ f ( j ) ] , so that Z 2 [ · ] is weakly lower semicontinu ous. Since Z 3 [ f ] = Z 1 [ f ] + Z 2 [ f ] , it follows that Z 3 [ · ] is also weakly lower semicon tinuous. Now recall that b ( i k ) ⇀ b ( ℓ ) opt and Z ℓ [ b ( i k ) ] → M ℓ . By the deﬁnition (70) of lower semic ontinuity , it follows that Z ℓ [ b ( ℓ ) opt ] ≤ lim inf k →∞ Z ℓ [ b ( i k ) ] = M ℓ (79) and since M ℓ is th e inﬁmum of Z ℓ [ b ] , we o btain Z [ b ( ℓ ) opt ] = M . Thus b ( ℓ ) opt is a min imizer of ( 67). It re mains to show tha t fo r ℓ ∈ { 1 , 3 } , the minimizer of (67) is unique. T o this end , we ﬁrst sho w that Z 1 [ · ] is strictly conve x . Let b (0) , b (1) ∈ S be two essentially different function s, i.e., p θ n θ ∈ Θ : b (0) ( θ ) 6 = b (1) ( θ ) o > 0 . (80) Let b (2) ( θ ) = λ b (0) ( θ ) + (1 − λ ) b (1) ( θ ) for some 0 < λ < 1 , so that b (2) ∈ S by conve x ity . W e the n h ave Z 1 [ b (2) ] = Z Q    λ b (0) ( θ ) + (1 − λ ) b (1) ( θ )    2 p θ ( d θ ) + Z Θ \ Q    λ b (0) ( θ ) + (1 − λ ) b (1) ( θ )    2 p θ ( d θ ) < Z Q h λ k b (0) ( θ ) k 2 + (1 − λ ) k b (1) ( θ ) k 2 i p θ ( θ ) + Z Θ \ Q h λ k b (0) ( θ ) k 2 + (1 − λ ) k b (1) ( θ ) k 2 i p θ ( θ ) = λZ 1 [ b (0) ] + (1 − λ ) Z 2 [ b (1) ] (81) where the inequality f ollows from strict con vexity of the squared Euclid ean no rm k x k 2 . Thu s Z 1 [ · ] is strictly co nv ex, and hence has a uniqu e minimu m. Note th at Z 3 [ b ] = Z 1 [ b ] + Z 2 [ b ] . Since Z 1 [ · ] is strictly conv ex and Z 2 [ · ] is con vex, it follo ws th at Z 3 [ · ] is st rictly conv ex, a nd thus also has a unique minimum. This completes the p roof. 12 The fo llowing lemma can be thou ght of as a trian gle inequality for a no rmed space of matrix function s over Θ . Lemma 2 : Let p θ be a probability measure over Θ , and let M : Θ → R n × n be a matrix function. Supp ose Z Θ k I + M ( θ ) k 2 F p θ ( d θ ) ≤ α (82) for some co nstant α . I t follows that Z Θ k M ( θ ) k 2 F p θ ( d θ ) ≤ ( √ α + √ n ) 2 . (83) Pr oof: By the triangle inequality , k M ( θ ) k F = k M ( θ ) + I − I k F ≤ k M ( θ ) + I k F + k I k F . (84) Since k I k 2 F = n , we have Z Θ k M ( θ ) k 2 F p θ ( d θ ) ≤ Z Θ h k I + M ( θ ) k 2 F + n + 2 √ n k I + M ( θ ) k F i p θ ( d θ ) . (85) Using th e fact th at Z Θ k I + M ( θ ) k F p θ ( d θ ) ≤ s Z Θ k I + M ( θ ) k 2 F p θ ( d θ ) (86) and combining with (8 2), it follows that Z Θ k M ( θ ) k 2 F p θ ( d θ ) ≤ α + n + 2 √ nα (87) which completes the pro of. A P P E N D I X I I P R O O F O F P RO P O S I T I O N 1 The fo llowing p roof of Propo sition 1 makes use of the results dev eloped in Appendix I. Pr oof: [ Proof of Proposition 1] Recall that Z 3 [ b ] o f (68) equals Z [ b ] . T hus, we would like to apply Lemma 1 (with ℓ = 3 ) to prove the uniqu e existence of a m inimizer of (17). However , L emma 1 r equires that th e minimization b e perfor med over a closed, b ound ed, an d co n vex set S , wh ereas (17) is per formed over th e unbou nded set H 1 . T o resolve this issue, w e m ust show that the m inimization ( 17) can b e reform ulated as a minimization over a c losed, b ound ed, and conv ex set S . T o this end , note that Z [ 0 ] = Z Θ T r( J − 1 ( θ )) p θ ( d θ ) , U (88) and therefor e M ≤ U < ∞ . Thu s, it sufﬁces to p erform the minimization (17) over those f unction s for wh ich Z [ b ] ≤ U . W e now show that th is can b e achieved by minimizin g over a clo sed, bou nded, and conve x set S . First, note that Z [ b ] ≥ k b k 2 L 2 , so that one may choose to minimize (17) only over function s b for w hich k b k 2 L 2 ≤ U . (89) Similarly , we have Z [ b ] ≥ Z Θ T r  I + ∂ b ∂ θ  J − 1 ( θ )  I + ∂ b ∂ θ  T ! p θ ( d θ ) (90) so that it sufﬁ ces to minim ize (17) over fu nctions b for w hich Z Θ T r  I + ∂ b ∂ θ  J − 1 ( θ )  I + ∂ b ∂ θ  T ! p θ ( d θ ) ≤ U. (91) Note that J ( θ ) is bo unded a.e., an d therefo re λ min ( J − 1 ) ≥ 1 /K a.e., for some constant K . It follows that T r  I + ∂ b ∂ θ  J − 1 ( θ )  I + ∂ b ∂ θ  T ! ≥ 1 K     I + ∂ b ∂ θ     2 F a.e. ( p θ ) . (92) Combining with (9 1) yields Z Θ     I + ∂ b ∂ θ     2 F p θ ( d θ ) ≤ K U. (93) From Lemma 2, we then have Z Θ     ∂ b ∂ θ     2 F p θ ( d θ ) ≤  √ n + √ K U  2 . (9 4) From (89) and (94) it fo llows that the minimization (17) can be lim ited to th e closed, bounded , co n vex set S =  b ∈ H 1 : k b k 2 H 1 ≤ U +  √ K U + √ n  2  . (95) Applying Lemma 1 prov es the unique e x istence of a minimizer of (1 7). The p roof that 0 < s < ∞ appears im mediately after the statem ent of Proposition 1. A P P E N D I X I I I P R O O F O F T H E O R E M 2 The fo llowing is the p roof o f Th eorem 2 con cerning the calculation of the OBB. Pr oof: [ Proof of Theorem 2] Co nsider the more g eneral problem of m inimizing the fu nctional Z [ b ] = Z Θ F [ b , θ ] d θ (96) where F [ b , θ ] is smooth a nd conve x in b : Θ → R n , a nd Θ ⊂ R n is a bound ed set with a sm ooth boun dary Λ . Then , Z [ b ] is also smoo th and co n vex in b , so that b is a global minimum of Z [ b ] if and only if the differential δ Z [ h ] equ als zero at b for all admissible functions h : Θ → R n [40]. By a stan dard technique [4 0, § 3 5], it ca n b e shown that δ Z [ h ] = ǫ X i Z Θ   ∂ F ∂ b i − X j ∂ ∂ θ j ∂ F ∂ b ( j ) i   h i ( θ ) d θ + ǫ X i Z Λ ∂ F ∂ b (1) i , . . . , ∂ F ∂ b ( n ) i ! T ν ( θ ) h i ( θ ) dσ (9 7) 13 where ǫ is a n inﬁnitesimal qu antity , b ( j ) i = ∂ b i /∂ θ j , an d ν ( θ ) is an outward-po inting normal at the b ounda ry point θ ∈ Λ . W e now seek condition s for which δ Z [ h ] = 0 for all h ( θ ) . Conside r ﬁrst fun ctions h ( θ ) which equal zero on the b ound ary Λ . In this case, the second integral vanishes, an d we obtain the Eu ler–Lagrange equations ∀ i, ∂ F ∂ b i − X j ∂ ∂ θ j ∂ F ∂ b ( j ) i = 0 . (98) Substituting th is result back into (9 7), an d again u sing th e fact that δ Z [ h ] = 0 for all h , we obtain the boundar y condition ∀ i, ∀ θ ∈ Λ , ∂ F ∂ b (1) i , . . . , ∂ F ∂ b ( n ) i ! T ν ( θ ) = 0 . (99) Plugging F [ b , θ ] = CRB[ b , θ ] p θ ( θ ) into (98) a nd (99) pro- vides the required r esult. A P P E N D I X I V P R O O F O F T H E O R E M 3 Before proving Theorem 3, we provid e the following two lemmas, which d emonstrate some symmetry pro perties of the CRB. Lemma 3 : Under the co ndition s of Theorem 3, the fu nc- tional Z [ b ] of (12) is rotation and reﬂection in variant, i.e., Z [ b ] = Z [ U b ] for any unitary matrix U . Pr oof: W e ﬁrst d emonstrate that Z [ b ] is rotation in vari- ant. From th e deﬁnitions of Z [ b ] and CRB[ b , θ ] , we have Z [ b ] = Z Θ T r "  I + ∂ b ∂ θ   I + ∂ b ∂ θ  T # q ( k θ k ) J ( k θ k ) d θ + Z Θ k b ( θ ) k 2 q ( k θ k ) d θ . (100) The second integral is clearly rotation inv ariant, since a rotation of b do es not alter its norm. It remains to show that the ﬁrst in tegral, which we de note by I 1 [ b ] , d oes n ot c hange w hen b is r otated. T o this end, we begin by consid ering a rotatio n about the ﬁrst two co ordinates, such that b is tr ansform ed to ˜ b , R φ b , where th e r otation matrix R φ is deﬁned su ch that R φ b = ( b 1 cos φ + b 2 sin φ, − b 1 sin φ + b 2 cos φ, b 3 , . . . , b n ) T . (101) W e mu st thus show that I 1 [ b ] = I 1 [ ˜ b ] . Le t u s perfo rm the change of variables θ 7→ ˜ θ , where ˜ θ = R ( − φ ) θ . Rewriting the trace in ( 100) as a sum, we have I 1 [ ˜ b ] = Z Θ X i,j δ ij + ∂ ˜ b i ∂ θ j ! 2 q ( k ˜ θ k ) J ( k ˜ θ k ) d ˜ θ (102) where we have used the facts th at k θ k = k ˜ θ k and that Θ does not change u nder th e change o f v a riables. W e n ow demo nstrate some pro perties of the transformatio n of b and θ . First, we have, for any j , ∂ ˜ b 1 ∂ θ j ! 2 + ∂ ˜ b 2 ∂ θ j ! 2 =  ∂ b 1 ∂ θ j cos φ + ∂ b 2 ∂ θ j sin φ  2 +  − ∂ b 1 ∂ θ j sin φ + ∂ b 2 ∂ θ j cos φ  2 =  ∂ b 1 ∂ θ j  2 +  ∂ b 2 ∂ θ j  2 . (1 03) Also, f or any i ,  ∂ b i ∂ ˜ θ 1  2 +  ∂ b i ∂ ˜ θ 2  2 =  ∂ b i ∂ θ 1 ∂ θ 1 ∂ ˜ θ 1 + ∂ b i ∂ θ 2 ∂ θ 2 ∂ ˜ θ 1  2 +  ∂ b i ∂ θ 1 ∂ θ 1 ∂ ˜ θ 2 + ∂ b i ∂ θ 2 ∂ θ 2 ∂ ˜ θ 2  2 =  ∂ b i ∂ θ 1  2 +  ∂ b i ∂ θ 2  2 (104) where we used the fact that θ = R φ ˜ θ . Third, we hav e ∂ ˜ b 1 ∂ θ 1 = ∂ b 1 ∂ ˜ θ 1 cos 2 φ + ∂ b 1 ∂ ˜ θ 2 sin φ cos φ + ∂ b 2 ∂ ˜ θ 1 sin φ cos φ + ∂ b 2 ∂ ˜ θ 2 sin 2 φ, ∂ ˜ b 2 ∂ θ 2 = ∂ b 1 ∂ ˜ θ 1 sin 2 φ − ∂ b 1 ∂ ˜ θ 2 sin φ co s φ − ∂ b 2 ∂ ˜ θ 1 sin φ cos φ + ∂ b 2 ∂ ˜ θ 2 cos 2 φ, (105) so that ∂ ˜ b 1 ∂ θ 1 + ∂ ˜ b 2 ∂ θ 2 = ∂ b 1 ∂ ˜ θ 1 + ∂ b 2 ∂ ˜ θ 2 . (106) W e now show that X i,j δ ij + ∂ ˜ b i ∂ θ j ! 2 = X i,j δ ij + ∂ b i ∂ ˜ θ j ! 2 . (107) For terms w ith i , j ≥ 3 , we have b i = ˜ b i and θ j = ˜ θ j , so that replacing ˜ b with b and θ with ˜ θ does not change the result. The terms with i = 1 , 2 and j ≥ 3 do no t change because of (103), while the terms with i ≥ 3 and j = 1 , 2 do not chan ge because of (1 04). It r emains to show that th e terms i, j = 1 , 2 do no t modif y the su m. T o this e nd, we write out th ese fo ur 14 terms as 1 + ∂ ˜ b 1 ∂ θ 1 ! 2 + 1 + ∂ ˜ b 2 ∂ θ 2 ! 2 + ∂ ˜ b 1 ∂ θ 2 ! 2 + ∂ ˜ b 2 ∂ θ 1 ! 2 = 2 + 2 ∂ ˜ b 1 ∂ θ 1 + 2 ∂ ˜ b 2 ∂ θ 2 + ∂ ˜ b 1 ∂ θ 1 ! 2 + ∂ ˜ b 1 ∂ θ 2 ! 2 + ∂ ˜ b 2 ∂ θ 1 ! 2 + ∂ ˜ b 2 ∂ θ 2 ! 2 = 2 + 2 ∂ b 1 ∂ ˜ θ 1 + 2 ∂ b 2 ∂ ˜ θ 2 +  ∂ b 1 ∂ ˜ θ 1  2 +  ∂ b 1 ∂ ˜ θ 2  2 +  ∂ b 2 ∂ ˜ θ 1  2 +  ∂ b 2 ∂ ˜ θ 2  2 =  1 + ∂ b 1 ∂ ˜ θ 1  2 +  1 + ∂ b 2 ∂ ˜ θ 2  2 +  ∂ b 1 ∂ ˜ θ 2  2 +  ∂ b 2 ∂ ˜ θ 1  2 (108) where, in the second transition, we have used (1 03), (104), and (1 06). It f ollows that I 1 [ ˜ b ] of (102) is e qual to I 1 [ b ] , and hence Z [ b ] = Z [ ˜ b ] . The resu lt similar ly holds for r otations about any oth er two coord inates. Sin ce any rotatio n can be decomp osed into a sequence of two-coor dinate rotation s, we conclud e that Z [ b ] is rotation in variant. Next, we p rove th at Z [ b ] is inv ar iant to r eﬂections thro ugh hyperp lanes containing the origin . Since Z [ b ] is inv ariant to rotations, it sufﬁces to choose a sing le hyper plane, say { θ : θ 1 = 0 } . Let ˜ b , ( − b 1 ( θ ) , b 2 ( θ ) , . . . , b n ( θ )) T (109) be the reﬂection o f b , and consid er the co rrespon ding ch ange of v ariables ˜ θ , ( − θ 1 , θ 2 , . . . , θ n ) T . (110) By the symm etry assump tions, p θ and J are unaffected by the change o f variables; furth ermore, ∂ ˜ b /∂ ˜ θ = ∂ b /∂ θ . It follows that CRB[ ˜ b , ˜ θ ] = CRB[ b , θ ] , and therefore Z [ b ] = Z [ ˜ b ] . Lemma 4 : Suppose b ( θ ) is radial and rotation in variant, i.e., b ( θ ) = t ( k θ k 2 ) θ for some fu nction t ∈ H 1 . Also suppose that J ( θ ) = J ( k θ k ) I , where J ( · ) is a scalar fu nction. Then, CRB[ b , θ ] of (11) is rotatio n inv ariant in θ , i.e., CRB[ b , Rθ ] = CRB[ b , θ ] for any r otation m atrix R . Pr oof: W e will show tha t CRB[ b , θ ] d epends on θ on ly throug h k θ k 2 , an d is th erefor e rotation inv ar iant. For the g iv e n value of b ( θ ) and J ( θ ) , we have CRB[ b , θ ] = k b ( θ ) k 2 + T r "  I + ∂ b ∂ θ  J − 1 ( θ )  I + ∂ b ∂ θ  T # = t 2 k θ k 2 + 1 J ( k θ k ) T r "  I + ∂ t θ ∂ θ   I + ∂ t θ ∂ θ  T # (111) where, for notational convenience, we h av e o mitted the de- penden ce o f t on k θ k 2 . It remains to sh ow th at the trace in the above expr ession is a fu nction o f θ o nly thro ugh k θ k 2 . T o this end, we note that ∂ b i ∂ θ j = tδ ij + t ′ θ i ∂ k θ k 2 ∂ θ j = tδ ij + 2 t ′ θ i θ j (112) where δ ij is the Kr onecker d elta. It f ollows that  δ ij + ∂ b i ∂ θ j  2 = (1 + t ) 2 δ ij + 4(1 + t ) t ′ θ i θ j δ ij + 4 t ′ 2 θ 2 i θ 2 j . (113) Therefo re T r "  I + ∂ b ∂ θ   I + ∂ b ∂ θ  T # = X i,j  δ ij + ∂ b i ∂ θ j  2 = n (1 + t ) 2 + 4 t ′ 2 X i,j θ 2 i θ 2 j + 4(1 + t ) t ′ X i θ 2 i = n (1 + t ) 2 + 4 t ′ 2 k θ k 4 + 4(1 + t ) t ′ k θ k 2 . (114) Thus, CRB[ b , θ ] depen ds on θ on ly thro ugh k θ k 2 , completin g the p roof. Pr oof: [ Proof o f Theo rem 3] W e h av e seen in Theo rem 2 that the so lution of (20) is uniqu e. No w suppose that the optimum b is not rotation inv arian t, i. e., there exists a rotation matrix R such that Rb ( θ ) is n ot iden tical to b ( θ ) . By Lemma 3 , Rb ( θ ) is also o ptimal, wh ich is a co ntradiction . Furthermo re, sup pose that b is not radial, i.e ., for some v alue of θ , b ( θ ) contains a co mpon ent perpendicu lar to the vector θ . Consider a hy perplan e p assing throu gh the o rigin, whose normal is the a foremen tioned perpe ndicular co mpon ent. By Lemma 3, The reﬂection of b through this hyperp lane is also an optimal solution o f (20), which is again a contradictio n. Therefo re, the o ptimum b is spher ically symmetric and r adial, so that it can be written as b ( θ ) = b ( k θ k ) θ k θ k (115) where b ( · ) is a scalar function . T o de termine the value of b ( · ) , it sufﬁces to analy ze the differential equ ation (21) along a straigh t line from the origin to the bo undar y . W e choose a line along the θ 1 axis, a nd begin by c alculating the derivati ves of b 1 ( θ ) , q ( k θ k ) , and J ( k θ k ) along th is axis. The de riv ative o f q ( k θ k ) is giv e n by ∂ q ∂ θ j = q ′ ( ρ ) θ j ρ (116) where we have den oted ρ = k θ k , so th at ρ is weakly differentiable and ∂ ρ ∂ θ j = θ j ρ . (117) Along the θ 1 axis, we have θ 1 = ρ while θ 2 = · · · = θ n = 0 , so that ∂ q ∂ θ j     θ = ρ e 1 = q ′ ( ρ ) δ j 1 . (118) Similarly , since J ( θ ) = J ( ρ ) I , ∂ ( J − 1 ) j k ∂ θ j = − J ′ ( ρ ) J 2 ( ρ ) θ j ρ δ j k (119) so that alon g the θ 1 axis ∂ ( J − 1 ) j k ∂ θ j     θ = ρ e 1 = − J ′ ( ρ ) J 2 ( ρ ) δ j k δ j 1 . ( 120) 15 From (115), we have ∂ b i ∂ θ j = b ′ ( ρ ) θ i θ j ρ 2 + b ( ρ ) ρ  δ ij − θ i θ j ρ 2  . (121) Thus, on th e θ 1 axis, we have ∂ b 1 ∂ θ j     θ = ρ e 1 = b ′ ( ρ ) δ j 1 . (122) The second deriv ative of b i ( θ ) can b e sh own to equal ∂ 2 b i ∂ θ j ∂ θ k = b ′′ ( ρ ) θ i θ j θ k ρ 3 +  b ′ ( ρ ) ρ − b ( ρ ) ρ 2   θ i ρ δ j k + θ j ρ δ ik + θ k ρ δ ij − 3 θ i θ j θ k ρ 3  . (123) Therefo re, on the θ 1 axis ∂ 2 b 1 ∂ θ 2 1     θ = ρ e 1 = b ′′ ( ρ ) ∂ 2 b 1 ∂ θ 2 j      θ = ρ e 1 = b ′ ( ρ ) ρ − b ( ρ ) ρ 2 ( j 6 = 1) ∂ 2 b 1 ∂ θ j ∂ θ k     θ = ρ e 1 = 0 ( j, k 6 = 1) . (1 24) Substituting th ese deriv atives in to ( 21), we o btain q ( ρ ) b ( ρ ) = q ( ρ ) J ( ρ )  b ′′ ( ρ ) + ( n − 1) b ′ ( ρ ) ρ − ( n − 1) b ( ρ ) ρ 2  + (1 + b ′ ( ρ ))  q ′ ( ρ ) J ( ρ ) − q ( ρ ) J ′ ( ρ ) J 2 ( ρ )  (125) which is eq uiv alent to (25). T o obtain the boun dary cond itions, ob serve that Lemm a 3 implies b ( 0 ) = 0 , whence we conclud e that b (0) = 0 . Next, ev aluate the boun dary c ondition (22) a t bou ndary p oint θ = r e 1 , where the surface normal ν ( θ ) equa ls e 1 , so that 1 + b ′ ( ρ ) = 1 + ∂ b 1 ∂ θ 1 = 0 , θ = r e 1 (126) which is eq uiv alent to the bo undary co ndition b ′ ( r ) = − 1 . T o ﬁnd the OBB (24), we must n ow calcula te Z [ b ] for the ob tained bias function (11 5). T o th is end, note that, by Lemma 4 , CRB[ b , θ ] is rotation inv a riant in θ f or the requ ired b ( θ ) . T hus, the integran d CRB[ b , θ ] q ( k θ k ) is con stant o n a ny ( n − 1) -spher e centered on the origin, so that Z [ b ] = Z r 0 CRB[ b , ρ e 1 ] q ( ρ ) S n ( ρ ) dρ (127) where S n ( ρ ) = 2 π n/ 2 Γ( n/ 2) ρ n − 1 (128) is th e hy persurface area of a n ( n − 1) -sphere of radius ρ [35] . It thus sufﬁces to calculate the value of CRB[ b , θ ] at poin ts along the θ 1 axis. From (121), it fo llows that ∂ b ∂ θ     θ = ρ e 1 = dia g  b ′ ( ρ ) , b ( ρ ) ρ , . . . , b ( ρ ) ρ  . (129) Substituting th is into the de ﬁnition of CRB[ b , θ ] , we obtain CRB[ b , ρ e 1 ] = b 2 ( ρ ) + 1 J ( ρ ) (1 + b ′ ( ρ )) 2 + n − 1 J ( ρ )  1 + b ( ρ ) ρ  2 . (13 0) Combining (130) with (12 7) yie lds (2 4), as r equired . A P P E N D I X V P R O O F S O F A S Y M P T O T I C P RO P E RT I E S Theorem s 5 and 6 d emonstrate asymptotic tightness of the OBB. T he proofs of th ese two theorem s follow . Pr oof: [Proo f of Theorem 5] W e begin the pro of by studying a certain o ptimization problem , wh ose r elev ance will be d emonstrated shortly . Let t ≥ 0 be a c onstant and co nsider the p roblem u ( t ) = inf b ∈ H 1 Z Θ     I + ∂ b ∂ θ     2 F p θ ( d θ ) s.t. Z Θ k b ( θ ) k 2 p θ ( d θ ) ≤ t. (131) Notice that u ( t ) ≤ n for all t , since an objectiv e having a value of n is achieved by th e functio n b ( θ ) = 0 . T hus, it su fﬁces to p erform the minimization (131) ov er function s b ∈ H 1 satisfying Z Θ     I + ∂ b ∂ θ     2 F p θ ( d θ ) ≤ n. (132) It follows from L emma 2 tha t such fu nctions also satisfy Z Θ     ∂ b ∂ θ     2 F p θ ( d θ ) ≤ (2 √ n ) 2 = 4 n. (133) Therefo re, (131) is eq uiv alen t to the minimization u ( t ) = inf b ∈ S t Z Θ     I + ∂ b ∂ θ     2 F p θ ( d θ ) (134) where S t =  b ∈ H 1 : Z Θ k b ( θ ) k 2 p θ ( d θ ) ≤ t, Z Θ     ∂ b ∂ θ     2 F p θ ( d θ ) ≤ 4 n  . (135) The set S t is co n vex, closed, a nd bou nded in H 1 . Applyin g Lemma 1 (with ℓ = 2 ) implies that there exists a fun ction b opt ∈ S t which minim izes (134), an d hence also minimizes (131). Note th at the o bjective in (1 31) is zero if and only if ∂ b opt ∂ θ = − I a.e. ( p θ ) . (136) The on ly fun ctions in H 1 satisfying this requ irement are th e function s b ( θ ) = k − θ a.e. ( p θ ) (137) for some co nstant k ∈ R n . Let µ , E { θ } and deﬁne v , E  k θ − E { θ } k 2  . (138 ) 16 For functions of the form (137), the constraint of (131) is giv en by Z Θ k k − θ k 2 p θ ( d θ ) = Z Θ k k − µ + µ − θ k 2 p θ ( d θ ) = k k − µ k 2 + v ≥ v . (139) In ( 139), eq uality is obtained if and on ly if k = µ . Th erefor e, if t < v , no functio ns satisfying (136) ar e feasible, and th us u ( t ) = 0 if t ≥ v , u ( t ) > 0 if t < v . (140) W e now return to th e setting o f Theorem 5. W e must show that β N → v as N → ∞ . W e den ote function s cor respond ing to the p roblem of estimatin g θ fro m x ( N ) with a superscript ( N ) . Th us, for examp le, Z ( N ) [ b ] denotes the functiona l Z [ b ] of (12) for the pro blem corresp onding to the measurem ent vector x ( N ) . Since all eigenv alues of J ( N ) ( θ ) decrease mono tonically with N for p θ -almost all θ , we have CRB ( N ) [ b , θ ] ≤ CRB ( N +1) [ b , θ ] (141) for any b ∈ H 1 , for p θ -almost all θ , and f or all N . There fore Z ( N ) [ b ] ≤ Z ( N +1) [ b ] . (142) for any b ∈ H 1 and for all N . It follows that fo r all N β N = min b ∈ H 1 Z ( N ) [ b ] ≤ min b ∈ H 1 Z ( N +1) [ b ] = β N + 1 (143) so that β N is a no n-decr easing sequence. Furth ermore, note that Z ( N ) [ µ − θ ] = v for all N (144) where v is given by (1 38). Therefo re, β N ≤ v for all N . Thus β N conv e rges to some value q , and we have β N ≤ q ≤ v for all N . (1 45) T o p rove the the orem, it rem ains to show th at q = v . Let b ( N ) be th e min imizer of (1 7) whe n θ is estimated from x ( N ) ; this minimizer exists by virtue of Proposition 1. W e then have β N = Z ( N ) [ b ( N ) ] ≤ q (146) and therefore Z Θ k b ( N ) ( θ ) k 2 p θ ( d θ ) ≤ q . (14 7) It fo llows th at b ( N ) satisﬁes the con straint of th e op timization problem (131) with t = q . As a consequence, we have Z Θ      I + ∂ b ( N ) ∂ θ      2 F p θ ( d θ ) ≥ u ( q ) . (14 8) Deﬁne λ N , es s sup θ ∈ Θ λ max ( J ( N ) ( θ )) (149 ) and n ote th at λ N > 0 for all N , since J ( N ) ( θ ) is positive deﬁnite. Thus Z ( N ) [ b ( N ) ] ≥ Z Θ T r   I + ∂ b ( N ) ∂ θ !  J ( N ) ( θ )  − 1 · I + ∂ b ( N ) ∂ θ ! T   p θ ( d θ ) ≥ 1 λ N Z Θ      I + ∂ b ( N ) ∂ θ      2 F p θ ( d θ ) ≥ u ( q ) λ N . (150) Assume by contrad iction that q < v . From (140), it then follows that u ( q ) > 0 . Since all eigenv alues o f J ( N ) ( θ ) decrease to zer o, we h av e λ N → 0 , an d th us β N ≥ u ( q ) λ N → ∞ . (151) This co ntradicts the fact ( 145) that β N ≤ v . W e co nclude that q = v , as requ ired. Pr oof: [Pr oof of Th eorem 6] Th e proo f is ana logous to that of Theorem 5. W e begin by co nsidering th e optimiza tion problem inf b ∈ H 1 Z Θ k b ( θ ) k 2 p θ ( d θ ) s.t. Z Θ T r  I + ∂ b ∂ θ  J − 1 ( θ )  I + ∂ b ∂ θ  T ! p θ ( d θ ) ≤ t (152) for some co nstant t ≥ 0 . Denote the minimum value o f (15 2) by w ( t ) . Let µ = E { θ } and n ote that b ( θ ) = µ − θ satisﬁes the constraint in ( 152) f or any t ≥ 0 , and has an objective equal to v o f (138). Th us, to determin e w ( t ) , it sufﬁces to minimize ( 152) over th e set S t =  b ∈ H 1 : Z Θ k b ( θ ) k 2 p θ ( d θ ) ≤ v , Z Θ T r  I + ∂ b ∂ θ  J − 1 ( θ )  I + ∂ b ∂ θ  T ! p θ ( d θ ) ≤ t  . Deﬁne λ , ess sup θ ∈ Θ λ max ( J ( θ )) . (153) Since J ( θ ) is po siti ve deﬁnite almost everywhere, we have λ > 0 . For any b ∈ S t , we ha ve 1 λ Z Θ     I + ∂ b ∂ θ     2 F p θ ( d θ ) ≤ t (154) and therefore, by Lemm a 2, Z Θ     ∂ b ∂ θ     2 F p θ ( d θ ) ≤  √ tλ + √ n  2 . ( 155) 17 Hence, for any b ∈ S t , k b k 2 H 1 = Z Θ k b ( θ ) k 2 p θ ( d θ ) + Z Θ     ∂ b ∂ θ     2 F p θ ( d θ ) ≤ v +  √ tλ + √ n  2 . (156) Thus S t is bou nded for all t . I t is straightf orward to show that S t is also closed an d conv ex. T herefo re, employin g Lemma 1 (with ℓ = 1 ) ensures that there exists a (unique ) b opt ∈ S t minimizing (152). Note th at the objective in (15 2) is 0 if a nd only if b opt ( θ ) = 0 almost ev erywhere. So, if 0 ∈ S t , we h av e w ( t ) = 0 , an d otherwise w ( t ) > 0 . Let us d eﬁne s , E  T r( J − 1 ( θ ))  (157) and note th at 0 ∈ S t if and only if t ≥ s . Th us w ( t ) = 0 f or t ≥ s w ( t ) > 0 o therwise. (158) Let us now r eturn to the settin g of Theo rem 6. For sim- plicity , we den ote function s correspo nding to the pro blem of estimating θ from { x (1) , . . . , x ( N ) } with a superscrip t ( N ) . For examp le, fr om th e additive p roperty of the Fisher informa tion [2, § 3.4], we h av e J ( N ) ( θ ) = N J ( θ ) . (159) It follows that ( N + 1)CRB ( N +1) [ b , θ ] ≥ N CRB ( N ) [ b , θ ] (160) for all b ∈ H 1 , all θ ∈ Θ , and all N . T herefo re ( N + 1) Z ( N +1) [ b ] ≥ N Z ( N ) [ b ] (161) for all b ∈ H 1 , an d h ence ( N + 1) β N + 1 = min b ∈ H 1  ( N + 1) Z ( N +1) [ b ]  ≥ min b ∈ H 1  N Z ( N ) [ b ]  = N β N . (162) Thus { N β N } is a non-decr easing sequence. Fu rthermo re, we have N Z ( N ) [ 0 ] = s (163) so that N β N ≤ s for a ll N . I t follows that { N β N } is non - decreasing and boun ded, and therefor e converges to some value r such that N β N ≤ r ≤ s fo r all N . (164) T o p rove the the orem, w e must show tha t r = s . Let b ( N ) ∈ H 1 denote the m inimizer of (17) when θ is estimated from { x (1) , . . . , x ( N ) } ( the existence of b ( N ) is guaran teed by Prop osition 1 ). W e then h av e N β N = N Z ( N ) [ b ( N ) ] ≤ r , so that Z Θ T r   I + ∂ b ( N ) ∂ θ ! J − 1 ( θ ) I + ∂ b ( N ) ∂ θ ! T   p θ ( d θ ) ≤ r. (165) Thus, b ( N ) satisﬁes the c onstraint of (152) w ith t = r . As a consequen ce, we have Z Θ k b ( N ) ( θ ) k 2 p θ ( d θ ) ≥ w ( r ) (166) and therefore N β N = N Z ( N ) [ b ( N ) ] ≥ N Z Θ k b ( N ) ( θ ) k 2 p θ ( d θ ) ≥ N w ( r ) . (167) Now suppose by co ntradiction that r < s . It follows from (158) that w ( r ) > 0 . Hen ce, by (167), N β N → ∞ , which contradicts the fact that N β N is bou nded. W e con clude tha t r = s , as required. R E F E R E N C E S [1] J. O. Berger , Statistical Decision T heory and Bayesian Analysis , 2nd ed. Ne w Y ork, NY : Springe r-V erlag, 1985. [2] S. M. Kay , Fundamentals of Stati s tical Signal Pr ocessing: Estimatio n Theory . Engle wood Cliffs, NJ: Prentice Hall, 1993. [3] H. L. V an Tree s, Detection , Estimation, and Modulation Theory . New Y ork: Wi ley , 1968, vol. 1. [4] J. Ziv and M. Zakai, “Some lowe r bounds on signal parameter estima- tion, ” IEEE Tr ans. Inf. Theory , vol. 15, no. 3, pp. 386–391, May 1969. [5] T . Y . Y oun g and R. A. W e s terber g, “Error bounds for stoc hastic estimati on of signal parameters, ” IEEE T rans. Inf. Theory , vol. 17, no. 5, pp. 549–557, Sep. 1971. [6] S. Belli ni and G. T artara, “Bounds on error in signal parameter estima- tion, ” IEE E T rans. Commun. , vol . 22, no. 3, pp. 340–342, 1974. [7] D. Chazan, M. Zakai, and J. Ziv , “Improve d lower bounds on signal paramete r estimati on, ” IEEE T rans. Inf. Theory , vol. 21, no. 1, pp. 90– 93, 1975. [8] B. Z. Bobro vski and M. Zakai, “ A lowe r bound on the estimation error for certai n diffusion proble ms, ” IEEE T rans. Inf. Theory , vol. 22, no. 1, pp. 45–52, Jan. 1976. [9] A. J. W eiss and E. W einstei n, “ A lower bound on the mean-squa re error in random para meter estimation, ” IEEE T rans. Inf. Theory , vol. 31, no. 5, pp. 680–682, Sep. 1985. [10] E. W einstein and A. J. W eiss, “ A general class of lower bounds in paramete r estimati on, ” IEEE T rans. Inf. Theory , vol. 34, no. 2, pp. 338– 342, Mar . 1988. [11] K. L. Bell, Y . Steinber g, Y . Ephraim, and H. L. V an Tre es, “Extended Ziv –Zakai lowe r bound for vect or para meter estimation, ” IEEE T rans. Inf. Theory , vol. 43, no. 2, pp. 624–637, 1997. [12] A. Renaux, P . Forste r , P . Larzabal, and C. Richmond, “The Bayesi an Abel bound on the mean square error , ” in Proc . Int. Conf. A coust., Speec h and Signal Pr ocessing (ICASSP 2 006) , vol. III, T oulouse, France, May 2006, pp. 9–12. [13] E. L. Lehmann and G. Casella, Theory of P oint Estimation , 2nd ed. Ne w Y ork: Springer , 1998. [14] Y . C. Eldar , “Rethink ing biased estimation: Improving maximum like - lihood and the Cram ´ er–R ao bound, ” F oundati ons and T re nds in Signal Pr ocessing , vol. 1, no. 4, pp. 305–449, 2008. [15] S. M. Kay and Y . C. Eldar , “Reth inking biased estimation, ” IE EE Signal Pr ocess. Mag. , vol. 25, no. 3, pp. 133–136, May 2008. [16] H. Cra m ´ er , “ A cont ribution to the theory of statist ical estimation, ” Skand. Akt. T idskr . , vol. 29, pp. 85–94, 1945. [17] C. R. Rao, “Information and accurac y attainabl e in the estimatio n of statisti cal parameters, ” B ull. Calcutta Math. Soc. , vol. 37, pp. 81–91, 1945. [18] J. M. Hammersley , “On estimating restricted parameters, ” J. Roy . Statist. Soc. B , vol. 12, no. 2, pp. 192–240, 1950. [19] D. G. Chapman and H. Robbins, “Minimum vari ance estimat ion without regul arity assumptions, ” A nn. Math. Statist. , vol. 22, no. 4, pp. 581–586, Dec. 1951. [20] P . K. Bhattac harya, “Estimating the mean of a multi varia te normal populat ion with general quadratic loss function, ” Ann. Math. Statist. , vol. 37, no. 6, pp. 1819–1824, D ec. 1966. [21] E. W . Barank in, “Locally best unbiased estimate s, ” Ann. Math. Statist. , vol. 20, no. 4, pp. 477–501, Dec. 1949. 18 [22] J. S. Abel, “ A bound on mean-squar e-estimat e error , ” IEEE T rans. Inf. Theory , vol. 39, no. 5, pp. 1675–1680, 1993. [23] A. O. Hero, J. A. Fessler , and M. Usman, “Exploring estimat or bias- v arianc e tradeof fs using the uniform CR bound, ” IEEE T rans. Signal Pr ocess. , vol. 44, no. 8, pp. 2026–2041, 1996. [24] P . Forster and P . Larzab al, “On lowe r bounds for determinist ic paramet er estimati on, ” in P r oc. Int. Conf. A coust., Speech and Signal Pro cessing (ICASSP 2002) , vol. 2, Orlando, FL, May 2002, pp. 1137–1140. [25] Y . C. Eldar , “Minimum v arianc e in biased estimati on: Bounds and asymptotic ally optimal estimators, ” IEEE T rans. Si gnal Pro cess. , v ol. 52, no. 7, pp. 1915–1930, 2004. [26] ——, “Uniformly improvin g the Cram ´ er-Rao bound and maximum- lik elihood estimation, ” IE EE T rans. Signal P r ocess. , vol. 54, no. 8, pp. 2943–2956, 2006. [27] ——, “MSE bou nds with af ﬁne bias dominating the Cram ´ er–Ra o bound, ” IEEE T rans. Signal Pr ocess. , vol. 56, no. 8, pp. 3824–3836, Aug. 2008. [28] I. A. Ibragimo v and R. Z. Has’minskii, Statistica l E stimation: A symp- totic Theory . New Y ork: Springer , 1981. [29] A. Renaux, “Contrib ution ` a l’analyse des performanc es d’estimation en traite m ent statistique du signal, ” Ph. D. dissertation, ` Ecole Normale Superieure de Cac han, 2006. [Online]. A vail able: http:/ /tel.a rchi ves- ouvertes.fr/tel- 00129527/ [30] Z. Ben-Haim and Y . C. Eldar , “ A Bayesian estimati on bound based on the optimal bias function, ” in Proc. 2nd Int. W orkshop on Computational Adv . in Multi-Sensor Adapt. Proce ss. (CAMSAP 2007) , St. Thomas, U.S. V irgin Islands, Dec. 2007. [31] J. Shao, Mathemati cal Statistics , 2nd ed. Ne w Y ork: Springer , 2003. [32] E. H. L ieb and M. L oss, Analysis , 2nd ed. American Mathematica l Society , 2001. [33] D. R. Cox and N. Reid, “Para meter orthogonal ity and approximat e conditi onal inference, ” J . Roy . Statist. Soc. B , vol. 49, no. 1, pp. 1–39, 1987. [34] H. L. V an Trees and K. L. Bell, B ayesian B ounds for P arame ter Estimation and Nonlinea r Filt ering/T rack ing . Ne w Y ork: Wil ey , 2007. [35] I. M. V inogrado v , Ed., Encyclopaedia of Math ematics . Dordrecht, The Netherl ands: Kluwer , 1995. [36] M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functi ons with F ormul as, Graphs, and Mathematical T ables . Ne w Y ork: Dove r, 1964. [37] Z. Ben-Haim and Y . C. Eldar , “ A comment on the use of the W eiss– Weinstei n bound with constrain ed parameter sets, ” IEEE T rans. Inf. Theory , vol. 54, no. 10, pp. 4682–4684, Oct. 2008. [38] L. P . Lebede v and M. J. Cloud, The Calcul us of V a riations and Functional A nalysis . New Jersey: W orld Scientiﬁc , 2003. [39] W . Rudin, Functional Analysis . New Y ork: McGraw-Hil l, 1973. [40] I. M. Gelfand and S. V . Fomin, Calc ulus of V ariations . Mineola, NY : Dov er, 2000.

A Lower Bound on the Bayesian MSE Based on the Optimal Bias Function

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment