Information field theory for cosmological perturbation reconstruction and non-linear signal analysis

Information ﬁeld theory for cosmological p erturbation reconstruction and non-linear signal analysis T orsten A. E nßlin, Mo na F rommert, and F rancisco S. Kita ur a Max-Planck-Institut f¨ ur Astr ophysik, Karl-Schwarzsch ild-Str. 1, 85741 Gar ching, Germany (Dated: Oct ober 29, 2018) W e dev elop inf ormation ﬁeld the ory (IFT) as a means of Ba yes ian inference on spatially dis- tributed signals, the information ﬁelds. A didactical app roac h is attempted. Starting from general considerations on the natu re of measurements, signals, noise, and their relation to a physical re- alit y , w e d eriv e the informatio n Hamiltonian, the source ﬁeld, propagator, and interaction terms. F ree IFT repro duces the well k no wn Wiener-ﬁlter theory . Interacting IFT can be diagrammatical ly expanded, for which w e pro vide the F eynman rules in p osition-, F ourier-, and spherical harmon- ics space, and the Boltzmann-Shannon information measure. The theory should b e applicable in many ﬁelds. How ever, here, tw o cosmolog ical signal reco very problems are discussed in their IFT- form u lation. 1) Reconstruction of the cosmic large-scale stru cture matter distribu tion from discrete galaxy counts in incomplete galaxy surveys within a simple model of galaxy formation. W e show that a Gaussian signal, which should resemble the initial density p erturbations of the U niver se, ob serv ed with a strongly non-linear, incomplete and P oissonian-noise aﬀected resp onse, as the pro cesses of structure and galaxy formation and observ ations provide, can b e reconstructed th anks to t he virtue of a resp onse-renormalization ﬂow equation. 2) W e design a ﬁlter to detect lo cal n on-linearities in the cosmic microw av e background, which are predicted from some Early-Un iv erse inﬂationary sce- narios, and exp ected due to measurement imp erfections. This ﬁlter is the opt imal Bay es’ estimator up to linear order in th e non-linearity parameter and can be used even to construct sky maps of non-linearities in t he data. I. INTRO DUCTION A. Motiv ation The optimal extractio n and resto ration of informa tio n from data on spatially distributed quant ities lik e the cos- mic lar ge-sc ale st ructur e (LSS) o r the c osmic mic r owave b ackgr ound (CMB) temp erature ﬂuctuations in cosmol- ogy , but als o on many other sig nals in physics and re- lated ﬁelds, is essential for any quantitativ e, data-driven scientiﬁc inference . The problem of how to design such metho ds p osses ses man y technical and even conceptual diﬃculties, which ha ve led to a larg e num ber of recip es and metho dologies . Here, we address such problems from a strictly infor- mation theo retical p oint of v iew. W e show, a s others hav e done before, that information theory for distributed quantities le a ds to a statis tica l ﬁeld theor y , which we name info rmation ﬁeld the ory (IFT). In con tr ast to the previous works, which mostly treat suc h pr oblems on a classical ﬁeld lev el, as will be detailed later, here, we take full adv antage of the exis ting ﬁeld theoretica l appa- ratus to tr e at interacting and non- classical ﬁelds . Thus, we show how to use diagr ammatic p e rturbation theo ry and renormalization ﬂows in order to construct optimal signal r e co vering algor ithms a nd to calcula te moment s of their uncer tain ties. Non-class ic alit y manifests itself a s quantum and statistical ﬂuctuations in quan tum and sta- tistical ﬁeld theory (QFT & SFT), and very simila rly as uncertaint y in IFT. The informa tion theor e tical per spective on signa l infer - ence pro blems ha s technical adv a n tages, since it pe r mits to design information-yield optimized algorithms and ex- per imen tal setups. Ho w ever, it als o provides deeper in- sight into the mechanisms of knowledge ac cum ula tion, its underlying information ﬂows, and its dependence on data models, prior k no wledge a nd as sumptions than pure empirical ev a lua tions of ad-ho c algo rithms a lone could provide. W e ther e fore hop e that o ur work is of in terest for tw o t yp e s of r e aders. The ﬁrs t are applied s cien tists, who are mainly interested in the pra ctical asp e c t of IFT since they ar e facing a concre te in verse problem for a spa tially distributed qua n tity , esp ecially but not exclusively in cos- mology . The second are more philo s ophical or theoret- ically inclined scie ntists, for whom IFT may serve as a framework to understand and cla ssify many of the exis t- ing metho ds o f signal e xtraction and r eception. Since we exp ect that many interested reader s ar e not very famil- iar with ﬁeld theoretical formalisms, w e in tro duce some of its ba sic mathematical concepts. Due to this a n tici- pated non-uniform reade r ship, not everything in this arti- cle might be of every o nes interest, and ther efore we pr o- vide in the following a short ov erview on the structure and conten t of the a r ticle. B. Overview of the work The rema inder of this introduction section contains a detailed discussion of the pr evious work on signal infer- ence theor y as well as a v er y brief introduction into the here relev ant works on the cosmic LSS and the CMB. The main par t of this ar ticle falls into t wo categorie s: 2 abstract IFT and its applicatio n. The co ncepts of IFT are in tro duced in Sec. I I, where Ba y esian metho dology , the distinction of physical a nd infor mation ﬁelds, the def- inition of signa l resp onse and noise, as w ell the design of signal spaces are discussed. The bas ic IFT formalism in- cluding the free theory is in tro duced in Sec. I I I, whic h, according to our judgement, summarize s and uniﬁes the previous knowledge o n IFT befor e this pap er. An im- patient reader, only in teres ted in applying IFT and not worrying abo ut co nce pts, may start reading in Sec. I I I. F rom Se c . IV on the new results of this work are pre- sented, star ting with the discuss ion of in teracting inf or- mation ﬁelds, their Hamiltonians and F eynman rules , and the Boltzmann-Shannon inf ormation measure . The nor- malisability of sensibly constructed IFTs is shown, as well the classical informatio n ﬁeld eq uation is presented there. A step-by-step recip e of how to derive and implement a n IFT a lgorithms is a lso provided. Details of the notation can be found, if not deﬁned in the ma in text, in App endix A. Applications of the theor y are provided in the following t wo sections, which can be skipp ed by a reader interested only in the g e neral theoretical framework. Althoug h sp e- ciﬁc inference problems are addressed, they s hould serve as a blueprint for the tackling of similar problems. In Sec. V the pr oblem of the reco nstruction of the cosmic matter distribution from galaxy surveys is analy z ed in terms of a Poissona in data mo del. In Sec. VI w e deriv e an optimal estimato r for non-Gaussianity in the CMB, and show how it can b e g eneralized to ma p p otential non-Gaussianities in the CMB sky . Our summar y and outlo ok can b e found in Sec. VII. C. Previous works The work presented here tries to unify information the- ory and statistical ﬁeld theory in or der to provide a con- ceptual framework in w hich optimal too ls for cosmologi- cal signa l ana lysis can be derived, as well a s for inference problems in other disciplines. Below, we provide very brief introductions int o each of the requir e d ﬁelds 1 (in- formation theor y , ima g e reconstructio n, statistica l ﬁeld theory , cos mo logical la r ge-scale structur e, a nd cosmic mi- crow ave bac kg round), for the orientation of no n-exper t readers. An exp ert in any o f these ﬁelds might decide to skip the corr esponding sections. 1 This work has tremendously b eneﬁtt ed in a dir ect and indirect wa y from a large num b er of pr evious publications in those ﬁelds. W e, the authors, hav e to ap ologize for b eing unable to give full credit to al l rel ev an t former works in those ﬁelds for only con- cen trating on a brief summary of the pap ers more or less directly inﬂuencing this work. This coll ection is obviously highly biased to wards the cosmological l iterature due to our main scientiﬁc int erests and expertise, and deﬁnitely incomplete. 1. Information the ory and Bayesian i nfer enc e The fundament of information theory w as laid by the work of Bay es [1] on pr obability theo ry , in which the cele- brated Bay es theorem w as pr esent ed. The theorem itself (see Eq. 7) is a simple rule for conditiona l pro babilities. It only unfolds its power for inference pr oblems if used with belief or knowledge states, describ ed by conditional probabilities. The a dv en t o f mo dern informatio n theory is proba bly bes t dated by the work of Shanno n [2, 3] o n the concept of information measur e, b eing the nega tive Boltzmann- ent ropy , and the work of Jaynes, combining the language of statistical mechanics and Bayes proba bilit y theory a nd applying it to knowledge uncer tain ties [4, 5, 6, 7, 8, 9, 10]. The requir ed numerical ev aluatio n o f Bay es ia n proba bil- it y int egrals suﬀered often from the curse of high dimen- sionality . The standard recip e ag a inst this, s till in mas - sive use today , is imp ortance s ampling via Ma rko v- C ha in Monte-Carlo Metho ds (MCMC), following the ide a s of Metrop olis et al. [11], Hastings [12], and Geman and Ge- man [13], where the latter author s alre ady had image reconstructio n applica tions in mind. The Hamiltonian MCMC metho ds [14], in whic h the phase-s pa ce sampling is partly following Hamiltonian dy namics, are a lso of rel- ev ance here. Ther e the Hamilto nia n is intro duced as the negative logar ithm of the pr obability , as we do in this work. With suc h too ls, higher dimensional problems, as present in signal resto ration, could a nd can b e tackled, how ever, for the pric e o f getting stochastic uncertaint y int o the computational results. F or a rece nt revie w on image r estoration MCMC tec hniques, see [15]. The a pplications and ex tensions of these pioneering works are too numerous to b e listed here. G o od mono- graphs e xist and the necessar y references can b e found there [16, 17, 18, 1 9, 20, 2 1]. 2. Image r e c onstruction in astr onomy and elsewher e The problem of image reconstructio n from incomplete, noisy data is esp ecially imp ortant in astro nom y , where the exp erimental co nditions are la rgely s e t by the nature of distant ob jects, weather conditions , etc., all ma inly out o f the control of the observer, as well as in other disciplines like medicine and geology , with simila r limita- tions to ar range the ob ject of observ ations for an optimal measurement. Some of the most prominent metho ds of image reconstr uction, which ar e based on a Bay e sian im- plement ation of an as sumed data mo del, are the Wiener- ﬁlter [22], the Richardson-Lucy algorithm [23, 24], and the maximum-en tr o p y ima ge r estoration [25](see also [26, 2 7, 28, 2 9, 30, 31, 32, 33, 3 4, 35, 3 6, 37]). The Wiener ﬁlter ca n b e r egarded to b e a full Bay esia n image inference metho d in case of Gauss ian signa l and noise statistics, as we will show in Sect. I I I B. It will be the working horse of the IFT formalis m, since the 3 Wiener ﬁlter repres e nts the algorithm to construct the exact ﬁeld theoretical exp ectation v alue given the data for an interaction-free informatio n Ha milto nia n. The ﬁl- ter can b e decomp osed in to t w o e ssen tial infor mation pro cessing steps, ﬁrst building the information source by resp onse-ov er-noise w eighting the data, and then pro pa- gating this information throug h the signa l space, by a p- plying the so called Wiener v aria nce. The Richardson-Lucy algorithm is a maximum- likelihoo d method to r econstruct fr o m Poissonia n data and therefore is also of Bay es ia n orig in. This metho d has usually to b e regular ized by hand, by trunca tion o f the iterative ca lculations, against an ov er-ﬁtting insta - bilit y due to the missing (or implicitly ﬂat) signa l prio r. A Gaussia n-prior based re g ularization w as recently pro- po sed by Kita ura and Enßlin [38], and the implementa- tion of a v a riant o f this is pres e n ted here in Sect. V D. Maximum entrop y a lgorithms will not be the topic here, a s well as not a num b er of o ther existing methods, which are pa rtly within and partly outside the Bayesian framework. They may b e found in existing reviews on this to pic [e.g. 39, 40]. 3. Statistic al and Bayesian ﬁeld the ory The relation of sig nal rec onstruction pro ble ms and ﬁe ld theory was disc overed indep endently by several authors. In cosmo logy , a prominent work in this dir ections is Bertschinger [41], in which the path integral a pproach was prop osed to sample primo rdial density p erturbations with a Gaussian statis tics under the constraint of exist- ing information on the large sca le structure. The work presented here can b e regar ded as a no n-linear, non- Gaussian extension o f this. Many methods from statistics and fro m statistical mechanics were of co ur se used even earlier, e.g. the usage of moment generating function for cosmic density ﬁelds can a lready b e found in F ry [4 2]. Sim ultaneously to Bertschinger’s work, Bialek and Zee [43, 44] argued that visual per ception can b e mo deled as a ﬁeld theo ry for the true image, be ing disto rted by noise and o ther data transfo r mations, which are summa r ized by a n uisance ﬁeld. A probabilistic lang ua ge w as used, but no direct refere nce to infor ma tion theor y was made, since not the o ptimal information reconstr uction was the aim, but a mode l for the human visua l reception system. How ever, this work actually trigger ed o ur resea rch. Bialek et a l. [45] applied a ﬁeld theor etical approa c h to recov er a probabilit y distributio n fro m data. Here, a Bay es ia n prior was used to reg ularize the solution, which was set up a d-ho c to enforce smo othness of the rec o n- struction, o btained fr om the classical (o r saddlep oint , or maximum a p osteriori) so lution o f the problem. How- ever, an “o ptimal” v alue for the smoothness controlling parameter w as deriv ed from the data itself, a topic also addressed b y Stoica et al. [46] and by a follow up publi- cation to ours [47]. Bialek et al. [45] also recognized, a s we do, that an IFT can easily b e non-lo cal. Finally , the work of L e mm and co w orkers [48, 49, 50, 51, 52, 53, 54, 55] established a tight connection b etw een statistical ﬁeld theory and Ba yesian inference, and pro- po sed the term Bayesia n ﬁeld the ory (BFT) for this. How ever, we prefer the term information ﬁeld the ory since it puts the emphasis on the relev a n t ob ject, the information, wher e a s BFT refers to a method, Bayesian inference. The term information ﬁeld is rather self- explaining, whereas the mea ning of a Bayesian ﬁeld is not that obvious. The applica tions consider ed b y Lemm concentrate on the reconstruction of probability ﬁelds ov er par ameter spaces and quan tum mechanical p otentials b y means of the maximum a p osterio ri equa tion. The ex tensiv e bo ok summarizing the e s sen tial insigh ts of these pap ers, [48], clearly states the p ossibility of p erturbative expansions of the ﬁeld theory . How ever, this is no t followed up by these authors probably for reasons of the co mputational com- plexity of the required alg o rithms. In con trast to many of the pr e vious works on IFT, which deal with ad-ho c priors, the publicatio n b y Lemm [56] is remark able , since it provides explicit rec ipes o f how to implemen t a priori information in v ar ious circumstances mor e rigor ously . The mathematical to o ls required to tackle IFT pro b- lems co me from SFT a nd Q FT, which hav e a v a s t litera- ture. W e hav e sp ecially made use of the bo ok s of Binney et a l. [57], Peskin and Schroeder [58], a nd Zee [59]. 4. Cosmolo gic al lar ge-sc ale st ructur e Our ﬁrs t IFT ex ample in Sec. V is geared tow a rds improving galaxy-s urvey ba sed cosmo graphy , the recon- struction of the large- scale structure matter distribution. W e provide here a sho rt ov erview on the relev ant back- ground and works. The LSS of the matter distribution of the Universe is traced by the spatia l distribution of Galaxies, and therefore well observ a ble. This struc tur e is believed to hav e emerged from tin y , mo stly Gaussian initial den- sity ﬂuctuatio ns of a rela tiv e strength of 10 − 5 via a self- gravitational instabilit y , partly counteracted b y the ex- pansion o f the Universe. The initial density ﬂuctuations are b elieved to b e pro duced during an ear ly inﬂationary epo ch of the Universe, and to ca rry v aluable information ab out the inﬂaton, the ﬁeld which drove inﬂation, in their N -p oin t co rrelation functions, to be extracted from the observ atio na l da ta. The onset of the structure formation pr oces s is well describ ed by linear p erturbation theory and ther efore to conserve Gaussianity , how ever, the later e v olution, the structures o n smaller scales, and esp ecially the gala xy formation require non-linear descriptions. The observ a- tional situation is complicated b y the fac t that the most impo rtant g alaxy distance indicator , their redshift, is als o sensitive to the gala xy p eculiar velocity , w hich cause s the observ atio na l data on the three-dimensio nal LSS to b e partially degener a ted. The r e are analytica l methods to 4 describ e these eﬀects 2 , and als o ex tensiv e work o n N - bo dy simulations of the s tructure formation, the la tter probably providing us with the mo st detaile d and ac - curate statistica l data o n the pro p erties of the matter density ﬁeld [e.g 75]. In rece nt years, it was re c o gnized that the evolution of the cosmic density ﬁeld and its statistica l prop er- ties can b e addres sed with ﬁeld theoretical metho ds by virtue of reno rmalization ﬂow equations. Detailed semi- analytical calculations for the density ﬁeld time prop- agator , the tw o- a nd three- p oint cor relation functions are now p ossible due to this, whic h are exp ected to play an impo rtant role in future approaches to recon- struct the initial ﬂuctuations from the observ a tional data [76, 77, 7 8, 79, 80, 81, 82, 8 3, 84, 85, 86, 87, 88, 89, 90] . It was reco gnized ear ly on, that the primordial den- sity ﬂuctuations can in principle b e reco nstructed from galaxy observ a tio ns [41]. This has lead to a larg e devel- opment of v arious n umer ical techniques for a n optimal reconstructio n [91, 92, 93, 94, 95, 96, 97, 98, 9 9, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 1 14, 115, 1 16, 117, 118, 119, 120, 1 21, 122, 1 23, 124, 1 25, 126, 127, 128, 129, 1 3 0, 131]. Many of them are based o n a Bay es ia n approach, since they a re im- plement ations and ex tension of the Wiener ﬁlter. How- ever, also other pr inciples ar e use d, like, e.g. the least action approach, or V oronoi tessella tion techniques [e.g. 132, 1 33, 1 34, 135, 13 6 , 137, 1 38]. A discussion a nd clas- siﬁcation o f the v arious metho ds can b e found in [38]. Esp ecially the Wiener ﬁlter methods w ere extensively applied to g alaxy survey data 3 and per mitted partly to extr apo la te the ma tter distribution into the zone of avoida nc e behind the g alactic disk and to clo se the data- gap ther e, c.f. [157, 158, 1 5 9], a topic we also address in Sect. V. Another cosmological r elev ant information ﬁeld to b e extracted from ga laxy c a talogues is the LSS p ow er sp ec- trum [e.g. 16 0 , 161, 1 62, 163, 164]. This power is also measurable in the CMB, and for a long time the CMB provided the b est sp ectrum norma liz a tion [165, 1 66]. 5. Cosmic Micr owave Backgr ound Since our second ex a mple deals with the CMB, we g iv e a brief ov erview on it and on rela ted inference metho ds . The CMB reveals the statistical prop erties of the ma t- ter ﬁeld at a time, when the Univ erse was ab out 110 0 times smaller in linear size than it is to day . The photon- baryon ﬂuid, which deco uples at that epo ch into neutral 2 Of special interest in this context ma y b e [60], whi c h alr eady applies path-integrals, [61, 62, 63, 64, 65, 66, 67, 68, 69 , 70, 71, 72, 73 , 74 ], and the pap ers they refer to. 3 Surve y based r econstructions of the cosmic matter ﬁelds can be found in [139, 140 , 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154 , 155 , 156 ] . Hydrogen and fr ee stre aming pho tons, has r e sponded to the gr a vitational pull of the then already forming dar k matter s tr uctures. The photons from that e po ch co oled due to the cosmic expansion since then into the CMB radiation w e observe today , and carry information on the physical pro p erties o f the photon-bar y on ﬂuid of that time like density , tempera ture a nd velo cit y . T o very high accuracy , the sp ectrum of the photons from any direc - tion is that of a blackbo dy , with a mean tempera ture o f 2 . 7 Kelv in and ﬂuctuations of the order of 10 − 5 Kelvin, imprinted by the pr imordial gravitational p otentials at decoupling. Therefore, mapping these temper ature ﬂuctua tions per mits pr ecisely to s tudy many cosmolog ical parameter s simult aneously , like the amount of dark matter pro duc- ing the gravitational p otentials, the r a tio of photons to baryons, balancing the pressure and w eight of the ﬂuid, and ge o metrical and dynamical par ameters of space-time itself. The observ ations ar e technically challenging, and therefore require s ophisticated algorithms to extrac t the tin y sig na l of tempe r ature ﬂuctuatio ns aga inst the instru- men t noise, but als o to separa te it fr om other astro ph ys- ical foreground emissio n with the be s t p ossible accura cy . A num b er o f suc h algor ithms were developed [e.g. 167, 168, 169, 1 70, 171, 1 72, 173, 174, 175, 176, 1 77, 178, 1 79, 180, 181, 1 82, 183, 184], which in many cases implement the Wiener ﬁlter. Th us , the r equired numerical to ols for an IFT trea tment o f CMB data are essentially av ailable. The exp ected temp erature ﬂuctua tions sp ectrum can be calculated from a linear perturba tiv e treatmen t o f the Boltzmann equatio ns of all dy namical active parti- cle sp ecies at this epo ch, and fast computationa l imple- men tations exists p ermitting to predict it for a given s et of cosmolo g ical parameter s. W ell k nown co des fo r this task ar e publicly av ailable 4 and p ermit to extract infor- mation o n co s mological parameter s from the measur ed CMB temperature ﬂuctuation s pectrum via comparison to their predictions for a given para meter set. It was recognized early on that this should happ en in an infor- mation theoretically optimal w ay , and Bayesian metho ds were ther efore adapted in that a rea well b efore in other astrophysical dis ciplines [e.g . 188, 18 9, 190, 191]. The initial metric and dens it y ﬂuctuations, from which the CMB ﬂuctuations and the L SS emerged, are believed to b e initially seeded by quantum ﬂuctuations of a hy- po thetical inﬂaton ﬁe ld, which sho uld hav e driven a n inﬂationary expans ion phase in the very early Univ erse [192, 193, 1 94, 195, 196, 1 97]. The inﬂato n-induced ﬂuc- tuations hav e a very Gaussian pr obability distribution, how ever, so me non-Gaussia n featur es seem to b e un- av oida ble in mo st scenario s and can ser v e as a ﬁnger prin t to discr iminate among them [e.g. 198, 199, 20 0, 201]. O b- 4 E.g. cmbfast ( http:// cmbfast. org , http://a scl.net/ cmbfast.html , [185]), camb ( http:// camb.inf o/ , [186]), and cmbeasy ( http:// www.cmbe asy.org/ , [187 ]). 5 serv ationa l tes ts on such non- Gaussianities based on the three-p oint cor relation function of the CMB data [e.g. 202, 203, 2 04, 2 05, 20 6] were s o far mostly negative, how ever not sensitive enough to serious ly co nstrain the po ssible theoretical parameter space o f inﬂationary sce- narios, see e.g. [2 0 7, 20 8]. Recently , there has been the claim of a detection of such non-Ga ussianities by Y adav and W andelt [209] and a conﬁrmation of this with b etter data and improv ed algorithms is therefore highly desir- able. In Sect. VI w e make a pro pos al for improving the algorithmic side of this challenge. A recent review on the current status of CMB-Gaussianity can b e found in [2 10]. II. CONCEPTS OF INFORMA TION FIELD THEOR Y A. Information on physical ﬁelds In our attempts to infer the prop erties of our Uni- verse from astronomical observ ations we are faced with the pro blem of how to in terpret incomplete, imper fect and noisy data, dr a w our conclusio ns based on them a nd quantify the uncertainties of our results. This is true for using galaxy surveys to map the cosmic LSS, for the in- terpretation of the CMB, as w ell for man y exper imen ts in ph y sical labo ratories and compilations of geo logical, economical, s ocio logical, and biological data ab out our planet. Information theory , w hich is based on probability theory and the Bay esian interpretation of mis s ing knowl- edge as probabilistic uncertaint y , oﬀers an idea l frame- work to handle such pr oblems. It p ermits to describ e all relev ant pro cesses inv o lv ed in the measur emen t prob- abilistically , provided a mo del fo r the Universe or the system under conside r ation is ado pted. The states o f such a mo del, denoted by the state v ar i- able ψ , ar e identiﬁed with the po ssible ph ysical rea li- ties. They can hav e probabilities P ( ψ ) assigned to them, the so-called prior information. This prior con tains our knowledge ab out the Universe a s w e mo del it befor e any other data is taken. F or a g iv en cosmologic a l mo del, the prior may b e the probability distribution of the diﬀerent initial conditions o f the Univ erse, which determine the subsequent evolution completely . Since our Universe is spatially extended, the state v aria ble will in gener al con- tain one or several ﬁelds, which are functions o ver some co ordinates x . Also the measurement pro cess is describ ed b y a data mo del which deﬁnes the so- c alled likelihoo d, the prob- ability P ( d | ψ ) to obtain a sp eciﬁc datase t d given the ph ysical c ondition ψ . In case the outcome d of the mea- surement is deterministic P ( d | ψ ) = δ ( d − d [ ψ ]), where d [ ψ ] is the functional depe ndenc e of the da ta on the state. In an y case, the pr obabilit y distributio n function of the data, P ( d ) = Z D ψ P ( d | ψ ) P ( ψ ) , (1) is given in ter ms of a phase-space or path in tegral ov er all p ossible realizatio ns of ψ , to b e deﬁned mo re precisely later (Sect. I I E 1). A scie n tist is not actually interested in the total state of the Univ erse, but only in some sp eciﬁc asp ects of it, which w e ca ll the signal s = s [ ψ ]. The sig nal is a very reduced description of the ph ysical r eality , and can b e any function of its state ψ , freely chosen accor ding to the needs and interests o f the scientist o r the abilit y and capacity of the measurement and computational devices used. Since the sig nal do es not co n tain the full phys- ical state, a n y physical de g ree of fr eedom which is not present in the sig nal but inﬂuences the data will b e re- ceived as pr obabilistic uncertaint y , o r shor tly no ise. The probability distr ibution function of the sig nal, its pr ior P ( s ) = Z D ψ δ ( s − s [ ψ ]) P ( ψ ) , (2) is rela ted to that o f the da ta via the joint pro ba bilit y P ( d, s ) = Z D ψ δ ( s − s [ ψ ]) P ( d | ψ ) P ( ψ ) , (3) from which the conditional signal likelihoo d P ( d | s ) = P ( d, s ) /P ( s ) (4) and signa l p osterior P ( s | d ) = P ( d, s ) /P ( d ) (5) can b e derived. Before the data is av a ilable, the phase-spa ce of interest is spanned by the dir ect pro duct o f all po ssible s ignals s and data d , and all regions with non-zero P ( d, s ) are of po ten tia l relev ance. Onc e the actua l data d obs hav e b een taken, only a sub-manifo ld of this s pace, a s ﬁxed by the data, is o f further relev ance. The proba bilit y function ov er this sub-space is prop ortional to P ( d = d obs , s ), and needs just to be reno rmalized by dividing by Z D s P ( d obs , s ) = Z D s Z D ψ δ ( s − s [ ψ ]) P ( d obs | ψ ) P ( ψ ) = Z D ψ P ( d obs | ψ ) P ( ψ ) = P ( d obs ) , (6) which is the unconditioned proba bilit y (or evidence) of that data . Thus, we ﬁnd the resulting information o f the data to be the poster ior distribution P ( s | d obs ) = P ( d obs , s ) /P ( d obs ). This p oster io r is the fundamental mathematical ob ject from which all our deductio ns have to b e made. It is related via Ba yes’s theor em [1] to the usually b etter accessible s ig nal likelihoo d, P ( s | d ) = P ( d | s ) P ( s ) /P ( d ) , (7) which follows fro m Eqs. 4 and 5. The normaliza tion term in Bayes’s theorem, the evi- dence P ( d ), is now also fully ex pressed in terms o f the joint pro babilit y of data a nd signal, P ( d ) = Z D s P ( d, s ) , (8) 6 and the underlying physical ﬁeld ψ basically b ecomes in- visible at this s tage in the formalis m. The evidence plays a ce n tr al role in Bayes infer ence, since it is the likeli- ho od of all the assumed mo del para meters. C o m bining this parameter - lik eliho od with parameter- priors one can start Bayesian infer ence on the mo del classes. B. Signal resp onse and noise If signa l a nd da ta depend o n the same under lying phys- ical pr ope r ties, there may b e correlations b etw een the t wo, whic h can b e expressed in terms of signa l res p onse R a nd noise n of the data as d = R [ s ] + n s . (9) W e hav e chosen tw o diﬀerent wa ys of denoting the de- pendenc e of r espons e and noise on the signal s , in o rder to highlight that the res ponse should embrace most of the reaction of the data to the s ig nal, wher eas the noise should b e as indep enden t as p ossible . W e ensure this b y putting the linear correla tion of the da ta with the signa l fully int o the r e spons e . The r espo nse is therefore the part of the data which co rrelates with the s ignal R [ s ] ≡ h d i ( d | s ) ≡ Z D d d P ( d | s ) , (10) and the noise is just deﬁned a s the remaining pa rt which do es not: n s ≡ d − R [ s ] = d − h d i ( d | s ) . (11) Although the noise mig h t dep end on the sig nal, as it is well known for example for Poissonian pro cesses, it is – per deﬁnition – linearly uncor related to it, h n s s † i ( d | s ) = ( h d i ( d | s ) − R [ s ]) s † = 0 s † = 0 , (12) whereas higher or der co rrelation might well exis t and may be further exploited for their information conten t. The dagger denotes co mplex conjugation and trans p osing of a vector or ma trix. These deﬁnitions were c hosen to b e close to the usua l language in sig nal pro cessing and data ana lysis. They per mit to deﬁne signal res p onse and no is e for an arbitrary choice o f the signal s [ ψ ]. No direct ca us al connection betw een signal and data is needed in or der to have a non-trivial res ponse , since b oth v a riables just need to exhibit some couplings to a co mmon sub-asp ect of ψ . The ab ov e deﬁnition of r espo nse and noise is how ever not unique, even for a ﬁxed signal deﬁnition, s ince any data transformatio n d ′ = T [ d ] ca n lead to diﬀerent deﬁnitions, as se e n from R ′ [ s ] ≡ h d ′ i ( d | s ) = h T [ d ] i ( d | s ) 6 = T [ h d i ( d | s ) ] = T [ R [ s ]] . (13) Exceptions are so me unique relations b etw een signa l and state, P ( ψ | s ) = δ ( ψ − ψ [ s ]), and maybe a few o ther very sp ecial cases. Th us, the co ncepts of sig nal r e spons e and therewith deﬁned noise dep end on the adopted co ordi- nate system in the data space. This co ordinate system can b e changed via a data transformation T , and the transformed da ta may exhibit b etter or worse resp onse to the sig nal. Informatio n theory a ids in designing a suit- able data transformation, so that the signal respo nse is maximal, and the signa l nois e is minimal, per mitting the signal to be b est r ecov ere d. Thu s, we may aim for an optimal T , which yields T [ d ] = h s i ( s | d ) . (14) W e deﬁne the po sterior average o f the signal, m d = h s i ( s | d ) , to be the ma p o f the signal given the data d and call T a map-making-algorithm if it fulﬁlls Eq. 1 4 at least approximately . As a criter ion for this one may req uire that the signa l r espo ns e of a map-ma k ing-algorithm, R T [ s ] ≡ h T [ d ] i ( d | s ) , (15) is p ositive deﬁnite with resp ect to signal v a r iations as stated by δ R T [ s ] δ s ≥ 0 . (16) This ensures tha t a ma p-making algorithm will resp o nd with a non-neg ativ e co rrelation of the map to any signal feature, with resp ect to the nois e ensemble. In general, T will b e a non-linear op eration on the data , to be con- structed fro m information theory if it should b e optimal in the sense of E q. 1 4. In any ca s e, the ﬁdelit y o f a sig - nal r econstruction ca n b e ch aracter iz ed b y the quadr atic signal uncertaint y , σ 2 T , d = h ( s − T [ d ]) ( s − T [ d ]) † i ( s | d ) , (17) av er aged ov er t ypical realiza tions of signal and noise. O f sp ecial interest is the tra ce of this T r( σ 2 T , d ) = Z dx h| s x − T x [ d ] | 2 i ( s | d ) , (18) since it is the exp ectation v alue of the sq uared Leb esgue- L 2 -space dista nce b et ween a signal rec o nstruction and the underlying sig nal. Requesting a map ma k ing algo- rithm to b e optimal with r espect to Eq. 18, implies T [ d ] = h s i ( s | d ) and ther e fore it to b e optimal in a n in- formation theoretical sense according to Eq . 14. The uncerta in ty σ 2 T , d depe nds on d , since in Ba yesian inference one av era ges over the p osterior, which is condi- tional to the data. The freq ue ntist uncerta in ty estimate, which is the ex p ected uncerta in ty of any estimator b efore the da ta is obtained, is given by a n av e r age over the joint probability function: σ 2 T = h ( s − T [ d ]) ( s − T [ d ]) † i ( d,s ) . (19) The latter is a go o d quantit y to characterize the ov er all per formance of an estimator, whereas T r( σ 2 T , d ) is a more 7 precise indicator of the actual estimator pe rformance for a given dataset. As we will see in our IFT applications, data dep endence o f the uncertaint y is a common fea tur e of no n-linear inference problems. An illus trative example s ho uld b e in or der. Supp ose our data is an exact c o p y o f a physical ﬁeld, d = ψ , our signal the square of the latter, s = ψ 2 , and the ph ys ical ﬁeld obe y s an even s ta tistics, P ( ψ ) = P ( − ψ ). Then, the signa l resp onse is ex actly zero, R [ s ] = 0 , and the data contains o nly nois e w ith r espe c t to the chosen signa l, d = n s . Th us, we hav e chosen a bad represe ntation of our data to reveal the signal. If we, howev er , in tro duce the transformation d ′ = T [ d ] = d 2 , w e ﬁnd a p erfect resp onse, R ′ [ s ] = s , and ze ro noise, n ′ s = 0. In this ca s e, ﬁnding the optimal map-mak ing alg orithm was trivia l, but in more co mplica ted situatio ns, it can not be g uessed tha t easily . Since the resp onse and noise deﬁnitions dep end on the s ig nal deﬁnition, s o me thoughts should be given to how to choose the signal in a way that it ca n b e well re constructed. C. Signal design F or practica l reas ons one will usually choos e s acco rd- ing to a few guidelines, which should simplify the infor- mation induction pro cess: 1. The functional form of s [ ψ ] should b est b e simple, steady , analytic, and if p ossible linear in ψ , p ermit- ting to use the signa l s to reaso n ab out the s ta te of reality ψ . 2. The degr ees of free do m o f s sho uld b e r elated to the ones of the data d in the sense that c ross cor - relations ex ist which p ermit to deduce prop erties of s from d . Signal degree s o f fr eedoms, which ar e insensitive to the data, will o nly be constrained by the prior and therefore just contain a large amo unt of unce rtaint y . This adds to the err or budget, a nd should b e av oided a s far a s p ossible. 3. The c hoice of s [ ψ ] should also be lead b y math- ematical convenience and practicality . In the ex- amples presented in this work, simple sig nals are chosen which per mit to gues s go o d approximations for signal likelihoo d P ( d | s ) and prior P ( s ) without the need to develop the full physical theo ry star ting with P ( ψ ). T o g iv e a mo r e sp eciﬁc ex ample, we assume a cosmo- logical mo del in which the rea lit y is thought to b e s o lely characterized by the primordial dar k matter density dis- tribution ψ ( x ), fro m whic h all obser v a ble co s mological phenomena like g alaxies derive in a deterministic way . The co ordinate x may re fer to the comoving co ordinates at so me early ep o ch o f the Univ ers e. Althoug h the LSS of the matter distribution a t a later time may pr edom- inantly dep end o n the initial large - scale mo des, and is reﬂected in the ga laxy distribution, the actual p ositions of the individual galaxie s also dep end in a non-trivial wa y on the small-scale mo des. Due to the discr eteness of our observ a ble, the galaxy po sitions, it may b e impos s i- ble to reconstruct these small scale mo des. Therefor e it could be sensible to deﬁne a signa l s [ ψ ] = F ψ , with F being a linear low-pass ﬁlter, which suppresses all small- scale structures. This signa l may b e reconstructible with high pr ecision, wher eas any attempt to rec o nstruct ψ di- rectly would b e plag ued b y a larg er error budg et, since all the data-unconstrained small-scale mo des repr esen t uncertainties to a rec o nstruction of ψ , but no t to one of s b eing deﬁned a s a low pass ﬁltered version of ψ . D. Signal moment cal culation The information of some data d o n a signal s deﬁned ov er some set Ω, whic h in most applications will b e a manifold like a sub-volume of the R n , or the spher e in case o f a CMB signal, is completely contained in the po sterior P ( s | d ) of the signal given the data. 5 The ex- pec tation v alue o f s a t s ome lo cation x ∈ Ω, and higher correla tion functions of s can all b e obtained from the po sterior by taking the appro priate av er age: h s ( x 1 ) · · · s ( x n ) i d ≡ h s ( x 1 ) · · · s ( x n ) i ( s | d ) ≡ Z D s s ( x 1 ) · · · s ( x n ) P ( s | d ) . (20 ) The problem is that often neither the exp ectation v al- ues no r even the p osterior are easily calculated analyt- ically , even for fa irly simple da ta mo dels. F ortunately , there is a t least one class o f data mo dels for which the po sterior and all its mo men ts can b e calculated ex actly , namely in case the p osterior turns out to b e a multiv ar i- ate Gaussian in s . In this case ana lytical formulae for all moments of the s ignal are known and ar e in pr inciple computable. T echnically , one is still o ften facing a h uge, but linea r in verse pr o blem. How ever, in the last decades a couple of c o mputational high-p erformance map-ma king techn iques were develop ed to tackle such problems either on the sphere, for CMB research, or in ﬂat spaces with one, t w o or three dimensions, for example for the recon- struction o f the cosmic LSS (detailed references are given in Sect. I C). The purpo se o f this work is to show how to expa nd other pos terior distributions around the Gaus- sian ones in a pe rturbative manner , which then p ermits to use the existing ma p- making codes for the computa- tion of the re s ulting dia grammatic p erturba tion s e ries. Since the diagr ammatic p erturbation series in F eynma n- diagrams are well known and understo o d in QFT and 5 W e are mostly dealing with scalar ﬁelds, ho wev er, m ul ti- component, vec tor or tensor ﬁelds can be treate d analogou sly , and many of the equations just hav e to b e re-i n terpreted for suc h ﬁelds and sta y v ali d. 8 SFT, the mo st economica l w ay is to reformulate the in- formation theor etical pro blem in a langua ge which is as close as p ossible to the former tw o theor ies. Thereb y , many of the results a nd concepts become directly av ail- able for s ignal inference problems. Moreover, it seems that expr e s sing the optimal signal estimator in terms o f F eynman diagr ams immediately provides co mputation- ally eﬃcient algor ithms, since the diag rams enco de the skeleton of the minimal necessar y computatio na l infor - mation ﬂow. E. Signal and data s paces 1. Discr etisation and c ontinuous li mit Both, the signal a nd the data space may b e co n tinuous, how ever, in practice will most often b e discrete s ince dig- ital data pr oces sing only p ermits to chose a discretized representation of the distributed information. The s pace in which the data and signa l discretisa tion happ e ns can be c hosen freely , and of cours e can be as w ell a F ourier , wa velet or spheric a l harmonics spa ce. Even if w e would like to analyze a con tinuous signal, the computationally required discretisation will force an implicit redeﬁnition of our actual signal to b e the discretely sampled version of that contin uous s ignal, and this discretisation step should also be pa rt of the data mo del, if it has the p oten tial to signiﬁcantly aﬀect the analysis [e.g. see 211]. Although discretisation implies some information loss it also ha s a n adv antage. W e can just assume discr etisa- tion and therefore rea d all sc a lar a nd tensor pro ducts as being the usual, comp onent-wise ones, now just in high-, but ﬁnite-dimensio nal vector spaces . T o b e concr ete, let { x i } ⊂ Ω b e a discrete set of N pix pixel po s itions, each of which has a volume-size V i at- tributed to it, then the s calar pro duct of t wo dis c r etized function-vectors f = ( f i ), a nd g = ( g i ) sa mpled at these po in ts via f i = f ( x i ), and g i = g ( x i ) could b e deﬁned by g † f ≡ N pix X i =1 V i g i ∗ f i . (21) The asterix deno tes complex conjuga tion. This s calar pro duct has the contin uous limit g † f − → Z dx g ∗ ( x ) f ( x ) . (22) In many cases the actual volume norma lization in Eq. 21 do es not matter fo r ﬁnal results, since it usually can- cels out, a nd therefor e V i is often dropp ed completely for equidistant sampling o f sig nal and data s paces. The vol- ume terms a lso disapp ear for a scalar pro duct inv olving a function whic h is discr etized via volume integration, f i = R V i dx f ( x ), e.g. the num b er of counts within the cell i . Anyho w, higher or der tensor pr oducts are deﬁned analogo usly . The path integral of a functiona l F [ f ] ≡ F ( f 1 , . . . , f N pix ) ov er all realizations o f such a dis- cretized ﬁe ld f is then just a high-dimensiona l volume int egral, with as man y dimensions as pixe ls : Z D f F [ f ] ≡   N pix Y i =1 Z d f i   F ( f 1 , . . . , f N pix ) . (23) This deﬁnition of a ﬁnite-dimensiona l pa th integral is well normalized, since in ca se that we wan t to in tegra te over a probability distribution ov er f , whic h is separable for all pixels, P ( f ) = Q N pix i =1 P i ( f i ), as e.g . for white and Poissonian no ise, we ﬁnd h 1 i ( f ) = Z D f P ( f ) = N pix Y i =1 Z d f P i ( f ) | {z } =1 = 1 . (24) Although, in rea l data - analysis applica tions, it is prac- tically never requir ed to p erform the contin uo us limit N pix → ∞ with V i → 0 for all i , we stress that this limit can for mally b e taken and is well deﬁned even for the path integral, as we argue in more detail in Sec. IV B. The bas ic arg umen t is that suita ble sig nals could and should be deﬁned in such a wa y that path-integral di- vergences, which pla gue sometimes QFT, c a n easily b e av oided b y sensible sig nal design. Practically , the ex- istence of a w ell-deﬁned contin uous limit of a well-pose d IFT implies that tw o numerical implementations of a sig- nal reconstruction problem, which diﬀer in their space discretisation on scales smaller than the structures of the signal, can b e exp ected to provide iden tical results up to a small discretisatio n diﬀerence, which v anishes with higher dis c retisation-reso lution. 2. Par ameter sp ac es In many applications, the sig nal spa ce is identiﬁed with the physical space or with the s pher e of the sky . How- ever, IFT ca n also b e done over par ameter spaces. In Sec. VI , a ﬁeld theor y ov er the sphere will implicitly de- ﬁne the knowledge state for an unknown parameter of that theory , which can b e regarde d again to deﬁne an information theor y for that parameter . The latter is an IFT in case tha t the parameter ha s spatial v a riations. How ever, there ar e also functions deﬁned ov er a para m- eter space, Ω parameter = { p } fo r s ome pa rameter p , which one mig h t want to obtain knowledge o n fro m incomplete data. A v ery import o ne is the pr obability distribution of the pa r ameter g iv en the obser v a tional data, P ( p | d ), which deﬁnes our par ameter-knowledge state. This func- tion may only b e incompletely k no wn and therefore re- quire an IFT appr oach for its reco nstruction and in ter- po lation. Suc h inc o mplete knowledge on the function could be due to incomplete numerical sa mpling of its function v alues be c ause of large computational costs and 9 the huge volumes o f mult i-dimensional parameter spaces. Or, there might be another unknown nuisance parame- ter q in the pr o blem, which induces an uncerta int y in P ( p | d ) = P ( p | d ) and there fo re an IFT ov er a ll p ossible realizations of this knowledge state ﬁeld function via P [ P ( p | d ) ] = Z D P ( p | d ) δ  P ( p | d ) − Z dq P ( p, q | d )  . (25) In cas e that q is a ﬁeld, the marginalis ation in tegra l in the delta functional also b ecomes a path-int egral. Prob- abilistic decision theo ry , bas e d on knowledge state as e x - pressed by proba bilit y functions o n parameter s, ha s to deal with such complications. F or inference dire c tly on p , and not on the k nowledge state P ( p | d ) , the marginal- ized probability P ( p | d ) = Z dq P ( p, q | d ) (26) contains all relev ant information, and that will be suﬃ- cient for most inference applications, a nd espe c ia lly for the o nes in this work. II I. BASIC FORMALISM A. Information Hamiltonian W e ar gued tha t the p oster ior P ( s | d ) contains a ll av a il- able infor mation on the signal. Although the p osterior might not b e e asily accessible mathematically , we assume in the following that the prior P ( s ) of the signal before the data is taken as well as the likeliho o d of the data given a sig nal P ( d | s ) are known o r at least ca n b e T aylor- F r´ ec het-expa nded aro und some reference ﬁeld conﬁg ura- tion t . Then Bay es’s theorem per mits to express the p os- terior as P ( s | d ) = P ( d, s ) P ( d ) = P ( d | s ) P ( s ) P ( d ) ≡ 1 Z e − H [ s ] . (27) Here, the Hamiltonian H [ s ] ≡ H d [ s ] ≡ − log [ P ( d, s )] = − log [ P ( d | s ) P ( s )] , (28) the ev ide nc e of the data P ( d ) ≡ Z D s P ( d | s ) P ( s ) = Z D s e − H [ s ] ≡ Z , (29) and the pa rtition function Z ≡ Z d were in tro duced. It is extremely conv enient to include a moment generating function into the deﬁnition o f the partition function Z d [ J ] ≡ Z D s e − H [ s ]+ J † s . (30) This means P ( d ) = Z = Z [0 ], but also p ermits to calculate any moment o f the signa l ﬁeld via F r´ ec het- diﬀerentiation of E q. 30, h s ( x 1 ) · · · s ( x n ) i d = 1 Z δ n Z d [ J ] δ J ( x 1 ) · · · δ J ( x n )     J =0 . (31) Of s pecia l imp ortance are the s o-called co nnected cor re- lation functions or cumulan ts h s ( x 1 ) · · · s ( x n ) i c d ≡ δ n log Z d [ J ] δ J ( x 1 ) · · · δ J ( x n )     J =0 , (32) which ar e corre c ted for the con tribution of lo w er mo- men ts to a co rrelator of order n . F or example, the con- nected mean and disper sion a r e expre s sed in terms of their unconnected counterparts as: h s ( x ) i c d = h s ( x ) i d , h s ( x ) s ( y ) i c d = h s ( x ) s ( y ) i d − h s ( x ) i d h s ( y ) i d , (33) where the last ter m repres e nts such a cor rection. F or Gaussian random ﬁelds a ll higher order co nnec ted corr e- lators v anis h: h s ( x 1 ) · · · s ( x n ) i c d = 0 (34) for n > 2. F or non-Gaussia n random ﬁelds, they are in gener al non-zero, and for later usage we pro vide the connected three- and four-p oint functions, h s x s y s z i c d = h ( s x − ¯ s x )( s y − ¯ s y )( s z − ¯ s z ) i d , h s x s y s z s u i c d = h ( s x − ¯ s x )( s y − ¯ s y )( s z − ¯ s z )( s u − ¯ s u ) i d − h s x s y i c d h s z s u i c d − h s x s z i c d h s y s u i c d − h s x s u i c d h s y s z i c d , (35) where we used s x = s ( x ) and deﬁned ¯ s x = h s ( x ) i d . The assumption that the Hamiltonian can be T aylor- F r´ ec het expanded in the signa l ﬁeld per mits to write H [ s ] = 1 2 s † D − 1 s − j † s + H 0 + ∞ X n =3 1 n ! Λ ( n ) x 1 ...x n s x 1 · · · s x n . (36) Repe a ted co ordina tes a re thought to be integrated over. The ﬁrst three T aylor co eﬃcients hav e sp ecial r oles. The constant H 0 is ﬁxed by the nor malization condition o f the joint probability density of sig nal and data . If H ′ d [ s ] de - notes some unnor malised Hamiltonian, its nor malization constant is given by H 0 = log Z D s Z D d e − H ′ d [ s ] . (37) Often H 0 is irrelev ant unless diﬀerent models or hyper- parameters are to be compa red. W e call the linear co eﬃcient j informatio n sour c e . This term is usually directly and linearly related to the data . The qua dratic co eﬃcient, D − 1 , deﬁnes the info r mation propaga tor D ( x, y ), which pr o pagates information o n the signal a t y to lo cation x , a nd thereby p ermits, e.g., to par- tially reco nstruct the signa l at lo cations where no data was tak en. Finally , the a nha rmonic tensors Λ ( n ) create int eractions b etw een the mo des of the free, ha rmonic the- ory . Since this free theor y will be the basis for the full int eraction theory , we ﬁr st inv estiga te the ca se Λ ( n ) = 0. 10 B. F ree theory 1. Gaussian data mo del F or our s imples t data mo del we ass ume a Gaussian signal with prio r P ( s ) = G ( s, S ) ≡ 1 | 2 π S | 1 2 exp  − 1 2 s † S − 1 s  , (38) where S = h s s † i is the signal co v a r iance. The signal is assumed here to be pro cess ed by nature and o ur mea- surement de v ice ac c ording to a linear data mo del d = R s + n. (39) Here, the r espo nse R [ s ] = R s is linear in and the noise n s = n is indep endent of the signal s . The linea r r espo ns e matrix R of our instrument can c o n tain window a nd se- lection functions, blurring eﬀects, and even a F ourier- transformatio n of the sig na l space, if our instrument is an in terfer ometer. Typically , the da ta-space is disc r ete, whereas the signa l spa ce may b e co n tin uo us. In that ca se the i -th data p oint is given by d i = Z dx R i ( x ) s ( x ) + n i . (40) W e assume, for the moment, but not in general, the noise to b e signa l-independent and Ga us sian, and there- fore distr ibuted as P ( n | s ) = G ( n, N ) , (41) where N = h n n † i is the noise cov aria nce matrix. Since the noise is just the diﬀer e nce of the data to the signal- resp onse, n = d − R s , the likelihoo d of the data is g iv en by P ( d | s ) = P ( n = d − R s | s ) = G ( d − R s, N ) , (42) and thus the Hamiltonian o f the Gaussian theory is H G [ s ] = − lo g [ P ( d | s ) P ( s )] = − log [ G ( d − R s, N ) G ( s, S )] = 1 2 s † D − 1 s − j † s + H G 0 . (43) Here D =  S − 1 + R † N − 1 R  − 1 (44) is the propaga tor of the free theo r y . The informatio n source, j = R † N − 1 d, (45) depe nds linear ly on the data in a resp onse-ov er-noise weigh ted fashion and reads j ( x ) = X ij R ∗ i ( x ) N − 1 ij d j (46) in case of discrete data but contin uous signal spaces. Fi- nally , H G 0 = 1 2 d † N − 1 d + 1 2 log ( | 2 π S | | 2 π N | ) (47) has a bs orb ed a ll s -indep enden t nor malization constants. The partition function of the free ﬁeld theory , Z G [ J ] = Z D s e − H G [ s ]+ J † s (48) = Z D s exp  − 1 2 s † D − 1 s + ( J + j ) † s − H G 0  , is a Gaussian path integral, which can be calculated e x- actly , yielding Z G [ J ] = p | 2 π D | exp  + 1 2 ( J + j ) † D ( J + j ) − H G 0  . (49) The explicit pa rtition function per mits to calcula te via Eq. 32 the exp ectation of the sig nal given the data, in the following called the map m d generated by the data d : m d = h s i d = δ log Z G δ J     J =0 = D j (50) =  S − 1 + R † N − 1 R  − 1 R † N − 1 | {z } F WF d. The last expr ession shows that the map is g iv en by the da ta after applying a generaliz ed Wiener ﬁlter, m d = F WF d . The propa g ator D ( x, y ) describ es how the informatio n on the density ﬁeld co n tained in the data at locatio n x pr opagates to position y : m ( y ) = R dx D ( y , x ) j ( x ). The connec ted autoc o rrelation of the signal given the data, h ss † i c d = D =  S − 1 + R † N − 1 R  − 1 , (51) is the propa gator itself. All higher co nnec ted co rrelation functions ar e zer o. Ther efore, the signal g iv en the data is a Ga us sian random ﬁeld around the mean m d and with a v ar iance of the res idual erro r r = s − m d (52) provided by the pro pagator itself, a s a straightforward calculation shows: h rr † i d = h ss † i d − h s i d h s † i d = h ss † i c d = D . (53) Thu s, the p osterio r should b e simply a Ga us sian given by P ( s | d ) = G ( s − m d , D ) . (54) 11 As a test for the latter equation, we ca lc ulate the evidence of the free theory v ia P ( d ) = P ( d | s ) P ( s ) P ( s | d ) = G ( d − R s, N ) G ( s, S ) G ( s − D j, D ) =  | D | / | S | | 2 π N |  1 2 exp  1 2 ( j † D j − d † N − 1 d )  , (55) which is indeed indep endent of s and also identical to Z G [0], as it should b e. 2. F r e e classic al the ory The Hamiltonian p ermits to ask for classic al equations derived fro m an extre mal principle. This is justiﬁed, on the one ha nd, as b eing just the r e s ult of a the s addle- po in t approximation of the exp onen tial in the pa rtition function. On the other hand, the extrema principle is equiv alent to the max im um a poster io ri (MAP) estima- tor, which is quite co mmonly used for the construction of signal-ﬁlters. An exhaustive intro duction into and dis- cussion of the MAP approximation to Gaussian and non- Gaussian signal ﬁelds is provided by Lemm [48]. The classica l theor y is exp ected to capture essential features of the ﬁeld theory . How ever, if the ﬁeld ﬂuctua- tions are able to pr obe phase space regio ns aw ay from the maximum in which the Hamiltonian (or po sterior) has a more complex structure , deviations b et ween classica l and ﬁeld theo ry should beco me appa ren t. Extremizing the Hamiltonian of the free theor y (E q. 43) δ H G δ s     s = m = D − 1 m − j ≡ 0 (56) we g e t the classical ma pping equa tion m = D j , which is ident ical to the ﬁeld theo retical result (Eq. 50). It is also p ossible to measur e the sharpness o f the max- im um of the po s terior by calculating the Hessian curv a- ture matrix H G [ m ] = δ 2 H [ s ] δ s 2     s = m = D − 1 . (57 ) In the Gaussian approximation of the maximum of the po sterior, the inv erse of the Hessian is identical to the cov aria nce o f the r esidual h r r † i = H − 1 [ m ] = D, (58) which for the pure Gaussian mo del is of course identical to the exa ct result, as g iven by the ﬁeld theory (Eq . 53). IV. INTERACTING INFORMA TION FIELDS A. In te raction Hamiltonian 1. Gener al F orm All results of the free theory pre s en ted s o far ar e w ell- known within the ﬁeld o f signal reco nstruction. IFT re- pro duces them elega n tly , and is therefore of peda gogical v alue. Howev er , the new results presented in the rest of this pa p er a rise as so on as one leaves the free theor y . Non-Gaussian signa l or noise , a non-linea r resp o nse, or a s ignal depe nden t nois e cr eate a nha rmonic ter ms in the Hamiltonian. These descr ibe interactions b etw een the eigenmo des of the free Hamiltonian. W e assume the Hamiltonian ca n be T aylor expanded in the signa l ﬁelds , which p ermits to write H [ s ] = 1 2 s † D − 1 s − j † s + H G 0 | {z } H G [ s ] + ∞ X n =0 1 n ! Λ ( n ) x 1 ...x n s x 1 · · · s x n | {z } H int [ s ] . (59) Repe a ted co ordina tes a re thought to be integrated over. In contrast to E q . 36 we hav e now included p erturba - tions which are c onstant, line a r and quadratic in the s ig- nal ﬁeld, bec ause we ar e summing fro m n = 0. This per mits to trea t certain no n- ideal eﬀects p erturbatively . F or e x ample if a mostly p osition- independent propa gator gets a small p osition dep enden t contamination, it might be mor e co n v enient to treat the latter p erturbatively and not to include it in to the pro pagator used in the calcula- tion. Note further, that all co eﬃcients ca n b e a ssumed to be symmetric with resp ect to their co ordinate-indices. 6 Often, it is mor e conv enien t to work with a shifted ﬁeld φ = s − t , where some (e.g. background) ﬁeld t is remov e d fr o m s. The Hamiltonian of φ rea ds H ′ [ φ ] = 1 2 φ † D − 1 φ − j ′† φ + H ′ 0 | {z } H ′ G [ φ ] 6 This means D x y = D y x and Λ ( n ) x π (1) ...x π ( n ) = Λ ( n ) x 1 ...x n with π an y p ermutat ion of { 1 , . . . , n } , since eve n non-symm etri c coef- ﬁcien ts would automatically be symmetrized by the int egration o v er all rep eated co ordinates. Therefor e, we assume in the fol- lowing that s uc h a symmetrization op eration has b een already done, or w e imp ose it by hand b efore we contin ue with an y per - turbativ e calculation by applying Λ ( n ) x 1 ...x n 7− → 1 n ! X π ∈ P n Λ ( n ) x π (1) ...x π ( n ) . This clearly leav es any symmetric tensor inv ar ian t if P n is the space of all per mutations of { 1 , . . . , n } . 12 + ∞ X n =0 1 n ! Λ ′ ( n ) x 1 ...x n φ x 1 · · · φ x n | {z } H ′ int [ φ ] , with H ′ 0 = H G 0 − j † t + 1 2 t † D − 1 t, (60) j ′ = j − D − 1 t, and Λ ′ ( m ) x 1 ...x m = ∞ X n =0 1 n ! Λ ( m + n ) x 1 ...x m + n t x 1 · · · t x n . 2. F eynman rules Since all the information o n any correlation functions of the ﬁelds is contained in the partition sum and can b e extracted from it, only the latter needs to b e calculated: Z [ J ] = Z D s e − H [ s ]+ J † s = Z D s exp " − ∞ X n =0 1 n ! Λ ( n ) x 1 ...x n s x 1 · · · s x n # e − H G [ s ]+ J † s = exp " − ∞ X n =0 1 n ! Λ ( n ) x 1 ...x n δ δ J x 1 · · · δ δ J x n # × Z D s e − H G [ s ]+ J † s = exp  − H int [ δ δ J ]  Z G [ J ] . (61) There exis t well known dia grammatic expansio n tec h- niques for such ex pr essions [e.g. 57]. The expansio n terms of the loga r ithm of the partition sum, from whic h any connected moments can be calculated, a re repr esen ted by all po ssible co nnec ted dia g rams build out of lines ( ), vertices (with a num b er of legs co nnecting to lines, like , , , , ...) and without a n y e xternal line- ends (any line ends in a vertex). Thes e diagr ams are int erpreted a ccording to the following F eynman rules: 1. Op en ends of lines in diag rams cor resp ond to ex- ternal co ordinates and are lab eled by such. Since the partition s um in particular do es not depend on any external co ordinate, it is ca lculated only from summing up closed diag rams. How ever, the ﬁeld ex pectation v alue m ( x ) = h s ( x ) i ( s | d ) = d log Z [ J ] /dJ ( x ) | J =0 and higher or der cor relation functions dep end on coor dina tes and therefore ar e calculated from diagrams with o ne o r more op en ends, res p ectively . 2. A line with co ordinates x ′ and y ′ at its end repre- sents the pr opagator D x ′ y ′ connecting these lo ca- tions. 3. V ertices with one leg get a n individual internal, int egrated co ordinate x ′ and repres e nt the term j x ′ + J x ′ − Λ (1) x ′ . 4. V ertices with n legs repr esen t the term − Λ ( n ) x ′ 1 ...x ′ n , where each individual leg is labeled by one of the int ernal co ordina tes x ′ 1 , . . . , x ′ n . This more co m- plex vertex-structure, as compar ed to QFT, is a consequence of no n-loc alit y in IFT. 5. All in terna l (and ther efore rep eatedly o ccurr ing) co ordinates are integrated ov er, wher eas exter nal co ordinates are not. 6. Every diagram is divided by its symmetry factor, the n umber of permutations of vertex leg s leaving the top ology inv ar iant , a s describ ed in any b o ok on ﬁeld theo ry [e.g. 57]. The n -th moment of s is g enerated by taking the n -th deriv ative of log Z [ J ] with resp ect to J , and then s e t- ting J = 0. This cor resp ond to remo ving n end-vertices from a ll diagra ms. F or e x ample, the ﬁrst fo ur diagr ams contributing to a map ( m = h s i ( s | d ) ) a re = D j = D xy j y ≡ Z dy D ( x, y ) j ( y ) , = − 1 2 D Λ (3) [ · , D ] = − 1 2 D xy Λ (3) y zu D z u ≡ − 1 2 Z dy D xy Z dz Z du Λ (3) xy u D z u , = − 1 2 D Λ (3) [ · , D j, D j ] = − 1 2 D xy Λ (3) y uz D z z ′ j z ′ D uu ′ j u ′ (62) ≡ − 1 2 Z dy D xy Z dz Z du Λ (3) y zu × Z dz ′ D z z ′ j z ′ Z du ′ D uu ′ j u ′ , and = − 1 2 D Λ (4) [ · , D , D j ] = − 1 2 D xy Λ (4) y zuv D z u D vv ′ j v ′ ≡ − 1 2 Z dy D xy Z dz Z du Z dv Λ (4) y zuv D z u × Z dv ′ D vv ′ j v ′ . Here we have assumed that any ﬁrst a nd second o rder per turbation w as absorb ed int o the data source and the propaga tor, thus Λ (1) = Λ (2) = 0. Rep eated indices are assumed to b e integrated (or summed) ov er. 3. L o c al inter actions and F ourier sp ac e rules In ca se of pur ely lo cal interactions Λ ( n ) x 1 ...x n = λ n ( x 1 ) δ ( x 1 − x 2 ) · · · δ ( x 1 − x n ) (63) 13 the interaction Ha milto nia n reads H int = ∞ X m =0 1 m ! λ † m s m (64) and the ex pressions of the F eynman dia grams simplify considerably . The fourth F eynman rule can be replaced by 4. V ertices with n lines connected to it are ass o cia ted with a single int ernal co ordinate x ′ and repres en t the ter m − λ n ( x ′ ). F or ex a mple, the last lo op diagra m in Eq 62 b ecomes = − 1 2 Z dy D xy λ 4 ( y ) D y y Z dz D y z j z . (65) In case of lo cal interactions, it can be helpful to do the calcula tions in F o ur ier space, for which the F eynman rules can be obtaine d by inserting a real-spa ce identit y op erator 1 = F † F in b etw een an y scalar pro duct and assigning the inv er se F ourier tra nsformation F † to the left and the forward transform F to the right term, e.g. D j = F † F D F † | {z } D ′ F j |{z} j ′ = F † D ′ j ′ . This yields: 1. An open end o f a line ha s an exter nal momentum co ordinate k , a nd gets an R dk e − i k x / (2 π ) n applied to it, if r eal space functions are to b e ev a lua ted. 2. A line connecting momentum k with momentum k ′ corres p onds to a dir ected propa gator b etw een these momenta: D kk ′ = D ( k , k ′ ). 3. A data source vertex is ( j + J − λ 1 )( k ′′ ), wher e k ′′ is the momentum at the data-end of the line. 4. A v e rtex with m > 1 lines with momen tum lab e ls k 1 , . . . , k m is − λ m ( k 0 )(2 π ) n δ ( P m i =0 k i ). 5. An in ternal end of a line ha s an internal (in- tegrated) momentum coo rdinate k ′ . In tegr ation means a term R dk ′ / (2 π ) n in front of the ex pres- sion. 6. The expr e ssion g ets divided by the symmetry factor of its diagra m. Here, j ( k ) = ( F j )( k ) = R dx j ( x ) e i k x , D ( k , k ′ ) = ( F D F † )( k , k ′ ) = R dx R dx ′ D ( x, x ′ ) e i ( k x − k ′ x ′ ) , etc. ar e the F our ier-transfor med information source, pro pagator, etc., r espec tiv e ly Note, that momentum dir ections hav e to b e taken into account. The momenta that go in to a vertex, data source or op en end get a po sitiv e sign in the delta-function o f momentum conser v a tion, the ones that go out of a v ertex get a minus sign. 4. Simplistic i nter action Hamil tonians In or der to ha ve a toy case, which p ermits a nalytic calculations, w e in tro duce a s implis tic Hamiltonia n b y requiring the data mo del to b e translationa l inv aria n t and all interaction ter ms to b e lo cal. This is the c a se whenever the signal and noise c ov arianc e s are fully char- acterized b y p ow er sp ectra ov er the sa me spatial s pace, S ( k , q ) = (2 π ) n δ ( k − q ) P S ( k ) , (66) N ( k , q ) = (2 π ) n δ ( k − q ) P N ( k ) , (67 ) with P s ( k ) = h| s ( k ) | 2 i /V , and P n ( k ) = h| n ( k ) | 2 i /V , where V is the volume of the system. W e a ssume further that the signal pro cessing ca n b e completely de s cribe d by a conv o lution w ith an instrumental b e am, d ( x ) = Z dy R ( x − y ) s ( y ) + n ( x ) , (68) where the resp onse-conv olution kernel has a F our ie r power sp ectrum P R ( k ) = | R ( k ) | 2 (no factor 1 / V ). In this ca se D ca n b e fully describ ed by a power sp ectrum: D ( k, q ) = (2 π ) n δ ( k − q ) P D ( k ) , (69) with P D ( k ) = ( P − 1 S ( k ) + P R ( k ) P − 1 N ( k )) − 1 . The lo cality of the interaction terms r equires λ m = const b eside translationa l in v a riance and therefore the int eraction Hamiltonian rea ds H int [ s ] = ∞ X m =1 λ m m ! Z dx s m ( x ) (70) = ∞ X m =1 λ m m ! m Y i =1 Z dk i (2 π ) n s k i ! (2 π ) n δ ( m X j =1 k j ) In that ca se, the F eynman rules simplify considera bly . F or the interaction Hamiltonian of Eq . 70, the F ey nman rules a re now: 1. unin tegrated x -co ordinate: exp( − i k x ) (if real space functions ar e to b e ev aluated), 2. propaga tor: P D ( k ), 3. data so urce vertex: ( j + J − λ 1 )( k ), 4. vertex with m > 1 lines: − λ m , 5. imply momentum conserv a tion at eac h v er tex : (2 π ) n δ ( P m i =1 k i )), and in tegrate over every internal momentum : R dk (2 π ) n , 6. and divide by the symmetry facto r. 5. F eynman rules on the spher e F or CMB reconstructio n and analysis, but pres umably also for terrestria l applications, the F eynman rules on the sphere Ω = S 2 are needed and therefore provided in Appendix B. 14 B. Normalisabil it y of the theory In contrast to QFT, IFT should b e pro p erly nor mal- ized and not nece s sarily require a n y reno r malization pr o- cedure. T he reaso n is that IFT is not a low-energy limit of so me unknown high-ener gy theo ry , but can b e set up as the full (high-energ y) theory . T he Hamiltonian is just the logar ithm of the joint proba bilit y function of da ta and signal, H d [ s ] ≡ − log [ P ( d, s )], a nd therefor e well deﬁned and proper ly norma liz e d if the la tter is. Only if a d-ho c Hamiltonians are set up, or if a pproximations lead to ill- normalized theories, norma lization should be an issue. How ever, since we are trying a p erturbative expansion of the theory , there is no guara n tee that all individual terms are pro viding ﬁnite res ults. F o r example in QFT, simple lo op diagrams are k no wn to be divergent and re- quire renormalization. In the following we inv estigate a simplistic, but r epresentativ e case of IFT, which shows that s uc h problems are g enerally not to b e expe c ted. Let us adopt the simplistic situation described in IV A 4 a nd e stimate a s imple loo p dia gram for which we assume for no tational conv enience λ 3 = − 2 (2 π ) n λ ′ (with λ ′ > 0): = − 1 2 D λ 3 b D (71) = λ ′ Z dk Z dk ′ δ ( k + k ′ − k ′ ) P D ( k ) P D ( k ′ ) e ikx ≤ λ ′ P D (0) Z dk ′ P S ( k ′ ) = λ ′ V P D (0) h s 2 ( x ) i , where V is the volume of the system. Her e and in the following, b C denotes the dia gonal of the matrix C . Thu s, a s long the signal ﬁeld is of b ounded v ar iance, the lo op diagram is conv ergent due to P D ≤ P S for all k . Even a signal o f unbounded v a riance would not lea d to a divergen t lo op dia gram if R dk ( P N /P R )( k ) is ﬁnite, since we also hav e P D ≤ P N /P R . A bo unded v aria nce signa l is very natura l, esp ecially in a cosmologica l setting. 7 Finally , since a signal as an infor ma tion ﬁeld can b e chosen freely , we can deﬁne it to b e the ﬁltered v ersion of the physical ﬁeld (e.g. dark matter dis tribution or CMB ﬂuctuations), so that only mo des of suﬃciently bound v ariance are present in it. Since we hav e the freedom to chose information ﬁelds, which are mathematically well behaved, we can therefore ens ure co n v ergence of expres- sions. Although this is not a gener al pro of of normalisability of the theory , which is b eyond the sco pe of this pap er, it 7 The cosmological signal of prim ary i nterest, the i nitial den- sity ﬂuctuations as reve aled b y the large-scale-s tr ucture and the CMB, is exp ected to exhibit a suppression of small- scale p ow er due to the free-streaming of dark matter particles b efore they be- came non-relativistic. Also the CMB temp erature ﬂuctuations are da mped on small scales, due to f ree streaming of photons around the tim e of r ecom bination. should provide conﬁdence in the well-b ehav edness of the formalism in sensible applications. The price to b e payed for this well-behavedness is the mor e co mplex str ucture of the pro pagator, which, in co mparison to QFT, ev en in s implis tic cas e s can b e non-analytica l a nd req uire nu- merical e v alua tion. C. Expansion around the cl assical solution 1. Gener al c ase The classica l solutio n of the Hamiltonian in Eq. 59 is provided by its minim um, δ H δ s x = D − 1 x y s y − j x + ∞ X m =1 1 m ! Λ ( m +1) x x 1 ...x m s x 1 . . . s x m = 0 . (72) This leads to the equa tion for the clas sical ﬁeld s c l y = D y x j x − ∞ X m =1 1 m ! Λ ( m +1) x x 1 ...x m s c l x 1 . . . s c l x m ! , (73) which one can try to solve iteratively . 2. L o c al inter actions F or simplicit y , w e concentrate for a moment on the case of purely loca l interactions, for which the equation for the classica l ﬁe ld s cl is s cl = D j − ∞ X m =1 λ † m +1 m ! s m cl ! . (74) Iterating this equation a nd r ewriting the resulting terms as F eynman diagrams shows that the class ic al solution contains the tree-diagr a ms. The lo op diagr ams ca n b e added by in v estigation of the non-classica l uncertaint y ﬁeld φ = s − s cl . A non-clas sical expansion of the information ﬁeld around the classical ﬁeld is poss ible b y inserting s = s cl + φ in to the Ha miltonian (Eq. 64). Reorder ing terms according to the powers of the ﬁeld φ leads to its Hamil- tonian H ′ [ φ ] ≡ H [ s cl + φ ] = H ′ 0 + 1 2 φ † D ′ − 1 φ − j ′† φ + ∞ X m =3 1 m ! λ ′ m † φ m , with λ ′ n ≡ ∞ X m =0 λ n + m m ! s m cl , (75) H ′ 0 ≡ H [ s cl ] = H 0 + 1 2 s cl † D − 1 s cl + λ ′ 0 , j ′ ≡ j − λ ′ 1 − D − 1 s cl , and D ′ ≡ ( D − 1 + c λ ′ 2 ) − 1 . 15 In case s cl is exa ctly the classical solution, E q s. 74 and 75 imply that j ′ = 0. Thus, there are no o ne-line inter- nal vertices in any F eynman-gr aphs of the φ -theory , and only lo op- dia grams con tribute uncertaint y- corrections 8 to a ny information theoretica l estimator. F or example, the uncertaint y-cor rections to the classical map estima- tor a re given b y δ m = m d − s cl = h φ i d (76) = + + + + + . . . How ever, in ca s e s cl is not (exactly) the cla ssical solution, may this due to a trunca tio n er r or o f an iter ation scheme to solve for the class ical ﬁeld, o r may s cl be chosen for a completely diﬀerent purp ose, Eq. 7 5 provides the cor rect ﬁeld theory for φ = s − s cl independent of the natur e of s cl . In ca se o f a truncation error , incorp orating diagr ams with data- source terms j ′ int o a n y computation will p er- mit to cor rect the inac c ur acy of s cl in a systematic wa y . D. Boltzmann-Shannon Information 1. Helmholtz f r e e ener gy Information ﬁelds c a rry infor mation on distributed ph ysical quantities. The amount of s ignal-information should be measura ble in infor mation unit s lik e bits and bytes. This is p o ssible by adopting the Boltzmann- Shannon information measure of negative entrop y . The ent ropy o f a signa l probability function mea sures the phase-space volume av ailable for signal uncertainties, and therefore the constraintness of the remaining uncertain- ties. Th us we deﬁne I d ≡ Z D s P ( s | d ) log P ( s | d ) = − Z D s 1 Z e − H [ s ] ( H [ s ] + log Z ) = −h H [ s ] i d − log Z . (77) as the information measure. Int ro ducing Z β [ d, J ] = Z D s exp  − β ( H [ s ] − J † s )  , and F β [ d, J ] = − 1 β log Z β [ d, J ] , (78) 8 W e prop ose the term uncertainty-c orr e c tions i n order to describ e the inﬂuence of the spread of the probabili t y distribution func- tion around its maximum. The uncertaint y-corrections are the information ﬁeld theoretical equiv alent to quan tum- corr ections in quan tum ﬁeld theories. of which the latter is the Helmholtz free energy as a func- tion of the in verse temp erature β , we can w r ite I d = − log Z 1 [ d, 0] − h H [ s ] i d = − ∂ F β [ d, J ] ∂ β     β =1 , J =0 , (79) as can be veriﬁed b y a dir ect calculation. The ﬁrst ex- pression for I d in E q. 79 is equiv a le n t to the well known thermo dynamic r elation F = E − T S B with the internal energy E = h H [ s ] i d , the Boltzmann entrop y S B = − I d and the temp erature, w hich is set here to T = 1. The sec- ond expression actually holds even if the Hamiltonian is improp erly no rmalized, e.g. H 0 can be chosen ar bitrarily if Z β [ d, J ] is calculated c onsistent ly with this choice. The Helmholtz free ene r gy F β [ J ] is also the genera - tor of all co nnected cor r elation functions of the signal h s x 1 · · · s x n i c ( s | d ) = − δ n F β [ d, J ] /δ J x 1 · · · δ J x n | β =1 , J =0 . It can b e calculated a s follows: F β = − 1 β log Z G β [ J ] Z G β [ J ] Z D s e − β H int [ s ] e − β ( H G [ s ] − J † s ) ! = − 1 β log Z G β [ J ] − 1 β log D e − β H int [ s ] E ( s | J + j, G ) , (80) where the average in the last ter m is o ver the Ga us- sian pro babilit y function P G J,β [ s ] = exp( − β ( H G [ s ] − J † s )) / Z G β [ J ]. This term can b e calculated by using the well-kno wn fact that the lo g arithm of the sum of all po s- sible connected and unconnected diagrams with only in- ternal co ordina tes (or without free ends ), as genera ted by the exp o nen tial function o f the interaction terms, is given by the sum of all connected diagrams [57]. F or example, a free theory , p erturb ed by small, up-to- fourth- order interaction ter ms (all b eing pr opo rtional to so me small par ameter γ ), has F β [ J ] = H G 0 + Λ (0) | {z } H 0 − β − 1  + + + + + + + +   + O ( γ 2 ) , (81 ) where an infor ma tion sourc e vertex reads β ( J + j − Λ (1) ), an internal vertex with n lines β Λ ( n ) , and the pr o pagator β − 1 D . Finally , we hav e deﬁned = 1 2 log | 2 π D β − 1 | = 1 2 T r(log(2 π D β − 1 )) . Thu s, we hav e F β [ J ] = H 0 − 1 2 β T r(log(2 π D β − 1 )) + 1 2 β Λ (2) [ D ] + 1 2 ( J + j − Λ (1) ) † ( D + Λ (2) ) ( J + j − Λ (1) ) + 1 2 β Λ (3) [ D , m J ] + 1 3! Λ (3) [ m J , m J , m J ] 16 + 1 8 β 2 Λ (4) [ D , D ] + 1 4 β Λ (4) [ D , m J , m J ] + 1 4! Λ (4) [ m J , m J , m J , m J ] + O ( γ 2 ) , (82) where we in tro duced the zero-o rder map m J = D ( J + j ) for no tational conv enience. The p ow er of β asso ciated with the diﬀerent diagr ams in Eq. 81 is given by the nu m ber of vertices min us the num ber of pro pagators mi- nu s o ne. Thus, all tree-diagra ms a re of order β 0 , the one-lo op diagr ams ar e o f order β − 1 and the tw o lo op di- agram of order β − 2 , and only the latter tw o a ﬀect the information c on ten t: I d = − "  2 + + + + + # + O ( γ 2 ) = 1 2 h − T r(1 + lo g(2 π D )) + Λ (2) [ D ] + Λ (3) [ D , m 0 ] + 1 2 Λ (4) [ D ⊗ ( D + m 0 m † 0 )]  + O ( γ 2 ) , (83) where  = T r(1), β = 1, J = 0 , and th us m 0 = D j . 9 2. F r e e the ory T o obtain the information co n tent of the free theor y , we can set γ = 0 in E qs. 82 and 83 or use Eq. 49 with the repla cemen ts J → β J , j → β j , D → β − 1 D , and H 0 → β H 0 . In b oth cases we ﬁnd identically F β [ J ] = H G 0 − 1 2 ( J + j ) † D ( J + j ) − 1 2 β T r log  2 π β D  , and I d = − 1 2 T r (1 + log (2 π D )) . (84) V ery similarly , o ne ca n calculate the information prior to the da ta, which turns o ut to b e I 0 = − 1 2 T r ( 1 + log (2 π S )) . (85) Thu s, the data-induced information g ain is ∆ I d = I d − I 0 = 1 2 T r  log  S D − 1  = 1 2 T r  log  1 + S R † N − 1 R  . (86) The information gain dep ends on the s ig nal-resp onse- to-noise ratio Q ≡ R S R † N − 1 , als o shortly denoted by 9 Here, we introduced the symmetrized tensor pro duct A ⊗ B of an n -rank tensor A and an m -rank tensor B , which has the prop ert y ( A ⊗ B ) x 1 ...x n + m = 1 m ! X π ∈ P n + m A x π (1) ...x π ( n ) B x π ( n +1) ...x π ( n + m ) , with P l being the set of p erm utations of { 1 , . . . , l } . the measure ment ﬁdelity or quality . The informatio n in- creases linearly with Q as lo ng as Q ≪ 1, but levels oﬀ to a logarithmic increase for Q ≫ 1. W e note, tha t for the free theor y only the informatio n gain do es not de p end on the actual data rea lization. E. IFT Recip e A typical IFT application will aim at calculating a mo del ev idence P ( d ), the ex pectation v alue of a sig nal given the data, the map m ( x ) = h s ( x ) i ( s | d ) of the sig- nal, o r its v ar iance σ 2 s ( x, y ) = h ( s ( x ) − m ( x )) ( s ( y ) − m ( y )) i ( s | d ) as a measure of the signal uncerta int y . The general recip e for such applications can b e summar ized as following: • Specify the signal s and its prior pro babilit y distri- bution P ( s ). If the signa l is derived from a physical ﬁeld ψ , of which a pr ior sta tistic is k no wn, the dis- tribution o f s = s [ ψ ] is induced accor ding to E q. 2. • Specify the data mo del in terms of a lik eliho o d P ( d | s ) conditioned on s . Again, if the da ta ar e related to an underlying ph ysical ﬁeld ψ , the lik e- liho od is given by E q. 4. • Calculate the Hamiltonia n H d [ s ] = − log( P ( d, s )), where P ( d, s ) = P ( d | s ) P ( s ) is the joint probabil- it y , a nd expa nd it in a T aylor-F r´ echet ser ies for all degrees of freedom of s . Identify the co eﬃcients o f the constant, linear , quadra tic, and n th -order terms with the normaliza tion H 0 , informa tion s ource j , inv erse propagator D − 1 , and n th -order interaction term Λ ( n ) , resp ectively , as s ho wn in Eq. 3 6 or 59. • Draw all diagr ams, whic h contribute to the quan- tit y of interest, co nsisting of v ertices, lines , a nd op en-ends up to so me o rder in complexity or some small or dering parameter. The log-evide nc e is given by the sum of all connected dia g rams w itho ut op en ends , the exp ectation v a lue of the signal b y all c o nnected diagrams with one ope n end, a nd the signal-v ar iance aro und this mean b y all connected diagrams with tw o op en ends. • Read the diag rams as co mputational alg orithms sp eciﬁed b y the F eynma n rules in Sect. IV, and implemen t them by using linear algebra pack ages or existing map- making co des for the information propaga tor and vertices. The r equired discretisa- tion is outlined in Sect. I I E 1. Information on how to implemen t the required matrix inv ersio ns e ﬃ- ciently can b e found in the litera ture given in Secs. I C 2, I C 4, and I C 5 and esp ecially in [38]. • If the resulting non-linear data transformation (or ﬁlter) has the required accura cy , e.g. to b e v eriﬁed via Monte-Carlo simulations using sig nal and data 17 realizations dr awn from the prior and likelihoo d, resp ectively , an IFT algorithm is established. • In case that to o lar ge interaction terms in the Hamiltonian pr ev ent a ﬁnite num b er of diag rams to form a w ell p e r forming alg orithm, a re-summation of high order terms is due. This ca n b e achiev ed by the saddle po in t approximation (classica l so lutio n, maximum a po steriori estimator), or even b etter b y a deta iled renor malization-ﬂow analysis along the lines outlined in Sect. V F. V. COSMIC LARGE-SCALE STRUCTURE VIA GALAXY SUR VEYS A. Poissonian data m odel and Hamiltonian Many datasets suﬀer from Poisson noise, which is non- Gaussian and signal dep endent, and there fore well suited to test IFT in the no n-linear regime. F or example, the cosmolog ical LSS is tra ced by gala x ies, which may b e assumed to b e genera ted by a Poisson pro cess. On la rge- scales, the e xpectatio n v alue of the galaxy density fol- lows that of the under lying (dark) matter distribution. The aim of cos mography is to recov er the initial den- sity ﬁeld fro m the shot-noise contaminated gala xy da ta. Currently , large ga laxy surveys are conducted in order to c har t the cosmic matter distribution in three dimen- sions. Impro v ing the galaxy based LSS reconstr uction techn iques and understanding their uncertainties better is there fore an imminent a nd imp ortant goal. Optimal techn iques to recons truct Poissonian-nois e a ﬀected sig- nals are a lso crucia l for other problems, since e.g. imag - ing with photon detecto rs plays an imp ortant role in a s- tronomy a nd other ﬁelds. Her e, w e outline how suc h problems ca n be tr eated, by dis c ussing a sp eciﬁc data mo del motiv ated by the problem of large-sca le-structure reconstructio n fr om gala xies. F or this problem we work out the optimal estimato r a nd show its superio rit y nu- merically . A more genera l dis cussions o f mo dels of galax y and structure formation and references to r e lev ant works was given in Sect. I C 4. In or der to treat the Poissonian case in a co nvenien t fashion, we sub divide the ph ysica l space into s ma ll cells with v olumes ∆ V , a nd assume that a cell loca ted a t x i has a n exp ected num b er of o bserved ga laxies µ i ≈ κ (1 + b s ( x i )) (87) with κ = ¯ n g ∆ V b eing the co smic average n um ber of galaxies p er cell and b b eing the bias of the galaxy over- density with resp ect to the da rk matter ov er densit y s , still as s umed to b e a Gaussian r andom ﬁeld (Eq. 38). How ever, this data mo del has tw o sho r tcomings. First, to o ne g ative ﬂuctuations o f the Gaussian random ﬁeld with s < − 1 lea d to neg ativ e exp ectation v alues , for which the Poissonian statistics is not deﬁned. Second, the mea n density of observ able galaxies κ and their bias parameter b are constan t everywhere, whereas in r ealit y bo th exhibit spatial v ariatio ns. 10 Although being now spatially inhomog eneous, we ass ume κ and b to b e known for the moment and to inco rpo rate a ll ab ov e observ a - tional eﬀects. T o cure the ab ov e mentioned sho rtcomings we r e place Eq. 87 by a non-linea r and no n-translational inv ar ian t mo del: µ i = κ ( x i ) ex p( b ( x i ) s ( x i )) , (88) where κ and b may depend on pos ition in a known wa y , and the unknown Gaussia n ﬁeld s , the log-ma tter density , may exhibit unres tricted negative ﬂuctuations. Note that µ is the signal resp onse, b y our deﬁnition in Eq. 10, since µ [ s ] = h d i ( d | s ) . W e call κ the zer o-r esp onse , since µ [0] = κ . It should b e stres s ed that the data mo del in E q . 88 is just a convenien t choice for illustration and pr oo f-of- concept purpose s , a nd is ea sily exchangeable with more realistic, and even non-lo cal data mo dels. Howev er, this log-nor mal data mo del w as orig ina lly prop osed by Cole s and Jones [212], inv es tig ated for constra ined realiz ations by Sheth [1 07] a nd Vio et al. [21 3 ] and se ems to repro duce the statistics of LSS sim ulations muc h b etter than the often us e d normal distribution of the ov erdens ity [214]. Having c hosen a Poissonian pro cess to po pulate the Univ erse and our obse r v a tional data with galaxies ac- cording to the under lying log -densit y ﬁeld s , the likeli- ho od is P ( d | s ) = Y i µ d i i d i ! e − µ i (89) = exp ( X i [ d i log µ i − µ i − log( d i !)] ) , where d i is the actual num b er of galax ies obser v ed in cell i . Since P ( s ) = G ( s, S ), the Hamiltonian is given by H d [ s ] = − log P ( d, s ) = − log P ( d | s ) − log P ( s ) = − d † b s + κ † exp( b s ) + H ′ 0 + 1 2 s † S − 1 s = 1 2 s † D − 1 s − j † s + H 0 + ∞ X n =3 1 n ! λ † n s n , with D − 1 = S − 1 + d κ b 2 , (90) 10 Suc h v ariations are due to the geometry of the observ ational survey s ky c o v erage, due to a galaxy selection funct ion which decreases with distance from the observ er, and due to a chan g- ing comp osition of the galaxy p opulation. The l atte r distance- eﬀects are caused by the cosmic evolution of galaxies and by the c hanging observ ational detectabilit y of the diﬀerent types with distance. W e note, th at an observed s ampl e of galaxies, which wa s selected determinis tically or sto c hastically fr om a complete sample e.g. b y their luminosity due to instrumental sensitivity , still p ossesses a Poissonian statistics, if the or i ginal distribution does. 18 j = b ( d − κ ) , H 0 = 1 2 log( | 2 π S | ) + ( κ + log( d !)) † 1 − d † log κ, and λ n = κ b n . The hat on a scalar ﬁeld denotes that it should b e read as a matrix, which is diagonal in p osition spa c e (see Ap- pendix A). A few remar ks should b e in or der. Compar- ing the pr opagator to the one of our Gaussian theory one can re a d o ﬀ an inv ers e no ise term M = R † N − 1 R = d κ b 2 . Thu s the eﬀectiv e (inv e r sely resp onse weight ed) noise de- creases with incr easing mea n galaxy nu m be r and bias, and seems to b e inﬁnite in regions w itho ut da ta ( κ = 0) without ca using any problem for the formalism. The informatio n source j increas es with increasing re- sp onse (bias) of the data (galaxies) to the signa l (density ﬂuctuations). Ho wever, it certainly v anishes for ze r o r e- sp onse ( b = 0) or in case that the observed galax y counts match the expe c ted mean at a given lo ca tio n exa ctly . Fi- nally , the int eraction terms λ n are lo cal in p osition space, and v a nish with decrea sing b and κ . The latter para m- eter is under the control of the data analyst, since it is prop ortional to the volume of the individual pix e l sizes, and therefore ca n b e made a rbitrarily sma ll by choo sing a more ﬁne gr ained reso lution in sig nal spa ce. Howev er , this would not c ha nge the con vergence prop erties of the series since any int eraction vertex has then to b e summed ov er a cor resp ondingly larg er num b er of pixels within a coherence pa tch of the signal, which exac tly co mpensates for the smaller co eﬃcient. 11 The bias, in contrast, is set b y nature and can be re garded a s a p ow er co unting parameter, which provides naturally a numerical hierar- ch y among the higher order vertices and diagra ms for b 2 S < 1. Note tha t j = O ( b ). B. Galaxy types and bias v ari ati ons Real galaxies can be cast in to diﬀerent classes , whic h all diﬀer in terms of their luminosities, bia s facto rs, and the frequencies with which they are found in the Uni- verse. Although we are not going to in vestigate this complication in the following, it should b e explained here how a ll the formulae in this section can eas ily be r ein ter- preted, in order to incorp ora te also the diﬀerent classes of g alaxies. The galaxie s can b e characterized by a type-v a r iable L ∈ Ω type , whic h may b e the intrinsic luminosity , the morpholog ical gala xy type, or a multi-dimensional com- bination of all prop erties whic h deter mine the g alaxy 11 κ seems to control the stiﬀness of the later introduced resp onse renormalization ﬂo w equation and its v al ues is therefore n umeri- cally r elev ant. A low er κ , due to a ﬁner space pixelisation, results in a less stiﬀ and better b eha v ed equation. t yp e ’s spatial distributions via a L -dependent bias b L , and their detectability as enco ded in κ L . The da ta space is now spa nned by Ω data = Ω space × Ω type , a nd also µ , κ and b can be reg arded as functions ov er this spa ce. Performing the s a me algebr a as in the previous sectio n, just taking the larger da ta-space into a ccount, w e get to exactly the sa me Hamiltonian, as in E q. 90, if we int erpret an y term containing d , κ a nd b to b e summed or integrated ov er the t ype v ar iable L . Th us, we read j ( x ) = ( b ( d − κ )) ( x ) ≡ Z dL b L ( x ) ( d L ( x ) − κ L ( x )) , D − 1 xy =  S − 1 + d κ b 2  xy ≡ S − 1 xy + 1 xy Z dL κ L ( x ) b 2 L ( x ) , λ n ( x ) = ( κ b n ) ( x ) ≡ Z dL κ L ( x ) b n L ( x ) , and (91) µ [ s ]( x ) =  κ e b s  ( x ) ≡ Z dL κ L ( x ) e b L ( x ) s ( x ) = Z dL µ L [ s ]( x ) , which all live in Ω space solely , so tha t the computa- tional c o mplexit y of the matter distr ibution r econstruc- tion problem is not a ﬀected at all, and o nly a bit mor e bo ok -k eeping is r equired in its setup. A few observ a tions s hould be in o rder. In cas e of all galaxies having the s ame bias factor, Eq . 91 is s imply a marg inalization o f the type v ar iable L , and any dif- ferentiation of the v arious galaxy types is not necess ary . Since all known ga laxy t ype s seem to hav e b ∼ O (1), such a mar ginalization seems to b e justiﬁed, and ex- plains why LSS reconstructions, which applied this s im- pliﬁcation, are relatively succe s sful, although the diﬀer- ent galaxy mas ses, luminosities, and frequencies v a ry by orders of magnitude. As o ur n umerical experiments b e- low r ev eal, the data, and ther efore the recons tr uctabilit y of the de ns it y ﬁeld, exhibit a sensitive dep endence on the bia s for s -ﬂuctuatio ns with unity v ar iance. 12 Such a v ariance is indeed o bserved on scales b elow ∼ 10 Mp c in the gala xy distributio n, and therefore the g alaxy t yp e- depe ndent bias v aria tion do es indeed matter . Larger galaxies , which hav e larger biases, therefore provide p er galaxy a slightly larg e r infor mation sour ce ( j ∝ b ), less shot noise ( R † N − 1 R ∝ b 2 ), and increas ingly larger higher-or der interaction ter ms ( λ n ∝ b n ) in comparison to sma lle r galaxies . How ever, smaller g alaxies are muc h more numerous by orders o f magnitude, a nd therefore provide the larg est total contribution to the information source, noise re ductio n and most low-order interaction terms. Thus, the latter will dominate and therefore p er- mit a reaso nable accur ate matter reconstruction from an inhomogeneous ga laxy survey using a sing le bias v alue. Nevertheless, improv emen ts of the bias trea tmen t a re po ssible by applying the recip es describ ed her e . 12 This is f ound for our sp eciﬁc data mo del µ ∝ exp( b s ), how ev er, should also apply for other mo dels, which somehow hav e to k eep µ ≥ 0 ev en for b s < − 1 19 C. Non-linear map mak ing The map, the expe c tation o f our information ﬁeld s given the data , is to the low est o r der in interaction m 1 = + + + + O ( b 6 ) = D xy j y − 1 2 D xy b 3 y κ y D y y − 1 2 D xy b 3 y κ y ( D y z j z ) 2 − 1 2 D xy b 4 y κ y D y y D y z j z + O ( b 6 ) (92) or in compac t notation m 1 = m 0 − 1 2 D d b 3 κ  b D + m 2 0 + c b D b m 0  + O ( b 6 ) . (93) It is appar e n t, that the non- linear map making for m ula contains corr e ctions to the linea r map m 0 = D j . The ﬁrst tw o correctio n terms are alwa ys negative, r eﬂect- ing the fact that o ur non-linear data mo del has no n- symmetric ﬂuctua tions in the data with resp ect to the mean. The la st corre ction ter m is opp ositely dir ected to the linear map, thereby cor recting for the curv ature in the sig nal resp onse. A one-dimensio nal, n umerical example is display ed in Fig. 1. There, the s ig nal w a s re a lized to have a power sp ectrum P s ( k ) ∝ ( k 2 + q 2 ) − 1 , with a cor relation length q − 1 = 0 . 04. The normaliza tion was chosen s uc h that the auto-cor relation function is h s ( x ) s ( x + r ) i ( s ) = exp( −| q r | ) and therefore the signa l disp ersion is unity , h s 2 i ( s ) = 1. The da ta a re ge nerated b y a Poissonian pro cess from κ s = κ exp( b s ) with b = 0 . 5 . All three dis - play ed reconstructions exhibit le s s p ow er than the orig- inal s ig nal, as it is exp ected s ince the r e c onstruction is conserv ative, and therefor e biased tow ards ze ro. The non-linear corr ection to the naive map m 0 should not b e to o lar ge, other wise hig her order diagr ams have to be included. In the case displayed in Fig. 1, b = 0 . 5 en- sured tha t the linear cor r ections w ere mo stly going in to the rig h t direction. Howev er, in case b ≈ 1 there is no obvious ordering of the imp ortance of the diﬀerent inter- action vertices, and num erical exp eriments reveal that the ﬁrst order cor rections s trongly ov erco rrect the linea r map m 0 = D j . In s uc h a case interaction re-summation techn iques should be used to inco rpo rate as many hig her order interaction terms as p ossible. One very powerful re-summation is provided by the class ic al solution, as de- veloped below, which contains all tree-diagr a ms s imulta- neously . This solution, also show in Fig. 1, is very c lose to m 1 in this case . D. Classi cal solution The classical signal ﬁeld or MAP so lutio n is giv en by Eq. 7 4, which reads in this case s cl = D j − ∞ X m =2 b m +1 m ! κ s m cl ! = D b  d − κ  e b s cl − b s cl  (94) = S b ( d − κ e b s cl | {z } κ s cl ) . The la st expr ession motiv ates to introduce the exp ected nu m ber of galaxie s given the signal s : κ s = κ e b s . (95 ) Also alternative forms of the MAP equation can be de- rived, for example one, which is esp ecially suitable for large j : s cl = 1 b log  j − S − 1 s cl κ b  = 1 b log  d κ − 1 − S − 1 s cl κ b  . (96) This may b e solved iteratively , while ensuring that s ( i ) cl ≤ S j at all itera tions i with equality only where κ = 0. This form o f the classical ﬁeld eq uation has some similarities to the na iv e in version of the resp onse formula, h d i ( d | s ) = κ exp( b s ), which yields s naive = 1 b log  d κ  , (97) a formula o ne can only dar e to use in regimes of lar ge d . Since s naive contains the full noise o f the data, a suitable naive map may b e g iv en by m naive = S s naive , after some ﬁx for the lo cations without g a laxy counts. The clas- sical solution, howev er , is mo r e conse r v a tiv e than this naive data inv ersio n, in that there is a damping term, S − 1 s cl / ( κ b ), co mpensating a bit the inﬂuence o f to o large data p oin ts. Those equations p ermit to calc ula te the classical solu- tion if suitable numerical regulariza tion s c hemes a re a p- plied, since na iv e iterations ca n easily lead to n umerica l divergences in the non-linear case. One way of do ing this is by turning the clas s ical equa- tion (Eq. 94) in to a dynamica l system. Its initial con- ditions a re given by a well solv able linear or even triv- ial problem to which non-linear complications a re added successively during a n interv al of some pseudo -time. The endpo in t of this dynamics is then the required solution. The meaning of the pseudo-time depends on the wa y it was set up. In any case, it ca n just b e regar ded as a math- ematical tric k to generate a diﬀerential equa tion, which might b e e a sier to s o lv e numerically than the or iginal problem. F or example, a pseudo -time τ can be intro duced by setting j ( τ ) = τ j . Th us, the information sour c e is successively injected into an initially trivia l ﬁeld state, s cl (0) = 0. This allows to set up a diﬀerential equation for s cl ( τ ) by taking the time der iv a tiv e o f Eq. 94, ˙ s cl = D s cl j with D s cl =  S − 1 + κ s cl b 2  − 1 , ( 98) which has to b e solved fo r s cl (1) starting from s cl (0) = 0. This equation is very app ealing, since it lo oks lik e Wiener-ﬁltering a n incoming infor mation strea m j and 20 1 10 0.2 0.6 0.8 -1 0 1 2 3 4 0.2 0.6 0.8 -1.5 -1 -0.5 0 0.5 1 1.5 0 0.2 0.4 0.6 0.8 1 P S f r a g r e p l a c e m e n t s data d signal response µ zero response κ ± b D 1 / 2 m ± b D 1 / 2 0 ∆ m = s − m m a p m T = 1 ∆ s c l = s − s c l m a p m T = 0 . 5 W iener map m 0 correted map m 1 classical map s cl m a s k signal s FIG. 1: Poiss onian-reconstruction of a signal with unit v ariance and correlation length q − 1 = 0 . 04, observ ed with sligh tly non-linear resp onse ( b = 0 . 5, resolution: 513 pixels p er unit length, zero-signal galaxy density: 1000 galaxies p er unit length). T op: data d , signal resp onse µ , and zero-resp onse κ . Middle: signal s , linear Wiener-ﬁ lter reco nstruction m 0 = D j , its one-sigma error in terv al m 0 ± b D 1 / 2 , next order reconstruction m 1 according to Eq. 92, and classical solution s cl according to Eq. 94. Although the linear Wiener is reconstructing well at most lo cations, t he n onlinear response requires the p erturb ativ e corrections present in m 1 or the classical solution in regions of high signal strength. Bottom: The residuals, th e d eviations of m 0 , m 1 , s cl from th e signal, and the Wiener-v ariance ± b D 1 / 2 . accumulating the ﬁltered da ta, while sim ultaneously tun- ing the ﬁlter D s cl ( τ ) to the accumulated knowledge on the sig nal s cl ( τ ) and thereby implied Poissonian-noise structure. Thus, it is a nice example system for co n tin- uous Bayesian learning and also illus trates how diﬀeren t datasets can successively b e fused into a single knowledge basis. Map-making a lgorithms with a higher ﬁdelit yare p os- sible b y not only inv estiga ting the maximum of the po s- terior, but by av era ging the signal s over the full s upport of P ( s | d ). Anyhow, we c a n as sume that a go o d a pprox- imation t ≈ s cl to the clas sical solution can be achieved. Figs. 1 and 2 displa y cla s sical solutions for slightly and strongly no n-linear Poissonia n inferenc e problems. E spe- cially the second example sho ws that the class ical solu- tion can b e improv ed in reg ions of lar ge uncer tain t y (see region b et ween x = 0 . 2 and 0 . 5 in Fig. 2, where ap- parently better estimato rs exist) for missing uncertaint y lo op diagrams , whic h contain informatio n abo ut the non- Gaussian structure o f the p osterior P ( s | d ) awa y from s cl . 21 0.01 0.1 1 10 100 0.2 0.6 0.8 -2 -1 0 1 2 0.4 0.6 0.8 -1 -0.5 0 0.5 1 0 0.2 0.4 0.6 0.8 1 P S f r a g r e p l a c e m e n t s data d signal response µ zero response κ ± b D 1 / 2 m ± b D 1 / 2 0 ∆ m = s − m map m T = 1 ∆ s c l = s − s c l map m T = 0 . 5 map m T = 0 m a s k signal s FIG. 2: Poissonian-reconstruction of the same signal realization as in Fig. 1 (unit v ariance and correlation length q − 1 = 0 . 04), observed now with a strongly non-linear resp onse ( b = 2 . 5, resolution: 512 pixels p er unit length, zero-signal galaxy density: 100 galaxies per unit length where mask is one) through a complicated mask. T op: data d , signal response µ , and zero-resp onse κ . Middle: signal s , classi cal solution s cl = m T = 0 , in t ermediate solution m T = 0 . 5 and renormalization-based reconstruction m T = 1 with un certain ty interv al m T = 1 ± b D 1 / 2 T = 1 , and mask κ/ ( n g ∆ V ). The linear Wiener-ﬁlter reconstruction m 0 as well as its next order corrected versi on m 1 are not display ed , since they are partly far ou tside the display ed area. Bottom: Deviations of the three reconstructions from th e signal, and t he original and the renormalized uncertaint y estimates ± b D 1 / 2 0 and ± b D 1 / 2 T = 1 , respectively . Note, that in the regions with many observe d galaxies, the h igh signal to n oise ratio can b e seen in th e narro wness of b D 1 / 2 T = 1 , whic h is signiﬁcantly smaller than the data-u naﬀected b D 1 / 2 0 at these locations. E. Uncertaint y-lo op corrections Now, w e see ho w the miss ing uncertaint y loo p cor- rections can b e added to the class ical solutio n. These correctio ns ca n be derived from the Hamiltonian of the uncertaint y-ﬁeld φ = s − t , H t [ φ ] = 1 2 φ † D − 1 t φ − j † t φ + κ † t g ( b φ ) + H 0 ,t , wher e D − 1 t = S − 1 + b 2 b κ t , j t = b ( d − κ t ) − S − 1 t, (99 ) g ( x ) = e x − 1 − x − 1 2 x 2 = ∞ X m =3 x m m ! , and H 0 ,t is a momen tar ily irrelev ant normalization c o n- stant. Again, we have p ermitted for a non-zer o j t , since t might not b e exa ctly the classical solutio n. It is interesting to no te that the interaction co eﬃcients in this Hamiltonian, λ ( m ) t = κ t b m , all reﬂect the e x pected 22 nu m ber o f gala xies given the reference ﬁeld t . Thus, the replacement κ 0 → κ t would provide us with the shifted ﬁeld Hamiltonian, as deﬁned in E q. 60, ex cept for the term − S − 1 t in j t . It turns out, that this term is some sort of co un ter -term, which accumulates the eﬀect of the non-linear interactions. W e see tha t eﬀectiv e interaction terms arise when r ele- v ant parts of the s olution ar e a bsorb ed in the background ﬁeld t . A similar appro ach is des irable for the lo op di- agrams . Ins tead o f dra wing and calcula ting a ll possible lo op diagrams, w e wan t to abso rb several o f them sim ul- taneously into eﬀective co eﬃcients. F or each vertex o f the Poissonian Hamiltonia n with m legs, there exist dia- grams in a n y F eynman-expans ion, in whic h a num b er of n simple lo ops a re added to this vertex. Such an n -lo op enhanced m − v ertex is given b y = − 1 2 n n ! λ ( m +2 n ) t c D t n = − 1 2 n n ! κ t b m +2 n c D t n . (1 00) All thes e diagra ms c an b e re-summed into an eﬀective int eraction v ertex, via λ ( m ) t → λ ′ ( m ) t = κ t b m ∞ X n =0 1 2 n n ! b 2 n b D n = κ t exp  b 2 2 b D  b m (101) = κ t + b 2 2 b D b m = λ ( m ) t + b 2 2 b D . Thu s, this re-summation is eﬀectively equiv alent to the replacement κ t → κ t + b b D/ 2 , (102) which reﬂects the large r exp ected resp onse to a re fer - ence ﬁeld t due to the uncertaint y ﬂuctuations around it. Those ﬂuctuations pic k up the a symmetric shap e of the exp onen tial term in the Hamiltonian, wher e the larger re - sp onse to p ositive ﬂuctuations is not fully co mpensated by the low er resp onse to nega tiv e ﬂuctuations. One might wonder, if the s imple replacement rule in Eq. 102 co uld supplement the classica l so lution with the missing un- certaint y loop corrections. Thus w e ask , if the mo diﬁed classical eq ua tion m = b S ( d − κ m + b b D/ 2 ) (103) together with a self-constitently determined propagato r D − 1 = S − 1 + b 2 b κ m + b b D / 2 (104) could pr o vide the mean ﬁe ld given the data. A more rigoro us reno r malization calculation will show that this is indeed the case, within some approximation. The lo op-corr ected density and propagator per mit to construct es timators for the dark matter density itself,  =  0 e c s , (105) instead o f its logarithm, s . Here c ﬁxes the re lation b e- t ween s and  , a nd  0 being the c o smic median dar k matter de ns it y . T ranslating o ur log density map into the density r esults in the naive density estimator m naive  =  0 e c m , (10 6 ) which is not optimal in the sense of minimal rms devia- tions. The prop er estimator would b e m  = h  0 e c s i ( s | d ) =  0 e c m + c 2 b D / 2 , (107) which contains uncer tain t y lo op corr ections acc oun ting for the shift of the mean under the non-linear transfor- mation b etw een log-dens it y a nd density . F. Resp onse renormalization Since we are dealing with a φ ∞ -ﬁeld theor y , the zoo of lo o p diagrams is quite complex, and for ms so mething like a F eynman fo am . I n order not to get stuck in the m ultitude of this foam, we urgently require a trick to keep either the maximal order of the diagra ms low, or to limit the num b er of vertices per diag ram, or b oth. W e hav e basically tw o handles on any interaction term λ n = κ b n , the bias b and the zero - resp onse κ . W e concentrate on the resp onse, since it e n ter s the Hamiltonian in a linear wa y and also the da ta ca n b e rega rded to be prop ortiona l to κ . Thus, the full Hamiltonia n H [ s ] = 1 2 s † S − 1 s − b d † s + κ 0 e b s (108) can b e regar ded to be propo rtional to the respo ns e, ex- cept for the pr ior term and also co nstan t terms we im- mediately dro p here a nd in the following. Let us ass ume that pr ior to any data analy s is we have an initial guess m 0 for the signal with some Gaussia n uncertaint y characterized by the cov aria nce D 0 . This can b e expressed via a Hamiltonian o f the form H 0 [ s ] = 1 2 ( s − m 0 ) † D − 1 0 ( s − m 0 ) , (10 9) which deﬁnes a probability density via P 0 ( s ) ∝ exp( − H 0 [ s ]). In case the pr ior should b e o ur initial guess, we have m 0 = 0 and D 0 = S , but we need not restrict ourself to this case . Now, we wan t to anticipate step by step the information o f the full problem, and forget our initial guess with the same ra te. This can b e mo deled by adopting an a ﬃne par ameter τ , which measur es how m uch we expo sed ourself to the full pro blem. F or each τ , which we rega rd as a pseudo-time, our knowledge state is describ ed by a n Hamiltonia n H τ . Increasing τ by s ome small amo un t ε should therefore lead to the next knowl- edge sta te characterized by H τ + ε = H τ [ s ] + ε ( H [ s ] − H τ [ s ]) . (110 ) 23 FIG. 3: The original p ropagator D 0 = ( S − 1 + d κ 0 b 2 ) − 1 (left) and the ﬁnal of the ren ormaliza tion ﬂ o w D (Eq. 117, right) in logarithmic grey scaling for t he data displa yed in Fig. 2. The v alues of th e diagonals show t he local uncertain ty v ariance (in Gaussian approximation) b efore ( c D 0 ) and after ( b D ) the data is analyzed, respectively . The b ottom left and t op right corners exhibit non-va nishing propagator va lues due to the assumed p eriodic spatial coordinate, which puts these corners close to the tw o others on the matrix d iago nal. This equation just models an asympto tica l a pproach to the correc t Hamiltonian. If the initial g ue s s was the prior , one s ees that for inﬁnitesimal steps ε the k no wledge ﬂow corres p onds to tuning up all terms prop ortional to κ , H τ [ s ] = 1 2 s † S − 1 s +  1 − e − τ   − b d † s + κ 0 e b s  → H [ s ] . This motiv a tes the term r esp onse r enormalization for this kind of co n tinuous lea rning system, into which the infor- mation s ource a s well the interactions a r e fed with the same r ate. The trick for the renormaliza tion pr oc edure is to ap- proximate the knowledge state at e ac h moment τ to b e of Gaussian shap e and therefore the Hamiltonian to b e free (quadr atic in the sig nal). Thus we set H τ [ s ] = 1 2 ( s − m τ ) † D − 1 τ ( s − m τ ) , (11 1) where m τ and D τ = ( S − 1 + M τ ) − 1 are the mean and disp e rsion of the ﬁeld g iven the acquir ed knowledge at time τ , res pectively . These have to be up dated when the next le a rning step is to be p erformed. The next Hamiltonian, b efore it b eing again repla ced by a fr ee one, is H τ + ε [ φ ] = 1 2 φ † D − 1 τ φ + ε  ( S − 1 m τ − b d ) † φ − 1 2 φ † M τ φ + κ m τ e b φ  = 1 2 φ † D − 1 τ φ + ε ∞ X n =1 1 n ! λ n φ n , (112) if ex pressed for the momentarily uncer tain t y ﬁeld φ = s − m τ . Here, the p erturbative expansion co eﬃcients a re given b y λ 1 = κ m τ b + S − 1 m τ − b d, λ 2 = κ m τ b 2 − c M τ , and λ n = κ m τ b n for n > 2 , assuming for simplicity tha t M τ is diagona l. This is a sav e restriction, since we will see tha t fo r τ → ∞ this is the case as ymptotically , ev en for a no n-diagonal initial M 0 . Thus we ca n require that our initial gues s was also of this form. In order to approximate this Hamiltonia n by a free one, we hav e to calculate the shifted mea n ﬁeld and its connected tw o - point correlation function, the full prop- agator . T o ﬁrst or de r in ε only lea f diag rams with a single p erturbative interaction vertex co n tribute to the per turbed exp ectation v a lue of φ : h φ i ( τ + ε ) ( s | d ) = + + + + . . . = ε D τ h b d − S − 1 m τ − b κ m τ e b 2 b D τ / 2 i . (11 3) Note, that only o dd interaction terms shift the exp ecta- tion v a lue m τ + ε = m τ + h φ i ( τ + ε ) ( s | d ) . The even o nes do not 24 exert an y net forces in the vicinity of φ τ = 0 since they represent a p otential which is mirror symmetric ab out this p oint. The reno rmalized propagato r D τ + ε is given b y the connected tw o - point co rrelation function h φφ † i ( τ + ε ) ( s | d ) , and this is up to linea r order in ε h φφ † i ( τ + ε ) ( s | d ) = + + + + . . . = D τ + ε D τ  M − b 2 κ m τ e b 2 b D τ / 2  D τ (114) Rewriting this for an up date of M τ we ﬁnd up to linear order in ε M τ + ε = (1 − ε ) M τ − ε b 2 κ m τ e b 2 b D τ / 2 . (115) T aking the limit ε → 0 yields the in tegro-diﬀere ntial sys - tem ˙ m = D  b d − b κ 0 e b m + b 2 b D/ 2 − S − 1 m  ˙ M = b 2 κ 0 e b m + b 2 b D / 2 − M , and (116) D =  S − 1 + c M  − 1 . This conv erges at a ﬁx p oint, which we pre v iously guesse d in Eqs. 103 and 104 for our uncerta in ty-lo op enhanced classical eq ua tion. The classic a l and the reno rmalization ﬂow ﬁx point equations can b e uniﬁed: m = b S  d − κ b m + T b b D / 2  , D =  S − 1 + b κ b m + T b b D/ 2  − 1 , (117) with T = 0 and T = 1 for the classica l and renormaliz a - tion result, resp ectively . The parameter T is more than a pure c o n v enience. If we would hav e int ro duced a temp erature T at the be- ginning, via P ( d, s | T ) = exp( − H d [ s ] /T ), Eq . 117 would hav e b een the r esult of the renor ma lization ﬂow calcula - tion. And the classical limit naturally corres ponds to the zero temperature regime, in whic h the ﬁeld expecta tio n v alue is not aﬀected by any uncertaint y ﬂuctuations since the sy stem is at its absolute energy minimum. An ex ample of such r econstructions can b e se en in Fig. 2, a nd its unce r tain t y structures in Fig. 3. Here, the renormaliz a tion equatio n indeed seems to provide a b et- ter res ult co mpared to the cla ssical one. Ho wev er, a sta- tistical compar ison of the tw o reco nstructions using 1000 realization of the signa l and da ta in Fig. 4 shows that there is at most a margina l diﬀerence. This may b e s ur- prising, since the clas sical and renorma lization solution are quite distinct, and the la tter is alwa ys lower than the former. O ne might therefor e ask, if the t wo a re bracket- ing the co rrect solution. And indeed, in termediate solu- tions c o nstructed using T = 1 / 2 p erform b etter tha n the ones for T = 0 a nd T = 1 , as can b e seen in Fig. 4. If neither T = 0 nor T = 1 provide the optimal r econ- struction, what would be the r igh t choice? W e hav e to re- mem ber that we replaced the probability density function at each step of the renor ma lisation scheme by a Gaussian with the cor rect mea n and disp ersion. How e ver, the real probability is not a Gaussian, and ther e fore our mean ﬁeld es timator is no t optimal. Reconstructions with dif- ferent T prob e the non- Gaussian proba bilit y structure with a diﬀer en tly wide Gaussian kernel in phase spa ce, and ther e fore result in a slightly diﬀerent signal means due to the anharmonic nature of our Hamiltonia n. G. Uncertaint y structure The re maining uncertainties at the end o f the renor- malization ﬂow can mainly be r e ad o f the renormalized propaga tor D , whic h we display in top part of Fig. 3 in compa rison to the o riginal, un-renorma liz ed one D 0 . The renor ma lised pr opagator is a m uch b e tter approxi- mation to the uncerta in ty-disp ersion of the signal p oste- rior distribution a round the mean map than the or iginal one. O ne can clea r ly see that the data imprinted a highly non-uniform structur e into the uncerta in ty pattern visi- ble in the reno rmalized propaga tor with small uncer tain- ties wher e there were many g alaxy counts. Also the den- sity estimator in Eq. 10 7 beneﬁts from the knowledge of the uncertaint y structure con ta ined in the renorma lis ed propaga tor, a s the low er pane l of Fig. 4 shows. The pro pagators also vis ualize the eﬀect any additional data would ha ve at diﬀeren t lo cations. The height a nd width of the propag ator v alues deﬁne res pectively the strength of the resp onse to, and the distance of info r ma- tion propa gation from a n information source. The s tructure of D 0 is imprin ted by the prior and the mask. At D 0 ’s widest lo cations the ma sk blo cks a n y in- formation source and the structure of the signa l prio r S bec omes visible. A t lo cations where the mask is transpa r - ent , the reconstructio n r espo nse per infor mation source is low er, as plen t y informa tion can b e exp ected there. Also the propaga tor width is sma lle r, since the individual in- formations do not nee d to b e pr opagated that far, thanks to the richer information source density in such re g ions. The structur e of D m has a dditionally imprinted the e x- pec ted information source density structure given the r e- construction m . The str ongly non-linea r signa l resp onse has lead to regions with very hig h ga laxy co un t ra tes , which have larg er informa tion densities, and ther efore low er and narrower information pro pagators. This im- plies, that any a dditional ga laxy detection in the regio ns with high gala xy counts will hav e little impact on the upda ted map, whereas any a dditional detected galaxies in low density r egions will mor e str ongly change it. How- ever, the n umber of additiona l galaxies per in vested ob- serving time will b e larger in high density reg ions, which may comp ensate the low e r information-p er-ga laxy ratio there. It is therefore interesting to loo k at the obser- v ational information co n ten t and how it dep ends on the 25 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 0 0.5 1 1.5 2 2.5 0 0.2 0.4 0.6 0.8 1 P S f r a g r e p l a c e m e n t s δ m T = 1 δ m T = 0 δ m T = 0 . 5 δ m naive δ b D T = 1 δ b D 0 δ m naive  δ m very naive  δ m  κ FIG. 4: T op: Statistical reconstruction erro r from 1000 signal and data rea lizations Curves are, roughly in o rder from top (bad performance) t o b ottom ( goo d performance): error δ m naive = h ( s − m naive ) 2 i 1 / 2 ( d,s ) of th e signal-co v ariance-conv olved naive map m naive = S s naive (see Eq. 97), exp ected Wiener-uncertaint y δ b D 0 = b D 1 / 2 0 , av eraged renormalized uncertaint y δ b D T = 1 = h b D T = 1 i 1 / 2 ( d,s ) , error of the classical map δ m T = 0 = h ( s − m T = 0 ) 2 i 1 / 2 ( d,s ) , error of the renormalized map δ m T =1 = h ( s − m T = 1 ) 2 i 1 / 2 ( d,s ) , and error of the intermediate map δ m T = 0 . 5 = h ( s − m T = 0 . 5 ) 2 i 1 / 2 ( d,s ) . The low est curve wi thout lab el is κ . Bottoms: Error v ariance of estimators for the density ,  = e s , namely δ m very naive  = h (  − e m naive ) 2 i 1 / 2 ( d,s ) , δ m naive  = h (  − m naive  ) 2 i 1 / 2 ( d,s ) and δ m  = h (  − m  ) 2 i 1 / 2 ( d,s ) (see Eqs. 106 and 107). actual da ta realiza tion. H. Information gai n In case of a free theory , the amount of information de- pends on the exp eriment al setup and o n the pr ior, but is indep e ndent of the data obtained as we hav e shown in Sect. IV D 2. This changes in case that o ne want s to har - vest informatio n in a situation describ ed by a non- line a r IFT. There, the amo un t o f informatio n can s trongly de- pend on the actual da ta . This is well illustrated by o ur LSS reco nstruction prob- lem. A p erturba tiv e c a lculation of the non-line a r infor- mation g ain is p o ssible if either the bias - factor or the signal amplitude, which b oth control the stre ngth of the non-linear interactions, are small compared to unity . 13 13 The si gnal amplitude can, for example, b e made small by deﬁning the signal of interest to b e the cosmic density ﬁeld, smo othed on The infor ma tion gain, as given by Eq. 83, e xpanded to the ﬁrst few order s in b ∆ I 1 = 1 2 T r log  1 + S d κ b 2  | {z } ∆ I 0 (118) + 1 2  κ b 3 c D 0  †  m 0 + 1 2 b ( b D 0 + m 2 0 )  + O ( b 5 ) , clearly dep ends on the actual realiz a tion of the data . The diﬀerent ﬂuctuations in the Wiener map m 0 = D 0 j , with D 0 = ( S − 1 + d b 2 κ ) − 1 and j = b ( d − κ ), imply p ositive and nega tiv e information density ﬂuctuatio ns. T o con v enient ly calculate the information g ain of the observ atio n in ca s e of a large bias factor , we use the Gaus- sian approximation of the jointed probability function, as pr ovided by the renorma liz ation scheme. Due to the Gaussianity of this a ppr o ximate solution, we can simply a suﬃcient ly large scale ( > 10 M pc) so that h s 2 i ( s ) < 1. 26 0.01 0.1 1 0.2 0.4 0.6 0.8 0.01 0.1 1 0 0.2 0.4 0.6 0.8 1 P S f r a g r e p l a c e m e n t s information gain ∆ I d 0 th order approx . ∆ I 0 1 st order approx . ∆ I 1 FIG. 5: Information gain density (the integrands of Eq. 118 and 119) for the tw o reconstruction examples presented, the only w eakly n onlinear one (top, and Fig. 1) and the strongly non- linear one (b ottom, and Fig. 2 ). The renormalization result for T = 1 (Eq. 119), the zero- and ﬁrst-order p erturbative results (Eq. 118) are shown. The information gain depends on the observ ational sensitivit y as well as th e actual data. The latter inﬂuence is stronger in the non-linear regime, and d isappears in linear inference problems. use the for m ula for the information gain o f a free theory , as g iv en by Eq. 8 6. This yields ∆ I d = 1 2 T r  log  1 + S d b 2 κη  , (1 19) with η = e b m + 1 2 b 2 b D T = 1 being prop ortiona l to the ex- pec ted num b er densit y of g laxies in this region (see Eq. 107). It is a lso here obvious that the infor mation ga in depe nds on the data. In regions with higher obser v ed galaxy num b ers η is lar ger, and mor e infor mation is ex- pec ted to b e harvested b y further observ ations. This is illustrated in Fig. 5, where the informa tion gain density , the individual con tributions to the trace in Eq. 119, as well as the ﬁrst a nd and a ll terms of Eq. 11 8 are shown for the cas e s display ed in Figs. 1 and 2. The approxi- mate Eq. 118 seems to b e adequate for b ≪ 1, but not for o ur cases o f b = 0 . 5 a nd 2 . 5. The exp ected beneﬁt of additional observ a tio ns at lo - cation x can also be calcula ted by diﬀerentiating Eq. 1 19 with r espect to κ ( x ). Us ing Eqs. 117 and 88 we ﬁnd h δ I d δ κ 0 i (new data | d ) = 1 2 b 2 η  1 + 1 2 d κ 0 b 2 η D 2 b 2  − 1 b D . (120) The exp ected information ga in is esp ecially la rge fo r observ atio ns at lo cations where the uncertaint y b D is large, where a la r ge num b er density of galaxies ( ∝ η ) can b e ex pected, and where strong non- linearities are present( ∝ b 2 ). The in verse ter m c a ps the maximally av ailable information g ain at some level. F or the tw o reconstructio n examples giv en in Figs. 1 and 2 we dis- play the exp ected information gain a s a function of the observing p ostion in Fig . 6. It is appa r en t from the to p panel, showing the case of uniform o bserv ation coverage, that additional obser v a - tions are mor e adv antageous at lo cations wher e a lready an increas ed matter density is identiﬁed. The bo ttom panel, s ho wing the case of an v ery inhomogeneous ob- serv ation of str ongly nonlinear data, demons trates that ﬁlling observ a tional g aps sho uld hav e the highest prio rit y . But there again, r egions where the extrapo lated galaxy 27 0.01 0.1 1 10 100 0.2 0.4 0.6 0.8 0.01 0.1 1 10 100 0 0.2 0.4 0.6 0.8 1 P S f r a g r e p l a c e m e n t s i n f o r m a t i o n ∆ I d exp ected information gain h δ I d /δκ i 0 t h o r d e r a p p r o x . ∆ I 0 1 s t o r d e r a p p r o x . ∆ I 1 data d s i g n a l r e s p o n s e µ zero response κ ± b D 1 / 2 m ± b D 1 / 2 0 ∆ m = s − m m a p m ∆ s c l = s − s c l c l a s s i c a l ﬁ e l d s c l m a s k s i g n a l s FIG. 6: Diﬀerential information gain d ensit y for the tw o reconstruction examples presen ted, th e only weakly nonlinear one (top, and Fig. 1) and the strongly non-linear one (b ottom, and Fig. 2). density seems to b e larger sho uld b e preferred, as ca n b e seen from the asymmetric shap e of the exp ected infor - mation gain for obser v atio ns in the g ap a r ound x = 0 . 2. In this example, the info r mation-harvest of high galaxy density re g ions can b e so lar ge, that further observ a tions of the a lready well observed r egions at the bo undary o f the do main seems to b e more adv antageous than improv- ing the p oo rly observed regions ar ound x = 0 . 4 , where a low galaxy densit y is alrea dy aparent fro m the existing data. Of cour se, in o rder to plan o bserv ations in a re a l c ase, the dep endence of o bserv ational costs as a function o f lo- cation x and already a c hieved zero- resp o nse there, κ ( x ), hav e to b e fo lded into the c o nsiderations. VI. NON-GA USSIAN CM B FLUCTUA TIONS VIA f nl -THEOR Y A. Data mo del As an IFT example o n the sphere Ω = S 2 , inv olving t wo interacting uncertaint y ﬁelds, we in v estigate the so called f nl -theory of local non-Gaussianities in the CMB temper ature ﬂuctuations . This problem has currently a high scientiﬁc relev ance due to the strongly increasing av ailability o f high ﬁdelity CMB meas ur emen ts, whic h per mit to constrain the physical c onditions at very early epo chs of the Univ er se. The relev ant references for this topic were provided in Sec t. I C 5. On to p of the very uniform CMB sky with a mea n temper ature T CMB , small temp erature ﬂuctuations on the lev el of δ T { I , E , B } obs /T CMB ∼ 10 −{ 5 , 6 , 7 } are observed or exp ected in total Intensit y (Stok es I) a nd in p olar iza- tion E - and B-mo des, r espectively . The weak B-mo des are mainly due to lensing of E-mo des a nd so me un- known level o f gravit y w av es. W e will disregard them in the following. These CMB temp erature ﬂuctuations are b elieved and observed to follow mo stly a Gaussian distribution. How ever, inﬂation pr edicts some lev el of non-Gaussianity . So me of the secondar y anisotropies imprinted b y the LSS of the Universe via CMB lens- ing, the Integrated Sachs-W olfe and the Rees-Sciama ef- fects should a lso hav e imprinted non-Gaussia n sig natures [215, 2 16]. The primo rdial, as well as so me of the sec- ondary CMB temper ature ﬂuctuations are a resp onse to the gravitational p o ten tial initially s eeded during inﬂa - tion. Since w e ar e interested in primordial ﬂuctuations, 28 we wr ite d ≡ δ T { I , E } obs /T CMB = R ϕ + n, (121) where ϕ is the 3 -dimensional, primo rdial g r avitational po ten tia l, and R is the r espo nse on it of a CMB- instrument, observ ing the induced CMB temp erature ﬂuctuations in intensit y and E-mo de p olarizatio n. These are imprinted by a num b er of eﬀects, like gr a vitational redshifting, the Doppler eﬀect, and a nis otropic Thom- son scattering. In ca se that the data of the instrument are foregr ound-cleaned and deconv olved a ll-sky ma ps (as- suming the data pro cess ing to b e part of the instrument) the resp onse, whic h transla tes the 3-d g ravitational ﬁeld int o temper ature maps, is well known from CMB-theor y and can b e calculated with publicly av ailable codes like cmbfas t, camb, and cm beasy (see Sect. I C 5). The precise form of the resp onse do es not matter for a devel- opment of the basic co ncept, and can b e inserted later . Finally , the noise n subsumes all deviation of the mea- surement from the signal re s ponse due to instrumental and physical eﬀects, which ar e not linearly co rrelated with the pr imordial gr avitational p otential, suc h are de- tector noise, remnants of foregro und signals, but also primordial g r avitational wa v e contributions to the CMB ﬂuctuations. The small level of non-Gaussianity exp ected in the CMB temp erature ﬂuctuations is a conse quence of so me non-Gaussianity in the primor dial gravitational p oten- tial. Despite the lack o f a gener ic no n- Gaussian proba bil- it y function, man y of the inﬂationa ry non-Ga ussianities seem to b e w ell describ ed by a lo cal pro cess , which taints an initially Gaussian rando m ﬁeld, φ ← ֓ P ( φ ) = G ( φ, Φ) (with the φ -cov ariance Φ = h φ φ † i ( φ ) ), with s ome level of non-Gaussianity . A w ell controllable realizatio n of such a tarnishing o p era tio n is pr o vided by a sligh tly non-linear transformatio n of φ into the primordial gr a vitational p o- ten tial ϕ via ϕ ( x ) = φ ( x ) + f nl ( φ 2 ( x ) − h φ 2 ( x ) i ( φ ) ) (122) for a n y x . The parameter f nl controls the le v el and na- ture of non-Gaussianity via its absolute v a lue and sign, resp ectively . This means that our data mo del reads d = R ( φ + f ( φ 2 − b Φ)) + n , (123) where we dro pped the subscript of f nl . In the following we assume the noise n to be Gaussian with cov ariance N = h n n † i ( n ) and deﬁne as usua l M = R † N − 1 R for notational conv enience. 14 14 Non-Gaussian noise components are in fact expected, and would need to b e included into the construction of an optimal f nl - reconstruction. Ho wev er, curren tly we aim only at outlining the principles and we are furthermore not aw are of an traditional f nl - estimator constructed while taking such noise into accoun t. And ﬁnally , w e show at the end how to iden tify some of such non- Gaussian noise sources by pr oducing f nl -maps on the sphere, which can morphologically b e compared to known for eground structures, l ik e our Galaxy . B. Sp ectrum, bisp ectrum, and trisp ectrum The nonlinearity of the relation b etw een the hidden Gaussian random ﬁeld φ a nd the obse rv a ble gravitational po ten tia l ϕ (Eq. 12 2 ) imprints non-Gaussia nit y into the latter. In order to b e able to extract the v alue of the non-Gaussianity parameter f from any data cont aining information on ϕ , w e need to k now its statistic at least up to the four -po int function, the trisp ectrum, which we brieﬂy der iv e with IFT metho ds. T o tha t end, it is c o n v enient to deﬁne a ϕ -moment generating function Z [ J ] and its logar ithm log Z [ J ] = log Z D φ P ( φ ) e J † ϕ ( φ ) (124) = 1 2 J † (Φ − 1 − 2 c f J ) − 1 J − ( f J ) † b Φ − 1 2 T r h log  1 − 2 Φ c f J i This p ermits to calculate via J -deriv atives (see Eqs . 32- 35) the mean ¯ ϕ = h ϕ i ( φ ) = 0 , (125) the sp ectrum (or cov arianc e ) C ( ϕ ) xy = h ϕ x ϕ y i c ( φ ) = h ( ϕ − ¯ ϕ ) x ( ϕ − ¯ ϕ ) y i ( φ ) = Φ xy + 2 f x Φ 2 xy f y , (126) the bisp ectrum 15 B ( ϕ ) xy z = h ( ϕ − ¯ ϕ ) x ( ϕ − ¯ ϕ ) y ( ϕ − ¯ ϕ ) z i ( φ ) = h ϕ x ϕ y ϕ z i c ( φ ) = 2 [Φ xy f y Φ y z + Φ y z f z Φ z x + Φ z x f x Φ xy ] +8 Φ xy f y Φ y z f z Φ z x f x (127) and the trisp ectrum T ( ϕ ) xy z u = h ( ϕ − ¯ ϕ ) x ( ϕ − ¯ ϕ ) y ( ϕ − ¯ ϕ ) z ( ϕ − ¯ ϕ ) u i ( φ ) (128) = Φ xy Φ z u + Φ xz Φ y u + Φ xu Φ y z + h ϕ x ϕ y ϕ z ϕ u i c ( φ ) =  1 8 Φ xy Φ z u + 2 Φ xy f y Φ y z f z Φ z u + Φ xy f y Φ y z f z Φ z u f u Φ ux f x  + 23 p erm . 15 Since the bisp ectrum conta ins most of the non-Gaussianity si g- nature, we also prov ide its F ouri er-space version, which is well- kno wn for the f nl -mo del [e.g. 217]. The bisp ectrum for f = const , expressed in terms of the ϕ -cov ari ance reads B ( ϕ ) xy z = 2 f [ C ( ϕ ) xy C ( ϕ ) y z + C ( ϕ ) xz C ( ϕ ) z y + C ( ϕ ) y x C ( ϕ ) xz ] + O ( f 3 ) . F ourier transformi ng this yields B ( ϕ ) k 1 k 2 k 3 = 2 f (2 π ) 3 δ ( k 1 + k 2 + k 3 ) × [ P ( k 1 ) P ( k 2 ) + P ( k 2 ) P ( k 3 ) + P ( k 3 ) P ( k 1 )] + O ( f 3 ) , where P ( k ) is the p o w er sp ectrum of ϕ , whic h is identical to that of φ up to O ( f 2 ). 29 of the g ravitational p otential. Since we will in vestigate the p ossibility of a spatially v ar ying non-Gauss ianit y pa - rameter a t the end o f this section, we keep track of the spatial co or dinate of f , but for the time b eing read f x = f . The sp ectrum, bispectrum and tr ispectr um o f our CMB-measurement ca n ea sily b e calcula ted from the gravitational spec tr um and bispec tr um, resp ectively: C ( d ) = R C ( ϕ ) R † + N , (129) B ( d ) ˆ n 1 ˆ n 2 ˆ n 3 = R ˆ n 1 x R ˆ n 2 y R ˆ n 3 z B ( ϕ ) xy z , T ( d ) ˆ n 1 ˆ n 2 ˆ n 3 ˆ n 4 = R ˆ n 1 x R ˆ n 2 y R ˆ n 3 z R ˆ n 3 u T ( ϕ ) xy z u + "  R C ( ϕ ) R † + 1 8 N  ˆ n 1 ˆ n 2 N ˆ n 3 ˆ n 4 + 23 p erm utations  , where ˆ n denotes the unit vector on the sphere, and we hav e ma de use of the as sumption of the noise be ing Gaus - sian a nd indep enden t of the signal. In ca se the noise itself has a bi- or trisp ectrum, or there is a signal dep endent noise, e.g. due to a n incorrect instrument calibration, then more terms have to be added to the expressio ns. The usually q uo ted formulae [e.g. 204, 217, 218, 219] ca n be obtained from E q. 129 by applying spher ical ha r monic transformatio ns. C. CMB-Hamiltonian Although we are not in terested in the auxiliar y ﬁeld φ , it is nevertheless very us eful for its marginalizatio n to deﬁne its Hamiltonian, which is H f [ d, φ ] = − log( G ( φ, Φ) G ( d − R ( φ + f ( φ 2 − b Φ)) , N )) = 1 2 φ † D − 1 φ + H 0 − j † φ + 4 X n =0 1 n ! Λ ( n ) [ φ, . . . , φ ] , with D − 1 = Φ − 1 + R † N − 1 R ≡ Φ − 1 + M , j = R † N − 1 d, Λ (0) = j † ( f b Φ) + 1 2 ( f b Φ) † M ( f b Φ) , (130) Λ (1) = − ( f b Φ) † M and j ′ = j − Λ (1) † , Λ (2) = − 2 c f j ′ , Λ (3) xy z = ( M xy f y δ y z + 5 per m uta tions) , Λ (4) xy z u = 1 2 ( f x δ xy M y z δ z u f u + 23 p ermutations) , and H 0 collects all terms independent of φ and f . The last tw o tensors should b e read without the E instein sum- conv ention, but with a ll p ossible index-p ermutations. Note, that this is a non-lo cal theory for φ in case tha t either the no ise c ov arianc e or the res p onse matrix is non- diagonal, yielding a no n- loca l M and therefore non-lo cal int eractions Λ (3) and Λ (4) . W e should note, that Babich [220] derived the now tra- ditional f nl -estimator fro m a very similar starting p oint, the lo g-probability for ϕ . The diﬀerence o f the res ulting estimators is not due to the slig h tly diﬀer en t a pproaches ( H f [ d, ϕ ] v ersus H f [ d, φ ]), but beca use of the frequentist and Bayes statistics he and we use, resp ectiv ely . In case that the noise a s well as the resp onse is di- agonal in p osition space, a s it is o ften ass umed for the instrument res ponse of pr oper ly c le a ned CMB ma ps , and is also approximately v alid on lar g e a ngular scales, where the Sachs-W olfe eﬀect dominates, we have N xy = σ 2 n ( x ) δ ( x − y ), R = − 3 [215] for the total in tensit y ﬂuc- tuations, and thus M xy = 9 σ − 2 n ( x ) δ ( x − y ), if we res trict the signa l space to the la s t-scattering surface, whic h we ident ify with S 2 . This per mits to simplify the Hamilto- nian to H f [ d, φ ] = 1 2 φ † D − 1 φ + H 0 − j † φ + 4 X n =0 1 n ! λ † n φ n , with D − 1 = Φ − 1 + 9 d σ − 2 n , j ′ = j − λ 1 = 3 (3 b Φ f − d ) /σ 2 n , λ 0 = 3 ( b Φ /σ 2 n ) † ( 3 2 f 2 b Φ − f d ) , λ 2 = − 2 f j ′ , λ 3 = 54 f / σ 2 n , and λ 4 = 108 f 2 /σ 2 n . (131 ) The n umerical co eﬃcients of the las t tw o terms may lo ok large, howev er , these co eﬃcients stand in fron t of terms of typically φ 3 ∼ 10 − 15 , and φ 4 ∼ 10 − 20 , whic h ensures their well-behavedness in any dia grammatic expansion series. F or later usage, we deﬁne the Wiener-ﬁlter reco ns truc- tion of the gravitational p otential as m 0 = D j . D. f nl -evidence and m ap making Since we are moment arily not interested in re c onstruct- ing the pr imordial ﬂuctuations, but to extrac t knowledge on f nl , we mar ginalize the former by calculating the log- evidence log P ( d | f ) up to quadratic o rder in f : log Z f [ d ] = lo g Z D φ P ( d, φ | f ) = log Z D φ e − H f [ d,φ ] = − H 0 − Λ 0 + + + + + + + + + + + + + + + + + + + + O ( f 3 ) . (132) W e ha ve made use of the fact that the loga rithm of the partition sum is provided by all connec ted diag rams, a nd 30 that j ′ contains a term of the or der O ( f 0 ), Λ (2) and Λ (3) contain terms of the o rder O ( f 1 ), and Λ (4) one o f the order O ( f 2 ), so that they can appea r an unrestricted nu m ber of times, twice and o nce in diagra ms of order up to O ( f 2 ), resp ectively . Since only 4 th order int erac- tions are inv o lv ed, an implementation in s pher ical har - monics space may b e feasible using the only 4 th order C -co eﬃcients (E q . B 3), which can b e calculated com- puter algebraica lly . Finally , we recall = 1 2 log | 2 π D | = 1 2 T r(log(2 π D )) . (133) Although f is not known, the expressions in E q. 132 prop ortional to f and f 2 can b e calculated sepa rately , per mitting to write down the Hamiltonia n of f if a suit- able prio r P ( f ) is chosen, H d [ f ] ≡ − lo g( P ( d | f ) P ( f )) = ˜ H 0 + 1 2 f † ˜ D − 1 f + ˜ j † f + O ( f 3 ) , (134) where we collected the linear and quadr atic coeﬃcie nts int o ˜ j and ˜ D − 1 . It is obvious that the optimal f - estimator to low est order is therefore m f = h f i ( s,f | d ) = ˜ D ˜ j , (135) and its uncertaint y v arianc e is just h ( f − m f ) ( f − m f ) † i ( s,f | d ) = ˜ D . (136) So far, we hav e assumed f to hav e a single universal v alue. How ever, we ca n a lso p ermit f to to v a ry spa tia lly , or on the spher e of the sky . In the latter cas e one would expand f as f ( x ) = l max X l =0 l X m = − l f lm Y lm ( ˆ x ) (137) up to some ﬁnite l max . Then one would r ecalculate the partition sum, now separa tely for terms pro por tional to f lm and f lm f l ′ m ′ , whic h are then sor ted in to the v ector and matr ix co eﬃcients of ˜ j and ˜ D − 1 , resp ectively and according to ˜ j ( lm ) = dH d [ f ] d f lm     f =0 , and (138) ˜ D − 1 ( lm ) ( l ′ m ′ ) = d 2 H d [ f ] d f lm d f l ′ m ′     f =0 . f -map making can then pro ceed a s describ ed ab ov e in spherical ha rmonics s pace. Compar ing the r esulting map in angula r space to known fo r eground so ur ces, as our Galaxy , the level of no n-Gaussian c o n tamination due to their imper fect remov a l from the data may b e ass e s sed. E. Comparison to traditional estimator W e conclude this chapter with a short compariso n to traditional f nl -estimators. T o our knowledge, the mo s t developed estimator in the literature is bas e d on the CMB-bisp ectrum, which is the third o rder correla tion functions of the data [e.g. 220, 221, and references in Sect. I C 5 ]. The IFT ba sed ﬁlter presented her e con- tains terms which ar e up to fourth or der in the data, a nd therefore can b e expected to be of higher accuracy since bo th metho ds ar e supp o sed to b e optimal. Kogo and Komatsu [219] note that the CMB trisp ectrum should contain signiﬁcant information on f 2 nl , and may b e su- per ior to non-Ga ussianity detection compa r ed to the bi- sp ectrum on small angular scales. How ever, since the trisp ectrum is insensitive to the sig n of f nl , its actua l usage as a proxy is a it more subtle. In the IFT esti- mator, any term prop ortional to f 2 nl ent ers the in verse of the propag a tor ˜ D , a nd therefore the trisp ectrum s eems to unfold its f nl -estimation p ow er mostly in combination with the bisp ectrum, whic h drives ˜ j . Under which conditions do es the traditional estimator emerge fro m the IFT o ne? There ar e three conceptual diﬀerences betw een the estimators, in that the IFT ﬁlter can ha ndle inho mo geneous non-Ga ussianity , corr ect for CMB sky and exp osure ch ance coupling, and is unbiased with r espect to the p osterior . The traditiona l estimator is usually written as ε = 1 N Z dx A ( x ) B 2 ( x ) = 1 N m † 0 Φ − 1 m 2 0 , (139) where B = D j = m 0 is the Wiener - ﬁlter reconstructio n of the gravitational p otent ial, A = Φ − 1 B is the same, just additionally ﬁltered by the inv e r se p ow er sp ectrum, and N is a normalizatio n co nstan t [e.g. 2 02]. This is ﬁxe d by the condition that the estimator should b e unbiased with r espect to all signal and nois e realiza tio ns, N = h m † 0 Φ − 1 m 2 0 i ( d,s | f =1) = B ( ϕ ) xy z | f =1  ( M D ) xu Φ − 1 uv ( D M ) vy ( D M ) vz  = 2 [Φ xy Φ y z + Φ y z Φ z x + Φ z x Φ xy ] ×  ( M D ) xu Φ − 1 uv ( D M ) vy ( D M ) vz  (140) The ﬁr st diﬀerence b et ween the estimators is obvious, in that the IFT es timator ca n handle a s pa tially v ary- ing f ( x ). Therefor e, we will only re g ard s patially co n- stant non-linear it y para meters in the following. Since no CMB exp eriment is a ble to measure the monop ole temper ature ﬂuctuation, the r espo nse to an y spa tially homogeneous signal is zer o. T his means, in F ourier ba- sis, that R ˆ n,k =0 = 0 and therefor e j k =0 = M k =0 ,k ′ = 0. Thu s, w e ﬁnd for a Universe with homogeneous statis- tics ( b Φ k 6 =0 = 0) that Λ (0) = Λ (1) = 0, j ′ = j , and Λ (2) = − 2 f b j , which reduces the num ber o f diagrams we have to calculate. The IFT estimator is driven by the f -infor ma tion source ˜ j , which is given by all dia grams which contain 31 terms linear in f . There ar e four of them, y ie lding ˜ j = 1 f   + + +   = m † 0 Φ − 1 m 2 0 + m † 0 h Φ − 1 b D − 2 d M D i , (141) where we used M = D − 1 − Φ − 1 in order to combine the t wo tree a nd the t w o lo op diagra ms into the ﬁrst and second term, resp ectively . The term r esulting from the tree diagra ms is actually identically to the unnormalised traditional estimator ε (Eq. 139). The terms resulting from the lo op diag rams v anish for an homogeneo us M , which a CMB exp eriment with uni- form exp osure and consta nt noise could pro duce. In case of an inhomogeneous M , which is the more rea listic case, the lo op term do es not v anish and corr ects for chance co r- relations b et ween the CMB-rea lization (as seen throug h j ) and the noise and resp onse structure of the ex peri- men t (as encoded in M a nd D ). Creminelli et al. [22 2 ] already p oint ed out that such a linear corr ection term is necessary in cas e o f an inhomogeneous sk y cov erage. An yhow, the se c ond diﬀerence betw een the estimator s is that the IFT based one a pplies a corr ection fo r chance correla tions of CMB sky and sky exp osure a nd the tradi- tional o ne do es no t. This term is a bsen t in the traditional estimator since the la tter was cons tr ucted as the optimal estimator which is third or de r in the data. This e xcluded the lo op term, whic h is linear in the data. An inclusio n of this term into the traditiona l es timator is straig htforward a nd actually done by the mor e re c e n t f nl measurements [e.g. 223]. T he norma lization constant N is unaﬀected by this, since the exp ectation v alue of the lo op term av eraged ov er all p ossible signal realization is zero. This br ings us to the third diﬀerence b etw een the es ti- mators, the diﬀerent normalization. The traditional esti- mator is normalized b y a data independent constant N , where the IFT estimator is normalized by a data dep en- dent term ˜ D − 1 = 1 σ 2 f + 2 f 2 " + + + + + + + + + + + +   , (142) where only the ﬁrst three diagr a ms are data indep endent and σ f is the v aria nce of the prior, which we a ssume to be P ( f ) = G ( f , σ 2 f ). The detailed expres sions for the dif- ferent diagra ms can be found in App endix C. F or both estimators, the traditional and the IFT one, the no r mal- ization is suppo sed to guarantee unbiasedness, how ever, with r espect to diﬀeren t pro babilit y distributions. The traditional estimator is unb iased in the frequentist sense, for an av er age ov er all sig na l f a nd data realiza- tions. Howev er , the IFT estimator is unbiased in the Bay es ia n sense, with res pect to the pos terior, the proba- bilit y distr ibutio n o f all signals g iv en the data. Since the data are given, and no t a ssumed to v a ry any more after the o bserv ation is per formed, it ca n and should a ﬀect the normalizatio n constant, which enco des the sensitivit y o f our no n-Gaussianity estimatio n. The reaso n for the I FT normalization constant (or f - propaga tor) to be data dep endent can b e understo o d as follows. There are data realizations which a re b et- ter suited to r ev eal the pres e nc e of a non-Gauss ianities than others, ev en if they have identical ˜ j . Suc h a de- pendenc e of the detectabilit y of a e ﬀect on the concrete data realizatio n is c ommon in non-linear Baysian infer- ence, and was even more pro minent in the example of the reconstructio n of a log-norma l density ﬁeld in Sect. V. VII. SUMMAR Y AND OUTLOOK Starting with fundament al information theoretical con- siderations ab out the nature of measurements, signa ls, noise and their relation to a physical r ealit y g iv en a mo del of the Universe or the system under consider ation, we reformulated the inference problem in the la nguage of information ﬁeld the ory (IFT). IFT is actually a sta tisti- cal ﬁeld theor y . The information ﬁeld is identiﬁed with a spatially distributed s ignal, which can freely b e chosen b y the scientist accor ding to needs and technical constra in ts . The mathematical appara tus of ﬁeld theory p ermits to deal with the ensemble of all p ossible ﬁeld conﬁgura tions given the data and pr io r information in a consistent wa y . With this conceptual framework, w e derived the Hamiltonian of the theory , s ho wed that the fre e theory repro duces the well known r e sults of Wiener-ﬁlter theory , and pre sen ted the F eynman-r ule s for non-linear , in teract- ing Hamiltonia ns in g eneral, a nd in particular ca ses. The latter ar e infor mation ﬁelds over F ourier- and spherical harmonics-s paces for infere nce problems in R n and S 2 , resp ectively . O ur “philoso phical” considera tions per mit- ted to a rgue why the resulting IFTs are usually well nor- malized, but often non-lo cal. Since the propag ator of the theory is clo sely r elated to the Wiener-ﬁlter , for which now adays eﬃcient numerical algorithms exist as image reconstructio n and map-making co des, and the informa - tion source term is us ua lly a noise weigh ted version of the data, the necess a ry computational to ols are at hand to conv er t the diagrammatic express io ns into well p erfor m- ing alg orithms. F urthermore, w e provided the Boltzmann-Shannon in- formation mea s ure of IFT based on the Helmho ltz free energy , thereby hig hligh ting the embedding of IFT in the framework of statistical mechanics. As examples of the IFT recip e, t wo concr e te IFT pro b- lems with c o smological mo tiv ation w ere discuss ed, whic h are also thought as blueprints for other inference prob- 32 lems. The ﬁrs t was targe ting at the problem of r econ- structing the s pa tially contin uous cosmic LSS matter distribution from discrete gala xy counts in incomplete galaxy surveys. The resulting a lg orithm can also b e used for image reco nstruction with low-num b er photon statis- tics, e.g in low-dose X-ray imag ing. The seco nd exa mple was the design o f an optimal metho d to measure or constr ain any p ossible loca l non- linearities in the CMB temp eratur e ﬂuctuations . This may serve as a blueprint for statistica l monitoring of the linearity of a signal a mpliﬁer. W e conclude here with a sho r t outlo ok on some prob- lems that are accessible to the presented theory . Many signa l inference pr oblems inv olve the rec o nstruc- tion of ﬁelds without pr ecisely known sta tistics . Some co eﬃcien ts in the IFT- Hamiltonians may only b e phe- nomenologica l in nature, a nd therefore hav e to be de- rived from the sa me data used for the r econstruction itself. This mor e intricate interpla y of parameter and information ﬁeld can also b e incorpo rated in to the IFT framework, as we will show with a subsequent w ork. F or cos mo logical applications, along the lines sta rted in this work, c le arly more r ealistic data mo dels need to b e inv estiga ted. F or example, to understand the r espo ns e in galaxy formation to the under lying dar k matter distribu- tion in terms of a r ealistic, statistical mo del, to b e used in constructing the corres p onding IFT Hamiltonian for a dark-matter info r mation ﬁe ld, detailed higher- order cor- relation co eﬃcients hav e to b e distilled from numerical simulations or semi-analytic descriptions. Also the CMB Hamiltonian may be ne ﬁt from the inclusion of remnants from the CMB for eground subtraction pro cess, p ermit- ting to g ather more s o lid ev idence on fundamental pa- rameters which ar e hidden in the C MB ﬂuctuations, like the a mplitude of non-Gaussia nities. F urthermore, there ex ist a num b er of mo re or less heuristic a lgorithms for inv er se problems, whic h hav e prov e n to serve w ell under certain circumstanc e s . Re- verse engineering of their implicitly assumed prio rs and data mo dels may p ermit to understand b etter for which conditions they are best suited, as w ell ho w to improve them in case these conditions ar e not ex a ctly met. Finally , we are very curio us to se e whether and how the presented framework may be suitable to infer ence problems in other scientiﬁc ﬁelds. Ackno wle dgement s It is a pleasure to thank the following p eople for help- ful scientiﬁc discussions on v a rious asp ects of this w o rk: Simon White on the danger s of p erturba tion theory , Ben- jamin W andelt on the prosp ects of larg e-scale structur e reconstructio n, Jens Jasche on the pleasures and pains of sig nal pr oc e ssing, J¨ org Ra c hen on the philosophy of science, and Andr ´ e W aelkens o n the in v a r iant , but ver- tiginous theo r y of is o tropic tensors. W e thank Cornelius W eig and Henrik Junklewitz for debates on the connec- tion betw een IFT and QFT. W e gratefully ac knowledge helpful comments o n the manuscript b y Marcus Br ¨ ugg e n and Thomas Riller and by three very constr uc tive refer - ees. APPENDIX A: NOT A TION W e brieﬂy summarize o ur notation of functions in p o- sition and F ourier space. A her e usually real, but in pr inciple also complex func- tion f ( x ) o ver the n -dimensiona l space is regarde d as a vector f in a discrete a nd ﬁnite-dimensiona l, or contin- uous a nd inﬁnite-dimensional Hilbert spa ce. f will de- note this vector, indep endently of the momen tar ily cho- sen function basis, b e it the rea l space f ( x ) = h x | f i or the F our ier basis f ( k ) = h k | f i = Z dx f ( x ) e i k · x . (A1) Here, the volume integration usually is p erformed o nly ov er an ﬁnit e doma in with volume V . This leads to the conv ention for the or igin of the delta function in k -space, δ (0) = V (2 π ) n , (A2) and a ls o to a F ourier tr a nsformation op erato r F = | k ih x | , with F kx = e i k x , and its inv er se F † = | x ih k | , with F † xk = e − i k x . The dagger is used to denote transp osed and complex co njugated o b jects. W e have ( F † F ) xy = 1 xy as well a s ( F F † ) kk ′ = 1 kk ′ for the following deﬁnition of the scalar pro duct of t wo functions f and g in rea l and F ourier spac e: f † g = h f | g i = Z dx f ∗ ( x ) g ( x ) = Z dk (2 π ) n f ∗ ( k ) g ( k ) , (A3) where the a sterix denotes complex co njuga tion. The statistical power-spectrum o f f is denoted by P f ( k ) = h| f ( k ) | 2 i ( f ) /V . W e also intro duce for co n v enience the po sition-space comp onen t-wise pro duct of tw o functions ( f g )( x ) ≡ f ( x ) g ( x ) , (A4) which als o p ermits compact notatio ns like (log f )( x ) = log( f ( x )) , ( f /g )( x ) = f ( x ) /g ( x ) , (A5) and alike. The c omponent-wise pro duct should not be confused with the tensor pro duct of tw o vectors ( f g † )( x, y ) = f ( x ) g ∗ ( y ). The diagonal components of a matrix M in po sition- space r epresentation for m a vector which we denote by c M = diag x M , with c M x = M xx . (A6) 33 Similarly , a diagonal matrix in p osition-space repr esen- tation, who se diag onal compo nen ts are g iv en by a vector f , will b e denoted by b f = diag x f with b f xy = f x 1 xy . (A7) Thu s, c c M = M if and only if M diag onal, and b b f = f alwa ys . In o ur notation a multiv aria te Gaussian reads: G ( s, S ) = 1 | 2 π S | 1 2 exp  − 1 2 s † S − 1 s  (A8) Here, S = h s s † i ( s ) denotes the cov ariance tensor of the Gaussian ﬁeld s , which is dr awn fro m P ( s ) = G ( s, S ). If s is s tatistically homogeneous, S is fully describ ed by the power-spectr um P s ( k ): S k k ′ = (2 π ) n δ ( k − k ′ ) P s ( k ) , S − 1 k k ′ = (2 π ) n δ ( k − k ′ ) ( P s ( k )) − 1 . ( A9) The F ourier r epresentation of the tra ce of a F ourier- diagonal o pera tor, T r( A ) = Z dx A x x = V Z dk (2 π ) n P A ( k ) , (A10) is v er y useful in combin ation with the following expres- sion for the determinant of a Hermitian matrix, log | A | = T r(log A ) . (A11) F urthermore, we usually suppr ess the dependency of probabilities on the under lying model I and its param- eters θ in our notation. I.e. instead of P ( s | θ , I ) we just wr ite P ( s ) or P ( s | θ ) depending on our focus. Her e θ = ( S, N , R, ... ) co ntains all the para meter s of the mo del, which are ass umed to be known within this work. APPENDIX B: FEYNMAN RULES ON THE SPHERE Here, we provide the F eynman r ules on the sphere. The r eal-space rules a r e identical to those of ﬂat s paces, with just the scalar pr oduct replaced by the integral ov er the sphere, etc. In case the problem at hand has an isotropic pr opagator, which only depends on the distance of t w o points on the sphere, but no t on their lo cation or orie ntation, the propa gator is diagona l if expr essed in spherical harmonics Y lm ( x ). Thanks to the orthogo nalit y relation of spherica l harmonics, we hav e for x, y ∈ S 2 ( Y Y † ) xy = X lm Y lm ( x ) Y ∗ lm ( y ) = δ ( x − y ) = (1) xy (B1) and ( Y † Y ) ( l,m )( l ′ ,m ′ ) = Z dx Y ∗ lm ( x ) Y l ′ m ′ ( x ) = δ ll ′ δ mm ′ = (1) ( l,m )( l ′ ,m ′ ) . (B2) Therefore, w e can just inser t real-space identit y matrices 1 = Y Y † in b etw een any expr ession in rea l-space dia- grammatic expr ession a nd assign Y † to the r ight , and Y to the left ter m of it. This wa y we ﬁnd the spherica l- harmonics F eynman rules, which a re very similar to the F ourier-spa ce ones , in that they a lso require directed propaga tors-lines for pro per angular-momentum conser- v ation. F or a theory with only lo cal interactions, these read: 1. An op en end of a line has external (not summed) angular- momen tum qua n tum num b ers ( l, m ). 2. A line connecting momentum ( l , m ) with momen- tum ( l ′ , m ′ ) corres p onds to a pro pagator betw een these momenta: D ( l,m )( l ′ ,m ′ ) = C D ( l ) δ ll ′ δ mm ′ , where C D ( l ) is the angula r p ow er s pectrum o f the propaga tor. 3. A da ta source vertex is ( j + J − λ 1 )( l, m ), where ( l, m ) is the a ngular moment um at the da ta-end of the line. 4. A vertex with quantum num ber ( l 0 , m 0 ) with n in incoming and n out outgoing lines ( n in + n out > 1) with momentum lab els ( l 1 , m 1 ) . . . ( l n in , m n in ) and ( l ′ 1 , m ′ 1 ) . . . ( l ′ n out , m ′ n out ), resp ectively , is given b y − λ m ( l 0 , m 0 ) C ( l ′ 1 ,m ′ 1 ) ... ( l ′ n out ,m ′ n out ) ( l 0 ,m 0 ) ... ( l n in ,m n in ) , where C will b e deﬁned in Eq. B3. 5. An internal vertex has internal (summed) a ngular- momentum q uan tum num b ers ( l ′ , m ′ ). Summation means a ter m P ∞ l ′ =0 P l ′ m = − l ′ in fr on t of the expr es- sion. 6. The expr e ssion g ets divided by the symmetry factor of its diag ram. The interaction s tructure in spherica l harmonics-s pa ce is complicated due to the non-or thogonality of powers a nd pro ducts of the spherica l harmonic funct ions, compared to the F our ier-space ca se, where any p o wer or pro duct of F ourier-basis functions is a gain a sing le F ourier- basis function. The spherical s tructure is encapsulated in the co eﬃ- cients C ( l ′ 1 ,m ′ 1 ) ... ( l ′ n out ,m ′ n out ) ( l 0 ,m 0 ) ... ( l n in ,m n in ) ≡ Z dx n in Y i =0 Y l i m i ( x ) ! n out Y i =1 Y ∗ l ′ i m ′ i ( x ) ! , (B3) which can b e ex pressed in terms of sums and pr oducts of Wigner co eﬃcients, thanks to the relations Y ∗ lm ( x ) = Y l , − m ( x ), Y l 1 m 1 ( x ) Y l 2 m 2 ( x ) = X lm r (2 l 1 + 1) (2 l 2 + 1) (2 l + 1) 4 π ×  l 1 l 2 l m 1 m 2 m  Y lm ( x )  l 1 l 2 l 0 0 0  , (B4) 34 and the orthog onality relation in Eq . B2, to b e applied successively in this or der. Due to this complica tion, it is probably most eﬃcient to calculate pr opagation in s pher- ical harmonic s space, but to change ba c k to real space for any in teraction vertex of high or der. APPENDIX C: f nl -PR OP AGA TOR W e pr ovide in the following the individual terms of the f nl -Propa gator in Eq. 142. The individua l diagrams a r e all O ( f 2 ) a nd are g iv en here for the case f = 1: = − T r  D 2 M  − 1 2 b D † M b D (C1) = 1 2 h 2 d D M + b D M i † D h 2 d M D + M b D i (C2) = T r  D 2 M D M  +2 M xy D y y ′ M y ′ x ′ D x ′ y D xx ′ (C3) = j † D 2 j (C4) = − 2 m † M D 2 j − 4 T r h b m D b j D M i (C5) = m † M D 2 M m + 4 T r h b m D d M m D M i +2 T r [ b m D ( b m M + M b m ) D M ] (C6) = − 2 h 2 d D M + b D M i † D b j m (C7) = − m 2 † M b D − 2 T r [ b m M b m D ] (C8) = h 2 d D M + b D M i † D  2 b mM m + M m 2  (C9) = 1 2  2 b mM m + M m 2  † D  2 b mM m + M m 2  (C10) = − 2 ( m j ) † D ( M m 2 + 2 b m M m ) (C11) = − 1 2 m 2 † M m 2 (C12) = 2 ( m j ) † D ( j m ) (C13) W e used here the conv ent ions m = D j and ( D 2 ) xy = ( D xy ) 2 and remind that Λ (0) = Λ (1) = 0, j ′ = j , Λ (2) = − 2 f b j , Λ (3) xy z = [ M xy δ y z + 5 p erm . ], Λ (4) xy z u = 1 2 [ δ xy M y z δ z u + 23 p erm . ]. [1] T. Ba yes, Phil. T rans. Roy . So c. 53 , 370 (1763). [2] C. E. Sh annon, Bell System T ec hnical Journal 27 , 379 (1948). [3] C. E. Shannon and W. W ea ver, The mathematic al the - ory of c ommunic ation (Urbana: Universit y of Illinois Press, 1949, 1949). [4] E. T. Ja y nes, Physical R eview 106 , 620 ( 1957). [5] E. T. Ja y nes, Physical R eview 108 , 171 ( 1957). [6] E. T. Ja y nes, in Statistic al Physics 3 (1963), p. 181. [7] E. T. Jaynes, American Journal of Physics 33 , 391 (1965). [8] E. T. Ja yn es, IEEE T rans. on Systems S cience and Cy- b ernetics SSC-4 , 227 (1968). [9] E. T. Ja y nes, in Pr o c. IEEE, V olume 70, p. 939-952 (1982), pp. 939–952 . [10] E. T. Ja ynes and R. Baierlein, Ph ysics T o day 57 , 76 (2004). [11] N. Metrop olis, A. W. R osen b luth, M. N. Rosenbluth, A. H. T eller, and E. T. T eller, Journal of Chemical Physics 21 , 1087 (1953). [12] W. K. Hastings, Biometrik a 57 , 97 (1970). [13] S. Geman and D. Geman, IEEE T ransactions on P at- tern A nalysis and Machine I n telligence 6 , 721 (1984). [14] S. Duan, A. Kennedy , B. Pe ndleton, and D. R o weth, Phys. Lett. B 195 , 216 (1987). [15] K. P . N. Murthy, M. Janani, and B. Shenbga Priya , ArXiv Computer S cience e-prints (2005), arXiv:cs/05040 37. [16] M. A. T anner, T o ols for statistic al infer enc e ( Springer- V erlag, New Y ork, 1996). [17] R. M. Neal, in T e chnic al R ep ort CRG-TR-93-1 (Dep t. of Computer Science, Un iv ersit y of T oronto, 1993). [18] C. P . R obert, The Bayesian choic e (Springer-V erlag, New Y ork, 2001). [19] A. Gelman, J. B. Carlin, H. S . Stern , and D . Rubin, Bayesian data analysis (Chapman & H all/C RC , Boca Raton, Florida, 2004). [20] R. A. Aster, B. Brochers, and C. H. Thurber, Par ame- ter estimation and i nverse pr oblems (Elsevier Academic Press, London, 2005). [21] R. T rotta, ArXiv e- prin ts 0803.4089 (2008), 0803. 4089. [22] N. W iener, Extr ap olation, I nterp olation, and Smo othing of Stationary Time Series (New Y ork: Wiley , 1949). [23] W. H. Richardson, Journal of the Optical So ciety of America (1917-198 3) 62 , 55 (1972). [24] L. B. Lucy, AJ 79 , 745 (1974). [25] B. R. F rieden, Journal of the Optical So ciety of America (1917-1983) 62 , 511 (1972). [26] S. F. Gull and G. J. Daniell, Nature (London) 272 , 686 (1978). [27] J. S killing, A. W. Strong, and K. Bennett, MNRAS 187 , 145 (1979). [28] R. K. Bryan and J. S killing, MN RAS 191 , 69 (1980). [29] S. F . Burch, S. F. Gull, and J. Skilli ng, Co mputer Visi on Graphics an d I mage Pro cessing 23 , 113 (1983). [30] S. F. Gull and J. Skilling, in Indir e ct Imagi ng. Me a- sur ement and Pr o c essing f or Indir e ct Im aging. Pr o- c e e dings of an International Symp osium held in Syd- ney, A ustr ali a, A ugust 30-Septe mb er 2, 1983. Editor, J.A. R ob erts; Publisher, Cambridge University Pr ess, Cambridge, En gland, New Y ork, NY, 1984. LC # QB51.3.E43 I53 1984. ISBN # 0-521-2 6282-8. P. 267, 35 1983 (1983), p. 267. [31] S. Sibisi, J. Skilling, R. G. Brereton, E. D. Laue, and J. S taun ton, Natu re (Lond on) 311 , 446 (1984). [32] D. M. Titterington and J. Skilling, N ature ( London) 312 , 381 (1984). [33] J. Skilling and R. K. Bryan, MNRAS 211 , 111 ( 1984 ). [34] R. K. Bry an and J. Sk illing, Journal of Mo dern Optics 33 , 287 (1986). [35] S. F. Gull, in Maximum Entr opy and Bayesian Meth- o ds , edited by J. Sk illing (Kluw er Academic Publishers, Dordtrech t, 1989), pp. 53–71. [36] S. F. Gull and J. Skilling, The M EMSYS5 User’s Man- ual (Maximum Entrop y Data Consultan ts Ltd, R o y s- ton, 1990). [37] J. Skilli ng, in Maximum Entr opy and Bayesian Metho ds , edited b y G. J. Eric kson, J. T. Rychert, a nd C. R. Smith (1998), p. 1. [38] F. S. Kitaura and T. A. Enßlin, MNRAS 389 , 497 (2008), 0705.0 429. [39] R. N ara yan and R. Nity ananda, ARAA 24 , 127 (1986). [40] R. M olina, J. N unez, F. J. C ortijo, a nd J. Mateos, Signal Processing Magazine, IEEE 18 , 11 (2001). [41] E. Bertsc hinger, ApJL 323 , L103 (1987). [42] J. N. F ry, Astrophys. J. 289 , 10 (1985). [43] W. Bialek and A. Z ee, Physical Review Letters 58 , 741 (1987). [44] W. Bialek and A. Zee, Ph ysical Review Letters 61 , 1512 (1988). [45] W. Bialek, C. G. Callan, and S. P . Strong, Physical Re- view Letters 77 , 4693 ( 1996), arXiv:cond- mat/9607 180. [46] P . Stoica, E. G. Larsson, and J. Li, A J 120 , 2163 (2 000). [47] T. Enßlin and M. F rommert, in preparation (2009). [48] J. C. Lemm, ArXiv Ph y sics e-p rin ts (199 9), physics/9 912005 . [49] J. C. Lemm and J. Uhlig, F ew-Bo dy Systems 29 , 25 (2000), arXiv:quant-ph/000602 7. [50] J. C. Lemm, J. Uhlig, and A. W eigun y, Ph ysical Review Letters 84 , 2068 (2000), arXiv:cond -mat/9907 013. [51] J. C. Lemm and J. U hlig, Physical Review Letters 84 , 4517 (2000), arXiv:nucl-th/99080 56. [52] J. C. Lemm, Physics Letters A 276 , 19 (2000). [53] J. C. Lemm, in Bayesian Infer enc e and Maximum En- tr opy M etho ds i n Scienc e and Engine ering , ed ited by A. Mohammad-Djafari (2001), vol. 568 of Americ an In- stitute of Physics Confer enc e Series , pp. 425–43 6. [54] J. C. Lemm, J. Uhlig, and A. W eiguny , Europ ean Phys- ical Journal B 20 , 349 (2001), arXiv:quant-ph/0005122. [55] J. C. Lemm, J. Uhlig, and A. W eiguny , Europ ean Phys- ical Journal B 46 , 41 (2005). [56] J. C. Lemm, ArXiv Condensed Matter e-prints (1998), cond-mat/9808039. [57] J. Binney, N . Do wric k, A . Fisher, and M. Newman, The the ory of critic al phe nomena (Oxford Universit y Press, Oxford, UK: I SBN0-19-851394 -1, 1992). [58] M. E. Peskin and D. V. S c h roeder, A n Intr o duction to Quantum Field The ory (W estview Press Bo ulder, Col- orado: 1995, I SBN-13 978-0-201-50397-5., 1995). [59] A. Zee, Quantum ﬁeld the ory in a nutshel l (Qu an tum ﬁeld theory in a nutshell, by A. Zee. Princeton, NJ: Princeton Un iv ersit y Press, 2003, ISBN 0691010196 ., 2003). [60] S. Matarrese, F. Lucchin, and S. A. Bonometto, ApJL 310 , L21 ( 1986 ). [61] Y. B. Zel’dovic h, A&A 5 , 84 (1970). [62] J. M. Bardeen, J. R . Bond, N. Kaiser, and A . S. Szala y, Astrophys. J. 304 , 15 (1986). [63] P . J . E. P eebles, The lar ge-sc ale structur e of the universe (Researc h supp orted by th e National S cience F oun- dation. Princeton, N .J., Princeton Universit y Press, 1980. 435 p ., 1980). [64] N. Kaiser, MNRAS 227 , 1 (1987). [65] P . J. E. P eebles, Astrophys. J. 362 , 1 (1990). [66] F. Bernardeau, ApJL 390 , L61 (1992). [67] S. Zaroubi and Y. Hoﬀman, Astroph ys. J. 462 , 25 (1996). [68] A. J. S. Hamilton, in The Evolving Universe , edited by D. Hamilton (Klu w er Academic Publishers, Dordtrech t, 1998), vol. 231 of Astr ophysics and Sp ac e Scienc e Li- br ary , p. 185. [69] F. Bernardeau, M. J. C hod oro wski, E. L. Lok as, R. Stomp or, and A. Kudlic k i, MNRAS 309 , 543 (1999), astro-ph/9901057 . [70] E. Branc hini, L. T eo doro, C. S . F renk, I. Sc hmoldt, G. Efstathiou, S. D. M. White, W. Saunders, W. Su ther- land, M. Ro wan-Robinson, O. Keeble, et al., MNRAS 308 , 1 (1999), astro-ph /9901 366. [71] A. Dekel and O. Lahav, Astrophys. J. 520 , 24 (1999), astro-ph/9806193 . [72] S. Zaroubi, ArXiv Astrophysic s e- prin ts (20 02), astro- ph/0206052. [73] R. E. Smith, J. A. Pe acock, A. Jenkins, S. D. M. White, C. S. F renk , F. R. P earce, P . A . Thomas, G. Efstathiou, and H. M. P . Couc h man, MNRAS 341 , 1311 (2003), arXiv:astro-ph/0207664 . [74] R. Scoccimarro, Phys. Rev. D 70 , 083007 (2004), astro- ph/0407214. [75] V. Springel, S. D. M. White, A. J enkins, C. S. F renk, N. Y oshida, L. Gao, J. Nav arro, R. Thac ker, D. Cro- ton, J. Helly, et al., Nature (London) 435 , 629 (2005), arXiv:astro-ph/0504097 . [76] P . V alageas, A&A 421 , 23 (2 004), arXiv:astro - ph/0307008. [77] P . V alageas, A &A 476 , 31 (2007), arXiv:0706.2593 . [78] P . V alageas, A &A 484 , 79 (2008), arXiv:0711.3407 . [79] M. Cro cce and R . Sco ccimarro, Physical Review D 73 , 063519 (2006), arXiv:astro-ph/050941 8. [80] M. Cro cce and R . Sco ccimarro, Physical Review D 73 , 063520 (2006), arXiv:astro-ph/050941 9. [81] P . McDonald, Phys. Rev. D 74 , 103512 ( 2006 ), arXiv:astro-ph/0609413 . [82] P . McDonald, Phys. Rev . D 74 , 129901(E) (2006). [83] P . McDonald, Phys. Rev. D 75 , 043514 ( 2007 ), arXiv:astro-ph/0606028 . [84] D. Jeong and E. Komatsu, Astrophys. J. 651 , 619 (2006), arXiv:astro-ph/060407 5. [85] D. Jeong and E. Komatsu, A rXiv e-prints 0805.2632 (2008), 0805.2 632. [86] S. Matarrese and M. Pietroni, Journal of Cosmology and Astro-Pa rticle Ph ysics 6 , 26 (2007), arXiv:astro- ph/0703563. [87] J. Gaite and A . Dom ´ ınguez, Journal of Physic s A Mathematical General 40 , 6849 (2007), arXiv:astro- ph/0610886. [88] S. Matarrese and M. Pietroni, Modern Ph ysics Letters A 23 , 25 (2008), arXiv:astro-ph/0702653. [89] T. Matsubara, Phys. Rev. D 77 , 063530 (200 8), arXiv:0711.25 21. [90] M. Pietroni, ArXiv e- prin ts 0806.0971 (2008), 36 0806.09 71. [91] E. Bertsc hinger and A. Dekel, A pJL 336 , L5 (1989). [92] E. Bertschinger and A. Dekel, in ASP Conf . Ser. 15: L ar ge-Sc ale Structur es and Pe culiar Motions in the Uni- verse , edited by D. W. Latham and L. A . N. da Costa (1991), p. 67. [93] P . J. E. P eebles, ApJL 344 , L53 (1989). [94] A. D ek el, E. Bertschinger, and S . M. F aber, Astrophys. J. 364 , 349 (1990). [95] N. Kai ser an d A. S tebbins, in ASP C onf. Ser. 15: L ar ge- Sc ale Structur es and Pe culiar Motions in the Universe , edited by D. W . Latham and L. A . N . d a Costa (1991), p. 111. [96] Y. Hoﬀman and E. Ribak, ApJL 380 , L5 (1991). [97] D. H. W einberg, MNRAS 254 , 315 (1992). [98] A. Nusser and A. Dekel, Astroph ys. J . 391 , 443 (1992). [99] G. B. Ry bic k i and W. H . Press, Astrophys. J. 398 , 169 (1992). [100] M. Gramann, Astrophys. J. 405 , 449 (1993). [101] G. Ganon and Y. H oﬀman, A pJL 415 , L5 (1993). [102] F. Bernardeau, A& A 291 , 697 (199 4), astro- ph/9403020. [103] A. Nu sser and M. Davis, ApJL 421 , L1 (1994), astro- ph/9309009. [104] O. Lahav, in ASP Conf. Ser. 67: Unveili ng L ar ge- Sc ale Structur es Behind the Mil ky W ay , edited by C. Balk o wski and R. C. Kraan-Korteweg (1994), p. 171. [105] O. Lahav, K. B. Fisher, Y. H oﬀman, C. A. Scharf, and S. Zaroubi, ApJL 423 , L93 (1994), astro-ph/9311059. [106] K. B. Fisher, O. Lahav, Y. H oﬀman, D. Ly nden- Bell, and S. Zaroubi, MNRAS 272 , 885 (1995), astro- ph/9406009. [107] R. K. Sheth , MNR AS 277 , 933 (19 95), astro- ph/9511096. [108] S. Zaroubi, Y. H oﬀman, K. B. Fisher, and O. Laha v, Astrophys. J. 449 , 446 (1995), astro-ph/9410080. [109] M. T egmark an d B. C. Bromley, Astrophys. J. 453 , 533 (1995), astro-ph/940903 8. [110] R. A. C. Croft and E. Gaztanaga, MNRAS 285 , 793 (1997), astro-ph/960210 0. [111] V. K. Nara ya nan and D. H. W einberg, Astrophys. J. 508 , 440 (1998), astro-ph/980623 8. [112] U.-L. P en, Astrophys. J. 504 , 601 ( 1998 ), astro- ph/9711180. [113] U. Seljak, Astroph y s. J. 503 , 492 (1998), astro- ph/9710269. [114] U. Seljak, Astrophys. J. 506 , 64 (1998), astro- ph/9711124. [115] V. Bistolas and Y. Hoﬀman, Astrophys. J. 492 , 439 (1998), astro-ph/970724 3. [116] A. T aylor and H . V alentine, MNR AS 306 , 491 (1999), astro-ph/9901171 . [117] V. K. Naray anan and R. A . C. Croft, Astrophys. J. 515 , 471 (1999), astro-ph/980625 5. [118] S. Zaroubi, Y. H oﬀman, and A . Dekel, A stroph ys. J. 520 , 413 (1999), astro-ph/981027 9. [119] D. M. Goldberg and D. N. Sp ergel, in ASP Conf. Ser. 201: C osmic Fl ows W orkshop , edited by S. Courteau and J. Willic k (2000), p. 282. [120] D. M. Goldberg and D . N . Sp ergel, Astrophys. J. 544 , 21 (2000), astro-ph/9912408 . [121] A. Kudlicki, M. Chodorow ski, T. Plew a, and M. R´ o ˙ zyczk a, MNRAS 316 , 464 (2000), astro- ph/9910018. [122] S. Basilak os and M. Plionis, Astrophys. J. 550 , 522 (2001), astro-ph/001126 5. [123] D. M. Goldberg, Astrophys. J. 552 , 413 (2001), astro- ph/0008266. [124] U. F risch, S. Matarrese, R. Mohay aee, and A. Sob olevski, Nature (London) 417 , 260 (2002), arXiv:astro-ph/0109483 . [125] S. Zaroubi, MNRAS 331 , 901 ( 2002 ), astro-ph/001056 1. [126] Y. Brenier, U. F risc h, M. H´ enon, G. Lo ep er, S. Matar- rese, R. Mohay aee, and A. Sob olevski ˘ i, MNR AS 346 , 501 (2003), astro-ph/030421 4. [127] R. Mohay aee, U. F risch, S. Matarrese, and A. Sob olevskii, A&A 406 , 393 (2003), arXiv:astro- ph/0301641. [128] R. Moha yaee , B. T ully, and U. F risc h, ArXiv A stro- physics e-prints (2004), astro-ph/0410063. [129] C. S. Botzler, J. S nigula, R. Bender, and U. Hopp , MN- RAS 349 , 425 ( 2004 ), arXiv:astro-ph/0312018. [130] R. Moha ya ee and R. B. T ully, ApJL 635 , L113 (2005), astro-ph/0509313 . [131] R. Mohay aee, H . Mathis, S. Colom bi, and J. Silk, MN- RAS 365 , 939 ( 2006 ), astro-ph/0501217. [132] V. Ick e and R. v an de W eygaert, qras 32 , 85 (1991). [133] S. Ikeuc hi and E. L. T urner, MNRAS 250 , 519 (1991). [134] F. Bernardeau and R . v an de W ey gaert, MNRAS 279 , 693 (1996). [135] W. E. Schaap and R. v an d e W eygaert, A&A 363 , L29 (2000), astro-ph/001100 7. [136] R. v an de W eygaert and W. Schaap, in Mining the Sky , edited by A. J. Banday, S. Zaroubi, and M. Bartelmann (2001), p. 268. [137] M. Ramella, W. Boschin, D. F adda, and M. N onino, A&A 368 , 776 (2001), arXiv:astro-ph/0101411. [138] L. Z aninetti, Chinese Journal of Astronomy and Astro- physics 6 , 387 (2006), arXiv:astro-ph/0602431. [139] E. Bertsc hinger, A. Dekel, S. M. F aber, A. Dressler, and D. Burstein, A strophys. J. 364 , 370 (1990). [140] A. Y ahil, M. A. Strauss, M. Davis, and J. P . H uc hra, Astrophys. J. 372 , 380 (1991). [141] K. B. Fisher, C. A. Sc harf, and O. Laha v , MNRAS 266 , 219 (1994), astro-ph/930902 7. [142] E. J. Shaya , P . J. E. P eebles, an d R. B. T ully, Astro- phys. J. 454 , 15 (1995), astro-ph/9506144 . [143] E. Branc hini, M. Pli onis, and D. W. Sciama , ApJL 461 , L17 (1996), astro-ph/9512055. [144] M. W ebster, O. Lah a v , and K . Fisher, MNRAS 287 , 425 (1997), astro-ph/960802 1. [145] C. Y ess, S. F. Shandarin, and K . B. Fisher, Astrophys. J. 474 , 553 (1997), astro-p h/96050 41. [146] I. M. Schmoldt, V. Saar, P . Sah a, E. Branchini, G. P . Ef- stathiou, C. S. F renk , O. Keeb le, S. Maddox, R. McMa- hon, S. Olive r, et al., Astrophys. J. 118 , 114 6 (1999), astro-ph/9906035 . [147] A. N usser and M. Haehnelt, MNR AS 303 , 179 (1999), astro-ph/9806109 . [148] M. T egmark and B. C. Bromley , The A s- trophysical Journal 518 , L69 (199 9), URL http://www .citebase .org/abst ract?id=oai:arXiv.org:astro- ph / 9 8 0 9 3 2 4 . [149] Y. H oﬀman and S . Zaroubi, ApJL 535 , L5 (2000), astro- ph/0003306. [150] D. M. Goldb erg, Astrophys. J. 550 , 87 (2001), astro- ph/0009046. [151] H. Mathis, G. Lemson, V. Springel, G. Kauﬀmann , 37 S. D. M. White, A. Eldar, and A. Dekel, MNR AS 333 , 739 (2002), astro-ph/011109 9. [152] P . Erdo˘ gdu, O. Lahav, S. Zaroub i, and et al., MNRAS 352 , 939 (2004), astro-ph/031254 6. [153] M. S. V ogeley , F. Hoyle, R. R. Ro jas, and D. M. Gold- b erg, in IAU Col lo q. 195: Outskirts of Galaxy Clus- ters: Intense Life in the Suburbs , edited by A. Diaferio (2004), pp. 5–11. [154] J. Huchra, T . Jarrett, M. Skrutskie, R. Cutri, S. Schnei- der, L. Macri, R. S teining, J. Mader, N. Martim b eau, and T. George, in ASP Conf. Ser. 329: Ne arby L ar ge- Sc ale Structur es and the Zone of A voidanc e , edited by A. P . F airall and P . A. W oudt (2005), p. 135. [155] W. J. Perc iv al, MNR AS 356 , 1168 (2005), astro- ph/0410631. [156] P . Erdo˘ gdu, O. Lahav, J. Hu c h ra, and et al., MNRAS 373 , 45 ( 2006 ), astro-ph/0610005. [157] Y. Hoﬀman, in ASP Conf. Ser. 67: Unveil ing L ar ge- Sc ale Structur es Behind the Mil ky W ay , edited by C. Balk o wski and R. C. Kraan-Korteweg (1994), p. 185. [158] S. Zaro ubi, in ASP Conf. Ser. 218: Mapping th e Hid- den Universe: T he U ni verse b ehind the Mil y Way - The Universe in HI , ed ited by R. C. K raan-Kortew eg, P . A . Henning, and H. An dernac h (2000), p. 173. [159] R. C. Kraan-Korteweg and O. Laha v , AAPR 10 , 211 (2000), astro-ph/000550 1. [160] J. A. P eaco ck an d S. J. Do dds, MNRAS 267 , 1020 (1994), astro-ph/931105 7. [161] M. S. V ogeley and A. S. Szala y, Astrophys. J. 465 , 34 (1996), astro-ph/960118 5. [162] S. Zaroubi, I . Zeha vi, A. Dekel, Y. Hoﬀman, and T. Ko- latt, A stroph ys. J. 486 , 21 (1997), astro-ph/961022 6. [163] M. T egmark, Physical Review Letters 79 , 3806 (1997), astro-ph/9706198 . [164] D. J. Eisenstein and W. H u, Astroph ys. J. 511 , 5 (1999), astro-ph/971025 2. [165] G. E fstathiou, J. R . Bond, and S. D. M. White, MNRAS 258 , 1P ( 1992). [166] E. F. Bunn, D. S cott, and M. Wh ite, ApJL 441 , L9 (1995), astro-ph/940900 3. [167] M. A. Janss en and S. Gulkis, in NA TO ASIC Pr o c. 359: The Infr ar e d and Submil l imetr e Sky after COBE , edited by M. Signore an d C. Dup raz (Kluw er Academic Pub- lishers, Dordtrech t, 1992), pp. 391–408 . [168] E. F. Bunn, K. B. Fisher, Y. Hoﬀman, O. Lahav, J. Silk, and S. Zaroubi, Ap JL 432 , L75 (1994), astro- ph/9404007. [169] K. Maisinger, M. P . Hobson, and A. N. Lasenb y , MN- RAS 290 , 313 ( 1997 ). [170] M. T egmark, Phys. Rev . D 56 , 4514 (1997), astro- ph/9705188. [171] M. T egmark, ApJL 480 , L87 ( 1997 ), astro-ph/9611130. [172] S. Do delson, Astrophys. J. 482 , 577 (1997), astro- ph/9512021. [173] M. P . Hobson, A. W. Jones, A. N. Lasenb y, an d F. R. Bouc het, MNRAS 300 , 1 (1998), astro-ph/9806387. [174] P . Natoli, G. de Gasp eris, C. Gheller, and N. Vittorio, A&A 372 , 346 (2001), astro-ph/0101252. [175] O. D or ´ e, R. T eyssier, F. R. Bouchet, D. V ibert, and S. Prunet, A &A 374 , 358 (2001), astro-ph/0101112. [176] R. Stomp or, A. Balbi, J. D. Borrill, P . G. F erreira, S. Hanany, A. H. Jaﬀe, A. T. Lee, S. Oh, B. Rabii, P . L. Richards, et al., Phys. Rev. D 65 , 022003 (2001), astro-ph/0106451 . [177] B. D. W and elt, D. L. Larson, and A . Lakshmi- nara yanan, Phys. Rev. D 70 , 083511 (2004), astro- ph/0310080. [178] H. K. Eriksen, I. J. O’Dwyer, J. B. Jewel l, B. D. W an- delt, D. L. Larson, K. M. G´ orski, S. Levin, A. J. Ban- day , and P . B. Lilje, ApJS 155 , 227 (2004), astro- ph/0407028. [179] J. Jewell, S. Levin, and C. H. A nderson, Astrophys. J. 609 , 1 (2004), astro-ph /0209 560. [180] D. Yvon and F. Ma yet, A& A 436 , 729 (2005), astro- ph/0401505. [181] E. Keih¨ anen, H. Kurki- Suonio, and T. Po utanen, MN- RAS 360 , 390 ( 2005 ), astro-ph/0412517. [182] E. C. Sutton and B . D. W andelt, ApJS 162 , 401 (2006). [183] D. L. Larson, H. K. Eriksen, B. D. W andelt, K. M. G´ orski, G. Huey, J. B. Jew ell, and I. J. O’Dwyer, A s- trophys. J. 656 , 653 (2007), astro-ph/0608007. [184] G. Hinshaw et al. (WMA P), arXiv 0803.0732 (2008), 0803.07 32. [185] U. Seljak and M. Zaldarriaga, A stroph ys. J. 469 , 437 (1996), arXiv:astro-ph/960303 3. [186] A. Lewis, A. Challinor, and A. Lasen by , A strophys. J. 538 , 473 (2000), astro-ph/9911177. [187] M. D oran, Journal of Cosmology and A stro-P article Physics 10 , 11 (2005), arX iv:astro-ph/0302 138. [188] E. F. Bunn and N. Sugiy ama, Astrophys. J. 446 , 49 (1995), astro-ph/940706 9. [189] M. T egmark, A. N. T a ylor, and A. F. Heav en s, Astro- phys. J. 480 , 22 (1997), astro-ph/9603021 . [190] M. T egmark, Phys. Rev. D 55 , 5895 (1997), astro- ph/9611174. [191] M. R. Nolta et al. (WMA P), arXiv 0803.0593 (2008), 0803.05 93. [192] A. H. Guth, Phys. Rev. D 23 , 347 (1981). [193] A. D. Linde, Physics Letters B 108 , 389 (1982). [194] A. Albrech t and P . J. Steinhardt, Physical Review Let- ters 48 , 1220 (1982). [195] A. H. Guth and S.-Y. Pi, Physical Review Letters 49 , 1110 (1982). [196] A. A. Starobinsky, Physi cs Letters B 117 , 175 (1982). [197] J. M. Bardeen, P . J. Steinhardt, and M. S. T urner, Ph ys. Rev. D 28 , 679 (1983). [198] W. Hu, Phys. R ev. D 64 , 083005 (2001), astro- ph/0105117. [199] F. Bernardeau and J.-P . Uzan, Ph ys. Rev. D 66 , 103506 (2002), hep-ph/0207295. [200] N. Ba rtolo, E. Komatsu, S. Matarrese, and A. Riotto, Phys. R ep. 402 , 103 (2004), astro-ph/0406398. [201] D. Babich, P . Creminelli, and M. Zaldarriaga, Journal of Cosmology and Astro-Particle Physics 8 , 9 (2004), arXiv:astro-ph/0405356 . [202] E. Komatsu, B. D. W andelt, D. N. Sp ergel, A. J. Ban- day , and K . M. G´ orski, Astrophys. J. 566 , 19 (2002), arXiv:astro-ph/0107605 . [203] D. Babic h and M. Zaldarriaga, Ph ys. Rev. D 70 , 083005 (2004), arXiv:astro-ph/040845 5. [204] E. Komatsu, D. N. Sp ergel, and B. D. W andelt, Astro- phys. J. 634 , 14 (2005), arXiv:astro-ph/0305189. [205] A. P . S . Y ada v, E. Komatsu, and B. D. W andelt, As- trophys. J. 664 , 680 (2007), arXiv:astro-ph/0701921. [206] A. P . S. Y adav, E. Komatsu, B. D. W andelt, M. Liguori, F. K. Hansen, and S. Matarrese, Astrophys. J. 678 , 578 (2008), arXiv:0711.4 933. [207] E. Komatsu, A. Kogut, M. R. Nolta, C. L. Bennett , 38 M. Halpern, G. Hinshaw, N. Jarosik, M. Limon, S. S . Meye r, L. P age, et al., ApJS 148 , 119 (2003), astro- ph/0302223. [208] A. Curto, J. F. Macias-P erez, E. Martinez-Gonzalez, R. B. Barreiro, D. Santos, F. K. Hansen, M. Liguori, and S. Matarrese, ArXiv e-prints 0804.0136 (2008), 0804.01 36. [209] A. P . S . Y adav and B. D. W andelt, Physical Rev iew Letters 100 , 181301 (2008). [210] E. Martinez-Gonzalez, ArXiv e-prints 0805.4157 (2008), 0805.4 157. [211] J. Jasc he, F. S . Kitaura, and T. A. Enssl in, ArXiv e- prints (2009), 0901.3043 . [212] P . Coles and B. Jones, MNRAS 248 , 1 (1991). [213] R. Vio, P . A ndreani, and W. W amsteker, P ASP 113 , 1009 (2001), arXiv:astro-ph/0105107. [214] M. C. Neyrinck, I. Szapu di, and A. S. S zala y, ArXiv e-prints (2009), 0903.4 693. [215] R. K. Sac hs and A. M. W olfe, Astrophys. J . 147 , 73 (1967). [216] M. J. R ees and D. W. Sciama, N ature (London) 217 , 511 (1968). [217] J. R. F ergusson and E. P . S. Shellard, ArXiv e-prints (2008), 0812.3 413. [218] E. K omatsu and D. N. Sp ergel, Phys. Rev . D 63 , 063002 (2001), arXiv:astro-ph/000503 6. [219] N. Kogo and E. Komatsu, Phys. Rev. D 73 , 083007 (2006), arXiv:astro-ph/060209 9. [220] D. Ba bic h, Ph y s. Rev. D 72 , 043003 (2005), arXiv:astro-ph/0503375 . [221] A. F. Heav ens, MNRAS 299 , 805 (199 8), arXiv:astro- ph/9804222. [222] P . Creminelli, A . Nicolis, L. Senatore, M. T egmark, and M. Zaldarriaga , Journal of Cosmology and Astro- P article Physic s 5 , 4 (2006), arXiv:astro-ph/0509029. [223] A. P . Y adav and B. D. W andelt, Ph ys. Rev. D 71 , 123004 (2005), arXiv:astro-ph/050538 6.

Information field theory for cosmological perturbation reconstruction and non-linear signal analysis

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment