Information field theory for cosmological perturbation reconstruction and non-linear signal analysis
We develop information field theory (IFT) as a means of Bayesian inference on spatially distributed signals, the information fields. A didactical approach is attempted. Starting from general considerations on the nature of measurements, signals, nois…
Authors: Torsten A. Ensslin, Mona Frommert, Francisco S. Kitaura
Information field theory for cosmological p erturbation reconstruction and non-linear signal analysis T orsten A. E nßlin, Mo na F rommert, and F rancisco S. Kita ur a Max-Planck-Institut f¨ ur Astr ophysik, Karl-Schwarzsch ild-Str. 1, 85741 Gar ching, Germany (Dated: Oct ober 29, 2018) W e dev elop inf ormation field the ory (IFT) as a means of Ba yes ian inference on spatially dis- tributed signals, the information fields. A didactical app roac h is attempted. Starting from general considerations on the natu re of measurements, signals, noise, and their relation to a physical re- alit y , w e d eriv e the informatio n Hamiltonian, the source field, propagator, and interaction terms. F ree IFT repro duces the well k no wn Wiener-filter theory . Interacting IFT can be diagrammatical ly expanded, for which w e pro vide the F eynman rules in p osition-, F ourier-, and spherical harmon- ics space, and the Boltzmann-Shannon information measure. The theory should b e applicable in many fields. How ever, here, tw o cosmolog ical signal reco very problems are discussed in their IFT- form u lation. 1) Reconstruction of the cosmic large-scale stru cture matter distribu tion from discrete galaxy counts in incomplete galaxy surveys within a simple model of galaxy formation. W e show that a Gaussian signal, which should resemble the initial density p erturbations of the U niver se, ob serv ed with a strongly non-linear, incomplete and P oissonian-noise affected resp onse, as the pro cesses of structure and galaxy formation and observ ations provide, can b e reconstructed th anks to t he virtue of a resp onse-renormalization flow equation. 2) W e design a filter to detect lo cal n on-linearities in the cosmic microw av e background, which are predicted from some Early-Un iv erse inflationary sce- narios, and exp ected due to measurement imp erfections. This filter is the opt imal Bay es’ estimator up to linear order in th e non-linearity parameter and can be used even to construct sky maps of non-linearities in t he data. I. INTRO DUCTION A. Motiv ation The optimal extractio n and resto ration of informa tio n from data on spatially distributed quant ities lik e the cos- mic lar ge-sc ale st ructur e (LSS) o r the c osmic mic r owave b ackgr ound (CMB) temp erature fluctuations in cosmol- ogy , but als o on many other sig nals in physics and re- lated fields, is essential for any quantitativ e, data-driven scientific inference . The problem of how to design such metho ds p osses ses man y technical and even conceptual difficulties, which ha ve led to a larg e num ber of recip es and metho dologies . Here, we address such problems from a strictly infor- mation theo retical p oint of v iew. W e show, a s others hav e done before, that information theory for distributed quantities le a ds to a statis tica l field theor y , which we name info rmation field the ory (IFT). In con tr ast to the previous works, which mostly treat suc h pr oblems on a classical field lev el, as will be detailed later, here, we take full adv antage of the exis ting field theoretica l appa- ratus to tr e at interacting and non- classical fields . Thus, we show how to use diagr ammatic p e rturbation theo ry and renormalization flows in order to construct optimal signal r e co vering algor ithms a nd to calcula te moment s of their uncer tain ties. Non-class ic alit y manifests itself a s quantum and statistical fluctuations in quan tum and sta- tistical field theory (QFT & SFT), and very simila rly as uncertaint y in IFT. The informa tion theor e tical per spective on signa l infer - ence pro blems ha s technical adv a n tages, since it pe r mits to design information-yield optimized algorithms and ex- per imen tal setups. Ho w ever, it als o provides deeper in- sight into the mechanisms of knowledge ac cum ula tion, its underlying information flows, and its dependence on data models, prior k no wledge a nd as sumptions than pure empirical ev a lua tions of ad-ho c algo rithms a lone could provide. W e ther e fore hop e that o ur work is of in terest for tw o t yp e s of r e aders. The firs t are applied s cien tists, who are mainly interested in the pra ctical asp e c t of IFT since they ar e facing a concre te in verse problem for a spa tially distributed qua n tity , esp ecially but not exclusively in cos- mology . The second are more philo s ophical or theoret- ically inclined scie ntists, for whom IFT may serve as a framework to understand and cla ssify many of the exis t- ing metho ds o f signal e xtraction and r eception. Since we exp ect that many interested reader s ar e not very famil- iar with field theoretical formalisms, w e in tro duce some of its ba sic mathematical concepts. Due to this a n tici- pated non-uniform reade r ship, not everything in this arti- cle might be of every o nes interest, and ther efore we pr o- vide in the following a short ov erview on the structure and conten t of the a r ticle. B. Overview of the work The rema inder of this introduction section contains a detailed discussion of the pr evious work on signal infer- ence theor y as well as a v er y brief introduction into the here relev ant works on the cosmic LSS and the CMB. The main par t of this ar ticle falls into t wo categorie s: 2 abstract IFT and its applicatio n. The co ncepts of IFT are in tro duced in Sec. I I, where Ba y esian metho dology , the distinction of physical a nd infor mation fields, the def- inition of signa l resp onse and noise, as w ell the design of signal spaces are discussed. The bas ic IFT formalism in- cluding the free theory is in tro duced in Sec. I I I, whic h, according to our judgement, summarize s and unifies the previous knowledge o n IFT befor e this pap er. An im- patient reader, only in teres ted in applying IFT and not worrying abo ut co nce pts, may start reading in Sec. I I I. F rom Se c . IV on the new results of this work are pre- sented, star ting with the discuss ion of in teracting inf or- mation fields, their Hamiltonians and F eynman rules , and the Boltzmann-Shannon inf ormation measure . The nor- malisability of sensibly constructed IFTs is shown, as well the classical informatio n field eq uation is presented there. A step-by-step recip e of how to derive and implement a n IFT a lgorithms is a lso provided. Details of the notation can be found, if not defined in the ma in text, in App endix A. Applications of the theor y are provided in the following t wo sections, which can be skipp ed by a reader interested only in the g e neral theoretical framework. Althoug h sp e- cific inference problems are addressed, they s hould serve as a blueprint for the tackling of similar problems. In Sec. V the pr oblem of the reco nstruction of the cosmic matter distribution from galaxy surveys is analy z ed in terms of a Poissona in data mo del. In Sec. VI w e deriv e an optimal estimato r for non-Gaussianity in the CMB, and show how it can b e g eneralized to ma p p otential non-Gaussianities in the CMB sky . Our summar y and outlo ok can b e found in Sec. VII. C. Previous works The work presented here tries to unify information the- ory and statistical field theory in or der to provide a con- ceptual framework in w hich optimal too ls for cosmologi- cal signa l ana lysis can be derived, as well a s for inference problems in other disciplines. Below, we provide very brief introductions int o each of the requir e d fields 1 (in- formation theor y , ima g e reconstructio n, statistica l field theory , cos mo logical la r ge-scale structur e, a nd cosmic mi- crow ave bac kg round), for the orientation of no n-exper t readers. An exp ert in any o f these fields might decide to skip the corr esponding sections. 1 This work has tremendously b enefitt ed in a dir ect and indirect wa y from a large num b er of pr evious publications in those fields. W e, the authors, hav e to ap ologize for b eing unable to give full credit to al l rel ev an t former works in those fields for only con- cen trating on a brief summary of the pap ers more or less directly influencing this work. This coll ection is obviously highly biased to wards the cosmological l iterature due to our main scientific int erests and expertise, and definitely incomplete. 1. Information the ory and Bayesian i nfer enc e The fundament of information theory w as laid by the work of Bay es [1] on pr obability theo ry , in which the cele- brated Bay es theorem w as pr esent ed. The theorem itself (see Eq. 7) is a simple rule for conditiona l pro babilities. It only unfolds its power for inference pr oblems if used with belief or knowledge states, describ ed by conditional probabilities. The a dv en t o f mo dern informatio n theory is proba bly bes t dated by the work of Shanno n [2, 3] o n the concept of information measur e, b eing the nega tive Boltzmann- ent ropy , and the work of Jaynes, combining the language of statistical mechanics and Bayes proba bilit y theory a nd applying it to knowledge uncer tain ties [4, 5, 6, 7, 8, 9, 10]. The requir ed numerical ev aluatio n o f Bay es ia n proba bil- it y int egrals suffered often from the curse of high dimen- sionality . The standard recip e ag a inst this, s till in mas - sive use today , is imp ortance s ampling via Ma rko v- C ha in Monte-Carlo Metho ds (MCMC), following the ide a s of Metrop olis et al. [11], Hastings [12], and Geman and Ge- man [13], where the latter author s alre ady had image reconstructio n applica tions in mind. The Hamiltonian MCMC metho ds [14], in whic h the phase-s pa ce sampling is partly following Hamiltonian dy namics, are a lso of rel- ev ance here. Ther e the Hamilto nia n is intro duced as the negative logar ithm of the pr obability , as we do in this work. With suc h too ls, higher dimensional problems, as present in signal resto ration, could a nd can b e tackled, how ever, for the pric e o f getting stochastic uncertaint y int o the computational results. F or a rece nt revie w on image r estoration MCMC tec hniques, see [15]. The a pplications and ex tensions of these pioneering works are too numerous to b e listed here. G o od mono- graphs e xist and the necessar y references can b e found there [16, 17, 18, 1 9, 20, 2 1]. 2. Image r e c onstruction in astr onomy and elsewher e The problem of image reconstructio n from incomplete, noisy data is esp ecially imp ortant in astro nom y , where the exp erimental co nditions are la rgely s e t by the nature of distant ob jects, weather conditions , etc., all ma inly out o f the control of the observer, as well as in other disciplines like medicine and geology , with simila r limita- tions to ar range the ob ject of observ ations for an optimal measurement. Some of the most prominent metho ds of image reconstr uction, which ar e based on a Bay e sian im- plement ation of an as sumed data mo del, are the Wiener- filter [22], the Richardson-Lucy algorithm [23, 24], and the maximum-en tr o p y ima ge r estoration [25](see also [26, 2 7, 28, 2 9, 30, 31, 32, 33, 3 4, 35, 3 6, 37]). The Wiener filter ca n b e r egarded to b e a full Bay esia n image inference metho d in case of Gauss ian signa l and noise statistics, as we will show in Sect. I I I B. It will be the working horse of the IFT formalis m, since the 3 Wiener filter repres e nts the algorithm to construct the exact field theoretical exp ectation v alue given the data for an interaction-free informatio n Ha milto nia n. The fil- ter can b e decomp osed in to t w o e ssen tial infor mation pro cessing steps, first building the information source by resp onse-ov er-noise w eighting the data, and then pro pa- gating this information throug h the signa l space, by a p- plying the so called Wiener v aria nce. The Richardson-Lucy algorithm is a maximum- likelihoo d method to r econstruct fr o m Poissonia n data and therefore is also of Bay es ia n orig in. This metho d has usually to b e regular ized by hand, by trunca tion o f the iterative ca lculations, against an ov er-fitting insta - bilit y due to the missing (or implicitly flat) signa l prio r. A Gaussia n-prior based re g ularization w as recently pro- po sed by Kita ura and Enßlin [38], and the implementa- tion of a v a riant o f this is pres e n ted here in Sect. V D. Maximum entrop y a lgorithms will not be the topic here, a s well as not a num b er of o ther existing methods, which are pa rtly within and partly outside the Bayesian framework. They may b e found in existing reviews on this to pic [e.g. 39, 40]. 3. Statistic al and Bayesian field the ory The relation of sig nal rec onstruction pro ble ms and fie ld theory was disc overed indep endently by several authors. In cosmo logy , a prominent work in this dir ections is Bertschinger [41], in which the path integral a pproach was prop osed to sample primo rdial density p erturbations with a Gaussian statis tics under the constraint of exist- ing information on the large sca le structure. The work presented here can b e regar ded as a no n-linear, non- Gaussian extension o f this. Many methods from statistics and fro m statistical mechanics were of co ur se used even earlier, e.g. the usage of moment generating function for cosmic density fields can a lready b e found in F ry [4 2]. Sim ultaneously to Bertschinger’s work, Bialek and Zee [43, 44] argued that visual per ception can b e mo deled as a field theo ry for the true image, be ing disto rted by noise and o ther data transfo r mations, which are summa r ized by a n uisance field. A probabilistic lang ua ge w as used, but no direct refere nce to infor ma tion theor y was made, since not the o ptimal information reconstr uction was the aim, but a mode l for the human visua l reception system. How ever, this work actually trigger ed o ur resea rch. Bialek et a l. [45] applied a field theor etical approa c h to recov er a probabilit y distributio n fro m data. Here, a Bay es ia n prior was used to reg ularize the solution, which was set up a d-ho c to enforce smo othness of the rec o n- struction, o btained fr om the classical (o r saddlep oint , or maximum a p osteriori) so lution o f the problem. How- ever, an “o ptimal” v alue for the smoothness controlling parameter w as deriv ed from the data itself, a topic also addressed b y Stoica et al. [46] and by a follow up publi- cation to ours [47]. Bialek et al. [45] also recognized, a s we do, that an IFT can easily b e non-lo cal. Finally , the work of L e mm and co w orkers [48, 49, 50, 51, 52, 53, 54, 55] established a tight connection b etw een statistical field theory and Ba yesian inference, and pro- po sed the term Bayesia n field the ory (BFT) for this. How ever, we prefer the term information field the ory since it puts the emphasis on the relev a n t ob ject, the information, wher e a s BFT refers to a method, Bayesian inference. The term information field is rather self- explaining, whereas the mea ning of a Bayesian field is not that obvious. The applica tions consider ed b y Lemm concentrate on the reconstruction of probability fields ov er par ameter spaces and quan tum mechanical p otentials b y means of the maximum a p osterio ri equa tion. The ex tensiv e bo ok summarizing the e s sen tial insigh ts of these pap ers, [48], clearly states the p ossibility of p erturbative expansions of the field theory . How ever, this is no t followed up by these authors probably for reasons of the co mputational com- plexity of the required alg o rithms. In con trast to many of the pr e vious works on IFT, which deal with ad-ho c priors, the publicatio n b y Lemm [56] is remark able , since it provides explicit rec ipes o f how to implemen t a priori information in v ar ious circumstances mor e rigor ously . The mathematical to o ls required to tackle IFT pro b- lems co me from SFT a nd Q FT, which hav e a v a s t litera- ture. W e hav e sp ecially made use of the bo ok s of Binney et a l. [57], Peskin and Schroeder [58], a nd Zee [59]. 4. Cosmolo gic al lar ge-sc ale st ructur e Our firs t IFT ex ample in Sec. V is geared tow a rds improving galaxy-s urvey ba sed cosmo graphy , the recon- struction of the large- scale structure matter distribution. W e provide here a sho rt ov erview on the relev ant back- ground and works. The LSS of the matter distribution of the Universe is traced by the spatia l distribution of Galaxies, and therefore well observ a ble. This struc tur e is believed to hav e emerged from tin y , mo stly Gaussian initial den- sity fluctuatio ns of a rela tiv e strength of 10 − 5 via a self- gravitational instabilit y , partly counteracted b y the ex- pansion o f the Universe. The initial density fluctuations are b elieved to b e pro duced during an ear ly inflationary epo ch of the Universe, and to ca rry v aluable information ab out the inflaton, the field which drove inflation, in their N -p oin t co rrelation functions, to be extracted from the observ atio na l da ta. The onset of the structure formation pr oces s is well describ ed by linear p erturbation theory and ther efore to conserve Gaussianity , how ever, the later e v olution, the structures o n smaller scales, and esp ecially the gala xy formation require non-linear descriptions. The observ a- tional situation is complicated b y the fac t that the most impo rtant g alaxy distance indicator , their redshift, is als o sensitive to the gala xy p eculiar velocity , w hich cause s the observ atio na l data on the three-dimensio nal LSS to b e partially degener a ted. The r e are analytica l methods to 4 describ e these effects 2 , and als o ex tensiv e work o n N - bo dy simulations of the s tructure formation, the la tter probably providing us with the mo st detaile d and ac - curate statistica l data o n the pro p erties of the matter density field [e.g 75]. In rece nt years, it was re c o gnized that the evolution of the cosmic density field and its statistica l prop er- ties can b e addres sed with field theoretical metho ds by virtue of reno rmalization flow equations. Detailed semi- analytical calculations for the density field time prop- agator , the tw o- a nd three- p oint cor relation functions are now p ossible due to this, whic h are exp ected to play an impo rtant role in future approaches to recon- struct the initial fluctuations from the observ a tional data [76, 77, 7 8, 79, 80, 81, 82, 8 3, 84, 85, 86, 87, 88, 89, 90] . It was reco gnized ear ly on, that the primordial den- sity fluctuations can in principle b e reco nstructed from galaxy observ a tio ns [41]. This has lead to a larg e devel- opment of v arious n umer ical techniques for a n optimal reconstructio n [91, 92, 93, 94, 95, 96, 97, 98, 9 9, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 1 14, 115, 1 16, 117, 118, 119, 120, 1 21, 122, 1 23, 124, 1 25, 126, 127, 128, 129, 1 3 0, 131]. Many of them are based o n a Bay es ia n approach, since they a re im- plement ations and ex tension of the Wiener filter. How- ever, also other pr inciples ar e use d, like, e.g. the least action approach, or V oronoi tessella tion techniques [e.g. 132, 1 33, 1 34, 135, 13 6 , 137, 1 38]. A discussion a nd clas- sification o f the v arious metho ds can b e found in [38]. Esp ecially the Wiener filter methods w ere extensively applied to g alaxy survey data 3 and per mitted partly to extr apo la te the ma tter distribution into the zone of avoida nc e behind the g alactic disk and to clo se the data- gap ther e, c.f. [157, 158, 1 5 9], a topic we also address in Sect. V. Another cosmological r elev ant information field to b e extracted from ga laxy c a talogues is the LSS p ow er sp ec- trum [e.g. 16 0 , 161, 1 62, 163, 164]. This power is also measurable in the CMB, and for a long time the CMB provided the b est sp ectrum norma liz a tion [165, 1 66]. 5. Cosmic Micr owave Backgr ound Since our second ex a mple deals with the CMB, we g iv e a brief ov erview on it and on rela ted inference metho ds . The CMB reveals the statistical prop erties of the ma t- ter field at a time, when the Univ erse was ab out 110 0 times smaller in linear size than it is to day . The photon- baryon fluid, which deco uples at that epo ch into neutral 2 Of special interest in this context ma y b e [60], whi c h alr eady applies path-integrals, [61, 62, 63, 64, 65, 66, 67, 68, 69 , 70, 71, 72, 73 , 74 ], and the pap ers they refer to. 3 Surve y based r econstructions of the cosmic matter fields can be found in [139, 140 , 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154 , 155 , 156 ] . Hydrogen and fr ee stre aming pho tons, has r e sponded to the gr a vitational pull of the then already forming dar k matter s tr uctures. The photons from that e po ch co oled due to the cosmic expansion since then into the CMB radiation w e observe today , and carry information on the physical pro p erties o f the photon-bar y on fluid of that time like density , tempera ture a nd velo cit y . T o very high accuracy , the sp ectrum of the photons from any direc - tion is that of a blackbo dy , with a mean tempera ture o f 2 . 7 Kelv in and fluctuations of the order of 10 − 5 Kelvin, imprinted by the pr imordial gravitational p otentials at decoupling. Therefore, mapping these temper ature fluctua tions per mits pr ecisely to s tudy many cosmolog ical parameter s simult aneously , like the amount of dark matter pro duc- ing the gravitational p otentials, the r a tio of photons to baryons, balancing the pressure and w eight of the fluid, and ge o metrical and dynamical par ameters of space-time itself. The observ ations ar e technically challenging, and therefore require s ophisticated algorithms to extrac t the tin y sig na l of tempe r ature fluctuatio ns aga inst the instru- men t noise, but als o to separa te it fr om other astro ph ys- ical foreground emissio n with the be s t p ossible accura cy . A num b er o f suc h algor ithms were developed [e.g. 167, 168, 169, 1 70, 171, 1 72, 173, 174, 175, 176, 1 77, 178, 1 79, 180, 181, 1 82, 183, 184], which in many cases implement the Wiener filter. Th us , the r equired numerical to ols for an IFT trea tment o f CMB data are essentially av ailable. The exp ected temp erature fluctua tions sp ectrum can be calculated from a linear perturba tiv e treatmen t o f the Boltzmann equatio ns of all dy namical active parti- cle sp ecies at this epo ch, and fast computationa l imple- men tations exists p ermitting to predict it for a given s et of cosmolo g ical parameter s. W ell k nown co des fo r this task ar e publicly av ailable 4 and p ermit to extract infor- mation o n co s mological parameter s from the measur ed CMB temperature fluctuation s pectrum via comparison to their predictions for a given para meter set. It was recognized early on that this should happ en in an infor- mation theoretically optimal w ay , and Bayesian metho ds were ther efore adapted in that a rea well b efore in other astrophysical dis ciplines [e.g . 188, 18 9, 190, 191]. The initial metric and dens it y fluctuations, from which the CMB fluctuations and the L SS emerged, are believed to b e initially seeded by quantum fluctuations of a hy- po thetical inflaton fie ld, which sho uld hav e driven a n inflationary expans ion phase in the very early Univ erse [192, 193, 1 94, 195, 196, 1 97]. The inflato n-induced fluc- tuations hav e a very Gaussian pr obability distribution, how ever, so me non-Gaussia n featur es seem to b e un- av oida ble in mo st scenario s and can ser v e as a finger prin t to discr iminate among them [e.g. 198, 199, 20 0, 201]. O b- 4 E.g. cmbfast ( http:// cmbfast. org , http://a scl.net/ cmbfast.html , [185]), camb ( http:// camb.inf o/ , [186]), and cmbeasy ( http:// www.cmbe asy.org/ , [187 ]). 5 serv ationa l tes ts on such non- Gaussianities based on the three-p oint cor relation function of the CMB data [e.g. 202, 203, 2 04, 2 05, 20 6] were s o far mostly negative, how ever not sensitive enough to serious ly co nstrain the po ssible theoretical parameter space o f inflationary sce- narios, see e.g. [2 0 7, 20 8]. Recently , there has been the claim of a detection of such non-Ga ussianities by Y adav and W andelt [209] and a confirmation of this with b etter data and improv ed algorithms is therefore highly desir- able. In Sect. VI w e make a pro pos al for improving the algorithmic side of this challenge. A recent review on the current status of CMB-Gaussianity can b e found in [2 10]. II. CONCEPTS OF INFORMA TION FIELD THEOR Y A. Information on physical fields In our attempts to infer the prop erties of our Uni- verse from astronomical observ ations we are faced with the pro blem of how to in terpret incomplete, imper fect and noisy data, dr a w our conclusio ns based on them a nd quantify the uncertainties of our results. This is true for using galaxy surveys to map the cosmic LSS, for the in- terpretation of the CMB, as w ell for man y exper imen ts in ph y sical labo ratories and compilations of geo logical, economical, s ocio logical, and biological data ab out our planet. Information theory , w hich is based on probability theory and the Bay esian interpretation of mis s ing knowl- edge as probabilistic uncertaint y , offers an idea l frame- work to handle such pr oblems. It p ermits to describ e all relev ant pro cesses inv o lv ed in the measur emen t prob- abilistically , provided a mo del fo r the Universe or the system under conside r ation is ado pted. The states o f such a mo del, denoted by the state v ar i- able ψ , ar e identified with the po ssible ph ysical rea li- ties. They can hav e probabilities P ( ψ ) assigned to them, the so-called prior information. This prior con tains our knowledge ab out the Universe a s w e mo del it befor e any other data is taken. F or a g iv en cosmologic a l mo del, the prior may b e the probability distribution of the different initial conditions o f the Univ erse, which determine the subsequent evolution completely . Since our Universe is spatially extended, the state v aria ble will in gener al con- tain one or several fields, which are functions o ver some co ordinates x . Also the measurement pro cess is describ ed b y a data mo del which defines the so- c alled likelihoo d, the prob- ability P ( d | ψ ) to obtain a sp ecific datase t d given the ph ysical c ondition ψ . In case the outcome d of the mea- surement is deterministic P ( d | ψ ) = δ ( d − d [ ψ ]), where d [ ψ ] is the functional depe ndenc e of the da ta on the state. In an y case, the pr obabilit y distributio n function of the data, P ( d ) = Z D ψ P ( d | ψ ) P ( ψ ) , (1) is given in ter ms of a phase-space or path in tegral ov er all p ossible realizatio ns of ψ , to b e defined mo re precisely later (Sect. I I E 1). A scie n tist is not actually interested in the total state of the Univ erse, but only in some sp ecific asp ects of it, which w e ca ll the signal s = s [ ψ ]. The sig nal is a very reduced description of the ph ysical r eality , and can b e any function of its state ψ , freely chosen accor ding to the needs and interests o f the scientist o r the abilit y and capacity of the measurement and computational devices used. Since the sig nal do es not co n tain the full phys- ical state, a n y physical de g ree of fr eedom which is not present in the sig nal but influences the data will b e re- ceived as pr obabilistic uncertaint y , o r shor tly no ise. The probability distr ibution function of the sig nal, its pr ior P ( s ) = Z D ψ δ ( s − s [ ψ ]) P ( ψ ) , (2) is rela ted to that o f the da ta via the joint pro ba bilit y P ( d, s ) = Z D ψ δ ( s − s [ ψ ]) P ( d | ψ ) P ( ψ ) , (3) from which the conditional signal likelihoo d P ( d | s ) = P ( d, s ) /P ( s ) (4) and signa l p osterior P ( s | d ) = P ( d, s ) /P ( d ) (5) can b e derived. Before the data is av a ilable, the phase-spa ce of interest is spanned by the dir ect pro duct o f all po ssible s ignals s and data d , and all regions with non-zero P ( d, s ) are of po ten tia l relev ance. Onc e the actua l data d obs hav e b een taken, only a sub-manifo ld of this s pace, a s fixed by the data, is o f further relev ance. The proba bilit y function ov er this sub-space is prop ortional to P ( d = d obs , s ), and needs just to be reno rmalized by dividing by Z D s P ( d obs , s ) = Z D s Z D ψ δ ( s − s [ ψ ]) P ( d obs | ψ ) P ( ψ ) = Z D ψ P ( d obs | ψ ) P ( ψ ) = P ( d obs ) , (6) which is the unconditioned proba bilit y (or evidence) of that data . Thus, we find the resulting information o f the data to be the poster ior distribution P ( s | d obs ) = P ( d obs , s ) /P ( d obs ). This p oster io r is the fundamental mathematical ob ject from which all our deductio ns have to b e made. It is related via Ba yes’s theor em [1] to the usually b etter accessible s ig nal likelihoo d, P ( s | d ) = P ( d | s ) P ( s ) /P ( d ) , (7) which follows fro m Eqs. 4 and 5. The normaliza tion term in Bayes’s theorem, the evi- dence P ( d ), is now also fully ex pressed in terms o f the joint pro babilit y of data a nd signal, P ( d ) = Z D s P ( d, s ) , (8) 6 and the underlying physical field ψ basically b ecomes in- visible at this s tage in the formalis m. The evidence plays a ce n tr al role in Bayes infer ence, since it is the likeli- ho od of all the assumed mo del para meters. C o m bining this parameter - lik eliho od with parameter- priors one can start Bayesian infer ence on the mo del classes. B. Signal resp onse and noise If signa l a nd da ta depend o n the same under lying phys- ical pr ope r ties, there may b e correlations b etw een the t wo, whic h can b e expressed in terms of signa l res p onse R a nd noise n of the data as d = R [ s ] + n s . (9) W e hav e chosen tw o different wa ys of denoting the de- pendenc e of r espons e and noise on the signal s , in o rder to highlight that the res ponse should embrace most of the reaction of the data to the s ig nal, wher eas the noise should b e as indep enden t as p ossible . W e ensure this b y putting the linear correla tion of the da ta with the signa l fully int o the r e spons e . The r espo nse is therefore the part of the data which co rrelates with the s ignal R [ s ] ≡ h d i ( d | s ) ≡ Z D d d P ( d | s ) , (10) and the noise is just defined a s the remaining pa rt which do es not: n s ≡ d − R [ s ] = d − h d i ( d | s ) . (11) Although the noise mig h t dep end on the sig nal, as it is well known for example for Poissonian pro cesses, it is – per definition – linearly uncor related to it, h n s s † i ( d | s ) = ( h d i ( d | s ) − R [ s ]) s † = 0 s † = 0 , (12) whereas higher or der co rrelation might well exis t and may be further exploited for their information conten t. The dagger denotes co mplex conjugation and trans p osing of a vector or ma trix. These definitions were c hosen to b e close to the usua l language in sig nal pro cessing and data ana lysis. They per mit to define signal res p onse and no is e for an arbitrary choice o f the signal s [ ψ ]. No direct ca us al connection betw een signal and data is needed in or der to have a non-trivial res ponse , since b oth v a riables just need to exhibit some couplings to a co mmon sub-asp ect of ψ . The ab ov e definition of r espo nse and noise is how ever not unique, even for a fixed signal definition, s ince any data transformatio n d ′ = T [ d ] ca n lead to different definitions, as se e n from R ′ [ s ] ≡ h d ′ i ( d | s ) = h T [ d ] i ( d | s ) 6 = T [ h d i ( d | s ) ] = T [ R [ s ]] . (13) Exceptions are so me unique relations b etw een signa l and state, P ( ψ | s ) = δ ( ψ − ψ [ s ]), and maybe a few o ther very sp ecial cases. Th us, the co ncepts of sig nal r e spons e and therewith defined noise dep end on the adopted co ordi- nate system in the data space. This co ordinate system can b e changed via a data transformation T , and the transformed da ta may exhibit b etter or worse resp onse to the sig nal. Informatio n theory a ids in designing a suit- able data transformation, so that the signal respo nse is maximal, and the signa l nois e is minimal, per mitting the signal to be b est r ecov ere d. Thu s, we may aim for an optimal T , which yields T [ d ] = h s i ( s | d ) . (14) W e define the po sterior average o f the signal, m d = h s i ( s | d ) , to be the ma p o f the signal given the data d and call T a map-making-algorithm if it fulfills Eq. 1 4 at least approximately . As a criter ion for this one may req uire that the signa l r espo ns e of a map-ma k ing-algorithm, R T [ s ] ≡ h T [ d ] i ( d | s ) , (15) is p ositive definite with resp ect to signal v a r iations as stated by δ R T [ s ] δ s ≥ 0 . (16) This ensures tha t a ma p-making algorithm will resp o nd with a non-neg ativ e co rrelation of the map to any signal feature, with resp ect to the nois e ensemble. In general, T will b e a non-linear op eration on the data , to be con- structed fro m information theory if it should b e optimal in the sense of E q. 1 4. In any ca s e, the fidelit y o f a sig - nal r econstruction ca n b e ch aracter iz ed b y the quadr atic signal uncertaint y , σ 2 T , d = h ( s − T [ d ]) ( s − T [ d ]) † i ( s | d ) , (17) av er aged ov er t ypical realiza tions of signal and noise. O f sp ecial interest is the tra ce of this T r( σ 2 T , d ) = Z dx h| s x − T x [ d ] | 2 i ( s | d ) , (18) since it is the exp ectation v alue of the sq uared Leb esgue- L 2 -space dista nce b et ween a signal rec o nstruction and the underlying sig nal. Requesting a map ma k ing algo- rithm to b e optimal with r espect to Eq. 18, implies T [ d ] = h s i ( s | d ) and ther e fore it to b e optimal in a n in- formation theoretical sense according to Eq . 14. The uncerta in ty σ 2 T , d depe nds on d , since in Ba yesian inference one av era ges over the p osterior, which is condi- tional to the data. The freq ue ntist uncerta in ty estimate, which is the ex p ected uncerta in ty of any estimator b efore the da ta is obtained, is given by a n av e r age over the joint probability function: σ 2 T = h ( s − T [ d ]) ( s − T [ d ]) † i ( d,s ) . (19) The latter is a go o d quantit y to characterize the ov er all per formance of an estimator, whereas T r( σ 2 T , d ) is a more 7 precise indicator of the actual estimator pe rformance for a given dataset. As we will see in our IFT applications, data dep endence o f the uncertaint y is a common fea tur e of no n-linear inference problems. An illus trative example s ho uld b e in or der. Supp ose our data is an exact c o p y o f a physical field, d = ψ , our signal the square of the latter, s = ψ 2 , and the ph ys ical field obe y s an even s ta tistics, P ( ψ ) = P ( − ψ ). Then, the signa l resp onse is ex actly zero, R [ s ] = 0 , and the data contains o nly nois e w ith r espe c t to the chosen signa l, d = n s . Th us, we hav e chosen a bad represe ntation of our data to reveal the signal. If we, howev er , in tro duce the transformation d ′ = T [ d ] = d 2 , w e find a p erfect resp onse, R ′ [ s ] = s , and ze ro noise, n ′ s = 0. In this ca s e, finding the optimal map-mak ing alg orithm was trivia l, but in more co mplica ted situatio ns, it can not be g uessed tha t easily . Since the resp onse and noise definitions dep end on the s ig nal definition, s o me thoughts should be given to how to choose the signal in a way that it ca n b e well re constructed. C. Signal design F or practica l reas ons one will usually choos e s acco rd- ing to a few guidelines, which should simplify the infor- mation induction pro cess: 1. The functional form of s [ ψ ] should b est b e simple, steady , analytic, and if p ossible linear in ψ , p ermit- ting to use the signa l s to reaso n ab out the s ta te of reality ψ . 2. The degr ees of free do m o f s sho uld b e r elated to the ones of the data d in the sense that c ross cor - relations ex ist which p ermit to deduce prop erties of s from d . Signal degree s o f fr eedoms, which ar e insensitive to the data, will o nly be constrained by the prior and therefore just contain a large amo unt of unce rtaint y . This adds to the err or budget, a nd should b e av oided a s far a s p ossible. 3. The c hoice of s [ ψ ] should also be lead b y math- ematical convenience and practicality . In the ex- amples presented in this work, simple sig nals are chosen which per mit to gues s go o d approximations for signal likelihoo d P ( d | s ) and prior P ( s ) without the need to develop the full physical theo ry star ting with P ( ψ ). T o g iv e a mo r e sp ecific ex ample, we assume a cosmo- logical mo del in which the rea lit y is thought to b e s o lely characterized by the primordial dar k matter density dis- tribution ψ ( x ), fro m whic h all obser v a ble co s mological phenomena like g alaxies derive in a deterministic way . The co ordinate x may re fer to the comoving co ordinates at so me early ep o ch o f the Univ ers e. Althoug h the LSS of the matter distribution a t a later time may pr edom- inantly dep end o n the initial large - scale mo des, and is reflected in the ga laxy distribution, the actual p ositions of the individual galaxie s also dep end in a non-trivial wa y on the small-scale mo des. Due to the discr eteness of our observ a ble, the galaxy po sitions, it may b e impos s i- ble to reconstruct these small scale mo des. Therefor e it could be sensible to define a signa l s [ ψ ] = F ψ , with F being a linear low-pass filter, which suppresses all small- scale structures. This signa l may b e reconstructible with high pr ecision, wher eas any attempt to rec o nstruct ψ di- rectly would b e plag ued b y a larg er error budg et, since all the data-unconstrained small-scale mo des repr esen t uncertainties to a rec o nstruction of ψ , but no t to one of s b eing defined a s a low pass filtered version of ψ . D. Signal moment cal culation The information of some data d o n a signal s defined ov er some set Ω, whic h in most applications will b e a manifold like a sub-volume of the R n , or the spher e in case o f a CMB signal, is completely contained in the po sterior P ( s | d ) of the signal given the data. 5 The ex- pec tation v alue o f s a t s ome lo cation x ∈ Ω, and higher correla tion functions of s can all b e obtained from the po sterior by taking the appro priate av er age: h s ( x 1 ) · · · s ( x n ) i d ≡ h s ( x 1 ) · · · s ( x n ) i ( s | d ) ≡ Z D s s ( x 1 ) · · · s ( x n ) P ( s | d ) . (20 ) The problem is that often neither the exp ectation v al- ues no r even the p osterior are easily calculated analyt- ically , even for fa irly simple da ta mo dels. F ortunately , there is a t least one class o f data mo dels for which the po sterior and all its mo men ts can b e calculated ex actly , namely in case the p osterior turns out to b e a multiv ar i- ate Gaussian in s . In this case ana lytical formulae for all moments of the s ignal are known and ar e in pr inciple computable. T echnically , one is still o ften facing a h uge, but linea r in verse pr o blem. How ever, in the last decades a couple of c o mputational high-p erformance map-ma king techn iques were develop ed to tackle such problems either on the sphere, for CMB research, or in flat spaces with one, t w o or three dimensions, for example for the recon- struction o f the cosmic LSS (detailed references are given in Sect. I C). The purpo se o f this work is to show how to expa nd other pos terior distributions around the Gaus- sian ones in a pe rturbative manner , which then p ermits to use the existing ma p- making codes for the computa- tion of the re s ulting dia grammatic p erturba tion s e ries. Since the diagr ammatic p erturbation series in F eynma n- diagrams are well known and understo o d in QFT and 5 W e are mostly dealing with scalar fields, ho wev er, m ul ti- component, vec tor or tensor fields can be treate d analogou sly , and many of the equations just hav e to b e re-i n terpreted for suc h fields and sta y v ali d. 8 SFT, the mo st economica l w ay is to reformulate the in- formation theor etical pro blem in a langua ge which is as close as p ossible to the former tw o theor ies. Thereb y , many of the results a nd concepts become directly av ail- able for s ignal inference problems. Moreover, it seems that expr e s sing the optimal signal estimator in terms o f F eynman diagr ams immediately provides co mputation- ally efficient algor ithms, since the diag rams enco de the skeleton of the minimal necessar y computatio na l infor - mation flow. E. Signal and data s paces 1. Discr etisation and c ontinuous li mit Both, the signal a nd the data space may b e co n tinuous, how ever, in practice will most often b e discrete s ince dig- ital data pr oces sing only p ermits to chose a discretized representation of the distributed information. The s pace in which the data and signa l discretisa tion happ e ns can be c hosen freely , and of cours e can be as w ell a F ourier , wa velet or spheric a l harmonics spa ce. Even if w e would like to analyze a con tinuous signal, the computationally required discretisation will force an implicit redefinition of our actual signal to b e the discretely sampled version of that contin uous s ignal, and this discretisation step should also be pa rt of the data mo del, if it has the p oten tial to significantly affect the analysis [e.g. see 211]. Although discretisation implies some information loss it also ha s a n adv antage. W e can just assume discr etisa- tion and therefore rea d all sc a lar a nd tensor pro ducts as being the usual, comp onent-wise ones, now just in high-, but finite-dimensio nal vector spaces . T o b e concr ete, let { x i } ⊂ Ω b e a discrete set of N pix pixel po s itions, each of which has a volume-size V i at- tributed to it, then the s calar pro duct of t wo dis c r etized function-vectors f = ( f i ), a nd g = ( g i ) sa mpled at these po in ts via f i = f ( x i ), and g i = g ( x i ) could b e defined by g † f ≡ N pix X i =1 V i g i ∗ f i . (21) The asterix deno tes complex conjuga tion. This s calar pro duct has the contin uous limit g † f − → Z dx g ∗ ( x ) f ( x ) . (22) In many cases the actual volume norma lization in Eq. 21 do es not matter fo r final results, since it usually can- cels out, a nd therefor e V i is often dropp ed completely for equidistant sampling o f sig nal and data s paces. The vol- ume terms a lso disapp ear for a scalar pro duct inv olving a function whic h is discr etized via volume integration, f i = R V i dx f ( x ), e.g. the num b er of counts within the cell i . Anyho w, higher or der tensor pr oducts are defined analogo usly . The path integral of a functiona l F [ f ] ≡ F ( f 1 , . . . , f N pix ) ov er all realizations o f such a dis- cretized fie ld f is then just a high-dimensiona l volume int egral, with as man y dimensions as pixe ls : Z D f F [ f ] ≡ N pix Y i =1 Z d f i F ( f 1 , . . . , f N pix ) . (23) This definition of a finite-dimensiona l pa th integral is well normalized, since in ca se that we wan t to in tegra te over a probability distribution ov er f , whic h is separable for all pixels, P ( f ) = Q N pix i =1 P i ( f i ), as e.g . for white and Poissonian no ise, we find h 1 i ( f ) = Z D f P ( f ) = N pix Y i =1 Z d f P i ( f ) | {z } =1 = 1 . (24) Although, in rea l data - analysis applica tions, it is prac- tically never requir ed to p erform the contin uo us limit N pix → ∞ with V i → 0 for all i , we stress that this limit can for mally b e taken and is well defined even for the path integral, as we argue in more detail in Sec. IV B. The bas ic arg umen t is that suita ble sig nals could and should be defined in such a wa y that path-integral di- vergences, which pla gue sometimes QFT, c a n easily b e av oided b y sensible sig nal design. Practically , the ex- istence of a w ell-defined contin uous limit of a well-pose d IFT implies that tw o numerical implementations of a sig- nal reconstruction problem, which differ in their space discretisation on scales smaller than the structures of the signal, can b e exp ected to provide iden tical results up to a small discretisatio n difference, which v anishes with higher dis c retisation-reso lution. 2. Par ameter sp ac es In many applications, the sig nal spa ce is identified with the physical space or with the s pher e of the sky . How- ever, IFT ca n also b e done over par ameter spaces. In Sec. VI , a field theor y ov er the sphere will implicitly de- fine the knowledge state for an unknown parameter of that theory , which can b e regarde d again to define an information theor y for that parameter . The latter is an IFT in case tha t the parameter ha s spatial v a riations. How ever, there ar e also functions defined ov er a para m- eter space, Ω parameter = { p } fo r s ome pa rameter p , which one mig h t want to obtain knowledge o n fro m incomplete data. A v ery import o ne is the pr obability distribution of the pa r ameter g iv en the obser v a tional data, P ( p | d ), which defines our par ameter-knowledge state. This func- tion may only b e incompletely k no wn and therefore re- quire an IFT appr oach for its reco nstruction and in ter- po lation. Suc h inc o mplete knowledge on the function could be due to incomplete numerical sa mpling of its function v alues be c ause of large computational costs and 9 the huge volumes o f mult i-dimensional parameter spaces. Or, there might be another unknown nuisance parame- ter q in the pr o blem, which induces an uncerta int y in P ( p | d ) = P ( p | d ) and there fo re an IFT ov er a ll p ossible realizations of this knowledge state field function via P [ P ( p | d ) ] = Z D P ( p | d ) δ P ( p | d ) − Z dq P ( p, q | d ) . (25) In cas e that q is a field, the marginalis ation in tegra l in the delta functional also b ecomes a path-int egral. Prob- abilistic decision theo ry , bas e d on knowledge state as e x - pressed by proba bilit y functions o n parameter s, ha s to deal with such complications. F or inference dire c tly on p , and not on the k nowledge state P ( p | d ) , the marginal- ized probability P ( p | d ) = Z dq P ( p, q | d ) (26) contains all relev ant information, and that will be suffi- cient for most inference applications, a nd espe c ia lly for the o nes in this work. II I. BASIC FORMALISM A. Information Hamiltonian W e ar gued tha t the p oster ior P ( s | d ) contains a ll av a il- able infor mation on the signal. Although the p osterior might not b e e asily accessible mathematically , we assume in the following that the prior P ( s ) of the signal before the data is taken as well as the likeliho o d of the data given a sig nal P ( d | s ) are known o r at least ca n b e T aylor- F r´ ec het-expa nded aro und some reference field config ura- tion t . Then Bay es’s theorem per mits to express the p os- terior as P ( s | d ) = P ( d, s ) P ( d ) = P ( d | s ) P ( s ) P ( d ) ≡ 1 Z e − H [ s ] . (27) Here, the Hamiltonian H [ s ] ≡ H d [ s ] ≡ − log [ P ( d, s )] = − log [ P ( d | s ) P ( s )] , (28) the ev ide nc e of the data P ( d ) ≡ Z D s P ( d | s ) P ( s ) = Z D s e − H [ s ] ≡ Z , (29) and the pa rtition function Z ≡ Z d were in tro duced. It is extremely conv enient to include a moment generating function into the definition o f the partition function Z d [ J ] ≡ Z D s e − H [ s ]+ J † s . (30) This means P ( d ) = Z = Z [0 ], but also p ermits to calculate any moment o f the signa l field via F r´ ec het- differentiation of E q. 30, h s ( x 1 ) · · · s ( x n ) i d = 1 Z δ n Z d [ J ] δ J ( x 1 ) · · · δ J ( x n ) J =0 . (31) Of s pecia l imp ortance are the s o-called co nnected cor re- lation functions or cumulan ts h s ( x 1 ) · · · s ( x n ) i c d ≡ δ n log Z d [ J ] δ J ( x 1 ) · · · δ J ( x n ) J =0 , (32) which ar e corre c ted for the con tribution of lo w er mo- men ts to a co rrelator of order n . F or example, the con- nected mean and disper sion a r e expre s sed in terms of their unconnected counterparts as: h s ( x ) i c d = h s ( x ) i d , h s ( x ) s ( y ) i c d = h s ( x ) s ( y ) i d − h s ( x ) i d h s ( y ) i d , (33) where the last ter m repres e nts such a cor rection. F or Gaussian random fields a ll higher order co nnec ted corr e- lators v anis h: h s ( x 1 ) · · · s ( x n ) i c d = 0 (34) for n > 2. F or non-Gaussia n random fields, they are in gener al non-zero, and for later usage we pro vide the connected three- and four-p oint functions, h s x s y s z i c d = h ( s x − ¯ s x )( s y − ¯ s y )( s z − ¯ s z ) i d , h s x s y s z s u i c d = h ( s x − ¯ s x )( s y − ¯ s y )( s z − ¯ s z )( s u − ¯ s u ) i d − h s x s y i c d h s z s u i c d − h s x s z i c d h s y s u i c d − h s x s u i c d h s y s z i c d , (35) where we used s x = s ( x ) and defined ¯ s x = h s ( x ) i d . The assumption that the Hamiltonian can be T aylor- F r´ ec het expanded in the signa l field per mits to write H [ s ] = 1 2 s † D − 1 s − j † s + H 0 + ∞ X n =3 1 n ! Λ ( n ) x 1 ...x n s x 1 · · · s x n . (36) Repe a ted co ordina tes a re thought to be integrated over. The first three T aylor co efficients hav e sp ecial r oles. The constant H 0 is fixed by the nor malization condition o f the joint probability density of sig nal and data . If H ′ d [ s ] de - notes some unnor malised Hamiltonian, its nor malization constant is given by H 0 = log Z D s Z D d e − H ′ d [ s ] . (37) Often H 0 is irrelev ant unless different models or hyper- parameters are to be compa red. W e call the linear co efficient j informatio n sour c e . This term is usually directly and linearly related to the data . The qua dratic co efficient, D − 1 , defines the info r mation propaga tor D ( x, y ), which pr o pagates information o n the signal a t y to lo cation x , a nd thereby p ermits, e.g., to par- tially reco nstruct the signa l at lo cations where no data was tak en. Finally , the a nha rmonic tensors Λ ( n ) create int eractions b etw een the mo des of the free, ha rmonic the- ory . Since this free theor y will be the basis for the full int eraction theory , we fir st inv estiga te the ca se Λ ( n ) = 0. 10 B. F ree theory 1. Gaussian data mo del F or our s imples t data mo del we ass ume a Gaussian signal with prio r P ( s ) = G ( s, S ) ≡ 1 | 2 π S | 1 2 exp − 1 2 s † S − 1 s , (38) where S = h s s † i is the signal co v a r iance. The signal is assumed here to be pro cess ed by nature and o ur mea- surement de v ice ac c ording to a linear data mo del d = R s + n. (39) Here, the r espo nse R [ s ] = R s is linear in and the noise n s = n is indep endent of the signal s . The linea r r espo ns e matrix R of our instrument can c o n tain window a nd se- lection functions, blurring effects, and even a F ourier- transformatio n of the sig na l space, if our instrument is an in terfer ometer. Typically , the da ta-space is disc r ete, whereas the signa l spa ce may b e co n tin uo us. In that ca se the i -th data p oint is given by d i = Z dx R i ( x ) s ( x ) + n i . (40) W e assume, for the moment, but not in general, the noise to b e signa l-independent and Ga us sian, and there- fore distr ibuted as P ( n | s ) = G ( n, N ) , (41) where N = h n n † i is the noise cov aria nce matrix. Since the noise is just the differ e nce of the data to the signal- resp onse, n = d − R s , the likelihoo d of the data is g iv en by P ( d | s ) = P ( n = d − R s | s ) = G ( d − R s, N ) , (42) and thus the Hamiltonian o f the Gaussian theory is H G [ s ] = − lo g [ P ( d | s ) P ( s )] = − log [ G ( d − R s, N ) G ( s, S )] = 1 2 s † D − 1 s − j † s + H G 0 . (43) Here D = S − 1 + R † N − 1 R − 1 (44) is the propaga tor of the free theo r y . The informatio n source, j = R † N − 1 d, (45) depe nds linear ly on the data in a resp onse-ov er-noise weigh ted fashion and reads j ( x ) = X ij R ∗ i ( x ) N − 1 ij d j (46) in case of discrete data but contin uous signal spaces. Fi- nally , H G 0 = 1 2 d † N − 1 d + 1 2 log ( | 2 π S | | 2 π N | ) (47) has a bs orb ed a ll s -indep enden t nor malization constants. The partition function of the free field theory , Z G [ J ] = Z D s e − H G [ s ]+ J † s (48) = Z D s exp − 1 2 s † D − 1 s + ( J + j ) † s − H G 0 , is a Gaussian path integral, which can be calculated e x- actly , yielding Z G [ J ] = p | 2 π D | exp + 1 2 ( J + j ) † D ( J + j ) − H G 0 . (49) The explicit pa rtition function per mits to calcula te via Eq. 32 the exp ectation of the sig nal given the data, in the following called the map m d generated by the data d : m d = h s i d = δ log Z G δ J J =0 = D j (50) = S − 1 + R † N − 1 R − 1 R † N − 1 | {z } F WF d. The last expr ession shows that the map is g iv en by the da ta after applying a generaliz ed Wiener filter, m d = F WF d . The propa g ator D ( x, y ) describ es how the informatio n on the density field co n tained in the data at locatio n x pr opagates to position y : m ( y ) = R dx D ( y , x ) j ( x ). The connec ted autoc o rrelation of the signal given the data, h ss † i c d = D = S − 1 + R † N − 1 R − 1 , (51) is the propa gator itself. All higher co nnec ted co rrelation functions ar e zer o. Ther efore, the signal g iv en the data is a Ga us sian random field around the mean m d and with a v ar iance of the res idual erro r r = s − m d (52) provided by the pro pagator itself, a s a straightforward calculation shows: h rr † i d = h ss † i d − h s i d h s † i d = h ss † i c d = D . (53) Thu s, the p osterio r should b e simply a Ga us sian given by P ( s | d ) = G ( s − m d , D ) . (54) 11 As a test for the latter equation, we ca lc ulate the evidence of the free theory v ia P ( d ) = P ( d | s ) P ( s ) P ( s | d ) = G ( d − R s, N ) G ( s, S ) G ( s − D j, D ) = | D | / | S | | 2 π N | 1 2 exp 1 2 ( j † D j − d † N − 1 d ) , (55) which is indeed indep endent of s and also identical to Z G [0], as it should b e. 2. F r e e classic al the ory The Hamiltonian p ermits to ask for classic al equations derived fro m an extre mal principle. This is justified, on the one ha nd, as b eing just the r e s ult of a the s addle- po in t approximation of the exp onen tial in the pa rtition function. On the other hand, the extrema principle is equiv alent to the max im um a poster io ri (MAP) estima- tor, which is quite co mmonly used for the construction of signal-filters. An exhaustive intro duction into and dis- cussion of the MAP approximation to Gaussian and non- Gaussian signal fields is provided by Lemm [48]. The classica l theor y is exp ected to capture essential features of the field theory . How ever, if the field fluctua- tions are able to pr obe phase space regio ns aw ay from the maximum in which the Hamiltonian (or po sterior) has a more complex structure , deviations b et ween classica l and field theo ry should beco me appa ren t. Extremizing the Hamiltonian of the free theor y (E q. 43) δ H G δ s s = m = D − 1 m − j ≡ 0 (56) we g e t the classical ma pping equa tion m = D j , which is ident ical to the field theo retical result (Eq. 50). It is also p ossible to measur e the sharpness o f the max- im um of the po s terior by calculating the Hessian curv a- ture matrix H G [ m ] = δ 2 H [ s ] δ s 2 s = m = D − 1 . (57 ) In the Gaussian approximation of the maximum of the po sterior, the inv erse of the Hessian is identical to the cov aria nce o f the r esidual h r r † i = H − 1 [ m ] = D, (58) which for the pure Gaussian mo del is of course identical to the exa ct result, as g iven by the field theory (Eq . 53). IV. INTERACTING INFORMA TION FIELDS A. In te raction Hamiltonian 1. Gener al F orm All results of the free theory pre s en ted s o far ar e w ell- known within the field o f signal reco nstruction. IFT re- pro duces them elega n tly , and is therefore of peda gogical v alue. Howev er , the new results presented in the rest of this pa p er a rise as so on as one leaves the free theor y . Non-Gaussian signa l or noise , a non-linea r resp o nse, or a s ignal depe nden t nois e cr eate a nha rmonic ter ms in the Hamiltonian. These descr ibe interactions b etw een the eigenmo des of the free Hamiltonian. W e assume the Hamiltonian ca n be T aylor expanded in the signa l fields , which p ermits to write H [ s ] = 1 2 s † D − 1 s − j † s + H G 0 | {z } H G [ s ] + ∞ X n =0 1 n ! Λ ( n ) x 1 ...x n s x 1 · · · s x n | {z } H int [ s ] . (59) Repe a ted co ordina tes a re thought to be integrated over. In contrast to E q . 36 we hav e now included p erturba - tions which are c onstant, line a r and quadratic in the s ig- nal field, bec ause we ar e summing fro m n = 0. This per mits to trea t certain no n- ideal effects p erturbatively . F or e x ample if a mostly p osition- independent propa gator gets a small p osition dep enden t contamination, it might be mor e co n v enient to treat the latter p erturbatively and not to include it in to the pro pagator used in the calcula- tion. Note further, that all co efficients ca n b e a ssumed to be symmetric with resp ect to their co ordinate-indices. 6 Often, it is mor e conv enien t to work with a shifted field φ = s − t , where some (e.g. background) field t is remov e d fr o m s. The Hamiltonian of φ rea ds H ′ [ φ ] = 1 2 φ † D − 1 φ − j ′† φ + H ′ 0 | {z } H ′ G [ φ ] 6 This means D x y = D y x and Λ ( n ) x π (1) ...x π ( n ) = Λ ( n ) x 1 ...x n with π an y p ermutat ion of { 1 , . . . , n } , since eve n non-symm etri c coef- ficien ts would automatically be symmetrized by the int egration o v er all rep eated co ordinates. Therefor e, we assume in the fol- lowing that s uc h a symmetrization op eration has b een already done, or w e imp ose it by hand b efore we contin ue with an y per - turbativ e calculation by applying Λ ( n ) x 1 ...x n 7− → 1 n ! X π ∈ P n Λ ( n ) x π (1) ...x π ( n ) . This clearly leav es any symmetric tensor inv ar ian t if P n is the space of all per mutations of { 1 , . . . , n } . 12 + ∞ X n =0 1 n ! Λ ′ ( n ) x 1 ...x n φ x 1 · · · φ x n | {z } H ′ int [ φ ] , with H ′ 0 = H G 0 − j † t + 1 2 t † D − 1 t, (60) j ′ = j − D − 1 t, and Λ ′ ( m ) x 1 ...x m = ∞ X n =0 1 n ! Λ ( m + n ) x 1 ...x m + n t x 1 · · · t x n . 2. F eynman rules Since all the information o n any correlation functions of the fields is contained in the partition sum and can b e extracted from it, only the latter needs to b e calculated: Z [ J ] = Z D s e − H [ s ]+ J † s = Z D s exp " − ∞ X n =0 1 n ! Λ ( n ) x 1 ...x n s x 1 · · · s x n # e − H G [ s ]+ J † s = exp " − ∞ X n =0 1 n ! Λ ( n ) x 1 ...x n δ δ J x 1 · · · δ δ J x n # × Z D s e − H G [ s ]+ J † s = exp − H int [ δ δ J ] Z G [ J ] . (61) There exis t well known dia grammatic expansio n tec h- niques for such ex pr essions [e.g. 57]. The expansio n terms of the loga r ithm of the partition sum, from whic h any connected moments can be calculated, a re repr esen ted by all po ssible co nnec ted dia g rams build out of lines ( ), vertices (with a num b er of legs co nnecting to lines, like , , , , ...) and without a n y e xternal line- ends (any line ends in a vertex). Thes e diagr ams are int erpreted a ccording to the following F eynman rules: 1. Op en ends of lines in diag rams cor resp ond to ex- ternal co ordinates and are lab eled by such. Since the partition s um in particular do es not depend on any external co ordinate, it is ca lculated only from summing up closed diag rams. How ever, the field ex pectation v alue m ( x ) = h s ( x ) i ( s | d ) = d log Z [ J ] /dJ ( x ) | J =0 and higher or der cor relation functions dep end on coor dina tes and therefore ar e calculated from diagrams with o ne o r more op en ends, res p ectively . 2. A line with co ordinates x ′ and y ′ at its end repre- sents the pr opagator D x ′ y ′ connecting these lo ca- tions. 3. V ertices with one leg get a n individual internal, int egrated co ordinate x ′ and repres e nt the term j x ′ + J x ′ − Λ (1) x ′ . 4. V ertices with n legs repr esen t the term − Λ ( n ) x ′ 1 ...x ′ n , where each individual leg is labeled by one of the int ernal co ordina tes x ′ 1 , . . . , x ′ n . This more co m- plex vertex-structure, as compar ed to QFT, is a consequence of no n-loc alit y in IFT. 5. All in terna l (and ther efore rep eatedly o ccurr ing) co ordinates are integrated ov er, wher eas exter nal co ordinates are not. 6. Every diagram is divided by its symmetry factor, the n umber of permutations of vertex leg s leaving the top ology inv ar iant , a s describ ed in any b o ok on field theo ry [e.g. 57]. The n -th moment of s is g enerated by taking the n -th deriv ative of log Z [ J ] with resp ect to J , and then s e t- ting J = 0. This cor resp ond to remo ving n end-vertices from a ll diagra ms. F or e x ample, the first fo ur diagr ams contributing to a map ( m = h s i ( s | d ) ) a re = D j = D xy j y ≡ Z dy D ( x, y ) j ( y ) , = − 1 2 D Λ (3) [ · , D ] = − 1 2 D xy Λ (3) y zu D z u ≡ − 1 2 Z dy D xy Z dz Z du Λ (3) xy u D z u , = − 1 2 D Λ (3) [ · , D j, D j ] = − 1 2 D xy Λ (3) y uz D z z ′ j z ′ D uu ′ j u ′ (62) ≡ − 1 2 Z dy D xy Z dz Z du Λ (3) y zu × Z dz ′ D z z ′ j z ′ Z du ′ D uu ′ j u ′ , and = − 1 2 D Λ (4) [ · , D , D j ] = − 1 2 D xy Λ (4) y zuv D z u D vv ′ j v ′ ≡ − 1 2 Z dy D xy Z dz Z du Z dv Λ (4) y zuv D z u × Z dv ′ D vv ′ j v ′ . Here we have assumed that any first a nd second o rder per turbation w as absorb ed int o the data source and the propaga tor, thus Λ (1) = Λ (2) = 0. Rep eated indices are assumed to b e integrated (or summed) ov er. 3. L o c al inter actions and F ourier sp ac e rules In ca se of pur ely lo cal interactions Λ ( n ) x 1 ...x n = λ n ( x 1 ) δ ( x 1 − x 2 ) · · · δ ( x 1 − x n ) (63) 13 the interaction Ha milto nia n reads H int = ∞ X m =0 1 m ! λ † m s m (64) and the ex pressions of the F eynman dia grams simplify considerably . The fourth F eynman rule can be replaced by 4. V ertices with n lines connected to it are ass o cia ted with a single int ernal co ordinate x ′ and repres en t the ter m − λ n ( x ′ ). F or ex a mple, the last lo op diagra m in Eq 62 b ecomes = − 1 2 Z dy D xy λ 4 ( y ) D y y Z dz D y z j z . (65) In case of lo cal interactions, it can be helpful to do the calcula tions in F o ur ier space, for which the F eynman rules can be obtaine d by inserting a real-spa ce identit y op erator 1 = F † F in b etw een an y scalar pro duct and assigning the inv er se F ourier tra nsformation F † to the left and the forward transform F to the right term, e.g. D j = F † F D F † | {z } D ′ F j |{z} j ′ = F † D ′ j ′ . This yields: 1. An open end o f a line ha s an exter nal momentum co ordinate k , a nd gets an R dk e − i k x / (2 π ) n applied to it, if r eal space functions are to b e ev a lua ted. 2. A line connecting momentum k with momentum k ′ corres p onds to a dir ected propa gator b etw een these momenta: D kk ′ = D ( k , k ′ ). 3. A data source vertex is ( j + J − λ 1 )( k ′′ ), wher e k ′′ is the momentum at the data-end of the line. 4. A v e rtex with m > 1 lines with momen tum lab e ls k 1 , . . . , k m is − λ m ( k 0 )(2 π ) n δ ( P m i =0 k i ). 5. An in ternal end of a line ha s an internal (in- tegrated) momentum coo rdinate k ′ . In tegr ation means a term R dk ′ / (2 π ) n in front of the ex pres- sion. 6. The expr e ssion g ets divided by the symmetry factor of its diagra m. Here, j ( k ) = ( F j )( k ) = R dx j ( x ) e i k x , D ( k , k ′ ) = ( F D F † )( k , k ′ ) = R dx R dx ′ D ( x, x ′ ) e i ( k x − k ′ x ′ ) , etc. ar e the F our ier-transfor med information source, pro pagator, etc., r espec tiv e ly Note, that momentum dir ections hav e to b e taken into account. The momenta that go in to a vertex, data source or op en end get a po sitiv e sign in the delta-function o f momentum conser v a tion, the ones that go out of a v ertex get a minus sign. 4. Simplistic i nter action Hamil tonians In or der to ha ve a toy case, which p ermits a nalytic calculations, w e in tro duce a s implis tic Hamiltonia n b y requiring the data mo del to b e translationa l inv aria n t and all interaction ter ms to b e lo cal. This is the c a se whenever the signal and noise c ov arianc e s are fully char- acterized b y p ow er sp ectra ov er the sa me spatial s pace, S ( k , q ) = (2 π ) n δ ( k − q ) P S ( k ) , (66) N ( k , q ) = (2 π ) n δ ( k − q ) P N ( k ) , (67 ) with P s ( k ) = h| s ( k ) | 2 i /V , and P n ( k ) = h| n ( k ) | 2 i /V , where V is the volume of the system. W e a ssume further that the signal pro cessing ca n b e completely de s cribe d by a conv o lution w ith an instrumental b e am, d ( x ) = Z dy R ( x − y ) s ( y ) + n ( x ) , (68) where the resp onse-conv olution kernel has a F our ie r power sp ectrum P R ( k ) = | R ( k ) | 2 (no factor 1 / V ). In this ca se D ca n b e fully describ ed by a power sp ectrum: D ( k, q ) = (2 π ) n δ ( k − q ) P D ( k ) , (69) with P D ( k ) = ( P − 1 S ( k ) + P R ( k ) P − 1 N ( k )) − 1 . The lo cality of the interaction terms r equires λ m = const b eside translationa l in v a riance and therefore the int eraction Hamiltonian rea ds H int [ s ] = ∞ X m =1 λ m m ! Z dx s m ( x ) (70) = ∞ X m =1 λ m m ! m Y i =1 Z dk i (2 π ) n s k i ! (2 π ) n δ ( m X j =1 k j ) In that ca se, the F eynman rules simplify considera bly . F or the interaction Hamiltonian of Eq . 70, the F ey nman rules a re now: 1. unin tegrated x -co ordinate: exp( − i k x ) (if real space functions ar e to b e ev aluated), 2. propaga tor: P D ( k ), 3. data so urce vertex: ( j + J − λ 1 )( k ), 4. vertex with m > 1 lines: − λ m , 5. imply momentum conserv a tion at eac h v er tex : (2 π ) n δ ( P m i =1 k i )), and in tegrate over every internal momentum : R dk (2 π ) n , 6. and divide by the symmetry facto r. 5. F eynman rules on the spher e F or CMB reconstructio n and analysis, but pres umably also for terrestria l applications, the F eynman rules on the sphere Ω = S 2 are needed and therefore provided in Appendix B. 14 B. Normalisabil it y of the theory In contrast to QFT, IFT should b e pro p erly nor mal- ized and not nece s sarily require a n y reno r malization pr o- cedure. T he reaso n is that IFT is not a low-energy limit of so me unknown high-ener gy theo ry , but can b e set up as the full (high-energ y) theory . T he Hamiltonian is just the logar ithm of the joint proba bilit y function of da ta and signal, H d [ s ] ≡ − log [ P ( d, s )], a nd therefor e well defined and proper ly norma liz e d if the la tter is. Only if a d-ho c Hamiltonians are set up, or if a pproximations lead to ill- normalized theories, norma lization should be an issue. How ever, since we are trying a p erturbative expansion of the theory , there is no guara n tee that all individual terms are pro viding finite res ults. F o r example in QFT, simple lo op diagrams are k no wn to be divergent and re- quire renormalization. In the following we inv estigate a simplistic, but r epresentativ e case of IFT, which shows that s uc h problems are g enerally not to b e expe c ted. Let us adopt the simplistic situation described in IV A 4 a nd e stimate a s imple loo p dia gram for which we assume for no tational conv enience λ 3 = − 2 (2 π ) n λ ′ (with λ ′ > 0): = − 1 2 D λ 3 b D (71) = λ ′ Z dk Z dk ′ δ ( k + k ′ − k ′ ) P D ( k ) P D ( k ′ ) e ikx ≤ λ ′ P D (0) Z dk ′ P S ( k ′ ) = λ ′ V P D (0) h s 2 ( x ) i , where V is the volume of the system. Her e and in the following, b C denotes the dia gonal of the matrix C . Thu s, a s long the signal field is of b ounded v ar iance, the lo op diagram is conv ergent due to P D ≤ P S for all k . Even a signal o f unbounded v a riance would not lea d to a divergen t lo op dia gram if R dk ( P N /P R )( k ) is finite, since we also hav e P D ≤ P N /P R . A bo unded v aria nce signa l is very natura l, esp ecially in a cosmologica l setting. 7 Finally , since a signal as an infor ma tion field can b e chosen freely , we can define it to b e the filtered v ersion of the physical field (e.g. dark matter dis tribution or CMB fluctuations), so that only mo des of sufficiently bound v ariance are present in it. Since we hav e the freedom to chose information fields, which are mathematically well behaved, we can therefore ens ure co n v ergence of expres- sions. Although this is not a gener al pro of of normalisability of the theory , which is b eyond the sco pe of this pap er, it 7 The cosmological signal of prim ary i nterest, the i nitial den- sity fluctuations as reve aled b y the large-scale-s tr ucture and the CMB, is exp ected to exhibit a suppression of small- scale p ow er due to the free-streaming of dark matter particles b efore they be- came non-relativistic. Also the CMB temp erature fluctuations are da mped on small scales, due to f ree streaming of photons around the tim e of r ecom bination. should provide confidence in the well-b ehav edness of the formalism in sensible applications. The price to b e payed for this well-behavedness is the mor e co mplex str ucture of the pro pagator, which, in co mparison to QFT, ev en in s implis tic cas e s can b e non-analytica l a nd req uire nu- merical e v alua tion. C. Expansion around the cl assical solution 1. Gener al c ase The classica l solutio n of the Hamiltonian in Eq. 59 is provided by its minim um, δ H δ s x = D − 1 x y s y − j x + ∞ X m =1 1 m ! Λ ( m +1) x x 1 ...x m s x 1 . . . s x m = 0 . (72) This leads to the equa tion for the clas sical field s c l y = D y x j x − ∞ X m =1 1 m ! Λ ( m +1) x x 1 ...x m s c l x 1 . . . s c l x m ! , (73) which one can try to solve iteratively . 2. L o c al inter actions F or simplicit y , w e concentrate for a moment on the case of purely loca l interactions, for which the equation for the classica l fie ld s cl is s cl = D j − ∞ X m =1 λ † m +1 m ! s m cl ! . (74) Iterating this equation a nd r ewriting the resulting terms as F eynman diagrams shows that the class ic al solution contains the tree-diagr a ms. The lo op diagr ams ca n b e added by in v estigation of the non-classica l uncertaint y field φ = s − s cl . A non-clas sical expansion of the information field around the classical field is poss ible b y inserting s = s cl + φ in to the Ha miltonian (Eq. 64). Reorder ing terms according to the powers of the field φ leads to its Hamil- tonian H ′ [ φ ] ≡ H [ s cl + φ ] = H ′ 0 + 1 2 φ † D ′ − 1 φ − j ′† φ + ∞ X m =3 1 m ! λ ′ m † φ m , with λ ′ n ≡ ∞ X m =0 λ n + m m ! s m cl , (75) H ′ 0 ≡ H [ s cl ] = H 0 + 1 2 s cl † D − 1 s cl + λ ′ 0 , j ′ ≡ j − λ ′ 1 − D − 1 s cl , and D ′ ≡ ( D − 1 + c λ ′ 2 ) − 1 . 15 In case s cl is exa ctly the classical solution, E q s. 74 and 75 imply that j ′ = 0. Thus, there are no o ne-line inter- nal vertices in any F eynman-gr aphs of the φ -theory , and only lo op- dia grams con tribute uncertaint y- corrections 8 to a ny information theoretica l estimator. F or example, the uncertaint y-cor rections to the classical map estima- tor a re given b y δ m = m d − s cl = h φ i d (76) = + + + + + . . . How ever, in ca s e s cl is not (exactly) the cla ssical solution, may this due to a trunca tio n er r or o f an iter ation scheme to solve for the class ical field, o r may s cl be chosen for a completely different purp ose, Eq. 7 5 provides the cor rect field theory for φ = s − s cl independent of the natur e of s cl . In ca se o f a truncation error , incorp orating diagr ams with data- source terms j ′ int o a n y computation will p er- mit to cor rect the inac c ur acy of s cl in a systematic wa y . D. Boltzmann-Shannon Information 1. Helmholtz f r e e ener gy Information fields c a rry infor mation on distributed ph ysical quantities. The amount of s ignal-information should be measura ble in infor mation unit s lik e bits and bytes. This is p o ssible by adopting the Boltzmann- Shannon information measure of negative entrop y . The ent ropy o f a signa l probability function mea sures the phase-space volume av ailable for signal uncertainties, and therefore the constraintness of the remaining uncertain- ties. Th us we define I d ≡ Z D s P ( s | d ) log P ( s | d ) = − Z D s 1 Z e − H [ s ] ( H [ s ] + log Z ) = −h H [ s ] i d − log Z . (77) as the information measure. Int ro ducing Z β [ d, J ] = Z D s exp − β ( H [ s ] − J † s ) , and F β [ d, J ] = − 1 β log Z β [ d, J ] , (78) 8 W e prop ose the term uncertainty-c orr e c tions i n order to describ e the influence of the spread of the probabili t y distribution func- tion around its maximum. The uncertaint y-corrections are the information field theoretical equiv alent to quan tum- corr ections in quan tum field theories. of which the latter is the Helmholtz free energy as a func- tion of the in verse temp erature β , we can w r ite I d = − log Z 1 [ d, 0] − h H [ s ] i d = − ∂ F β [ d, J ] ∂ β β =1 , J =0 , (79) as can be verified b y a dir ect calculation. The first ex- pression for I d in E q. 79 is equiv a le n t to the well known thermo dynamic r elation F = E − T S B with the internal energy E = h H [ s ] i d , the Boltzmann entrop y S B = − I d and the temp erature, w hich is set here to T = 1. The sec- ond expression actually holds even if the Hamiltonian is improp erly no rmalized, e.g. H 0 can be chosen ar bitrarily if Z β [ d, J ] is calculated c onsistent ly with this choice. The Helmholtz free ene r gy F β [ J ] is also the genera - tor of all co nnected cor r elation functions of the signal h s x 1 · · · s x n i c ( s | d ) = − δ n F β [ d, J ] /δ J x 1 · · · δ J x n | β =1 , J =0 . It can b e calculated a s follows: F β = − 1 β log Z G β [ J ] Z G β [ J ] Z D s e − β H int [ s ] e − β ( H G [ s ] − J † s ) ! = − 1 β log Z G β [ J ] − 1 β log D e − β H int [ s ] E ( s | J + j, G ) , (80) where the average in the last ter m is o ver the Ga us- sian pro babilit y function P G J,β [ s ] = exp( − β ( H G [ s ] − J † s )) / Z G β [ J ]. This term can b e calculated by using the well-kno wn fact that the lo g arithm of the sum of all po s- sible connected and unconnected diagrams with only in- ternal co ordina tes (or without free ends ), as genera ted by the exp o nen tial function o f the interaction terms, is given by the sum of all connected diagrams [57]. F or example, a free theory , p erturb ed by small, up-to- fourth- order interaction ter ms (all b eing pr opo rtional to so me small par ameter γ ), has F β [ J ] = H G 0 + Λ (0) | {z } H 0 − β − 1 + + + + + + + + + O ( γ 2 ) , (81 ) where an infor ma tion sourc e vertex reads β ( J + j − Λ (1) ), an internal vertex with n lines β Λ ( n ) , and the pr o pagator β − 1 D . Finally , we hav e defined = 1 2 log | 2 π D β − 1 | = 1 2 T r(log(2 π D β − 1 )) . Thu s, we hav e F β [ J ] = H 0 − 1 2 β T r(log(2 π D β − 1 )) + 1 2 β Λ (2) [ D ] + 1 2 ( J + j − Λ (1) ) † ( D + Λ (2) ) ( J + j − Λ (1) ) + 1 2 β Λ (3) [ D , m J ] + 1 3! Λ (3) [ m J , m J , m J ] 16 + 1 8 β 2 Λ (4) [ D , D ] + 1 4 β Λ (4) [ D , m J , m J ] + 1 4! Λ (4) [ m J , m J , m J , m J ] + O ( γ 2 ) , (82) where we in tro duced the zero-o rder map m J = D ( J + j ) for no tational conv enience. The p ow er of β asso ciated with the different diagr ams in Eq. 81 is given by the nu m ber of vertices min us the num ber of pro pagators mi- nu s o ne. Thus, all tree-diagra ms a re of order β 0 , the one-lo op diagr ams ar e o f order β − 1 and the tw o lo op di- agram of order β − 2 , and only the latter tw o a ffect the information c on ten t: I d = − " 2 + + + + + # + O ( γ 2 ) = 1 2 h − T r(1 + lo g(2 π D )) + Λ (2) [ D ] + Λ (3) [ D , m 0 ] + 1 2 Λ (4) [ D ⊗ ( D + m 0 m † 0 )] + O ( γ 2 ) , (83) where = T r(1), β = 1, J = 0 , and th us m 0 = D j . 9 2. F r e e the ory T o obtain the information co n tent of the free theor y , we can set γ = 0 in E qs. 82 and 83 or use Eq. 49 with the repla cemen ts J → β J , j → β j , D → β − 1 D , and H 0 → β H 0 . In b oth cases we find identically F β [ J ] = H G 0 − 1 2 ( J + j ) † D ( J + j ) − 1 2 β T r log 2 π β D , and I d = − 1 2 T r (1 + log (2 π D )) . (84) V ery similarly , o ne ca n calculate the information prior to the da ta, which turns o ut to b e I 0 = − 1 2 T r ( 1 + log (2 π S )) . (85) Thu s, the data-induced information g ain is ∆ I d = I d − I 0 = 1 2 T r log S D − 1 = 1 2 T r log 1 + S R † N − 1 R . (86) The information gain dep ends on the s ig nal-resp onse- to-noise ratio Q ≡ R S R † N − 1 , als o shortly denoted by 9 Here, we introduced the symmetrized tensor pro duct A ⊗ B of an n -rank tensor A and an m -rank tensor B , which has the prop ert y ( A ⊗ B ) x 1 ...x n + m = 1 m ! X π ∈ P n + m A x π (1) ...x π ( n ) B x π ( n +1) ...x π ( n + m ) , with P l being the set of p erm utations of { 1 , . . . , l } . the measure ment fidelity or quality . The informatio n in- creases linearly with Q as lo ng as Q ≪ 1, but levels off to a logarithmic increase for Q ≫ 1. W e note, tha t for the free theor y only the informatio n gain do es not de p end on the actual data rea lization. E. IFT Recip e A typical IFT application will aim at calculating a mo del ev idence P ( d ), the ex pectation v alue of a sig nal given the data, the map m ( x ) = h s ( x ) i ( s | d ) of the sig- nal, o r its v ar iance σ 2 s ( x, y ) = h ( s ( x ) − m ( x )) ( s ( y ) − m ( y )) i ( s | d ) as a measure of the signal uncerta int y . The general recip e for such applications can b e summar ized as following: • Specify the signal s and its prior pro babilit y distri- bution P ( s ). If the signa l is derived from a physical field ψ , of which a pr ior sta tistic is k no wn, the dis- tribution o f s = s [ ψ ] is induced accor ding to E q. 2. • Specify the data mo del in terms of a lik eliho o d P ( d | s ) conditioned on s . Again, if the da ta ar e related to an underlying ph ysical field ψ , the lik e- liho od is given by E q. 4. • Calculate the Hamiltonia n H d [ s ] = − log( P ( d, s )), where P ( d, s ) = P ( d | s ) P ( s ) is the joint probabil- it y , a nd expa nd it in a T aylor-F r´ echet ser ies for all degrees of freedom of s . Identify the co efficients o f the constant, linear , quadra tic, and n th -order terms with the normaliza tion H 0 , informa tion s ource j , inv erse propagator D − 1 , and n th -order interaction term Λ ( n ) , resp ectively , as s ho wn in Eq. 3 6 or 59. • Draw all diagr ams, whic h contribute to the quan- tit y of interest, co nsisting of v ertices, lines , a nd op en-ends up to so me o rder in complexity or some small or dering parameter. The log-evide nc e is given by the sum of all connected dia g rams w itho ut op en ends , the exp ectation v a lue of the signal b y all c o nnected diagrams with one ope n end, a nd the signal-v ar iance aro und this mean b y all connected diagrams with tw o op en ends. • Read the diag rams as co mputational alg orithms sp ecified b y the F eynma n rules in Sect. IV, and implemen t them by using linear algebra pack ages or existing map- making co des for the information propaga tor and vertices. The r equired discretisa- tion is outlined in Sect. I I E 1. Information on how to implemen t the required matrix inv ersio ns e ffi- ciently can b e found in the litera ture given in Secs. I C 2, I C 4, and I C 5 and esp ecially in [38]. • If the resulting non-linear data transformation (or filter) has the required accura cy , e.g. to b e v erified via Monte-Carlo simulations using sig nal and data 17 realizations dr awn from the prior and likelihoo d, resp ectively , an IFT algorithm is established. • In case that to o lar ge interaction terms in the Hamiltonian pr ev ent a finite num b er of diag rams to form a w ell p e r forming alg orithm, a re-summation of high order terms is due. This ca n b e achiev ed by the saddle po in t approximation (classica l so lutio n, maximum a po steriori estimator), or even b etter b y a deta iled renor malization-flow analysis along the lines outlined in Sect. V F. V. COSMIC LARGE-SCALE STRUCTURE VIA GALAXY SUR VEYS A. Poissonian data m odel and Hamiltonian Many datasets suffer from Poisson noise, which is non- Gaussian and signal dep endent, and there fore well suited to test IFT in the no n-linear regime. F or example, the cosmolog ical LSS is tra ced by gala x ies, which may b e assumed to b e genera ted by a Poisson pro cess. On la rge- scales, the e xpectatio n v alue of the galaxy density fol- lows that of the under lying (dark) matter distribution. The aim of cos mography is to recov er the initial den- sity field fro m the shot-noise contaminated gala xy da ta. Currently , large ga laxy surveys are conducted in order to c har t the cosmic matter distribution in three dimen- sions. Impro v ing the galaxy based LSS reconstr uction techn iques and understanding their uncertainties better is there fore an imminent a nd imp ortant goal. Optimal techn iques to recons truct Poissonian-nois e a ffected sig- nals are a lso crucia l for other problems, since e.g. imag - ing with photon detecto rs plays an imp ortant role in a s- tronomy a nd other fields. Her e, w e outline how suc h problems ca n be tr eated, by dis c ussing a sp ecific data mo del motiv ated by the problem of large-sca le-structure reconstructio n fr om gala xies. F or this problem we work out the optimal estimato r a nd show its superio rit y nu- merically . A more genera l dis cussions o f mo dels of galax y and structure formation and references to r e lev ant works was given in Sect. I C 4. In or der to treat the Poissonian case in a co nvenien t fashion, we sub divide the ph ysica l space into s ma ll cells with v olumes ∆ V , a nd assume that a cell loca ted a t x i has a n exp ected num b er of o bserved ga laxies µ i ≈ κ (1 + b s ( x i )) (87) with κ = ¯ n g ∆ V b eing the co smic average n um ber of galaxies p er cell and b b eing the bias of the galaxy over- density with resp ect to the da rk matter ov er densit y s , still as s umed to b e a Gaussian r andom field (Eq. 38). How ever, this data mo del has tw o sho r tcomings. First, to o ne g ative fluctuations o f the Gaussian random field with s < − 1 lea d to neg ativ e exp ectation v alues , for which the Poissonian statistics is not defined. Second, the mea n density of observ able galaxies κ and their bias parameter b are constan t everywhere, whereas in r ealit y bo th exhibit spatial v ariatio ns. 10 Although being now spatially inhomog eneous, we ass ume κ and b to b e known for the moment and to inco rpo rate a ll ab ov e observ a - tional effects. T o cure the ab ov e mentioned sho rtcomings we r e place Eq. 87 by a non-linea r and no n-translational inv ar ian t mo del: µ i = κ ( x i ) ex p( b ( x i ) s ( x i )) , (88) where κ and b may depend on pos ition in a known wa y , and the unknown Gaussia n field s , the log-ma tter density , may exhibit unres tricted negative fluctuations. Note that µ is the signal resp onse, b y our definition in Eq. 10, since µ [ s ] = h d i ( d | s ) . W e call κ the zer o-r esp onse , since µ [0] = κ . It should b e stres s ed that the data mo del in E q . 88 is just a convenien t choice for illustration and pr oo f-of- concept purpose s , a nd is ea sily exchangeable with more realistic, and even non-lo cal data mo dels. Howev er, this log-nor mal data mo del w as orig ina lly prop osed by Cole s and Jones [212], inv es tig ated for constra ined realiz ations by Sheth [1 07] a nd Vio et al. [21 3 ] and se ems to repro duce the statistics of LSS sim ulations muc h b etter than the often us e d normal distribution of the ov erdens ity [214]. Having c hosen a Poissonian pro cess to po pulate the Univ erse and our obse r v a tional data with galaxies ac- cording to the under lying log -densit y field s , the likeli- ho od is P ( d | s ) = Y i µ d i i d i ! e − µ i (89) = exp ( X i [ d i log µ i − µ i − log( d i !)] ) , where d i is the actual num b er of galax ies obser v ed in cell i . Since P ( s ) = G ( s, S ), the Hamiltonian is given by H d [ s ] = − log P ( d, s ) = − log P ( d | s ) − log P ( s ) = − d † b s + κ † exp( b s ) + H ′ 0 + 1 2 s † S − 1 s = 1 2 s † D − 1 s − j † s + H 0 + ∞ X n =3 1 n ! λ † n s n , with D − 1 = S − 1 + d κ b 2 , (90) 10 Suc h v ariations are due to the geometry of the observ ational survey s ky c o v erage, due to a galaxy selection funct ion which decreases with distance from the observ er, and due to a chan g- ing comp osition of the galaxy p opulation. The l atte r distance- effects are caused by the cosmic evolution of galaxies and by the c hanging observ ational detectabilit y of the different types with distance. W e note, th at an observed s ampl e of galaxies, which wa s selected determinis tically or sto c hastically fr om a complete sample e.g. b y their luminosity due to instrumental sensitivity , still p ossesses a Poissonian statistics, if the or i ginal distribution does. 18 j = b ( d − κ ) , H 0 = 1 2 log( | 2 π S | ) + ( κ + log( d !)) † 1 − d † log κ, and λ n = κ b n . The hat on a scalar field denotes that it should b e read as a matrix, which is diagonal in p osition spa c e (see Ap- pendix A). A few remar ks should b e in or der. Compar- ing the pr opagator to the one of our Gaussian theory one can re a d o ff an inv ers e no ise term M = R † N − 1 R = d κ b 2 . Thu s the effectiv e (inv e r sely resp onse weight ed) noise de- creases with incr easing mea n galaxy nu m be r and bias, and seems to b e infinite in regions w itho ut da ta ( κ = 0) without ca using any problem for the formalism. The informatio n source j increas es with increasing re- sp onse (bias) of the data (galaxies) to the signa l (density fluctuations). Ho wever, it certainly v anishes for ze r o r e- sp onse ( b = 0) or in case that the observed galax y counts match the expe c ted mean at a given lo ca tio n exa ctly . Fi- nally , the int eraction terms λ n are lo cal in p osition space, and v a nish with decrea sing b and κ . The latter para m- eter is under the control of the data analyst, since it is prop ortional to the volume of the individual pix e l sizes, and therefore ca n b e made a rbitrarily sma ll by choo sing a more fine gr ained reso lution in sig nal spa ce. Howev er , this would not c ha nge the con vergence prop erties of the series since any int eraction vertex has then to b e summed ov er a cor resp ondingly larg er num b er of pixels within a coherence pa tch of the signal, which exac tly co mpensates for the smaller co efficient. 11 The bias, in contrast, is set b y nature and can be re garded a s a p ow er co unting parameter, which provides naturally a numerical hierar- ch y among the higher order vertices and diagra ms for b 2 S < 1. Note tha t j = O ( b ). B. Galaxy types and bias v ari ati ons Real galaxies can be cast in to different classes , whic h all differ in terms of their luminosities, bia s facto rs, and the frequencies with which they are found in the Uni- verse. Although we are not going to in vestigate this complication in the following, it should b e explained here how a ll the formulae in this section can eas ily be r ein ter- preted, in order to incorp ora te also the different classes of g alaxies. The galaxie s can b e characterized by a type-v a r iable L ∈ Ω type , whic h may b e the intrinsic luminosity , the morpholog ical gala xy type, or a multi-dimensional com- bination of all prop erties whic h deter mine the g alaxy 11 κ seems to control the stiffness of the later introduced resp onse renormalization flo w equation and its v al ues is therefore n umeri- cally r elev ant. A low er κ , due to a finer space pixelisation, results in a less stiff and better b eha v ed equation. t yp e ’s spatial distributions via a L -dependent bias b L , and their detectability as enco ded in κ L . The da ta space is now spa nned by Ω data = Ω space × Ω type , a nd also µ , κ and b can be reg arded as functions ov er this spa ce. Performing the s a me algebr a as in the previous sectio n, just taking the larger da ta-space into a ccount, w e get to exactly the sa me Hamiltonian, as in E q. 90, if we int erpret an y term containing d , κ a nd b to b e summed or integrated ov er the t ype v ar iable L . Th us, we read j ( x ) = ( b ( d − κ )) ( x ) ≡ Z dL b L ( x ) ( d L ( x ) − κ L ( x )) , D − 1 xy = S − 1 + d κ b 2 xy ≡ S − 1 xy + 1 xy Z dL κ L ( x ) b 2 L ( x ) , λ n ( x ) = ( κ b n ) ( x ) ≡ Z dL κ L ( x ) b n L ( x ) , and (91) µ [ s ]( x ) = κ e b s ( x ) ≡ Z dL κ L ( x ) e b L ( x ) s ( x ) = Z dL µ L [ s ]( x ) , which all live in Ω space solely , so tha t the computa- tional c o mplexit y of the matter distr ibution r econstruc- tion problem is not a ffected at all, and o nly a bit mor e bo ok -k eeping is r equired in its setup. A few observ a tions s hould be in o rder. In cas e of all galaxies having the s ame bias factor, Eq . 91 is s imply a marg inalization o f the type v ar iable L , and any dif- ferentiation of the v arious galaxy types is not necess ary . Since all known ga laxy t ype s seem to hav e b ∼ O (1), such a mar ginalization seems to b e justified, and ex- plains why LSS reconstructions, which applied this s im- plification, are relatively succe s sful, although the differ- ent galaxy mas ses, luminosities, and frequencies v a ry by orders of magnitude. As o ur n umerical experiments b e- low r ev eal, the data, and ther efore the recons tr uctabilit y of the de ns it y field, exhibit a sensitive dep endence on the bia s for s -fluctuatio ns with unity v ar iance. 12 Such a v ariance is indeed o bserved on scales b elow ∼ 10 Mp c in the gala xy distributio n, and therefore the g alaxy t yp e- depe ndent bias v aria tion do es indeed matter . Larger galaxies , which hav e larger biases, therefore provide p er galaxy a slightly larg e r infor mation sour ce ( j ∝ b ), less shot noise ( R † N − 1 R ∝ b 2 ), and increas ingly larger higher-or der interaction ter ms ( λ n ∝ b n ) in comparison to sma lle r galaxies . How ever, smaller g alaxies are muc h more numerous by orders o f magnitude, a nd therefore provide the larg est total contribution to the information source, noise re ductio n and most low-order interaction terms. Thus, the latter will dominate and therefore p er- mit a reaso nable accur ate matter reconstruction from an inhomogeneous ga laxy survey using a sing le bias v alue. Nevertheless, improv emen ts of the bias trea tmen t a re po ssible by applying the recip es describ ed her e . 12 This is f ound for our sp ecific data mo del µ ∝ exp( b s ), how ev er, should also apply for other mo dels, which somehow hav e to k eep µ ≥ 0 ev en for b s < − 1 19 C. Non-linear map mak ing The map, the expe c tation o f our information field s given the data , is to the low est o r der in interaction m 1 = + + + + O ( b 6 ) = D xy j y − 1 2 D xy b 3 y κ y D y y − 1 2 D xy b 3 y κ y ( D y z j z ) 2 − 1 2 D xy b 4 y κ y D y y D y z j z + O ( b 6 ) (92) or in compac t notation m 1 = m 0 − 1 2 D d b 3 κ b D + m 2 0 + c b D b m 0 + O ( b 6 ) . (93) It is appar e n t, that the non- linear map making for m ula contains corr e ctions to the linea r map m 0 = D j . The first tw o correctio n terms are alwa ys negative, r eflect- ing the fact that o ur non-linear data mo del has no n- symmetric fluctua tions in the data with resp ect to the mean. The la st corre ction ter m is opp ositely dir ected to the linear map, thereby cor recting for the curv ature in the sig nal resp onse. A one-dimensio nal, n umerical example is display ed in Fig. 1. There, the s ig nal w a s re a lized to have a power sp ectrum P s ( k ) ∝ ( k 2 + q 2 ) − 1 , with a cor relation length q − 1 = 0 . 04. The normaliza tion was chosen s uc h that the auto-cor relation function is h s ( x ) s ( x + r ) i ( s ) = exp( −| q r | ) and therefore the signa l disp ersion is unity , h s 2 i ( s ) = 1. The da ta a re ge nerated b y a Poissonian pro cess from κ s = κ exp( b s ) with b = 0 . 5 . All three dis - play ed reconstructions exhibit le s s p ow er than the orig- inal s ig nal, as it is exp ected s ince the r e c onstruction is conserv ative, and therefor e biased tow ards ze ro. The non-linear corr ection to the naive map m 0 should not b e to o lar ge, other wise hig her order diagr ams have to be included. In the case displayed in Fig. 1, b = 0 . 5 en- sured tha t the linear cor r ections w ere mo stly going in to the rig h t direction. Howev er, in case b ≈ 1 there is no obvious ordering of the imp ortance of the different inter- action vertices, and num erical exp eriments reveal that the first order cor rections s trongly ov erco rrect the linea r map m 0 = D j . In s uc h a case interaction re-summation techn iques should be used to inco rpo rate as many hig her order interaction terms as p ossible. One very powerful re-summation is provided by the class ic al solution, as de- veloped below, which contains all tree-diagr a ms s imulta- neously . This solution, also show in Fig. 1, is very c lose to m 1 in this case . D. Classi cal solution The classical signal field or MAP so lutio n is giv en by Eq. 7 4, which reads in this case s cl = D j − ∞ X m =2 b m +1 m ! κ s m cl ! = D b d − κ e b s cl − b s cl (94) = S b ( d − κ e b s cl | {z } κ s cl ) . The la st expr ession motiv ates to introduce the exp ected nu m ber of galaxie s given the signal s : κ s = κ e b s . (95 ) Also alternative forms of the MAP equation can be de- rived, for example one, which is esp ecially suitable for large j : s cl = 1 b log j − S − 1 s cl κ b = 1 b log d κ − 1 − S − 1 s cl κ b . (96) This may b e solved iteratively , while ensuring that s ( i ) cl ≤ S j at all itera tions i with equality only where κ = 0. This form o f the classical field eq uation has some similarities to the na iv e in version of the resp onse formula, h d i ( d | s ) = κ exp( b s ), which yields s naive = 1 b log d κ , (97) a formula o ne can only dar e to use in regimes of lar ge d . Since s naive contains the full noise o f the data, a suitable naive map may b e g iv en by m naive = S s naive , after some fix for the lo cations without g a laxy counts. The clas- sical solution, howev er , is mo r e conse r v a tiv e than this naive data inv ersio n, in that there is a damping term, S − 1 s cl / ( κ b ), co mpensating a bit the influence o f to o large data p oin ts. Those equations p ermit to calc ula te the classical solu- tion if suitable numerical regulariza tion s c hemes a re a p- plied, since na iv e iterations ca n easily lead to n umerica l divergences in the non-linear case. One way of do ing this is by turning the clas s ical equa- tion (Eq. 94) in to a dynamica l system. Its initial con- ditions a re given by a well solv able linear or even triv- ial problem to which non-linear complications a re added successively during a n interv al of some pseudo -time. The endpo in t of this dynamics is then the required solution. The meaning of the pseudo-time depends on the wa y it was set up. In any case, it ca n just b e regar ded as a math- ematical tric k to generate a differential equa tion, which might b e e a sier to s o lv e numerically than the or iginal problem. F or example, a pseudo -time τ can be intro duced by setting j ( τ ) = τ j . Th us, the information sour c e is successively injected into an initially trivia l field state, s cl (0) = 0. This allows to set up a differential equation for s cl ( τ ) by taking the time der iv a tiv e o f Eq. 94, ˙ s cl = D s cl j with D s cl = S − 1 + κ s cl b 2 − 1 , ( 98) which has to b e solved fo r s cl (1) starting from s cl (0) = 0. This equation is very app ealing, since it lo oks lik e Wiener-filtering a n incoming infor mation strea m j and 20 1 10 0.2 0.6 0.8 -1 0 1 2 3 4 0.2 0.6 0.8 -1.5 -1 -0.5 0 0.5 1 1.5 0 0.2 0.4 0.6 0.8 1 P S f r a g r e p l a c e m e n t s data d signal response µ zero response κ ± b D 1 / 2 m ± b D 1 / 2 0 ∆ m = s − m m a p m T = 1 ∆ s c l = s − s c l m a p m T = 0 . 5 W iener map m 0 correted map m 1 classical map s cl m a s k signal s FIG. 1: Poiss onian-reconstruction of a signal with unit v ariance and correlation length q − 1 = 0 . 04, observ ed with sligh tly non-linear resp onse ( b = 0 . 5, resolution: 513 pixels p er unit length, zero-signal galaxy density: 1000 galaxies p er unit length). T op: data d , signal resp onse µ , and zero-resp onse κ . Middle: signal s , linear Wiener-fi lter reco nstruction m 0 = D j , its one-sigma error in terv al m 0 ± b D 1 / 2 , next order reconstruction m 1 according to Eq. 92, and classical solution s cl according to Eq. 94. Although the linear Wiener is reconstructing well at most lo cations, t he n onlinear response requires the p erturb ativ e corrections present in m 1 or the classical solution in regions of high signal strength. Bottom: The residuals, th e d eviations of m 0 , m 1 , s cl from th e signal, and the Wiener-v ariance ± b D 1 / 2 . accumulating the filtered da ta, while sim ultaneously tun- ing the filter D s cl ( τ ) to the accumulated knowledge on the sig nal s cl ( τ ) and thereby implied Poissonian-noise structure. Thus, it is a nice example system for co n tin- uous Bayesian learning and also illus trates how differen t datasets can successively b e fused into a single knowledge basis. Map-making a lgorithms with a higher fidelit yare p os- sible b y not only inv estiga ting the maximum of the po s- terior, but by av era ging the signal s over the full s upport of P ( s | d ). Anyhow, we c a n as sume that a go o d a pprox- imation t ≈ s cl to the clas sical solution can be achieved. Figs. 1 and 2 displa y cla s sical solutions for slightly and strongly no n-linear Poissonia n inferenc e problems. E spe- cially the second example sho ws that the class ical solu- tion can b e improv ed in reg ions of lar ge uncer tain t y (see region b et ween x = 0 . 2 and 0 . 5 in Fig. 2, where ap- parently better estimato rs exist) for missing uncertaint y lo op diagrams , whic h contain informatio n abo ut the non- Gaussian structure o f the p osterior P ( s | d ) awa y from s cl . 21 0.01 0.1 1 10 100 0.2 0.6 0.8 -2 -1 0 1 2 0.4 0.6 0.8 -1 -0.5 0 0.5 1 0 0.2 0.4 0.6 0.8 1 P S f r a g r e p l a c e m e n t s data d signal response µ zero response κ ± b D 1 / 2 m ± b D 1 / 2 0 ∆ m = s − m map m T = 1 ∆ s c l = s − s c l map m T = 0 . 5 map m T = 0 m a s k signal s FIG. 2: Poissonian-reconstruction of the same signal realization as in Fig. 1 (unit v ariance and correlation length q − 1 = 0 . 04), observed now with a strongly non-linear resp onse ( b = 2 . 5, resolution: 512 pixels p er unit length, zero-signal galaxy density: 100 galaxies per unit length where mask is one) through a complicated mask. T op: data d , signal response µ , and zero-resp onse κ . Middle: signal s , classi cal solution s cl = m T = 0 , in t ermediate solution m T = 0 . 5 and renormalization-based reconstruction m T = 1 with un certain ty interv al m T = 1 ± b D 1 / 2 T = 1 , and mask κ/ ( n g ∆ V ). The linear Wiener-filter reconstruction m 0 as well as its next order corrected versi on m 1 are not display ed , since they are partly far ou tside the display ed area. Bottom: Deviations of the three reconstructions from th e signal, and t he original and the renormalized uncertaint y estimates ± b D 1 / 2 0 and ± b D 1 / 2 T = 1 , respectively . Note, that in the regions with many observe d galaxies, the h igh signal to n oise ratio can b e seen in th e narro wness of b D 1 / 2 T = 1 , whic h is significantly smaller than the data-u naffected b D 1 / 2 0 at these locations. E. Uncertaint y-lo op corrections Now, w e see ho w the miss ing uncertaint y loo p cor- rections can b e added to the class ical solutio n. These correctio ns ca n be derived from the Hamiltonian of the uncertaint y-field φ = s − t , H t [ φ ] = 1 2 φ † D − 1 t φ − j † t φ + κ † t g ( b φ ) + H 0 ,t , wher e D − 1 t = S − 1 + b 2 b κ t , j t = b ( d − κ t ) − S − 1 t, (99 ) g ( x ) = e x − 1 − x − 1 2 x 2 = ∞ X m =3 x m m ! , and H 0 ,t is a momen tar ily irrelev ant normalization c o n- stant. Again, we have p ermitted for a non-zer o j t , since t might not b e exa ctly the classical solutio n. It is interesting to no te that the interaction co efficients in this Hamiltonian, λ ( m ) t = κ t b m , all reflect the e x pected 22 nu m ber o f gala xies given the reference field t . Thus, the replacement κ 0 → κ t would provide us with the shifted field Hamiltonian, as defined in E q. 60, ex cept for the term − S − 1 t in j t . It turns out, that this term is some sort of co un ter -term, which accumulates the effect of the non-linear interactions. W e see tha t effectiv e interaction terms arise when r ele- v ant parts of the s olution ar e a bsorb ed in the background field t . A similar appro ach is des irable for the lo op di- agrams . Ins tead o f dra wing and calcula ting a ll possible lo op diagrams, w e wan t to abso rb several o f them sim ul- taneously into effective co efficients. F or each vertex o f the Poissonian Hamiltonia n with m legs, there exist dia- grams in a n y F eynman-expans ion, in whic h a num b er of n simple lo ops a re added to this vertex. Such an n -lo op enhanced m − v ertex is given b y = − 1 2 n n ! λ ( m +2 n ) t c D t n = − 1 2 n n ! κ t b m +2 n c D t n . (1 00) All thes e diagra ms c an b e re-summed into an effective int eraction v ertex, via λ ( m ) t → λ ′ ( m ) t = κ t b m ∞ X n =0 1 2 n n ! b 2 n b D n = κ t exp b 2 2 b D b m (101) = κ t + b 2 2 b D b m = λ ( m ) t + b 2 2 b D . Thu s, this re-summation is effectively equiv alent to the replacement κ t → κ t + b b D/ 2 , (102) which reflects the large r exp ected resp onse to a re fer - ence field t due to the uncertaint y fluctuations around it. Those fluctuations pic k up the a symmetric shap e of the exp onen tial term in the Hamiltonian, wher e the larger re - sp onse to p ositive fluctuations is not fully co mpensated by the low er resp onse to nega tiv e fluctuations. One might wonder, if the s imple replacement rule in Eq. 102 co uld supplement the classica l so lution with the missing un- certaint y loop corrections. Thus w e ask , if the mo dified classical eq ua tion m = b S ( d − κ m + b b D/ 2 ) (103) together with a self-constitently determined propagato r D − 1 = S − 1 + b 2 b κ m + b b D / 2 (104) could pr o vide the mean fie ld given the data. A more rigoro us reno r malization calculation will show that this is indeed the case, within some approximation. The lo op-corr ected density and propagator per mit to construct es timators for the dark matter density itself, = 0 e c s , (105) instead o f its logarithm, s . Here c fixes the re lation b e- t ween s and , a nd 0 being the c o smic median dar k matter de ns it y . T ranslating o ur log density map into the density r esults in the naive density estimator m naive = 0 e c m , (10 6 ) which is not optimal in the sense of minimal rms devia- tions. The prop er estimator would b e m = h 0 e c s i ( s | d ) = 0 e c m + c 2 b D / 2 , (107) which contains uncer tain t y lo op corr ections acc oun ting for the shift of the mean under the non-linear transfor- mation b etw een log-dens it y a nd density . F. Resp onse renormalization Since we are dealing with a φ ∞ -field theor y , the zoo of lo o p diagrams is quite complex, and for ms so mething like a F eynman fo am . I n order not to get stuck in the m ultitude of this foam, we urgently require a trick to keep either the maximal order of the diagra ms low, or to limit the num b er of vertices per diag ram, or b oth. W e hav e basically tw o handles on any interaction term λ n = κ b n , the bias b and the zero - resp onse κ . W e concentrate on the resp onse, since it e n ter s the Hamiltonian in a linear wa y and also the da ta ca n b e rega rded to be prop ortiona l to κ . Thus, the full Hamiltonia n H [ s ] = 1 2 s † S − 1 s − b d † s + κ 0 e b s (108) can b e regar ded to be propo rtional to the respo ns e, ex- cept for the pr ior term and also co nstan t terms we im- mediately dro p here a nd in the following. Let us ass ume that pr ior to any data analy s is we have an initial guess m 0 for the signal with some Gaussia n uncertaint y characterized by the cov aria nce D 0 . This can b e expressed via a Hamiltonian o f the form H 0 [ s ] = 1 2 ( s − m 0 ) † D − 1 0 ( s − m 0 ) , (10 9) which defines a probability density via P 0 ( s ) ∝ exp( − H 0 [ s ]). In case the pr ior should b e o ur initial guess, we have m 0 = 0 and D 0 = S , but we need not restrict ourself to this case . Now, we wan t to anticipate step by step the information o f the full problem, and forget our initial guess with the same ra te. This can b e mo deled by adopting an a ffine par ameter τ , which measur es how m uch we expo sed ourself to the full pro blem. F or each τ , which we rega rd as a pseudo-time, our knowledge state is describ ed by a n Hamiltonia n H τ . Increasing τ by s ome small amo un t ε should therefore lead to the next knowl- edge sta te characterized by H τ + ε = H τ [ s ] + ε ( H [ s ] − H τ [ s ]) . (110 ) 23 FIG. 3: The original p ropagator D 0 = ( S − 1 + d κ 0 b 2 ) − 1 (left) and the final of the ren ormaliza tion fl o w D (Eq. 117, right) in logarithmic grey scaling for t he data displa yed in Fig. 2. The v alues of th e diagonals show t he local uncertain ty v ariance (in Gaussian approximation) b efore ( c D 0 ) and after ( b D ) the data is analyzed, respectively . The b ottom left and t op right corners exhibit non-va nishing propagator va lues due to the assumed p eriodic spatial coordinate, which puts these corners close to the tw o others on the matrix d iago nal. This equation just models an asympto tica l a pproach to the correc t Hamiltonian. If the initial g ue s s was the prior , one s ees that for infinitesimal steps ε the k no wledge flow corres p onds to tuning up all terms prop ortional to κ , H τ [ s ] = 1 2 s † S − 1 s + 1 − e − τ − b d † s + κ 0 e b s → H [ s ] . This motiv a tes the term r esp onse r enormalization for this kind of co n tinuous lea rning system, into which the infor- mation s ource a s well the interactions a r e fed with the same r ate. The trick for the renormaliza tion pr oc edure is to ap- proximate the knowledge state at e ac h moment τ to b e of Gaussian shap e and therefore the Hamiltonian to b e free (quadr atic in the sig nal). Thus we set H τ [ s ] = 1 2 ( s − m τ ) † D − 1 τ ( s − m τ ) , (11 1) where m τ and D τ = ( S − 1 + M τ ) − 1 are the mean and disp e rsion of the field g iven the acquir ed knowledge at time τ , res pectively . These have to be up dated when the next le a rning step is to be p erformed. The next Hamiltonian, b efore it b eing again repla ced by a fr ee one, is H τ + ε [ φ ] = 1 2 φ † D − 1 τ φ + ε ( S − 1 m τ − b d ) † φ − 1 2 φ † M τ φ + κ m τ e b φ = 1 2 φ † D − 1 τ φ + ε ∞ X n =1 1 n ! λ n φ n , (112) if ex pressed for the momentarily uncer tain t y field φ = s − m τ . Here, the p erturbative expansion co efficients a re given b y λ 1 = κ m τ b + S − 1 m τ − b d, λ 2 = κ m τ b 2 − c M τ , and λ n = κ m τ b n for n > 2 , assuming for simplicity tha t M τ is diagona l. This is a sav e restriction, since we will see tha t fo r τ → ∞ this is the case as ymptotically , ev en for a no n-diagonal initial M 0 . Thus we ca n require that our initial gues s was also of this form. In order to approximate this Hamiltonia n by a free one, we hav e to calculate the shifted mea n field and its connected tw o - point correlation function, the full prop- agator . T o first or de r in ε only lea f diag rams with a single p erturbative interaction vertex co n tribute to the per turbed exp ectation v a lue of φ : h φ i ( τ + ε ) ( s | d ) = + + + + . . . = ε D τ h b d − S − 1 m τ − b κ m τ e b 2 b D τ / 2 i . (11 3) Note, that only o dd interaction terms shift the exp ecta- tion v a lue m τ + ε = m τ + h φ i ( τ + ε ) ( s | d ) . The even o nes do not 24 exert an y net forces in the vicinity of φ τ = 0 since they represent a p otential which is mirror symmetric ab out this p oint. The reno rmalized propagato r D τ + ε is given b y the connected tw o - point co rrelation function h φφ † i ( τ + ε ) ( s | d ) , and this is up to linea r order in ε h φφ † i ( τ + ε ) ( s | d ) = + + + + . . . = D τ + ε D τ M − b 2 κ m τ e b 2 b D τ / 2 D τ (114) Rewriting this for an up date of M τ we find up to linear order in ε M τ + ε = (1 − ε ) M τ − ε b 2 κ m τ e b 2 b D τ / 2 . (115) T aking the limit ε → 0 yields the in tegro-differe ntial sys - tem ˙ m = D b d − b κ 0 e b m + b 2 b D/ 2 − S − 1 m ˙ M = b 2 κ 0 e b m + b 2 b D / 2 − M , and (116) D = S − 1 + c M − 1 . This conv erges at a fix p oint, which we pre v iously guesse d in Eqs. 103 and 104 for our uncerta in ty-lo op enhanced classical eq ua tion. The classic a l and the reno rmalization flow fix point equations can b e unified: m = b S d − κ b m + T b b D / 2 , D = S − 1 + b κ b m + T b b D/ 2 − 1 , (117) with T = 0 and T = 1 for the classica l and renormaliz a - tion result, resp ectively . The parameter T is more than a pure c o n v enience. If we would hav e int ro duced a temp erature T at the be- ginning, via P ( d, s | T ) = exp( − H d [ s ] /T ), Eq . 117 would hav e b een the r esult of the renor ma lization flow calcula - tion. And the classical limit naturally corres ponds to the zero temperature regime, in whic h the field expecta tio n v alue is not affected by any uncertaint y fluctuations since the sy stem is at its absolute energy minimum. An ex ample of such r econstructions can b e se en in Fig. 2, a nd its unce r tain t y structures in Fig. 3. Here, the renormaliz a tion equatio n indeed seems to provide a b et- ter res ult co mpared to the cla ssical one. Ho wev er, a sta- tistical compar ison of the tw o reco nstructions using 1000 realization of the signa l and da ta in Fig. 4 shows that there is at most a margina l difference. This may b e s ur- prising, since the clas sical and renorma lization solution are quite distinct, and the la tter is alwa ys lower than the former. O ne might therefor e ask, if the t wo a re bracket- ing the co rrect solution. And indeed, in termediate solu- tions c o nstructed using T = 1 / 2 p erform b etter tha n the ones for T = 0 a nd T = 1 , as can b e seen in Fig. 4. If neither T = 0 nor T = 1 provide the optimal r econ- struction, what would be the r igh t choice? W e hav e to re- mem ber that we replaced the probability density function at each step of the renor ma lisation scheme by a Gaussian with the cor rect mea n and disp ersion. How e ver, the real probability is not a Gaussian, and ther e fore our mean field es timator is no t optimal. Reconstructions with dif- ferent T prob e the non- Gaussian proba bilit y structure with a differ en tly wide Gaussian kernel in phase spa ce, and ther e fore result in a slightly different signal means due to the anharmonic nature of our Hamiltonia n. G. Uncertaint y structure The re maining uncertainties at the end o f the renor- malization flow can mainly be r e ad o f the renormalized propaga tor D , whic h we display in top part of Fig. 3 in compa rison to the o riginal, un-renorma liz ed one D 0 . The renor ma lised pr opagator is a m uch b e tter approxi- mation to the uncerta in ty-disp ersion of the signal p oste- rior distribution a round the mean map than the or iginal one. O ne can clea r ly see that the data imprinted a highly non-uniform structur e into the uncerta in ty pattern visi- ble in the reno rmalized propaga tor with small uncer tain- ties wher e there were many g alaxy counts. Also the den- sity estimator in Eq. 10 7 benefits from the knowledge of the uncertaint y structure con ta ined in the renorma lis ed propaga tor, a s the low er pane l of Fig. 4 shows. The pro pagators also vis ualize the effect any additional data would ha ve at differen t lo cations. The height a nd width of the propag ator v alues define res pectively the strength of the resp onse to, and the distance of info r ma- tion propa gation from a n information source. The s tructure of D 0 is imprin ted by the prior and the mask. At D 0 ’s widest lo cations the ma sk blo cks a n y in- formation source and the structure of the signa l prio r S bec omes visible. A t lo cations where the mask is transpa r - ent , the reconstructio n r espo nse per infor mation source is low er, as plen t y informa tion can b e exp ected there. Also the propaga tor width is sma lle r, since the individual in- formations do not nee d to b e pr opagated that far, thanks to the richer information source density in such re g ions. The structur e of D m has a dditionally imprinted the e x- pec ted information source density structure given the r e- construction m . The str ongly non-linea r signa l resp onse has lead to regions with very hig h ga laxy co un t ra tes , which have larg er informa tion densities, and ther efore low er and narrower information pro pagators. This im- plies, that any a dditional ga laxy detection in the regio ns with high gala xy counts will hav e little impact on the upda ted map, whereas any a dditional detected galaxies in low density r egions will mor e str ongly change it. How- ever, the n umber of additiona l galaxies per in vested ob- serving time will b e larger in high density reg ions, which may comp ensate the low e r information-p er-ga laxy ratio there. It is therefore interesting to loo k at the obser- v ational information co n ten t and how it dep ends on the 25 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 0 0.5 1 1.5 2 2.5 0 0.2 0.4 0.6 0.8 1 P S f r a g r e p l a c e m e n t s δ m T = 1 δ m T = 0 δ m T = 0 . 5 δ m naive δ b D T = 1 δ b D 0 δ m naive δ m very naive δ m κ FIG. 4: T op: Statistical reconstruction erro r from 1000 signal and data rea lizations Curves are, roughly in o rder from top (bad performance) t o b ottom ( goo d performance): error δ m naive = h ( s − m naive ) 2 i 1 / 2 ( d,s ) of th e signal-co v ariance-conv olved naive map m naive = S s naive (see Eq. 97), exp ected Wiener-uncertaint y δ b D 0 = b D 1 / 2 0 , av eraged renormalized uncertaint y δ b D T = 1 = h b D T = 1 i 1 / 2 ( d,s ) , error of the classical map δ m T = 0 = h ( s − m T = 0 ) 2 i 1 / 2 ( d,s ) , error of the renormalized map δ m T =1 = h ( s − m T = 1 ) 2 i 1 / 2 ( d,s ) , and error of the intermediate map δ m T = 0 . 5 = h ( s − m T = 0 . 5 ) 2 i 1 / 2 ( d,s ) . The low est curve wi thout lab el is κ . Bottoms: Error v ariance of estimators for the density , = e s , namely δ m very naive = h ( − e m naive ) 2 i 1 / 2 ( d,s ) , δ m naive = h ( − m naive ) 2 i 1 / 2 ( d,s ) and δ m = h ( − m ) 2 i 1 / 2 ( d,s ) (see Eqs. 106 and 107). actual da ta realiza tion. H. Information gai n In case of a free theory , the amount of information de- pends on the exp eriment al setup and o n the pr ior, but is indep e ndent of the data obtained as we hav e shown in Sect. IV D 2. This changes in case that o ne want s to har - vest informatio n in a situation describ ed by a non- line a r IFT. There, the amo un t o f informatio n can s trongly de- pend on the actual da ta . This is well illustrated by o ur LSS reco nstruction prob- lem. A p erturba tiv e c a lculation of the non-line a r infor- mation g ain is p o ssible if either the bias - factor or the signal amplitude, which b oth control the stre ngth of the non-linear interactions, are small compared to unity . 13 13 The si gnal amplitude can, for example, b e made small by defining the signal of interest to b e the cosmic density field, smo othed on The infor ma tion gain, as given by Eq. 83, e xpanded to the first few order s in b ∆ I 1 = 1 2 T r log 1 + S d κ b 2 | {z } ∆ I 0 (118) + 1 2 κ b 3 c D 0 † m 0 + 1 2 b ( b D 0 + m 2 0 ) + O ( b 5 ) , clearly dep ends on the actual realiz a tion of the data . The different fluctuations in the Wiener map m 0 = D 0 j , with D 0 = ( S − 1 + d b 2 κ ) − 1 and j = b ( d − κ ), imply p ositive and nega tiv e information density fluctuatio ns. T o con v enient ly calculate the information g ain of the observ atio n in ca s e of a large bias factor , we use the Gaus- sian approximation of the jointed probability function, as pr ovided by the renorma liz ation scheme. Due to the Gaussianity of this a ppr o ximate solution, we can simply a sufficient ly large scale ( > 10 M pc) so that h s 2 i ( s ) < 1. 26 0.01 0.1 1 0.2 0.4 0.6 0.8 0.01 0.1 1 0 0.2 0.4 0.6 0.8 1 P S f r a g r e p l a c e m e n t s information gain ∆ I d 0 th order approx . ∆ I 0 1 st order approx . ∆ I 1 FIG. 5: Information gain density (the integrands of Eq. 118 and 119) for the tw o reconstruction examples presented, the only w eakly n onlinear one (top, and Fig. 1) and the strongly non- linear one (b ottom, and Fig. 2 ). The renormalization result for T = 1 (Eq. 119), the zero- and first-order p erturbative results (Eq. 118) are shown. The information gain depends on the observ ational sensitivit y as well as th e actual data. The latter influence is stronger in the non-linear regime, and d isappears in linear inference problems. use the for m ula for the information gain o f a free theory , as g iv en by Eq. 8 6. This yields ∆ I d = 1 2 T r log 1 + S d b 2 κη , (1 19) with η = e b m + 1 2 b 2 b D T = 1 being prop ortiona l to the ex- pec ted num b er densit y of g laxies in this region (see Eq. 107). It is a lso here obvious that the infor mation ga in depe nds on the data. In regions with higher obser v ed galaxy num b ers η is lar ger, and mor e infor mation is ex- pec ted to b e harvested b y further observ ations. This is illustrated in Fig. 5, where the informa tion gain density , the individual con tributions to the trace in Eq. 119, as well as the first a nd and a ll terms of Eq. 11 8 are shown for the cas e s display ed in Figs. 1 and 2. The approxi- mate Eq. 118 seems to b e adequate for b ≪ 1, but not for o ur cases o f b = 0 . 5 a nd 2 . 5. The exp ected benefit of additional observ a tio ns at lo - cation x can also be calcula ted by differentiating Eq. 1 19 with r espect to κ ( x ). Us ing Eqs. 117 and 88 we find h δ I d δ κ 0 i (new data | d ) = 1 2 b 2 η 1 + 1 2 d κ 0 b 2 η D 2 b 2 − 1 b D . (120) The exp ected information ga in is esp ecially la rge fo r observ atio ns at lo cations where the uncertaint y b D is large, where a la r ge num b er density of galaxies ( ∝ η ) can b e ex pected, and where strong non- linearities are present( ∝ b 2 ). The in verse ter m c a ps the maximally av ailable information g ain at some level. F or the tw o reconstructio n examples giv en in Figs. 1 and 2 we dis- play the exp ected information gain a s a function of the observing p ostion in Fig . 6. It is appa r en t from the to p panel, showing the case of uniform o bserv ation coverage, that additional obser v a - tions are mor e adv antageous at lo cations wher e a lready an increas ed matter density is identified. The bo ttom panel, s ho wing the case of an v ery inhomogeneous ob- serv ation of str ongly nonlinear data, demons trates that filling observ a tional g aps sho uld hav e the highest prio rit y . But there again, r egions where the extrapo lated galaxy 27 0.01 0.1 1 10 100 0.2 0.4 0.6 0.8 0.01 0.1 1 10 100 0 0.2 0.4 0.6 0.8 1 P S f r a g r e p l a c e m e n t s i n f o r m a t i o n ∆ I d exp ected information gain h δ I d /δκ i 0 t h o r d e r a p p r o x . ∆ I 0 1 s t o r d e r a p p r o x . ∆ I 1 data d s i g n a l r e s p o n s e µ zero response κ ± b D 1 / 2 m ± b D 1 / 2 0 ∆ m = s − m m a p m ∆ s c l = s − s c l c l a s s i c a l fi e l d s c l m a s k s i g n a l s FIG. 6: Differential information gain d ensit y for the tw o reconstruction examples presen ted, th e only weakly nonlinear one (top, and Fig. 1) and the strongly non-linear one (b ottom, and Fig. 2). density seems to b e larger sho uld b e preferred, as ca n b e seen from the asymmetric shap e of the exp ected infor - mation gain for obser v atio ns in the g ap a r ound x = 0 . 2. In this example, the info r mation-harvest of high galaxy density re g ions can b e so lar ge, that further observ a tions of the a lready well observed r egions at the bo undary o f the do main seems to b e more adv antageous than improv- ing the p oo rly observed regions ar ound x = 0 . 4 , where a low galaxy densit y is alrea dy aparent fro m the existing data. Of cour se, in o rder to plan o bserv ations in a re a l c ase, the dep endence of o bserv ational costs as a function o f lo- cation x and already a c hieved zero- resp o nse there, κ ( x ), hav e to b e fo lded into the c o nsiderations. VI. NON-GA USSIAN CM B FLUCTUA TIONS VIA f nl -THEOR Y A. Data mo del As an IFT example o n the sphere Ω = S 2 , inv olving t wo interacting uncertaint y fields, we in v estigate the so called f nl -theory of local non-Gaussianities in the CMB temper ature fluctuations . This problem has currently a high scientific relev ance due to the strongly increasing av ailability o f high fidelity CMB meas ur emen ts, whic h per mit to constrain the physical c onditions at very early epo chs of the Univ er se. The relev ant references for this topic were provided in Sec t. I C 5. On to p of the very uniform CMB sky with a mea n temper ature T CMB , small temp erature fluctuations on the lev el of δ T { I , E , B } obs /T CMB ∼ 10 −{ 5 , 6 , 7 } are observed or exp ected in total Intensit y (Stok es I) a nd in p olar iza- tion E - and B-mo des, r espectively . The weak B-mo des are mainly due to lensing of E-mo des a nd so me un- known level o f gravit y w av es. W e will disregard them in the following. These CMB temp erature fluctuations are b elieved and observed to follow mo stly a Gaussian distribution. How ever, inflation pr edicts some lev el of non-Gaussianity . So me of the secondar y anisotropies imprinted b y the LSS of the Universe via CMB lens- ing, the Integrated Sachs-W olfe and the Rees-Sciama ef- fects should a lso hav e imprinted non-Gaussia n sig natures [215, 2 16]. The primo rdial, as well as so me of the sec- ondary CMB temper ature fluctuations are a resp onse to the gravitational p o ten tial initially s eeded during infla - tion. Since w e ar e interested in primordial fluctuations, 28 we wr ite d ≡ δ T { I , E } obs /T CMB = R ϕ + n, (121) where ϕ is the 3 -dimensional, primo rdial g r avitational po ten tia l, and R is the r espo nse on it of a CMB- instrument, observ ing the induced CMB temp erature fluctuations in intensit y and E-mo de p olarizatio n. These are imprinted by a num b er of effects, like gr a vitational redshifting, the Doppler effect, and a nis otropic Thom- son scattering. In ca se that the data of the instrument are foregr ound-cleaned and deconv olved a ll-sky ma ps (as- suming the data pro cess ing to b e part of the instrument) the resp onse, whic h transla tes the 3-d g ravitational field int o temper ature maps, is well known from CMB-theor y and can b e calculated with publicly av ailable codes like cmbfas t, camb, and cm beasy (see Sect. I C 5). The precise form of the resp onse do es not matter for a devel- opment of the basic co ncept, and can b e inserted later . Finally , the noise n subsumes all deviation of the mea- surement from the signal re s ponse due to instrumental and physical effects, which ar e not linearly co rrelated with the pr imordial gr avitational p otential, suc h are de- tector noise, remnants of foregro und signals, but also primordial g r avitational wa v e contributions to the CMB fluctuations. The small level of non-Gaussianity exp ected in the CMB temp erature fluctuations is a conse quence of so me non-Gaussianity in the primor dial gravitational p oten- tial. Despite the lack o f a gener ic no n- Gaussian proba bil- it y function, man y of the inflationa ry non-Ga ussianities seem to b e w ell describ ed by a lo cal pro cess , which taints an initially Gaussian rando m field, φ ← ֓ P ( φ ) = G ( φ, Φ) (with the φ -cov ariance Φ = h φ φ † i ( φ ) ), with s ome level of non-Gaussianity . A w ell controllable realizatio n of such a tarnishing o p era tio n is pr o vided by a sligh tly non-linear transformatio n of φ into the primordial gr a vitational p o- ten tial ϕ via ϕ ( x ) = φ ( x ) + f nl ( φ 2 ( x ) − h φ 2 ( x ) i ( φ ) ) (122) for a n y x . The parameter f nl controls the le v el and na- ture of non-Gaussianity via its absolute v a lue and sign, resp ectively . This means that our data mo del reads d = R ( φ + f ( φ 2 − b Φ)) + n , (123) where we dro pped the subscript of f nl . In the following we assume the noise n to be Gaussian with cov ariance N = h n n † i ( n ) and define as usua l M = R † N − 1 R for notational conv enience. 14 14 Non-Gaussian noise components are in fact expected, and would need to b e included into the construction of an optimal f nl - reconstruction. Ho wev er, curren tly we aim only at outlining the principles and we are furthermore not aw are of an traditional f nl - estimator constructed while taking such noise into accoun t. And finally , w e show at the end how to iden tify some of such non- Gaussian noise sources by pr oducing f nl -maps on the sphere, which can morphologically b e compared to known for eground structures, l ik e our Galaxy . B. Sp ectrum, bisp ectrum, and trisp ectrum The nonlinearity of the relation b etw een the hidden Gaussian random field φ a nd the obse rv a ble gravitational po ten tia l ϕ (Eq. 12 2 ) imprints non-Gaussia nit y into the latter. In order to b e able to extract the v alue of the non-Gaussianity parameter f from any data cont aining information on ϕ , w e need to k now its statistic at least up to the four -po int function, the trisp ectrum, which we briefly der iv e with IFT metho ds. T o tha t end, it is c o n v enient to define a ϕ -moment generating function Z [ J ] and its logar ithm log Z [ J ] = log Z D φ P ( φ ) e J † ϕ ( φ ) (124) = 1 2 J † (Φ − 1 − 2 c f J ) − 1 J − ( f J ) † b Φ − 1 2 T r h log 1 − 2 Φ c f J i This p ermits to calculate via J -deriv atives (see Eqs . 32- 35) the mean ¯ ϕ = h ϕ i ( φ ) = 0 , (125) the sp ectrum (or cov arianc e ) C ( ϕ ) xy = h ϕ x ϕ y i c ( φ ) = h ( ϕ − ¯ ϕ ) x ( ϕ − ¯ ϕ ) y i ( φ ) = Φ xy + 2 f x Φ 2 xy f y , (126) the bisp ectrum 15 B ( ϕ ) xy z = h ( ϕ − ¯ ϕ ) x ( ϕ − ¯ ϕ ) y ( ϕ − ¯ ϕ ) z i ( φ ) = h ϕ x ϕ y ϕ z i c ( φ ) = 2 [Φ xy f y Φ y z + Φ y z f z Φ z x + Φ z x f x Φ xy ] +8 Φ xy f y Φ y z f z Φ z x f x (127) and the trisp ectrum T ( ϕ ) xy z u = h ( ϕ − ¯ ϕ ) x ( ϕ − ¯ ϕ ) y ( ϕ − ¯ ϕ ) z ( ϕ − ¯ ϕ ) u i ( φ ) (128) = Φ xy Φ z u + Φ xz Φ y u + Φ xu Φ y z + h ϕ x ϕ y ϕ z ϕ u i c ( φ ) = 1 8 Φ xy Φ z u + 2 Φ xy f y Φ y z f z Φ z u + Φ xy f y Φ y z f z Φ z u f u Φ ux f x + 23 p erm . 15 Since the bisp ectrum conta ins most of the non-Gaussianity si g- nature, we also prov ide its F ouri er-space version, which is well- kno wn for the f nl -mo del [e.g. 217]. The bisp ectrum for f = const , expressed in terms of the ϕ -cov ari ance reads B ( ϕ ) xy z = 2 f [ C ( ϕ ) xy C ( ϕ ) y z + C ( ϕ ) xz C ( ϕ ) z y + C ( ϕ ) y x C ( ϕ ) xz ] + O ( f 3 ) . F ourier transformi ng this yields B ( ϕ ) k 1 k 2 k 3 = 2 f (2 π ) 3 δ ( k 1 + k 2 + k 3 ) × [ P ( k 1 ) P ( k 2 ) + P ( k 2 ) P ( k 3 ) + P ( k 3 ) P ( k 1 )] + O ( f 3 ) , where P ( k ) is the p o w er sp ectrum of ϕ , whic h is identical to that of φ up to O ( f 2 ). 29 of the g ravitational p otential. Since we will in vestigate the p ossibility of a spatially v ar ying non-Gauss ianit y pa - rameter a t the end o f this section, we keep track of the spatial co or dinate of f , but for the time b eing read f x = f . The sp ectrum, bispectrum and tr ispectr um o f our CMB-measurement ca n ea sily b e calcula ted from the gravitational spec tr um and bispec tr um, resp ectively: C ( d ) = R C ( ϕ ) R † + N , (129) B ( d ) ˆ n 1 ˆ n 2 ˆ n 3 = R ˆ n 1 x R ˆ n 2 y R ˆ n 3 z B ( ϕ ) xy z , T ( d ) ˆ n 1 ˆ n 2 ˆ n 3 ˆ n 4 = R ˆ n 1 x R ˆ n 2 y R ˆ n 3 z R ˆ n 3 u T ( ϕ ) xy z u + " R C ( ϕ ) R † + 1 8 N ˆ n 1 ˆ n 2 N ˆ n 3 ˆ n 4 + 23 p erm utations , where ˆ n denotes the unit vector on the sphere, and we hav e ma de use of the as sumption of the noise be ing Gaus - sian a nd indep enden t of the signal. In ca se the noise itself has a bi- or trisp ectrum, or there is a signal dep endent noise, e.g. due to a n incorrect instrument calibration, then more terms have to be added to the expressio ns. The usually q uo ted formulae [e.g. 204, 217, 218, 219] ca n be obtained from E q. 129 by applying spher ical ha r monic transformatio ns. C. CMB-Hamiltonian Although we are not in terested in the auxiliar y field φ , it is nevertheless very us eful for its marginalizatio n to define its Hamiltonian, which is H f [ d, φ ] = − log( G ( φ, Φ) G ( d − R ( φ + f ( φ 2 − b Φ)) , N )) = 1 2 φ † D − 1 φ + H 0 − j † φ + 4 X n =0 1 n ! Λ ( n ) [ φ, . . . , φ ] , with D − 1 = Φ − 1 + R † N − 1 R ≡ Φ − 1 + M , j = R † N − 1 d, Λ (0) = j † ( f b Φ) + 1 2 ( f b Φ) † M ( f b Φ) , (130) Λ (1) = − ( f b Φ) † M and j ′ = j − Λ (1) † , Λ (2) = − 2 c f j ′ , Λ (3) xy z = ( M xy f y δ y z + 5 per m uta tions) , Λ (4) xy z u = 1 2 ( f x δ xy M y z δ z u f u + 23 p ermutations) , and H 0 collects all terms independent of φ and f . The last tw o tensors should b e read without the E instein sum- conv ention, but with a ll p ossible index-p ermutations. Note, that this is a non-lo cal theory for φ in case tha t either the no ise c ov arianc e or the res p onse matrix is non- diagonal, yielding a no n- loca l M and therefore non-lo cal int eractions Λ (3) and Λ (4) . W e should note, that Babich [220] derived the now tra- ditional f nl -estimator fro m a very similar starting p oint, the lo g-probability for ϕ . The difference o f the res ulting estimators is not due to the slig h tly differ en t a pproaches ( H f [ d, ϕ ] v ersus H f [ d, φ ]), but beca use of the frequentist and Bayes statistics he and we use, resp ectiv ely . In case that the noise a s well as the resp onse is di- agonal in p osition space, a s it is o ften ass umed for the instrument res ponse of pr oper ly c le a ned CMB ma ps , and is also approximately v alid on lar g e a ngular scales, where the Sachs-W olfe effect dominates, we have N xy = σ 2 n ( x ) δ ( x − y ), R = − 3 [215] for the total in tensit y fluc- tuations, and thus M xy = 9 σ − 2 n ( x ) δ ( x − y ), if we res trict the signa l space to the la s t-scattering surface, whic h we ident ify with S 2 . This per mits to simplify the Hamilto- nian to H f [ d, φ ] = 1 2 φ † D − 1 φ + H 0 − j † φ + 4 X n =0 1 n ! λ † n φ n , with D − 1 = Φ − 1 + 9 d σ − 2 n , j ′ = j − λ 1 = 3 (3 b Φ f − d ) /σ 2 n , λ 0 = 3 ( b Φ /σ 2 n ) † ( 3 2 f 2 b Φ − f d ) , λ 2 = − 2 f j ′ , λ 3 = 54 f / σ 2 n , and λ 4 = 108 f 2 /σ 2 n . (131 ) The n umerical co efficients of the las t tw o terms may lo ok large, howev er , these co efficients stand in fron t of terms of typically φ 3 ∼ 10 − 15 , and φ 4 ∼ 10 − 20 , whic h ensures their well-behavedness in any dia grammatic expansion series. F or later usage, we define the Wiener-filter reco ns truc- tion of the gravitational p otential as m 0 = D j . D. f nl -evidence and m ap making Since we are moment arily not interested in re c onstruct- ing the pr imordial fluctuations, but to extrac t knowledge on f nl , we mar ginalize the former by calculating the log- evidence log P ( d | f ) up to quadratic o rder in f : log Z f [ d ] = lo g Z D φ P ( d, φ | f ) = log Z D φ e − H f [ d,φ ] = − H 0 − Λ 0 + + + + + + + + + + + + + + + + + + + + O ( f 3 ) . (132) W e ha ve made use of the fact that the loga rithm of the partition sum is provided by all connec ted diag rams, a nd 30 that j ′ contains a term of the or der O ( f 0 ), Λ (2) and Λ (3) contain terms of the o rder O ( f 1 ), and Λ (4) one o f the order O ( f 2 ), so that they can appea r an unrestricted nu m ber of times, twice and o nce in diagra ms of order up to O ( f 2 ), resp ectively . Since only 4 th order int erac- tions are inv o lv ed, an implementation in s pher ical har - monics space may b e feasible using the only 4 th order C -co efficients (E q . B 3), which can b e calculated com- puter algebraica lly . Finally , we recall = 1 2 log | 2 π D | = 1 2 T r(log(2 π D )) . (133) Although f is not known, the expressions in E q. 132 prop ortional to f and f 2 can b e calculated sepa rately , per mitting to write down the Hamiltonia n of f if a suit- able prio r P ( f ) is chosen, H d [ f ] ≡ − lo g( P ( d | f ) P ( f )) = ˜ H 0 + 1 2 f † ˜ D − 1 f + ˜ j † f + O ( f 3 ) , (134) where we collected the linear and quadr atic coefficie nts int o ˜ j and ˜ D − 1 . It is obvious that the optimal f - estimator to low est order is therefore m f = h f i ( s,f | d ) = ˜ D ˜ j , (135) and its uncertaint y v arianc e is just h ( f − m f ) ( f − m f ) † i ( s,f | d ) = ˜ D . (136) So far, we hav e assumed f to hav e a single universal v alue. How ever, we ca n a lso p ermit f to to v a ry spa tia lly , or on the spher e of the sky . In the latter cas e one would expand f as f ( x ) = l max X l =0 l X m = − l f lm Y lm ( ˆ x ) (137) up to some finite l max . Then one would r ecalculate the partition sum, now separa tely for terms pro por tional to f lm and f lm f l ′ m ′ , whic h are then sor ted in to the v ector and matr ix co efficients of ˜ j and ˜ D − 1 , resp ectively and according to ˜ j ( lm ) = dH d [ f ] d f lm f =0 , and (138) ˜ D − 1 ( lm ) ( l ′ m ′ ) = d 2 H d [ f ] d f lm d f l ′ m ′ f =0 . f -map making can then pro ceed a s describ ed ab ov e in spherical ha rmonics s pace. Compar ing the r esulting map in angula r space to known fo r eground so ur ces, as our Galaxy , the level of no n-Gaussian c o n tamination due to their imper fect remov a l from the data may b e ass e s sed. E. Comparison to traditional estimator W e conclude this chapter with a short compariso n to traditional f nl -estimators. T o our knowledge, the mo s t developed estimator in the literature is bas e d on the CMB-bisp ectrum, which is the third o rder correla tion functions of the data [e.g. 220, 221, and references in Sect. I C 5 ]. The IFT ba sed filter presented her e con- tains terms which ar e up to fourth or der in the data, a nd therefore can b e expected to be of higher accuracy since bo th metho ds ar e supp o sed to b e optimal. Kogo and Komatsu [219] note that the CMB trisp ectrum should contain significant information on f 2 nl , and may b e su- per ior to non-Ga ussianity detection compa r ed to the bi- sp ectrum on small angular scales. How ever, since the trisp ectrum is insensitive to the sig n of f nl , its actua l usage as a proxy is a it more subtle. In the IFT esti- mator, any term prop ortional to f 2 nl ent ers the in verse of the propag a tor ˜ D , a nd therefore the trisp ectrum s eems to unfold its f nl -estimation p ow er mostly in combination with the bisp ectrum, whic h drives ˜ j . Under which conditions do es the traditional estimator emerge fro m the IFT o ne? There ar e three conceptual differences betw een the estimators, in that the IFT filter can ha ndle inho mo geneous non-Ga ussianity , corr ect for CMB sky and exp osure ch ance coupling, and is unbiased with r espect to the p osterior . The traditiona l estimator is usually written as ε = 1 N Z dx A ( x ) B 2 ( x ) = 1 N m † 0 Φ − 1 m 2 0 , (139) where B = D j = m 0 is the Wiener - filter reconstructio n of the gravitational p otent ial, A = Φ − 1 B is the same, just additionally filtered by the inv e r se p ow er sp ectrum, and N is a normalizatio n co nstan t [e.g. 2 02]. This is fixe d by the condition that the estimator should b e unbiased with r espect to all signal and nois e realiza tio ns, N = h m † 0 Φ − 1 m 2 0 i ( d,s | f =1) = B ( ϕ ) xy z | f =1 ( M D ) xu Φ − 1 uv ( D M ) vy ( D M ) vz = 2 [Φ xy Φ y z + Φ y z Φ z x + Φ z x Φ xy ] × ( M D ) xu Φ − 1 uv ( D M ) vy ( D M ) vz (140) The fir st difference b et ween the estimators is obvious, in that the IFT es timator ca n handle a s pa tially v ary- ing f ( x ). Therefor e, we will only re g ard s patially co n- stant non-linear it y para meters in the following. Since no CMB exp eriment is a ble to measure the monop ole temper ature fluctuation, the r espo nse to an y spa tially homogeneous signal is zer o. T his means, in F ourier ba- sis, that R ˆ n,k =0 = 0 and therefor e j k =0 = M k =0 ,k ′ = 0. Thu s, w e find for a Universe with homogeneous statis- tics ( b Φ k 6 =0 = 0) that Λ (0) = Λ (1) = 0, j ′ = j , and Λ (2) = − 2 f b j , which reduces the num ber o f diagrams we have to calculate. The IFT estimator is driven by the f -infor ma tion source ˜ j , which is given by all dia grams which contain 31 terms linear in f . There ar e four of them, y ie lding ˜ j = 1 f + + + = m † 0 Φ − 1 m 2 0 + m † 0 h Φ − 1 b D − 2 d M D i , (141) where we used M = D − 1 − Φ − 1 in order to combine the t wo tree a nd the t w o lo op diagra ms into the first and second term, resp ectively . The term r esulting from the tree diagra ms is actually identically to the unnormalised traditional estimator ε (Eq. 139). The terms resulting from the lo op diag rams v anish for an homogeneo us M , which a CMB exp eriment with uni- form exp osure and consta nt noise could pro duce. In case of an inhomogeneous M , which is the more rea listic case, the lo op term do es not v anish and corr ects for chance co r- relations b et ween the CMB-rea lization (as seen throug h j ) and the noise and resp onse structure of the ex peri- men t (as encoded in M a nd D ). Creminelli et al. [22 2 ] already p oint ed out that such a linear corr ection term is necessary in cas e o f an inhomogeneous sk y cov erage. An yhow, the se c ond difference betw een the estimator s is that the IFT based one a pplies a corr ection fo r chance correla tions of CMB sky and sky exp osure a nd the tradi- tional o ne do es no t. This term is a bsen t in the traditional estimator since the la tter was cons tr ucted as the optimal estimator which is third or de r in the data. This e xcluded the lo op term, whic h is linear in the data. An inclusio n of this term into the traditiona l es timator is straig htforward a nd actually done by the mor e re c e n t f nl measurements [e.g. 223]. T he norma lization constant N is unaffected by this, since the exp ectation v alue of the lo op term av eraged ov er all p ossible signal realization is zero. This br ings us to the third difference b etw een the es ti- mators, the different normalization. The traditional esti- mator is normalized b y a data independent constant N , where the IFT estimator is normalized by a data dep en- dent term ˜ D − 1 = 1 σ 2 f + 2 f 2 " + + + + + + + + + + + + , (142) where only the first three diagr a ms are data indep endent and σ f is the v aria nce of the prior, which we a ssume to be P ( f ) = G ( f , σ 2 f ). The detailed expres sions for the dif- ferent diagra ms can be found in App endix C. F or both estimators, the traditional and the IFT one, the no r mal- ization is suppo sed to guarantee unbiasedness, how ever, with r espect to differen t pro babilit y distributions. The traditional estimator is unb iased in the frequentist sense, for an av er age ov er all sig na l f a nd data realiza- tions. Howev er , the IFT estimator is unbiased in the Bay es ia n sense, with res pect to the pos terior, the proba- bilit y distr ibutio n o f all signals g iv en the data. Since the data are given, and no t a ssumed to v a ry any more after the o bserv ation is per formed, it ca n and should a ffect the normalizatio n constant, which enco des the sensitivit y o f our no n-Gaussianity estimatio n. The reaso n for the I FT normalization constant (or f - propaga tor) to be data dep endent can b e understo o d as follows. There are data realizations which a re b et- ter suited to r ev eal the pres e nc e of a non-Gauss ianities than others, ev en if they have identical ˜ j . Suc h a de- pendenc e of the detectabilit y of a e ffect on the concrete data realizatio n is c ommon in non-linear Baysian infer- ence, and was even more pro minent in the example of the reconstructio n of a log-norma l density field in Sect. V. VII. SUMMAR Y AND OUTLOOK Starting with fundament al information theoretical con- siderations ab out the nature of measurements, signa ls, noise and their relation to a physical r ealit y g iv en a mo del of the Universe or the system under consider ation, we reformulated the inference problem in the la nguage of information field the ory (IFT). IFT is actually a sta tisti- cal field theor y . The information field is identified with a spatially distributed s ignal, which can freely b e chosen b y the scientist accor ding to needs and technical constra in ts . The mathematical appara tus of field theory p ermits to deal with the ensemble of all p ossible field configura tions given the data and pr io r information in a consistent wa y . With this conceptual framework, w e derived the Hamiltonian of the theory , s ho wed that the fre e theory repro duces the well known r e sults of Wiener-filter theory , and pre sen ted the F eynman-r ule s for non-linear , in teract- ing Hamiltonia ns in g eneral, a nd in particular ca ses. The latter ar e infor mation fields over F ourier- and spherical harmonics-s paces for infere nce problems in R n and S 2 , resp ectively . O ur “philoso phical” considera tions per mit- ted to a rgue why the resulting IFTs are usually well nor- malized, but often non-lo cal. Since the propag ator of the theory is clo sely r elated to the Wiener-filter , for which now adays efficient numerical algorithms exist as image reconstructio n and map-making co des, and the informa - tion source term is us ua lly a noise weigh ted version of the data, the necess a ry computational to ols are at hand to conv er t the diagrammatic express io ns into well p erfor m- ing alg orithms. F urthermore, w e provided the Boltzmann-Shannon in- formation mea s ure of IFT based on the Helmho ltz free energy , thereby hig hligh ting the embedding of IFT in the framework of statistical mechanics. As examples of the IFT recip e, t wo concr e te IFT pro b- lems with c o smological mo tiv ation w ere discuss ed, whic h are also thought as blueprints for other inference prob- 32 lems. The firs t was targe ting at the problem of r econ- structing the s pa tially contin uous cosmic LSS matter distribution from discrete gala xy counts in incomplete galaxy surveys. The resulting a lg orithm can also b e used for image reco nstruction with low-num b er photon statis- tics, e.g in low-dose X-ray imag ing. The seco nd exa mple was the design o f an optimal metho d to measure or constr ain any p ossible loca l non- linearities in the CMB temp eratur e fluctuations . This may serve as a blueprint for statistica l monitoring of the linearity of a signal a mplifier. W e conclude here with a sho r t outlo ok on some prob- lems that are accessible to the presented theory . Many signa l inference pr oblems inv olve the rec o nstruc- tion of fields without pr ecisely known sta tistics . Some co efficien ts in the IFT- Hamiltonians may only b e phe- nomenologica l in nature, a nd therefore hav e to be de- rived from the sa me data used for the r econstruction itself. This mor e intricate interpla y of parameter and information field can also b e incorpo rated in to the IFT framework, as we will show with a subsequent w ork. F or cos mo logical applications, along the lines sta rted in this work, c le arly more r ealistic data mo dels need to b e inv estiga ted. F or example, to understand the r espo ns e in galaxy formation to the under lying dar k matter distribu- tion in terms of a r ealistic, statistical mo del, to b e used in constructing the corres p onding IFT Hamiltonian for a dark-matter info r mation fie ld, detailed higher- order cor- relation co efficients hav e to b e distilled from numerical simulations or semi-analytic descriptions. Also the CMB Hamiltonian may be ne fit from the inclusion of remnants from the CMB for eground subtraction pro cess, p ermit- ting to g ather more s o lid ev idence on fundamental pa- rameters which ar e hidden in the C MB fluctuations, like the a mplitude of non-Gaussia nities. F urthermore, there ex ist a num b er of mo re or less heuristic a lgorithms for inv er se problems, whic h hav e prov e n to serve w ell under certain circumstanc e s . Re- verse engineering of their implicitly assumed prio rs and data mo dels may p ermit to understand b etter for which conditions they are best suited, as w ell ho w to improve them in case these conditions ar e not ex a ctly met. Finally , we are very curio us to se e whether and how the presented framework may be suitable to infer ence problems in other scientific fields. Ackno wle dgement s It is a pleasure to thank the following p eople for help- ful scientific discussions on v a rious asp ects of this w o rk: Simon White on the danger s of p erturba tion theory , Ben- jamin W andelt on the prosp ects of larg e-scale structur e reconstructio n, Jens Jasche on the pleasures and pains of sig nal pr oc e ssing, J¨ org Ra c hen on the philosophy of science, and Andr ´ e W aelkens o n the in v a r iant , but ver- tiginous theo r y of is o tropic tensors. W e thank Cornelius W eig and Henrik Junklewitz for debates on the connec- tion betw een IFT and QFT. W e gratefully ac knowledge helpful comments o n the manuscript b y Marcus Br ¨ ugg e n and Thomas Riller and by three very constr uc tive refer - ees. APPENDIX A: NOT A TION W e briefly summarize o ur notation of functions in p o- sition and F ourier space. A her e usually real, but in pr inciple also complex func- tion f ( x ) o ver the n -dimensiona l space is regarde d as a vector f in a discrete a nd finite-dimensiona l, or contin- uous a nd infinite-dimensional Hilbert spa ce. f will de- note this vector, indep endently of the momen tar ily cho- sen function basis, b e it the rea l space f ( x ) = h x | f i or the F our ier basis f ( k ) = h k | f i = Z dx f ( x ) e i k · x . (A1) Here, the volume integration usually is p erformed o nly ov er an finit e doma in with volume V . This leads to the conv ention for the or igin of the delta function in k -space, δ (0) = V (2 π ) n , (A2) and a ls o to a F ourier tr a nsformation op erato r F = | k ih x | , with F kx = e i k x , and its inv er se F † = | x ih k | , with F † xk = e − i k x . The dagger is used to denote transp osed and complex co njugated o b jects. W e have ( F † F ) xy = 1 xy as well a s ( F F † ) kk ′ = 1 kk ′ for the following definition of the scalar pro duct of t wo functions f and g in rea l and F ourier spac e: f † g = h f | g i = Z dx f ∗ ( x ) g ( x ) = Z dk (2 π ) n f ∗ ( k ) g ( k ) , (A3) where the a sterix denotes complex co njuga tion. The statistical power-spectrum o f f is denoted by P f ( k ) = h| f ( k ) | 2 i ( f ) /V . W e also intro duce for co n v enience the po sition-space comp onen t-wise pro duct of tw o functions ( f g )( x ) ≡ f ( x ) g ( x ) , (A4) which als o p ermits compact notatio ns like (log f )( x ) = log( f ( x )) , ( f /g )( x ) = f ( x ) /g ( x ) , (A5) and alike. The c omponent-wise pro duct should not be confused with the tensor pro duct of tw o vectors ( f g † )( x, y ) = f ( x ) g ∗ ( y ). The diagonal components of a matrix M in po sition- space r epresentation for m a vector which we denote by c M = diag x M , with c M x = M xx . (A6) 33 Similarly , a diagonal matrix in p osition-space repr esen- tation, who se diag onal compo nen ts are g iv en by a vector f , will b e denoted by b f = diag x f with b f xy = f x 1 xy . (A7) Thu s, c c M = M if and only if M diag onal, and b b f = f alwa ys . In o ur notation a multiv aria te Gaussian reads: G ( s, S ) = 1 | 2 π S | 1 2 exp − 1 2 s † S − 1 s (A8) Here, S = h s s † i ( s ) denotes the cov ariance tensor of the Gaussian field s , which is dr awn fro m P ( s ) = G ( s, S ). If s is s tatistically homogeneous, S is fully describ ed by the power-spectr um P s ( k ): S k k ′ = (2 π ) n δ ( k − k ′ ) P s ( k ) , S − 1 k k ′ = (2 π ) n δ ( k − k ′ ) ( P s ( k )) − 1 . ( A9) The F ourier r epresentation of the tra ce of a F ourier- diagonal o pera tor, T r( A ) = Z dx A x x = V Z dk (2 π ) n P A ( k ) , (A10) is v er y useful in combin ation with the following expres- sion for the determinant of a Hermitian matrix, log | A | = T r(log A ) . (A11) F urthermore, we usually suppr ess the dependency of probabilities on the under lying model I and its param- eters θ in our notation. I.e. instead of P ( s | θ , I ) we just wr ite P ( s ) or P ( s | θ ) depending on our focus. Her e θ = ( S, N , R, ... ) co ntains all the para meter s of the mo del, which are ass umed to be known within this work. APPENDIX B: FEYNMAN RULES ON THE SPHERE Here, we provide the F eynman r ules on the sphere. The r eal-space rules a r e identical to those of flat s paces, with just the scalar pr oduct replaced by the integral ov er the sphere, etc. In case the problem at hand has an isotropic pr opagator, which only depends on the distance of t w o points on the sphere, but no t on their lo cation or orie ntation, the propa gator is diagona l if expr essed in spherical harmonics Y lm ( x ). Thanks to the orthogo nalit y relation of spherica l harmonics, we hav e for x, y ∈ S 2 ( Y Y † ) xy = X lm Y lm ( x ) Y ∗ lm ( y ) = δ ( x − y ) = (1) xy (B1) and ( Y † Y ) ( l,m )( l ′ ,m ′ ) = Z dx Y ∗ lm ( x ) Y l ′ m ′ ( x ) = δ ll ′ δ mm ′ = (1) ( l,m )( l ′ ,m ′ ) . (B2) Therefore, w e can just inser t real-space identit y matrices 1 = Y Y † in b etw een any expr ession in rea l-space dia- grammatic expr ession a nd assign Y † to the r ight , and Y to the left ter m of it. This wa y we find the spherica l- harmonics F eynman rules, which a re very similar to the F ourier-spa ce ones , in that they a lso require directed propaga tors-lines for pro per angular-momentum conser- v ation. F or a theory with only lo cal interactions, these read: 1. An op en end of a line has external (not summed) angular- momen tum qua n tum num b ers ( l, m ). 2. A line connecting momentum ( l , m ) with momen- tum ( l ′ , m ′ ) corres p onds to a pro pagator betw een these momenta: D ( l,m )( l ′ ,m ′ ) = C D ( l ) δ ll ′ δ mm ′ , where C D ( l ) is the angula r p ow er s pectrum o f the propaga tor. 3. A da ta source vertex is ( j + J − λ 1 )( l, m ), where ( l, m ) is the a ngular moment um at the da ta-end of the line. 4. A vertex with quantum num ber ( l 0 , m 0 ) with n in incoming and n out outgoing lines ( n in + n out > 1) with momentum lab els ( l 1 , m 1 ) . . . ( l n in , m n in ) and ( l ′ 1 , m ′ 1 ) . . . ( l ′ n out , m ′ n out ), resp ectively , is given b y − λ m ( l 0 , m 0 ) C ( l ′ 1 ,m ′ 1 ) ... ( l ′ n out ,m ′ n out ) ( l 0 ,m 0 ) ... ( l n in ,m n in ) , where C will b e defined in Eq. B3. 5. An internal vertex has internal (summed) a ngular- momentum q uan tum num b ers ( l ′ , m ′ ). Summation means a ter m P ∞ l ′ =0 P l ′ m = − l ′ in fr on t of the expr es- sion. 6. The expr e ssion g ets divided by the symmetry factor of its diag ram. The interaction s tructure in spherica l harmonics-s pa ce is complicated due to the non-or thogonality of powers a nd pro ducts of the spherica l harmonic funct ions, compared to the F our ier-space ca se, where any p o wer or pro duct of F ourier-basis functions is a gain a sing le F ourier- basis function. The spherical s tructure is encapsulated in the co effi- cients C ( l ′ 1 ,m ′ 1 ) ... ( l ′ n out ,m ′ n out ) ( l 0 ,m 0 ) ... ( l n in ,m n in ) ≡ Z dx n in Y i =0 Y l i m i ( x ) ! n out Y i =1 Y ∗ l ′ i m ′ i ( x ) ! , (B3) which can b e ex pressed in terms of sums and pr oducts of Wigner co efficients, thanks to the relations Y ∗ lm ( x ) = Y l , − m ( x ), Y l 1 m 1 ( x ) Y l 2 m 2 ( x ) = X lm r (2 l 1 + 1) (2 l 2 + 1) (2 l + 1) 4 π × l 1 l 2 l m 1 m 2 m Y lm ( x ) l 1 l 2 l 0 0 0 , (B4) 34 and the orthog onality relation in Eq . B2, to b e applied successively in this or der. Due to this complica tion, it is probably most efficient to calculate pr opagation in s pher- ical harmonic s space, but to change ba c k to real space for any in teraction vertex of high or der. APPENDIX C: f nl -PR OP AGA TOR W e pr ovide in the following the individual terms of the f nl -Propa gator in Eq. 142. The individua l diagrams a r e all O ( f 2 ) a nd are g iv en here for the case f = 1: = − T r D 2 M − 1 2 b D † M b D (C1) = 1 2 h 2 d D M + b D M i † D h 2 d M D + M b D i (C2) = T r D 2 M D M +2 M xy D y y ′ M y ′ x ′ D x ′ y D xx ′ (C3) = j † D 2 j (C4) = − 2 m † M D 2 j − 4 T r h b m D b j D M i (C5) = m † M D 2 M m + 4 T r h b m D d M m D M i +2 T r [ b m D ( b m M + M b m ) D M ] (C6) = − 2 h 2 d D M + b D M i † D b j m (C7) = − m 2 † M b D − 2 T r [ b m M b m D ] (C8) = h 2 d D M + b D M i † D 2 b mM m + M m 2 (C9) = 1 2 2 b mM m + M m 2 † D 2 b mM m + M m 2 (C10) = − 2 ( m j ) † D ( M m 2 + 2 b m M m ) (C11) = − 1 2 m 2 † M m 2 (C12) = 2 ( m j ) † D ( j m ) (C13) W e used here the conv ent ions m = D j and ( D 2 ) xy = ( D xy ) 2 and remind that Λ (0) = Λ (1) = 0, j ′ = j , Λ (2) = − 2 f b j , Λ (3) xy z = [ M xy δ y z + 5 p erm . ], Λ (4) xy z u = 1 2 [ δ xy M y z δ z u + 23 p erm . ]. [1] T. Ba yes, Phil. T rans. Roy . So c. 53 , 370 (1763). [2] C. E. Sh annon, Bell System T ec hnical Journal 27 , 379 (1948). [3] C. E. Shannon and W. W ea ver, The mathematic al the - ory of c ommunic ation (Urbana: Universit y of Illinois Press, 1949, 1949). [4] E. T. Ja y nes, Physical R eview 106 , 620 ( 1957). [5] E. T. Ja y nes, Physical R eview 108 , 171 ( 1957). [6] E. T. Ja y nes, in Statistic al Physics 3 (1963), p. 181. [7] E. T. Jaynes, American Journal of Physics 33 , 391 (1965). [8] E. T. Ja yn es, IEEE T rans. on Systems S cience and Cy- b ernetics SSC-4 , 227 (1968). [9] E. T. Ja y nes, in Pr o c. IEEE, V olume 70, p. 939-952 (1982), pp. 939–952 . [10] E. T. Ja ynes and R. Baierlein, Ph ysics T o day 57 , 76 (2004). [11] N. Metrop olis, A. W. R osen b luth, M. N. Rosenbluth, A. H. T eller, and E. T. T eller, Journal of Chemical Physics 21 , 1087 (1953). [12] W. K. Hastings, Biometrik a 57 , 97 (1970). [13] S. Geman and D. Geman, IEEE T ransactions on P at- tern A nalysis and Machine I n telligence 6 , 721 (1984). [14] S. Duan, A. Kennedy , B. Pe ndleton, and D. R o weth, Phys. Lett. B 195 , 216 (1987). [15] K. P . N. Murthy, M. Janani, and B. Shenbga Priya , ArXiv Computer S cience e-prints (2005), arXiv:cs/05040 37. [16] M. A. T anner, T o ols for statistic al infer enc e ( Springer- V erlag, New Y ork, 1996). [17] R. M. Neal, in T e chnic al R ep ort CRG-TR-93-1 (Dep t. of Computer Science, Un iv ersit y of T oronto, 1993). [18] C. P . R obert, The Bayesian choic e (Springer-V erlag, New Y ork, 2001). [19] A. Gelman, J. B. Carlin, H. S . Stern , and D . Rubin, Bayesian data analysis (Chapman & H all/C RC , Boca Raton, Florida, 2004). [20] R. A. Aster, B. Brochers, and C. H. Thurber, Par ame- ter estimation and i nverse pr oblems (Elsevier Academic Press, London, 2005). [21] R. T rotta, ArXiv e- prin ts 0803.4089 (2008), 0803. 4089. [22] N. W iener, Extr ap olation, I nterp olation, and Smo othing of Stationary Time Series (New Y ork: Wiley , 1949). [23] W. H. Richardson, Journal of the Optical So ciety of America (1917-198 3) 62 , 55 (1972). [24] L. B. Lucy, AJ 79 , 745 (1974). [25] B. R. F rieden, Journal of the Optical So ciety of America (1917-1983) 62 , 511 (1972). [26] S. F. Gull and G. J. Daniell, Nature (London) 272 , 686 (1978). [27] J. S killing, A. W. Strong, and K. Bennett, MNRAS 187 , 145 (1979). [28] R. K. Bryan and J. S killing, MN RAS 191 , 69 (1980). [29] S. F . Burch, S. F. Gull, and J. Skilli ng, Co mputer Visi on Graphics an d I mage Pro cessing 23 , 113 (1983). [30] S. F. Gull and J. Skilling, in Indir e ct Imagi ng. Me a- sur ement and Pr o c essing f or Indir e ct Im aging. Pr o- c e e dings of an International Symp osium held in Syd- ney, A ustr ali a, A ugust 30-Septe mb er 2, 1983. Editor, J.A. R ob erts; Publisher, Cambridge University Pr ess, Cambridge, En gland, New Y ork, NY, 1984. LC # QB51.3.E43 I53 1984. ISBN # 0-521-2 6282-8. P. 267, 35 1983 (1983), p. 267. [31] S. Sibisi, J. Skilling, R. G. Brereton, E. D. Laue, and J. S taun ton, Natu re (Lond on) 311 , 446 (1984). [32] D. M. Titterington and J. Skilling, N ature ( London) 312 , 381 (1984). [33] J. Skilling and R. K. Bryan, MNRAS 211 , 111 ( 1984 ). [34] R. K. Bry an and J. Sk illing, Journal of Mo dern Optics 33 , 287 (1986). [35] S. F. Gull, in Maximum Entr opy and Bayesian Meth- o ds , edited by J. Sk illing (Kluw er Academic Publishers, Dordtrech t, 1989), pp. 53–71. [36] S. F. Gull and J. Skilling, The M EMSYS5 User’s Man- ual (Maximum Entrop y Data Consultan ts Ltd, R o y s- ton, 1990). [37] J. Skilli ng, in Maximum Entr opy and Bayesian Metho ds , edited b y G. J. Eric kson, J. T. Rychert, a nd C. R. Smith (1998), p. 1. [38] F. S. Kitaura and T. A. Enßlin, MNRAS 389 , 497 (2008), 0705.0 429. [39] R. N ara yan and R. Nity ananda, ARAA 24 , 127 (1986). [40] R. M olina, J. N unez, F. J. C ortijo, a nd J. Mateos, Signal Processing Magazine, IEEE 18 , 11 (2001). [41] E. Bertsc hinger, ApJL 323 , L103 (1987). [42] J. N. F ry, Astrophys. J. 289 , 10 (1985). [43] W. Bialek and A. Z ee, Physical Review Letters 58 , 741 (1987). [44] W. Bialek and A. Zee, Ph ysical Review Letters 61 , 1512 (1988). [45] W. Bialek, C. G. Callan, and S. P . Strong, Physical Re- view Letters 77 , 4693 ( 1996), arXiv:cond- mat/9607 180. [46] P . Stoica, E. G. Larsson, and J. Li, A J 120 , 2163 (2 000). [47] T. Enßlin and M. F rommert, in preparation (2009). [48] J. C. Lemm, ArXiv Ph y sics e-p rin ts (199 9), physics/9 912005 . [49] J. C. Lemm and J. Uhlig, F ew-Bo dy Systems 29 , 25 (2000), arXiv:quant-ph/000602 7. [50] J. C. Lemm, J. Uhlig, and A. W eigun y, Ph ysical Review Letters 84 , 2068 (2000), arXiv:cond -mat/9907 013. [51] J. C. Lemm and J. U hlig, Physical Review Letters 84 , 4517 (2000), arXiv:nucl-th/99080 56. [52] J. C. Lemm, Physics Letters A 276 , 19 (2000). [53] J. C. Lemm, in Bayesian Infer enc e and Maximum En- tr opy M etho ds i n Scienc e and Engine ering , ed ited by A. Mohammad-Djafari (2001), vol. 568 of Americ an In- stitute of Physics Confer enc e Series , pp. 425–43 6. [54] J. C. Lemm, J. Uhlig, and A. W eiguny , Europ ean Phys- ical Journal B 20 , 349 (2001), arXiv:quant-ph/0005122. [55] J. C. Lemm, J. Uhlig, and A. W eiguny , Europ ean Phys- ical Journal B 46 , 41 (2005). [56] J. C. Lemm, ArXiv Condensed Matter e-prints (1998), cond-mat/9808039. [57] J. Binney, N . Do wric k, A . Fisher, and M. Newman, The the ory of critic al phe nomena (Oxford Universit y Press, Oxford, UK: I SBN0-19-851394 -1, 1992). [58] M. E. Peskin and D. V. S c h roeder, A n Intr o duction to Quantum Field The ory (W estview Press Bo ulder, Col- orado: 1995, I SBN-13 978-0-201-50397-5., 1995). [59] A. Zee, Quantum field the ory in a nutshel l (Qu an tum field theory in a nutshell, by A. Zee. Princeton, NJ: Princeton Un iv ersit y Press, 2003, ISBN 0691010196 ., 2003). [60] S. Matarrese, F. Lucchin, and S. A. Bonometto, ApJL 310 , L21 ( 1986 ). [61] Y. B. Zel’dovic h, A&A 5 , 84 (1970). [62] J. M. Bardeen, J. R . Bond, N. Kaiser, and A . S. Szala y, Astrophys. J. 304 , 15 (1986). [63] P . J . E. P eebles, The lar ge-sc ale structur e of the universe (Researc h supp orted by th e National S cience F oun- dation. Princeton, N .J., Princeton Universit y Press, 1980. 435 p ., 1980). [64] N. Kaiser, MNRAS 227 , 1 (1987). [65] P . J. E. P eebles, Astrophys. J. 362 , 1 (1990). [66] F. Bernardeau, ApJL 390 , L61 (1992). [67] S. Zaroubi and Y. Hoffman, Astroph ys. J. 462 , 25 (1996). [68] A. J. S. Hamilton, in The Evolving Universe , edited by D. Hamilton (Klu w er Academic Publishers, Dordtrech t, 1998), vol. 231 of Astr ophysics and Sp ac e Scienc e Li- br ary , p. 185. [69] F. Bernardeau, M. J. C hod oro wski, E. L. Lok as, R. Stomp or, and A. Kudlic k i, MNRAS 309 , 543 (1999), astro-ph/9901057 . [70] E. Branc hini, L. T eo doro, C. S . F renk, I. Sc hmoldt, G. Efstathiou, S. D. M. White, W. Saunders, W. Su ther- land, M. Ro wan-Robinson, O. Keeble, et al., MNRAS 308 , 1 (1999), astro-ph /9901 366. [71] A. Dekel and O. Lahav, Astrophys. J. 520 , 24 (1999), astro-ph/9806193 . [72] S. Zaroubi, ArXiv Astrophysic s e- prin ts (20 02), astro- ph/0206052. [73] R. E. Smith, J. A. Pe acock, A. Jenkins, S. D. M. White, C. S. F renk , F. R. P earce, P . A . Thomas, G. Efstathiou, and H. M. P . Couc h man, MNRAS 341 , 1311 (2003), arXiv:astro-ph/0207664 . [74] R. Scoccimarro, Phys. Rev. D 70 , 083007 (2004), astro- ph/0407214. [75] V. Springel, S. D. M. White, A. J enkins, C. S. F renk, N. Y oshida, L. Gao, J. Nav arro, R. Thac ker, D. Cro- ton, J. Helly, et al., Nature (London) 435 , 629 (2005), arXiv:astro-ph/0504097 . [76] P . V alageas, A&A 421 , 23 (2 004), arXiv:astro - ph/0307008. [77] P . V alageas, A &A 476 , 31 (2007), arXiv:0706.2593 . [78] P . V alageas, A &A 484 , 79 (2008), arXiv:0711.3407 . [79] M. Cro cce and R . Sco ccimarro, Physical Review D 73 , 063519 (2006), arXiv:astro-ph/050941 8. [80] M. Cro cce and R . Sco ccimarro, Physical Review D 73 , 063520 (2006), arXiv:astro-ph/050941 9. [81] P . McDonald, Phys. Rev. D 74 , 103512 ( 2006 ), arXiv:astro-ph/0609413 . [82] P . McDonald, Phys. Rev . D 74 , 129901(E) (2006). [83] P . McDonald, Phys. Rev. D 75 , 043514 ( 2007 ), arXiv:astro-ph/0606028 . [84] D. Jeong and E. Komatsu, Astrophys. J. 651 , 619 (2006), arXiv:astro-ph/060407 5. [85] D. Jeong and E. Komatsu, A rXiv e-prints 0805.2632 (2008), 0805.2 632. [86] S. Matarrese and M. Pietroni, Journal of Cosmology and Astro-Pa rticle Ph ysics 6 , 26 (2007), arXiv:astro- ph/0703563. [87] J. Gaite and A . Dom ´ ınguez, Journal of Physic s A Mathematical General 40 , 6849 (2007), arXiv:astro- ph/0610886. [88] S. Matarrese and M. Pietroni, Modern Ph ysics Letters A 23 , 25 (2008), arXiv:astro-ph/0702653. [89] T. Matsubara, Phys. Rev. D 77 , 063530 (200 8), arXiv:0711.25 21. [90] M. Pietroni, ArXiv e- prin ts 0806.0971 (2008), 36 0806.09 71. [91] E. Bertsc hinger and A. Dekel, A pJL 336 , L5 (1989). [92] E. Bertschinger and A. Dekel, in ASP Conf . Ser. 15: L ar ge-Sc ale Structur es and Pe culiar Motions in the Uni- verse , edited by D. W. Latham and L. A . N. da Costa (1991), p. 67. [93] P . J. E. P eebles, ApJL 344 , L53 (1989). [94] A. D ek el, E. Bertschinger, and S . M. F aber, Astrophys. J. 364 , 349 (1990). [95] N. Kai ser an d A. S tebbins, in ASP C onf. Ser. 15: L ar ge- Sc ale Structur es and Pe culiar Motions in the Universe , edited by D. W . Latham and L. A . N . d a Costa (1991), p. 111. [96] Y. Hoffman and E. Ribak, ApJL 380 , L5 (1991). [97] D. H. W einberg, MNRAS 254 , 315 (1992). [98] A. Nusser and A. Dekel, Astroph ys. J . 391 , 443 (1992). [99] G. B. Ry bic k i and W. H . Press, Astrophys. J. 398 , 169 (1992). [100] M. Gramann, Astrophys. J. 405 , 449 (1993). [101] G. Ganon and Y. H offman, A pJL 415 , L5 (1993). [102] F. Bernardeau, A& A 291 , 697 (199 4), astro- ph/9403020. [103] A. Nu sser and M. Davis, ApJL 421 , L1 (1994), astro- ph/9309009. [104] O. Lahav, in ASP Conf. Ser. 67: Unveili ng L ar ge- Sc ale Structur es Behind the Mil ky W ay , edited by C. Balk o wski and R. C. Kraan-Korteweg (1994), p. 171. [105] O. Lahav, K. B. Fisher, Y. H offman, C. A. Scharf, and S. Zaroubi, ApJL 423 , L93 (1994), astro-ph/9311059. [106] K. B. Fisher, O. Lahav, Y. H offman, D. Ly nden- Bell, and S. Zaroubi, MNRAS 272 , 885 (1995), astro- ph/9406009. [107] R. K. Sheth , MNR AS 277 , 933 (19 95), astro- ph/9511096. [108] S. Zaroubi, Y. H offman, K. B. Fisher, and O. Laha v, Astrophys. J. 449 , 446 (1995), astro-ph/9410080. [109] M. T egmark an d B. C. Bromley, Astrophys. J. 453 , 533 (1995), astro-ph/940903 8. [110] R. A. C. Croft and E. Gaztanaga, MNRAS 285 , 793 (1997), astro-ph/960210 0. [111] V. K. Nara ya nan and D. H. W einberg, Astrophys. J. 508 , 440 (1998), astro-ph/980623 8. [112] U.-L. P en, Astrophys. J. 504 , 601 ( 1998 ), astro- ph/9711180. [113] U. Seljak, Astroph y s. J. 503 , 492 (1998), astro- ph/9710269. [114] U. Seljak, Astrophys. J. 506 , 64 (1998), astro- ph/9711124. [115] V. Bistolas and Y. Hoffman, Astrophys. J. 492 , 439 (1998), astro-ph/970724 3. [116] A. T aylor and H . V alentine, MNR AS 306 , 491 (1999), astro-ph/9901171 . [117] V. K. Naray anan and R. A . C. Croft, Astrophys. J. 515 , 471 (1999), astro-ph/980625 5. [118] S. Zaroubi, Y. H offman, and A . Dekel, A stroph ys. J. 520 , 413 (1999), astro-ph/981027 9. [119] D. M. Goldberg and D. N. Sp ergel, in ASP Conf. Ser. 201: C osmic Fl ows W orkshop , edited by S. Courteau and J. Willic k (2000), p. 282. [120] D. M. Goldberg and D . N . Sp ergel, Astrophys. J. 544 , 21 (2000), astro-ph/9912408 . [121] A. Kudlicki, M. Chodorow ski, T. Plew a, and M. R´ o ˙ zyczk a, MNRAS 316 , 464 (2000), astro- ph/9910018. [122] S. Basilak os and M. Plionis, Astrophys. J. 550 , 522 (2001), astro-ph/001126 5. [123] D. M. Goldberg, Astrophys. J. 552 , 413 (2001), astro- ph/0008266. [124] U. F risch, S. Matarrese, R. Mohay aee, and A. Sob olevski, Nature (London) 417 , 260 (2002), arXiv:astro-ph/0109483 . [125] S. Zaroubi, MNRAS 331 , 901 ( 2002 ), astro-ph/001056 1. [126] Y. Brenier, U. F risc h, M. H´ enon, G. Lo ep er, S. Matar- rese, R. Mohay aee, and A. Sob olevski ˘ i, MNR AS 346 , 501 (2003), astro-ph/030421 4. [127] R. Mohay aee, U. F risch, S. Matarrese, and A. Sob olevskii, A&A 406 , 393 (2003), arXiv:astro- ph/0301641. [128] R. Moha yaee , B. T ully, and U. F risc h, ArXiv A stro- physics e-prints (2004), astro-ph/0410063. [129] C. S. Botzler, J. S nigula, R. Bender, and U. Hopp , MN- RAS 349 , 425 ( 2004 ), arXiv:astro-ph/0312018. [130] R. Moha ya ee and R. B. T ully, ApJL 635 , L113 (2005), astro-ph/0509313 . [131] R. Mohay aee, H . Mathis, S. Colom bi, and J. Silk, MN- RAS 365 , 939 ( 2006 ), astro-ph/0501217. [132] V. Ick e and R. v an de W eygaert, qras 32 , 85 (1991). [133] S. Ikeuc hi and E. L. T urner, MNRAS 250 , 519 (1991). [134] F. Bernardeau and R . v an de W ey gaert, MNRAS 279 , 693 (1996). [135] W. E. Schaap and R. v an d e W eygaert, A&A 363 , L29 (2000), astro-ph/001100 7. [136] R. v an de W eygaert and W. Schaap, in Mining the Sky , edited by A. J. Banday, S. Zaroubi, and M. Bartelmann (2001), p. 268. [137] M. Ramella, W. Boschin, D. F adda, and M. N onino, A&A 368 , 776 (2001), arXiv:astro-ph/0101411. [138] L. Z aninetti, Chinese Journal of Astronomy and Astro- physics 6 , 387 (2006), arXiv:astro-ph/0602431. [139] E. Bertsc hinger, A. Dekel, S. M. F aber, A. Dressler, and D. Burstein, A strophys. J. 364 , 370 (1990). [140] A. Y ahil, M. A. Strauss, M. Davis, and J. P . H uc hra, Astrophys. J. 372 , 380 (1991). [141] K. B. Fisher, C. A. Sc harf, and O. Laha v , MNRAS 266 , 219 (1994), astro-ph/930902 7. [142] E. J. Shaya , P . J. E. P eebles, an d R. B. T ully, Astro- phys. J. 454 , 15 (1995), astro-ph/9506144 . [143] E. Branc hini, M. Pli onis, and D. W. Sciama , ApJL 461 , L17 (1996), astro-ph/9512055. [144] M. W ebster, O. Lah a v , and K . Fisher, MNRAS 287 , 425 (1997), astro-ph/960802 1. [145] C. Y ess, S. F. Shandarin, and K . B. Fisher, Astrophys. J. 474 , 553 (1997), astro-p h/96050 41. [146] I. M. Schmoldt, V. Saar, P . Sah a, E. Branchini, G. P . Ef- stathiou, C. S. F renk , O. Keeb le, S. Maddox, R. McMa- hon, S. Olive r, et al., Astrophys. J. 118 , 114 6 (1999), astro-ph/9906035 . [147] A. N usser and M. Haehnelt, MNR AS 303 , 179 (1999), astro-ph/9806109 . [148] M. T egmark and B. C. Bromley , The A s- trophysical Journal 518 , L69 (199 9), URL http://www .citebase .org/abst ract?id=oai:arXiv.org:astro- ph / 9 8 0 9 3 2 4 . [149] Y. H offman and S . Zaroubi, ApJL 535 , L5 (2000), astro- ph/0003306. [150] D. M. Goldb erg, Astrophys. J. 550 , 87 (2001), astro- ph/0009046. [151] H. Mathis, G. Lemson, V. Springel, G. Kauffmann , 37 S. D. M. White, A. Eldar, and A. Dekel, MNR AS 333 , 739 (2002), astro-ph/011109 9. [152] P . Erdo˘ gdu, O. Lahav, S. Zaroub i, and et al., MNRAS 352 , 939 (2004), astro-ph/031254 6. [153] M. S. V ogeley , F. Hoyle, R. R. Ro jas, and D. M. Gold- b erg, in IAU Col lo q. 195: Outskirts of Galaxy Clus- ters: Intense Life in the Suburbs , edited by A. Diaferio (2004), pp. 5–11. [154] J. Huchra, T . Jarrett, M. Skrutskie, R. Cutri, S. Schnei- der, L. Macri, R. S teining, J. Mader, N. Martim b eau, and T. George, in ASP Conf. Ser. 329: Ne arby L ar ge- Sc ale Structur es and the Zone of A voidanc e , edited by A. P . F airall and P . A. W oudt (2005), p. 135. [155] W. J. Perc iv al, MNR AS 356 , 1168 (2005), astro- ph/0410631. [156] P . Erdo˘ gdu, O. Lahav, J. Hu c h ra, and et al., MNRAS 373 , 45 ( 2006 ), astro-ph/0610005. [157] Y. Hoffman, in ASP Conf. Ser. 67: Unveil ing L ar ge- Sc ale Structur es Behind the Mil ky W ay , edited by C. Balk o wski and R. C. Kraan-Korteweg (1994), p. 185. [158] S. Zaro ubi, in ASP Conf. Ser. 218: Mapping th e Hid- den Universe: T he U ni verse b ehind the Mil y Way - The Universe in HI , ed ited by R. C. K raan-Kortew eg, P . A . Henning, and H. An dernac h (2000), p. 173. [159] R. C. Kraan-Korteweg and O. Laha v , AAPR 10 , 211 (2000), astro-ph/000550 1. [160] J. A. P eaco ck an d S. J. Do dds, MNRAS 267 , 1020 (1994), astro-ph/931105 7. [161] M. S. V ogeley and A. S. Szala y, Astrophys. J. 465 , 34 (1996), astro-ph/960118 5. [162] S. Zaroubi, I . Zeha vi, A. Dekel, Y. Hoffman, and T. Ko- latt, A stroph ys. J. 486 , 21 (1997), astro-ph/961022 6. [163] M. T egmark, Physical Review Letters 79 , 3806 (1997), astro-ph/9706198 . [164] D. J. Eisenstein and W. H u, Astroph ys. J. 511 , 5 (1999), astro-ph/971025 2. [165] G. E fstathiou, J. R . Bond, and S. D. M. White, MNRAS 258 , 1P ( 1992). [166] E. F. Bunn, D. S cott, and M. Wh ite, ApJL 441 , L9 (1995), astro-ph/940900 3. [167] M. A. Janss en and S. Gulkis, in NA TO ASIC Pr o c. 359: The Infr ar e d and Submil l imetr e Sky after COBE , edited by M. Signore an d C. Dup raz (Kluw er Academic Pub- lishers, Dordtrech t, 1992), pp. 391–408 . [168] E. F. Bunn, K. B. Fisher, Y. Hoffman, O. Lahav, J. Silk, and S. Zaroubi, Ap JL 432 , L75 (1994), astro- ph/9404007. [169] K. Maisinger, M. P . Hobson, and A. N. Lasenb y , MN- RAS 290 , 313 ( 1997 ). [170] M. T egmark, Phys. Rev . D 56 , 4514 (1997), astro- ph/9705188. [171] M. T egmark, ApJL 480 , L87 ( 1997 ), astro-ph/9611130. [172] S. Do delson, Astrophys. J. 482 , 577 (1997), astro- ph/9512021. [173] M. P . Hobson, A. W. Jones, A. N. Lasenb y, an d F. R. Bouc het, MNRAS 300 , 1 (1998), astro-ph/9806387. [174] P . Natoli, G. de Gasp eris, C. Gheller, and N. Vittorio, A&A 372 , 346 (2001), astro-ph/0101252. [175] O. D or ´ e, R. T eyssier, F. R. Bouchet, D. V ibert, and S. Prunet, A &A 374 , 358 (2001), astro-ph/0101112. [176] R. Stomp or, A. Balbi, J. D. Borrill, P . G. F erreira, S. Hanany, A. H. Jaffe, A. T. Lee, S. Oh, B. Rabii, P . L. Richards, et al., Phys. Rev. D 65 , 022003 (2001), astro-ph/0106451 . [177] B. D. W and elt, D. L. Larson, and A . Lakshmi- nara yanan, Phys. Rev. D 70 , 083511 (2004), astro- ph/0310080. [178] H. K. Eriksen, I. J. O’Dwyer, J. B. Jewel l, B. D. W an- delt, D. L. Larson, K. M. G´ orski, S. Levin, A. J. Ban- day , and P . B. Lilje, ApJS 155 , 227 (2004), astro- ph/0407028. [179] J. Jewell, S. Levin, and C. H. A nderson, Astrophys. J. 609 , 1 (2004), astro-ph /0209 560. [180] D. Yvon and F. Ma yet, A& A 436 , 729 (2005), astro- ph/0401505. [181] E. Keih¨ anen, H. Kurki- Suonio, and T. Po utanen, MN- RAS 360 , 390 ( 2005 ), astro-ph/0412517. [182] E. C. Sutton and B . D. W andelt, ApJS 162 , 401 (2006). [183] D. L. Larson, H. K. Eriksen, B. D. W andelt, K. M. G´ orski, G. Huey, J. B. Jew ell, and I. J. O’Dwyer, A s- trophys. J. 656 , 653 (2007), astro-ph/0608007. [184] G. Hinshaw et al. (WMA P), arXiv 0803.0732 (2008), 0803.07 32. [185] U. Seljak and M. Zaldarriaga, A stroph ys. J. 469 , 437 (1996), arXiv:astro-ph/960303 3. [186] A. Lewis, A. Challinor, and A. Lasen by , A strophys. J. 538 , 473 (2000), astro-ph/9911177. [187] M. D oran, Journal of Cosmology and A stro-P article Physics 10 , 11 (2005), arX iv:astro-ph/0302 138. [188] E. F. Bunn and N. Sugiy ama, Astrophys. J. 446 , 49 (1995), astro-ph/940706 9. [189] M. T egmark, A. N. T a ylor, and A. F. Heav en s, Astro- phys. J. 480 , 22 (1997), astro-ph/9603021 . [190] M. T egmark, Phys. Rev. D 55 , 5895 (1997), astro- ph/9611174. [191] M. R. Nolta et al. (WMA P), arXiv 0803.0593 (2008), 0803.05 93. [192] A. H. Guth, Phys. Rev. D 23 , 347 (1981). [193] A. D. Linde, Physics Letters B 108 , 389 (1982). [194] A. Albrech t and P . J. Steinhardt, Physical Review Let- ters 48 , 1220 (1982). [195] A. H. Guth and S.-Y. Pi, Physical Review Letters 49 , 1110 (1982). [196] A. A. Starobinsky, Physi cs Letters B 117 , 175 (1982). [197] J. M. Bardeen, P . J. Steinhardt, and M. S. T urner, Ph ys. Rev. D 28 , 679 (1983). [198] W. Hu, Phys. R ev. D 64 , 083005 (2001), astro- ph/0105117. [199] F. Bernardeau and J.-P . Uzan, Ph ys. Rev. D 66 , 103506 (2002), hep-ph/0207295. [200] N. Ba rtolo, E. Komatsu, S. Matarrese, and A. Riotto, Phys. R ep. 402 , 103 (2004), astro-ph/0406398. [201] D. Babich, P . Creminelli, and M. Zaldarriaga, Journal of Cosmology and Astro-Particle Physics 8 , 9 (2004), arXiv:astro-ph/0405356 . [202] E. Komatsu, B. D. W andelt, D. N. Sp ergel, A. J. Ban- day , and K . M. G´ orski, Astrophys. J. 566 , 19 (2002), arXiv:astro-ph/0107605 . [203] D. Babic h and M. Zaldarriaga, Ph ys. Rev. D 70 , 083005 (2004), arXiv:astro-ph/040845 5. [204] E. Komatsu, D. N. Sp ergel, and B. D. W andelt, Astro- phys. J. 634 , 14 (2005), arXiv:astro-ph/0305189. [205] A. P . S . Y ada v, E. Komatsu, and B. D. W andelt, As- trophys. J. 664 , 680 (2007), arXiv:astro-ph/0701921. [206] A. P . S. Y adav, E. Komatsu, B. D. W andelt, M. Liguori, F. K. Hansen, and S. Matarrese, Astrophys. J. 678 , 578 (2008), arXiv:0711.4 933. [207] E. Komatsu, A. Kogut, M. R. Nolta, C. L. Bennett , 38 M. Halpern, G. Hinshaw, N. Jarosik, M. Limon, S. S . Meye r, L. P age, et al., ApJS 148 , 119 (2003), astro- ph/0302223. [208] A. Curto, J. F. Macias-P erez, E. Martinez-Gonzalez, R. B. Barreiro, D. Santos, F. K. Hansen, M. Liguori, and S. Matarrese, ArXiv e-prints 0804.0136 (2008), 0804.01 36. [209] A. P . S . Y adav and B. D. W andelt, Physical Rev iew Letters 100 , 181301 (2008). [210] E. Martinez-Gonzalez, ArXiv e-prints 0805.4157 (2008), 0805.4 157. [211] J. Jasc he, F. S . Kitaura, and T. A. Enssl in, ArXiv e- prints (2009), 0901.3043 . [212] P . Coles and B. Jones, MNRAS 248 , 1 (1991). [213] R. Vio, P . A ndreani, and W. W amsteker, P ASP 113 , 1009 (2001), arXiv:astro-ph/0105107. [214] M. C. Neyrinck, I. Szapu di, and A. S. S zala y, ArXiv e-prints (2009), 0903.4 693. [215] R. K. Sac hs and A. M. W olfe, Astrophys. J . 147 , 73 (1967). [216] M. J. R ees and D. W. Sciama, N ature (London) 217 , 511 (1968). [217] J. R. F ergusson and E. P . S. Shellard, ArXiv e-prints (2008), 0812.3 413. [218] E. K omatsu and D. N. Sp ergel, Phys. Rev . D 63 , 063002 (2001), arXiv:astro-ph/000503 6. [219] N. Kogo and E. Komatsu, Phys. Rev. D 73 , 083007 (2006), arXiv:astro-ph/060209 9. [220] D. Ba bic h, Ph y s. Rev. D 72 , 043003 (2005), arXiv:astro-ph/0503375 . [221] A. F. Heav ens, MNRAS 299 , 805 (199 8), arXiv:astro- ph/9804222. [222] P . Creminelli, A . Nicolis, L. Senatore, M. T egmark, and M. Zaldarriaga , Journal of Cosmology and Astro- P article Physic s 5 , 4 (2006), arXiv:astro-ph/0509029. [223] A. P . Y adav and B. D. W andelt, Ph ys. Rev. D 71 , 123004 (2005), arXiv:astro-ph/050538 6.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment