In principle determination of generic priors

In principle determination of generic priors Cael L. Hasse ∗ Sp e cial R ese ar ch Centr e for the Sub a tomi c Struct ur e of Matter and Dep artment of Physics, University of A delaide 5005, Aus tr alia. (Dated: July 13, 2018) Probabilit y theory as extended logic is completed such that ess entia lly an y probabili ty may b e determined. This is done by considering prop ositional logic (as opp osed to predicate logic) as syntactical ly suﬃcient and imp osing a symmetry from prop ositional logic. It is shown how the notions of ‘p ossibilit y’ and ‘prop erty’ ma y b e suﬃciently represen ted in prop ositional logic such that 1) the principle of indiﬀerence drops out and b ecomes essen tially com b inatoric in nature and 2) one ma y appropriately represent assumptions where one assumes there is a space of p ossibilities but do es not assume the size of the space. INTRO D UCTION This a rticle is a summation of current and ongoing resear ch. Some work in the literature may not yet b e prop erly considered. It can be argued that Bay esian probability theory is the general framew or k for scientiﬁc prediction and inference [1, 2]. It is a ca lculus for no rmative statements P ( A | B ) , where A and B are propos itio ns. These statemen ts en- co de degrees of belief/plaus ibility of A , given B . F ully half of probability theory is enco ded into tw o s imple rules 1 , Pro duct r ule: P ( AB | C ) = P ( A | C ) P ( B | AC ) = P ( B | C ) P ( A | B C ) Sum rule : P ( A | B ) + P ( ¯ A | B ) = 1 , from which we hav e a generalised sum rule P ( A + B | C ) = P ( A | C ) + P ( B | C ) − P ( AB | C ) . These rules give relationships b etw een diﬀerent proba- bilities but do not constra in the probabilities enough to uniquely determine them [14]. This is the incompleteness of pr obability theory . In particular, in inference w e often want to determine the probability of a hypothes is H given data D and p er- haps so me background knowledge I , P ( H | D I ) = P ( D | H I ) P ( H | I ) P ( D | I ) . 1 The notation w e are using corresp onds to AB = A and B , A + B = A or B , and ¯ A = not A Probabilities like P ( H | I ) are often called ‘prior s’. These need to be determined to w ork o ut P ( H | D I ). Many methods hav e b e e n inven ted to determine prio rs for diﬀeren t situations. If one is a Su b jective Ba yesian where a probability is a deg ree o f belief relative to some agent, the (perhaps coun terfactual) agent is free to just choo se the v alue of their priors based on intuition or g ut instinct. If one is an O b jective Bay esia n, one wan ts to ﬁn d met ho ds o f deriv ation such that ea ch probability as signation can be considered unique and agent indep endent (although diﬀerent agents may make diﬀerent assumptions and so may still c o nsider diﬀeren t degrees of plausibility; i.e., Ob jective Bayesians still consider proba bilit y theo r y sub jective in one sense). Such metho ds include Laplac e ’s principle of indiﬀere nce , transformatio n group metho ds, and maximum entrop y metho ds [3]. How ever, metho ds such as these ar e not alwa ys of use fo r calculation o f ge ner ic priors . I prop o se a method whereby in principle any prior may be deter mined. Moreov er, it is a metho d for calculating essentially 2 any 3 probability . This will be acco mplished by completing the Ob jective Bayesian a ppr oach of pro b- ability theory as extended logic [1, 4] by impo s ing a sym- metry and treating probability theory as syntactically complete. LOGIC AND EXTENDED LOG IC T o unders tand the completion I am prop osing, we m ust understand certain asp ects o f log ic that I propos e 2 W e shall b e using a ﬁnite sets poli cy where one alw ays starts with ﬁnite sets a nd only then tak es limits. See [1] f or detailed motiv ation. 3 There is also no general metho d for calculating factors P ( D | H I ) - ca l led l ike l iho o ds - unless H predicts D o r ¯ D for certain; the calculation often reduces to calculation of a prior. 2 to imp ose on probability theory . Consider th e fo llowing logica l argument: A B ∴ C This is not a v alid ar g ument . The ba sic prop ositions A and B ma y have meaning s for us that are no t reﬂected in the for mu lation of the argument. F or ex ample, w e may wan t the corresp ondence A : So crates is a man. B : If So cr ates is a man, then So crates is mor tal. C : So crates is mortal. A b etter for mulation of the arg ument will then b e to use logical implication A → C instead of B . W e then get the new 4 argument, A A → C ∴ C which is a v alid argument. One ma y see from this a trivial asp ect o f logic; the v alidity of an argument is depe ndent on the structure of the arg ument . O ne needs to suﬃciently deﬁne the structure in order to make an argument that is appropriate. More impo rtantly , the str ucture of an ar gument is the only thing the v alidity is dep endent up on. An y inﬂuence to the v alidity of a lo gical a rgument b eyond the for m of the stated arg ument is extra-lo gical. One ma y hav e extra-log ical inﬂuences in t wo w ays; the c hoic e o f rules that deﬁne the logic used ma y b e changed s uch that o ne uses a diﬀeren t logic; and one may ha ve s o me meaning for a pro po sition in mind that is not deﬁned within the ar gument. F or this second inﬂuence, we shall take the p osition that o ne has insuﬃcien tly represented ones premises a nd thus the ar gument is not well formulated. F or the ﬁrst inﬂuence, this is a leg itimate endeavour. W e shall howev er stick to prop ositional lo gic and see how far w e can g o. An a rgument is also indep endent of the la be ls one uses for the prop ositions; what is impo rtant is logical structure. 4 Note here we ar e not considering uppercase propositions to be proposi tional v ariables; within each argument, pr opositions are constan t. It is the argument s that change. The position we a re taking is sometimes called a syn- tactic logical in terpreta tio n of pr obability theory . Many philosophers take the p osition that there is meaning for some prop ositions relev ant to scientiﬁc inference and prediction that ca nnot b e deﬁned thro ugh logical structure alone. These issues and others are discussed in the r emarks. W e star t by following Cox [4] who derived the pro duct and sum rules from ba sic desiderata to extend lo gic. In the system, our primitiv e o b jects are de g rees of plausi- bilit y A | B , of A given B , that are equa l to rea l num b ers . P robabil- ities a re p os itive, contin uous, monotonic functions, f , o f plausibilities P ( A | B ) = f ( A | B ) . The function, f , is chosen such that certa int y cor re- sp onds to the num b er 1. This choice is ar bitrary but it leads to the particularly simple forms of the pro duct and sum rules we have g iven. W e shall consider a degr e e of plausibility as dir ectly analogo us to a log ic al argument. This means t wo things: 1. The v alue of a plausibility (analog ous to the v a- lidit y of an a rgument) is dep endent o n o nly the ex- plicit logic a l s tr ucture. F rom now on, any non-basic prop osition will b e written as Z [ A i ] or Z [ A 1 , ..., A n ] as opp osed to Z . When calculating a prior for Z [ A i ], the pro duct and sum rules will constr ain it to be functionally r elated to pro babilities of basic prop ositions. W e may then isolate the pro babilities that canno t b e calculated by using only the pro duct and sum rules , and the rules of Boolea n a lgebra. 2. D ir ectly related to the ab ov e, the v alue of a degree of plausibility will not depend on the lab els used on basic prop ositions. This gives us a pow erful tric k that is fundamen tal to deriv ations [1 ] of indiﬀerence and transformation group metho ds. F or example, P ( A 1 | X [ A 1 , A 2 ]) = P ( A 2 | X [ A 2 , A 1 ]) . If X [ A 1 , A 2 ] is per mu tation symmetric; i.e., X [ A 1 , A 2 ] = X [ A 2 , A 1 ], then P ( A 1 | X [ A 1 , A 2 ]) = P ( A 2 | X [ A 1 , A 2 ]) , which gives us a non-trivial constr aint. Rela b e lling can also b e used for individual probabilities sepa- rately within a functional relatio nship. F or ex am- ple, P ( A 1 + A 2 | A 3 ) = P ( A 1 | A 3 ) + P ( A 2 | A 3 ) − P ( A 1 A 2 | A 3 ) = 2 P ( A 1 | A 3 ) − P ( A 1 A 2 | A 3 ) . 3 EXCLUSIVITY, EXHAUSTIVITY AND INDIFFERENCE In nearly e very proba bilit y calcula tion exclusivity and exhaustivity for so me set of po ssibilities a re (usua lly im- plicitly) assumed. Exclusivity means that if you assume that A i is true, then you may infer that any A j , in some set { A i } n i =1 where i 6 = j , must b e false. Exhaustivity means that y ou assume at least o ne of the s et { A i } n i =1 is true. These assumptions ar e often parametrised b y the conditions P ( A i A j | I [ A i ]) = P ( A i | I [ A i ]) δ ij ; P ( n X i =1 A i | I [ A i ]) = 1 . If one wan ts to wr ite a probabilit y in terms of a mixture of others in the normal wa y , then it is nece s sary to make these assumptions; P ( B | X [ A i , B ]) = P ( B n X i =1 A i | X [ A i , B ]) = n X i =1 P ( B A i | X [ A i , B ]) = n X i =1 P ( A i | X [ A i , B ]) P ( B | A i X [ A i , B ]) . If one doesn’t assume exclusivity , then one g e ts extra terms in the ab ov e e q uation. One may consider exclusivit y and exhaustivit y as the suﬃcient lo gical deﬁnition of ‘p os sibility’ and ‘prop erty’ [8, 11]. A property may b e considered a coarse gr ained po ssibility; for example, any classical observ able in Hamiltonian mechanics seg ments the space o f p ossible states, deﬁning a prop erty . In particular, the energy of a sys tem is a prop er t y with v arious v alues of the energy related to v ar io us sets of p ossible states. The meaning that diﬀerentiates diﬀeren t t yp es of properties and po ssibilities is deﬁned by other logical relationships one assumes. So e ne r gy is a c lassiﬁcation diﬀerentiated by ca usal structure (i.e., the equatio ns one uses); we assume that if a system ha s a certain v alue fo r its ener gy at a certain time and is isolated, then the system will hav e the same energy at a la ter time; we hav e log ical correla tion. One problem is that the co nditions for exclusiv it y a nd exhaustivity are no t derived from the explicit for m o f our assumptions I [ A i ]. The explicit form of I [ A i ] for the simple s et { A 1 , A 2 , A 3 } is as follows: M 3 [ A i ] = A 1 ¯ A 2 ¯ A 3 + ¯ A 1 A 2 ¯ A 3 + ¯ A 1 ¯ A 2 A 3 + ¯ A 1 ¯ A 2 ¯ A 3 for exclus ivity and X 3 [ A i ] = A 1 A 2 A 3 + A 1 A 2 ¯ A 3 + A 1 ¯ A 2 A 3 + ¯ A 1 A 2 A 3 + A 1 ¯ A 2 ¯ A 3 + ¯ A 1 A 2 ¯ A 3 + ¯ A 1 ¯ A 2 A 3 for exha ustivity . This giv es us I 3 [ A i ] = M 3 [ A i ] X 3 [ A i ] = A 1 ¯ A 2 ¯ A 3 + ¯ A 1 A 2 ¯ A 3 + ¯ A 1 ¯ A 2 A 3 . These functions have be e n written in a non-minima l wa y where the function is a sum of terms with each term b eing a pro duct of A i ’s and ¯ A i ’s and containing A i ’s f o r all 1 ≤ i ≤ n for some n . In this form - which is often called disjunctive normal form - every term is exclusive by deﬁnition. Every prop ositio nal function can b e wr itten in this form. Each function can then be asso ciated with a subset of the p ower se t of { A i } n i =1 , { A i } p , where each term is asso cia ted with an element of { A i } p . The sum of terms asso c iated with { A i } p are exhaus tive b y deﬁnition. This will give us grea t ﬂexibility in calculation. Let us no w calculate P ( A 1 | I 3 [ A i ]) = P ( A 1 ¯ A 2 ¯ A 3 | ) P ( A 1 ¯ A 2 ¯ A 3 | ) + P ( ¯ A 1 A 2 ¯ A 3 | ) + P ( ¯ A 1 ¯ A 2 A 3 | ) = 1 3 × P ( A 1 ¯ A 2 ¯ A 3 | ) P ( A 1 ¯ A 2 ¯ A 3 | ) = 1 3 , where we have used relab elling . Pro babilities of the form P ( . | ) refer to probabilities with minimal assumptions . This will b e de ﬁned later. Indiﬀerence can b e seen as fundamentally combinatoric in nature, where the probability is directly rela ted to the n umber of wa y s one may a ssign a single no n-negated pro po sition in a pr o duct of propo sitions. W e can generalise to a ny { A i } n i =1 in a simple ma nner. One asp ect of the ab ov e deriv ation is that the prob- abilities cancel out such that they do not need to b e calculated. This is suggestive of why indiﬀer e nce could be derived in the past [1] without needing to g o beyond the ba s ic sum and pro duct rules. The a bove deriv ation a ls o shows us that exclusivity and exhaustivity a re not just necessary but also suﬃcient in deriving ind iﬀerence. DETERMINING GENERIC PR OBABILITIES W e deﬁne a w or king set { A i } n i =1 as the s et of all pro p o - sitions we are working with for a par ticular proba bilit y . 4 This set may b e made a rbitrarily la rge: P ( A | B ) = P ( A | B ) P ( C + ¯ C | AB ) = P ( A [ C + ¯ C ] | B ) = P ( A | [ C + ¯ C ] B ) . F rom this one ca n see w e may add an arbitrary n umber of tautolog ies to the premises. One may consider this arbitrar iness an important criterion for pro bability theory; we wan t our pro babilities to b e stable under arbitrar y additions of tautologies to our assumptions. It is interesting that the pr o duct a nd sum rules g ive this to us for free. An imp ortant thing to no te is that we ar e a llowing prop ositions like C in the conclusions that have no representation in the premises. Probabilities with minimal assumptions may be writ- ten a s P ( Z [ A 1 , ..., A n ] | ) = P ( Z [ A 1 , ..., A n ] | Q n [ A 1 , ..., A n ]) , where Q n [ A i ] = Q n i =1 ( A i + ¯ A i ). W e may thus consider probabilities P ( . | ) as ones either as suming nothing o r only tautolo gies. Let us now turn our attention to a g e ne r ic pro bability P ( Z [ A i ] | Y [ A i ]) . W e may write P ( Z [ A i ] | Y [ A i ]) = P ( Z [ A i ] Y [ A i ] | ) P ( Y [ A i ] | ) . Both of these factors may be written as sums of terms of the form P ( A 1 ...A i ¯ A i +1 ... ¯ A n | ). These terms may be decompo s ed using the pro duct rule. A t this p oint the der iv ations of Jaynes [1] and Cox [4] give us no further suppor t. Jaynes appeared [1] (P35) to co nsider probability theory complete as he exp ected one to alwa ys hav e background knowledge to determine the terms. Here we are explicitly ass uming only tautologies with probabilities P ( . | ) and hence ca nnot r ely on such background kno wledg e. Mor eov er, as we are aiming at generality , we do not wan t to rely o n such bac kg round knowledge. T o determine the terms , consider the following symme- try: The v alidity of an a r gument is inv aria n t under swap- ping a basic pr op osition A with its negation ¯ A in both the premises and the conclusio ns. Mo reov er , the swap- ping symmetr y A ↔ ¯ A is a s ymmetry o f lo gical str uc tur e. It may b e seen a s directly related to the double negation rule; imp osing the symmetry on a trivial iden tity gives us the rule: ¯ A = ¯ A → A = ¯ ¯ A. Consider also that a p ossible state of a ﬀairs may b e referred to b y e ither A o r ¯ A , with b o th ch oices giving equal consequences fo r argumen tatio n. The t wo prop o- sitions are deﬁned in contrast to one another (their truth v alues are o pp o s ed) and are not distinguished within the system in any other way . This lack of distinguishing f actors is made mo re apparent when po ssibility is see n as an ex plicit extra assumption; the prop osition ¯ A do es not by deﬁnition mean that a prop o- sition fro m a set of p ossibilities, other than A , must be true. Suc h meaning comes from an assumption I n [ A i ]. I asse r t that o ur degrees o f plausibility must satisfy the symmetry in order to not in tr o duce an extra-logica l bias into our framework. This may b e co nsidered part of the desideratum of consistency used in [1]. Consider notation x j k = P ( A | A 1 ...A j ¯ B 1 ... ¯ B k ). F rom our sy mmetry , one ma y impose 5 x j +1 k = x j k +1 ∀ j, k ≥ 0 . (1) W e now prove a lemma: ∀ j, k ≥ 0 , if x j k = P ( A | ) , then x j k +1 = P ( A | ). Pro of: Assume x j 0 k 0 = P ( A | ) for some j 0 , k 0 ≥ 0 . Then P ( A | ) = P ( A [ A q + ¯ A q ] | A 1 ...A j 0 ¯ B 1 ... ¯ B k 0 ) = P ( A | ) P ( A | A 1 ...A j 0 A q ¯ B 1 ... ¯ B k 0 ) + { 1 − P ( A | ) } P ( A | A 1 ...A j 0 ¯ B 1 ... ¯ B k 0 ¯ A q ) = P ( A | ) x j 0 +1 k 0 + { 1 − P ( A | ) } x j 0 k 0 +1 Then by eq.(1), x j 0 k 0 +1 = P ( A | ) .  F rom our lemma and (1), w e ge t by induction from x 0 0 = P ( A | ), ∀ j, k ≥ 0 , x j k = P ( A | ) . Now see P ( A 1 ...A j ¯ B 1 ... ¯ B k | ) = j Y l =1 k Y r =1 P ( A l | µ l ) { 1 − P ( B r | ν r ) } = P j ( A | ) { 1 − P ( A | ) } k , where µ l and ν r are pro ducts of v ar io us A ’s and B ’s. T o determine P ( A | ), w e impose the symmetry again: A | = ¯ A | . 5 This condition i s imp osed on the plausibili ties but may b e state d in terms of pr obabilities. 5 F rom this condition one arrives at P ( A | ) = 1 2 . Our g eneric probability then becomes P ( Z [ A i ] | Y [ A i ]) = M N 2 n − m , where M and N are just the n umber of terms in Z [ A i ] Y [ A i ] and Y [ A i ] resp ectively when the functions a re written in minimal disjunctive normal form. Moreov e r , M can heuristically b e thought of as an unnormalised ov erla p b etw een Z [ A i ] a nd Y [ A i ]. The n umbers m and n are the num b ers of basic prop ositions in the terms o f Z [ A i ] Y [ A i ] a nd Y [ A i ] r esp ectively , written in minimal disjunctiv e no rmal form. APPLICA TIONS The precis ion and gener a lity of our scientiﬁc state- men ts are direc tly related to the precis ion and genera lity of the languag e used to mak e the statements. With the formulation of probability theory I hav e just pr op osed, we are a ble to determine precise pr obabilistic statements for a greater v a r iety of situations then we w ere able to befo re. Immediate applications include situations where we do no t or canno t assume exhaustivity and exclusivity: 1. W e c a n c a lculate probabilities for prop o s itions A i when we know only tha t they a re one of m exha us- tive p ossibilities ; P ( A i | X m [ A i ]) = 2 m − 1 2 m − 1 . 2. Consider a situation s imilar to o ne pr esented by W alley [5]: W e have a bag o f marbles. Supp ose we know that they are la belle d in a distinguish- able way . In particular, they are num b ered and we know there is a marble that is lab elled with ‘1’. W e know nothing ab out the n umber of marbles in the bag (p erhaps the ba g is magical, with the ability to hold an unlimited num b er of marble s ). W e wan t to know the probability that if we pick a mar ble from the bag, that marble will b e the one lab elled with ‘1’. This will g enerally dep end on our knowledge of how we pick the marble. W e are not interested in this pa rticular asp ect and if w e kno w our metho d of picking cannot dis c riminate the lab els, we may neglect this knowledge for our current pur p o ses. W alley and other s hav e prop osed solutions to pro b- lems of this sort which go beyond the Ba yesian framework. One so ught after pro pe rty of a prob- ability in this situation is ca lle d r egrouping in- v ariance; i.e., it s hould somehow b e inv a riant to changes in the ‘size of the sample s pa ce’. This pre- suppo ses that our pro babilities are deﬁned in ter ms of ‘s ample spac e s ’. Within the fra mework just prop osed the so lution requires only pr op erly stating the salient assump- tions; we hav e p ositive knowledge that ther e is a set of exclusive and ex haustive p ossibilities, we just do not know the size of the set. An appro priate prob- ability will then be of the form lim n →∞ P ( A 1 | n X j =1 I j [ A i ]) . Note, assuming P n j =1 I j [ A i ] do es not assume the v arious s ample spaces are exclusiv e t o eac h o ther. Exclusivity o f sample spa ces would require ad- ditional assumptions and c hang e the probability . This is just one example of the precise choices we could make in o ur assumptions, exemplifying the generality o f our approach. 3. Quantum theory ha s severe ontological problems. Our diﬃculty in solving these problems may be an insuﬃcient formulation of pro bability theory [6]. Most if not all no-go theo rems for ontological mo d- els of quantum theory [7 – 11] implicitly assume ex- clusivity a nd ex haustivity for the s pace of ontolog- ical states. The fr amework pre s ent ed here allows for a whole cla ss of mo dels, whic h do no t assume exclusivity a nd e x haustivity , to be explored. REMARKS The proba bilistic framework here is co nsidered a s a symbolic sy stem r ather than a system of functions o r measures on a pr edeﬁned set. The fra mework is general enough to dea l with situatio ns where sets of possibilities are not a ssumed. The principle of indiﬀerence is deriv ed as a co nsequence of our ability to re la b e l and the explication of the assumptions we implicitly make to deﬁne po ssibility . Indiﬀerence is thus not a principle impo sed a priori or arbitrarily . Probability theory as extended logic is completed by impo sing a symmetry from propo sitional logic. The deg ree to which o ne is co nvinced by the fra me- work prop osed here partly dep ends on whether one is convinced that prop ositio nal logic is suﬃcient for the task of scientiﬁc inference . W e hav e s een how one may represent basic notions of pos sibility and property while still maintaining logica l consis tency . What prop ositiona l logic do es not do are univ ersa ls. I argue that univ ersa ls are not directly r elev ant for scientiﬁc infer e nce; a scientist would never b e able to test the statement ‘all 6 rav ens a re black’. I prop ose the notion of universalit y is related to notions o f induction and simplicity . The framework just prop osed do es not direc tly justify induction. This is a go o d thing. An approa ch [12] b y Carnap - that ha s s imilar motiv ations to the appro a ch here - tr ies to build induction dir ectly into the frame- work. One pr oblem is that the inductive predictions do not take into account ones assumptions ; whether or not one predicts a s equence to co ntin ue at all and precisely how one pr e dic ts this depends on ones assumptions. Moreov er, I submit these things should only dep end on ones assumptions; if y o u make no assumptions you have no rea son to predict the co nt inuation of a sequence. One may still p erfor m inductive rea soning given cer - tain assumptions such a s a constant causal mechanism. There is, how ever, s till a pro blem of induction: One may make v alid predictions ba sed on as s umptions but those assumptions may not necessarily b e j us tiﬁed. The Bay e s ian framework has some built in notion of simplicity [1](Ch.20). Consider t wo sets of pr op ositional functions we’ll call mo dels, Ω m and Ω m +1 , where Ω m is parametrise d by m parameter s a nd Ω m +1 by m + 1 parameters. Supp os e the m + 1’th parameter is θ m +1 and the s ubset Ω m +1 | θ m +1 =0 has a o ne to one cor resp ondence with Ω m where ea ch element in bo th sets is identiﬁed with one that pr o duces the same likelihoo d for some data D [ A i ]. W e may take Ω m and Ω m +1 as comp ound mo dels, i.e., mo dels where the para meter s are unknown. If the elements are exclusive for b oth Ω m and Ω m +1 and the p oint in the parameter s pa ce that gives a max im um likelihoo d (for data D [ A i ]) is near θ m +1 = 0 and sharply p eaked, then the likelihoo d for the comp ound mo del of Ω m will generally b e gr eater than the lik e liho o d for Ω m +1 ; a set of mo dels that predicts the observ ations as well as another but with less par a meters will ge ner ally b e b etter. One limitation to this is tha t one has already chosen the sets o f mo de ls to consider in a certain way . This has partly to do with ones pr eferences; do I judge a mo del with v arious par ameters on the b est choice of v alues o f those parameters or do I judge a mo del on the total parameter spa ce given to me? The c ho ic e also has to do with the choice o f using a mathematical framework in the ﬁrst pla ce. In principle there are an inﬁnit e n um b er of pr op ositional functions that one may use as a mo del that hav e no discernable or c onsistent pattern. Can the restriction to pro po sitional functions with consistent patterns b e justiﬁed? This question b ecomes ma nifest in the pro p osed framework wher e we do no t rely on calculating things with r esp ect to a pr edeﬁned set o f alternative mo dels; w e ma y a sk where those alternatives come fro m and why . Note that the framework pres ent ed her e manifests a primitive notio n o f simplicity for pro po sitional functions themselves. The probability of some Z [ A i ], g iven no assumptions, is pro p ortional to 2 − m where m is the minim um num b er of pro po sitions required to write Z [ A i ] in disjunctiv e normal fo r m. The smaller the v alue of m , the ‘simpler ’ Z [ A i ] is. I sp e c ulate that a justiﬁcatio n for induction and simplicity comes from an assumption, J [ A i ], that restricts the s et of prop os itio nal functions one may use. This restr iction could b e justiﬁed b y epistemolog ical consideratio ns. Mo dels with consistent patterns ma y then emer ge due to combinatoric reasons . The concept o f p os sibility that is o utlined in this article is suggestive of how scientiﬁc concepts may b e deﬁned genera lly . Possibility is a pattern of prop o sitions within a mo de l. Cr ucially , this patter n is no t unique; diﬀerent mo dels with diﬀerent size s for p ossibility s pa ces will use diﬀerent patterns (e.g ., I 2 [ A i ] and I 3 [ A i ] are diﬀerent) . Moreover, the pattern may b e nested such that the diﬀerent p ossibilities ar e pro po sitional functions rather than basic pro p o sitions. Within this framework, the co ncept of possibility cannot be deﬁned as a form of classiﬁcation, in co ntrast to some other attempts at the deﬁnition of a conce pt [13]. I s pe c ulate that concepts like po ssibility and prop erty may instead b e as so ciated with a lg orithms. Univ ersality may b e deﬁned as a concept. This deﬁnition of concept sugg ests a motiv ation for its use. Co nsider an agent with data and assumption J [ A i ]. There will likely b e a n inﬁnite set of mo dels to consider. Calculation for decisions may b e computationally in- tractable. The ag e n t may choose some sc heme that b est approximates the inferences one would ideally achiev e. This scheme could involv e a lgorithms for gener ating mo dels. It may b e the cas e that the b est algorithms come from c o llections of nested conc e pts we may call general hypothes e s. These g e neral hypotheses may not give unique r esults but rather generate prop ositional functions dependent o n input. Some of these g eneral hypotheses may b e w ell parametrised by mathematics . F urther work is required. ∗ Electronic address: cael.hasse@adelai d e.edu.au 7 [1] E. T. Jaynes, “Probabilit y Theory: The Logic of Sci- ence,” Cambridge U niversi ty Press, Cam b ridge (2003). [2] B. de Finetti, “Theory of Probability: A Critical Intro- ductory T reatment,” John Wiley & Sons Ltd, N ew Y ork (1974-75). [3] J. E. Shore and R. W. Johnson, IEEE transactions on information t heory , 26 (1980). [4] R . T. Cox, “The Algebra of Probable In ference,” John Hopkins Press, Baltimore (1961). [5] P . W alley , J. R. Statist. So c. B, 58:3-57 (1996). [6] C. F u chs, eprint arXiv:q uant-ph/1003. 5209v1 ( 2010). [7] J. S. Bell, “Sp eak able and Unsp eak able in Quantum Me- chanics ,” C ambridge Univ. Press (1987). [8] N. H arrigan, R. W. Sp ek kens, F ound. Ph ys. 40:125 - 157 (2010). [9] R. W. Sp ekkens, Phys. R ev. A, 71:052108 (2005). [10] L. Hardy , Stud. Hist. Phil. Mod . Phys., 35:267-276 (2004). [11] M. F. Pusey , J. Barrett, T. Rudolph, N ature Phys., 8:475 (2012). [12] R. Carnap, “Logical F oun d ations of Probability ,” Uni- versi ty of Chicago Press (1950). [13] L. G. V alian t, Communications of the A CM, 27:1 134 (1984). [14] R. T. Cox, An n. J. Phys., 14:1-13 (1946).

In principle determination of generic priors

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment