In principle determination of generic priors
Probability theory as extended logic is completed such that essentially any probability may be determined. This is done by considering propositional logic (as opposed to predicate logic) as syntactically suffcient and imposing a symmetry from proposi…
Authors: Cael L. Hasse
In principle determination of generic priors Cael L. Hasse ∗ Sp e cial R ese ar ch Centr e for the Sub a tomi c Struct ur e of Matter and Dep artment of Physics, University of A delaide 5005, Aus tr alia. (Dated: July 13, 2018) Probabilit y theory as extended logic is completed such that ess entia lly an y probabili ty may b e determined. This is done by considering prop ositional logic (as opp osed to predicate logic) as syntactical ly sufficient and imp osing a symmetry from prop ositional logic. It is shown how the notions of ‘p ossibilit y’ and ‘prop erty’ ma y b e sufficiently represen ted in prop ositional logic such that 1) the principle of indifference drops out and b ecomes essen tially com b inatoric in nature and 2) one ma y appropriately represent assumptions where one assumes there is a space of p ossibilities but do es not assume the size of the space. INTRO D UCTION This a rticle is a summation of current and ongoing resear ch. Some work in the literature may not yet b e prop erly considered. It can be argued that Bay esian probability theory is the general framew or k for scientific prediction and inference [1, 2]. It is a ca lculus for no rmative statements P ( A | B ) , where A and B are propos itio ns. These statemen ts en- co de degrees of belief/plaus ibility of A , given B . F ully half of probability theory is enco ded into tw o s imple rules 1 , Pro duct r ule: P ( AB | C ) = P ( A | C ) P ( B | AC ) = P ( B | C ) P ( A | B C ) Sum rule : P ( A | B ) + P ( ¯ A | B ) = 1 , from which we hav e a generalised sum rule P ( A + B | C ) = P ( A | C ) + P ( B | C ) − P ( AB | C ) . These rules give relationships b etw een different proba- bilities but do not constra in the probabilities enough to uniquely determine them [14]. This is the incompleteness of pr obability theory . In particular, in inference w e often want to determine the probability of a hypothes is H given data D and p er- haps so me background knowledge I , P ( H | D I ) = P ( D | H I ) P ( H | I ) P ( D | I ) . 1 The notation w e are using corresp onds to AB = A and B , A + B = A or B , and ¯ A = not A Probabilities like P ( H | I ) are often called ‘prior s’. These need to be determined to w ork o ut P ( H | D I ). Many methods hav e b e e n inven ted to determine prio rs for differen t situations. If one is a Su b jective Ba yesian where a probability is a deg ree o f belief relative to some agent, the (perhaps coun terfactual) agent is free to just choo se the v alue of their priors based on intuition or g ut instinct. If one is an O b jective Bay esia n, one wan ts to fin d met ho ds o f deriv ation such that ea ch probability as signation can be considered unique and agent indep endent (although different agents may make different assumptions and so may still c o nsider differen t degrees of plausibility; i.e., Ob jective Bayesians still consider proba bilit y theo r y sub jective in one sense). Such metho ds include Laplac e ’s principle of indiffere nce , transformatio n group metho ds, and maximum entrop y metho ds [3]. How ever, metho ds such as these ar e not alwa ys of use fo r calculation o f ge ner ic priors . I prop o se a method whereby in principle any prior may be deter mined. Moreov er, it is a metho d for calculating essentially 2 any 3 probability . This will be acco mplished by completing the Ob jective Bayesian a ppr oach of pro b- ability theory as extended logic [1, 4] by impo s ing a sym- metry and treating probability theory as syntactically complete. LOGIC AND EXTENDED LOG IC T o unders tand the completion I am prop osing, we m ust understand certain asp ects o f log ic that I propos e 2 W e shall b e using a finite sets poli cy where one alw ays starts with finite sets a nd only then tak es limits. See [1] f or detailed motiv ation. 3 There is also no general metho d for calculating factors P ( D | H I ) - ca l led l ike l iho o ds - unless H predicts D o r ¯ D for certain; the calculation often reduces to calculation of a prior. 2 to imp ose on probability theory . Consider th e fo llowing logica l argument: A B ∴ C This is not a v alid ar g ument . The ba sic prop ositions A and B ma y have meaning s for us that are no t reflected in the for mu lation of the argument. F or ex ample, w e may wan t the corresp ondence A : So crates is a man. B : If So cr ates is a man, then So crates is mor tal. C : So crates is mortal. A b etter for mulation of the arg ument will then b e to use logical implication A → C instead of B . W e then get the new 4 argument, A A → C ∴ C which is a v alid argument. One ma y see from this a trivial asp ect o f logic; the v alidity of an argument is depe ndent on the structure of the arg ument . O ne needs to sufficiently define the structure in order to make an argument that is appropriate. More impo rtantly , the str ucture of an ar gument is the only thing the v alidity is dep endent up on. An y influence to the v alidity of a lo gical a rgument b eyond the for m of the stated arg ument is extra-lo gical. One ma y hav e extra-log ical influences in t wo w ays; the c hoic e o f rules that define the logic used ma y b e changed s uch that o ne uses a differen t logic; and one may ha ve s o me meaning for a pro po sition in mind that is not defined within the ar gument. F or this second influence, we shall take the p osition that o ne has insufficien tly represented ones premises a nd thus the ar gument is not well formulated. F or the first influence, this is a leg itimate endeavour. W e shall howev er stick to prop ositional lo gic and see how far w e can g o. An a rgument is also indep endent of the la be ls one uses for the prop ositions; what is impo rtant is logical structure. 4 Note here we ar e not considering uppercase propositions to be proposi tional v ariables; within each argument, pr opositions are constan t. It is the argument s that change. The position we a re taking is sometimes called a syn- tactic logical in terpreta tio n of pr obability theory . Many philosophers take the p osition that there is meaning for some prop ositions relev ant to scientific inference and prediction that ca nnot b e defined thro ugh logical structure alone. These issues and others are discussed in the r emarks. W e star t by following Cox [4] who derived the pro duct and sum rules from ba sic desiderata to extend lo gic. In the system, our primitiv e o b jects are de g rees of plausi- bilit y A | B , of A given B , that are equa l to rea l num b ers . P robabil- ities a re p os itive, contin uous, monotonic functions, f , o f plausibilities P ( A | B ) = f ( A | B ) . The function, f , is chosen such that certa int y cor re- sp onds to the num b er 1. This choice is ar bitrary but it leads to the particularly simple forms of the pro duct and sum rules we have g iven. W e shall consider a degr e e of plausibility as dir ectly analogo us to a log ic al argument. This means t wo things: 1. The v alue of a plausibility (analog ous to the v a- lidit y of an a rgument) is dep endent o n o nly the ex- plicit logic a l s tr ucture. F rom now on, any non-basic prop osition will b e written as Z [ A i ] or Z [ A 1 , ..., A n ] as opp osed to Z . When calculating a prior for Z [ A i ], the pro duct and sum rules will constr ain it to be functionally r elated to pro babilities of basic prop ositions. W e may then isolate the pro babilities that canno t b e calculated by using only the pro duct and sum rules , and the rules of Boolea n a lgebra. 2. D ir ectly related to the ab ov e, the v alue of a degree of plausibility will not depend on the lab els used on basic prop ositions. This gives us a pow erful tric k that is fundamen tal to deriv ations [1 ] of indifference and transformation group metho ds. F or example, P ( A 1 | X [ A 1 , A 2 ]) = P ( A 2 | X [ A 2 , A 1 ]) . If X [ A 1 , A 2 ] is per mu tation symmetric; i.e., X [ A 1 , A 2 ] = X [ A 2 , A 1 ], then P ( A 1 | X [ A 1 , A 2 ]) = P ( A 2 | X [ A 1 , A 2 ]) , which gives us a non-trivial constr aint. Rela b e lling can also b e used for individual probabilities sepa- rately within a functional relatio nship. F or ex am- ple, P ( A 1 + A 2 | A 3 ) = P ( A 1 | A 3 ) + P ( A 2 | A 3 ) − P ( A 1 A 2 | A 3 ) = 2 P ( A 1 | A 3 ) − P ( A 1 A 2 | A 3 ) . 3 EXCLUSIVITY, EXHAUSTIVITY AND INDIFFERENCE In nearly e very proba bilit y calcula tion exclusivity and exhaustivity for so me set of po ssibilities a re (usua lly im- plicitly) assumed. Exclusivity means that if you assume that A i is true, then you may infer that any A j , in some set { A i } n i =1 where i 6 = j , must b e false. Exhaustivity means that y ou assume at least o ne of the s et { A i } n i =1 is true. These assumptions ar e often parametrised b y the conditions P ( A i A j | I [ A i ]) = P ( A i | I [ A i ]) δ ij ; P ( n X i =1 A i | I [ A i ]) = 1 . If one wan ts to wr ite a probabilit y in terms of a mixture of others in the normal wa y , then it is nece s sary to make these assumptions; P ( B | X [ A i , B ]) = P ( B n X i =1 A i | X [ A i , B ]) = n X i =1 P ( B A i | X [ A i , B ]) = n X i =1 P ( A i | X [ A i , B ]) P ( B | A i X [ A i , B ]) . If one doesn’t assume exclusivity , then one g e ts extra terms in the ab ov e e q uation. One may consider exclusivit y and exhaustivit y as the sufficient lo gical definition of ‘p os sibility’ and ‘prop erty’ [8, 11]. A property may b e considered a coarse gr ained po ssibility; for example, any classical observ able in Hamiltonian mechanics seg ments the space o f p ossible states, defining a prop erty . In particular, the energy of a sys tem is a prop er t y with v arious v alues of the energy related to v ar io us sets of p ossible states. The meaning that differentiates differen t t yp es of properties and po ssibilities is defined by other logical relationships one assumes. So e ne r gy is a c lassification differentiated by ca usal structure (i.e., the equatio ns one uses); we assume that if a system ha s a certain v alue fo r its ener gy at a certain time and is isolated, then the system will hav e the same energy at a la ter time; we hav e log ical correla tion. One problem is that the co nditions for exclusiv it y a nd exhaustivity are no t derived from the explicit for m o f our assumptions I [ A i ]. The explicit form of I [ A i ] for the simple s et { A 1 , A 2 , A 3 } is as follows: M 3 [ A i ] = A 1 ¯ A 2 ¯ A 3 + ¯ A 1 A 2 ¯ A 3 + ¯ A 1 ¯ A 2 A 3 + ¯ A 1 ¯ A 2 ¯ A 3 for exclus ivity and X 3 [ A i ] = A 1 A 2 A 3 + A 1 A 2 ¯ A 3 + A 1 ¯ A 2 A 3 + ¯ A 1 A 2 A 3 + A 1 ¯ A 2 ¯ A 3 + ¯ A 1 A 2 ¯ A 3 + ¯ A 1 ¯ A 2 A 3 for exha ustivity . This giv es us I 3 [ A i ] = M 3 [ A i ] X 3 [ A i ] = A 1 ¯ A 2 ¯ A 3 + ¯ A 1 A 2 ¯ A 3 + ¯ A 1 ¯ A 2 A 3 . These functions have be e n written in a non-minima l wa y where the function is a sum of terms with each term b eing a pro duct of A i ’s and ¯ A i ’s and containing A i ’s f o r all 1 ≤ i ≤ n for some n . In this form - which is often called disjunctive normal form - every term is exclusive by definition. Every prop ositio nal function can b e wr itten in this form. Each function can then be asso ciated with a subset of the p ower se t of { A i } n i =1 , { A i } p , where each term is asso cia ted with an element of { A i } p . The sum of terms asso c iated with { A i } p are exhaus tive b y definition. This will give us grea t flexibility in calculation. Let us no w calculate P ( A 1 | I 3 [ A i ]) = P ( A 1 ¯ A 2 ¯ A 3 | ) P ( A 1 ¯ A 2 ¯ A 3 | ) + P ( ¯ A 1 A 2 ¯ A 3 | ) + P ( ¯ A 1 ¯ A 2 A 3 | ) = 1 3 × P ( A 1 ¯ A 2 ¯ A 3 | ) P ( A 1 ¯ A 2 ¯ A 3 | ) = 1 3 , where we have used relab elling . Pro babilities of the form P ( . | ) refer to probabilities with minimal assumptions . This will b e de fined later. Indifference can b e seen as fundamentally combinatoric in nature, where the probability is directly rela ted to the n umber of wa y s one may a ssign a single no n-negated pro po sition in a pr o duct of propo sitions. W e can generalise to a ny { A i } n i =1 in a simple ma nner. One asp ect of the ab ov e deriv ation is that the prob- abilities cancel out such that they do not need to b e calculated. This is suggestive of why indiffer e nce could be derived in the past [1] without needing to g o beyond the ba s ic sum and pro duct rules. The a bove deriv ation a ls o shows us that exclusivity and exhaustivity a re not just necessary but also sufficient in deriving ind ifference. DETERMINING GENERIC PR OBABILITIES W e define a w or king set { A i } n i =1 as the s et of all pro p o - sitions we are working with for a par ticular proba bilit y . 4 This set may b e made a rbitrarily la rge: P ( A | B ) = P ( A | B ) P ( C + ¯ C | AB ) = P ( A [ C + ¯ C ] | B ) = P ( A | [ C + ¯ C ] B ) . F rom this one ca n see w e may add an arbitrary n umber of tautolog ies to the premises. One may consider this arbitrar iness an important criterion for pro bability theory; we wan t our pro babilities to b e stable under arbitrar y additions of tautologies to our assumptions. It is interesting that the pr o duct a nd sum rules g ive this to us for free. An imp ortant thing to no te is that we ar e a llowing prop ositions like C in the conclusions that have no representation in the premises. Probabilities with minimal assumptions may be writ- ten a s P ( Z [ A 1 , ..., A n ] | ) = P ( Z [ A 1 , ..., A n ] | Q n [ A 1 , ..., A n ]) , where Q n [ A i ] = Q n i =1 ( A i + ¯ A i ). W e may thus consider probabilities P ( . | ) as ones either as suming nothing o r only tautolo gies. Let us now turn our attention to a g e ne r ic pro bability P ( Z [ A i ] | Y [ A i ]) . W e may write P ( Z [ A i ] | Y [ A i ]) = P ( Z [ A i ] Y [ A i ] | ) P ( Y [ A i ] | ) . Both of these factors may be written as sums of terms of the form P ( A 1 ...A i ¯ A i +1 ... ¯ A n | ). These terms may be decompo s ed using the pro duct rule. A t this p oint the der iv ations of Jaynes [1] and Cox [4] give us no further suppor t. Jaynes appeared [1] (P35) to co nsider probability theory complete as he exp ected one to alwa ys hav e background knowledge to determine the terms. Here we are explicitly ass uming only tautologies with probabilities P ( . | ) and hence ca nnot r ely on such background kno wledg e. Mor eov er, as we are aiming at generality , we do not wan t to rely o n such bac kg round knowledge. T o determine the terms , consider the following symme- try: The v alidity of an a r gument is inv aria n t under swap- ping a basic pr op osition A with its negation ¯ A in both the premises and the conclusio ns. Mo reov er , the swap- ping symmetr y A ↔ ¯ A is a s ymmetry o f lo gical str uc tur e. It may b e seen a s directly related to the double negation rule; imp osing the symmetry on a trivial iden tity gives us the rule: ¯ A = ¯ A → A = ¯ ¯ A. Consider also that a p ossible state of a ffairs may b e referred to b y e ither A o r ¯ A , with b o th ch oices giving equal consequences fo r argumen tatio n. The t wo prop o- sitions are defined in contrast to one another (their truth v alues are o pp o s ed) and are not distinguished within the system in any other way . This lack of distinguishing f actors is made mo re apparent when po ssibility is see n as an ex plicit extra assumption; the prop osition ¯ A do es not by definition mean that a prop o- sition fro m a set of p ossibilities, other than A , must be true. Suc h meaning comes from an assumption I n [ A i ]. I asse r t that o ur degrees o f plausibility must satisfy the symmetry in order to not in tr o duce an extra-logica l bias into our framework. This may b e co nsidered part of the desideratum of consistency used in [1]. Consider notation x j k = P ( A | A 1 ...A j ¯ B 1 ... ¯ B k ). F rom our sy mmetry , one ma y impose 5 x j +1 k = x j k +1 ∀ j, k ≥ 0 . (1) W e now prove a lemma: ∀ j, k ≥ 0 , if x j k = P ( A | ) , then x j k +1 = P ( A | ). Pro of: Assume x j 0 k 0 = P ( A | ) for some j 0 , k 0 ≥ 0 . Then P ( A | ) = P ( A [ A q + ¯ A q ] | A 1 ...A j 0 ¯ B 1 ... ¯ B k 0 ) = P ( A | ) P ( A | A 1 ...A j 0 A q ¯ B 1 ... ¯ B k 0 ) + { 1 − P ( A | ) } P ( A | A 1 ...A j 0 ¯ B 1 ... ¯ B k 0 ¯ A q ) = P ( A | ) x j 0 +1 k 0 + { 1 − P ( A | ) } x j 0 k 0 +1 Then by eq.(1), x j 0 k 0 +1 = P ( A | ) . F rom our lemma and (1), w e ge t by induction from x 0 0 = P ( A | ), ∀ j, k ≥ 0 , x j k = P ( A | ) . Now see P ( A 1 ...A j ¯ B 1 ... ¯ B k | ) = j Y l =1 k Y r =1 P ( A l | µ l ) { 1 − P ( B r | ν r ) } = P j ( A | ) { 1 − P ( A | ) } k , where µ l and ν r are pro ducts of v ar io us A ’s and B ’s. T o determine P ( A | ), w e impose the symmetry again: A | = ¯ A | . 5 This condition i s imp osed on the plausibili ties but may b e state d in terms of pr obabilities. 5 F rom this condition one arrives at P ( A | ) = 1 2 . Our g eneric probability then becomes P ( Z [ A i ] | Y [ A i ]) = M N 2 n − m , where M and N are just the n umber of terms in Z [ A i ] Y [ A i ] and Y [ A i ] resp ectively when the functions a re written in minimal disjunctive normal form. Moreov e r , M can heuristically b e thought of as an unnormalised ov erla p b etw een Z [ A i ] a nd Y [ A i ]. The n umbers m and n are the num b ers of basic prop ositions in the terms o f Z [ A i ] Y [ A i ] a nd Y [ A i ] r esp ectively , written in minimal disjunctiv e no rmal form. APPLICA TIONS The precis ion and gener a lity of our scientific state- men ts are direc tly related to the precis ion and genera lity of the languag e used to mak e the statements. With the formulation of probability theory I hav e just pr op osed, we are a ble to determine precise pr obabilistic statements for a greater v a r iety of situations then we w ere able to befo re. Immediate applications include situations where we do no t or canno t assume exhaustivity and exclusivity: 1. W e c a n c a lculate probabilities for prop o s itions A i when we know only tha t they a re one of m exha us- tive p ossibilities ; P ( A i | X m [ A i ]) = 2 m − 1 2 m − 1 . 2. Consider a situation s imilar to o ne pr esented by W alley [5]: W e have a bag o f marbles. Supp ose we know that they are la belle d in a distinguish- able way . In particular, they are num b ered and we know there is a marble that is lab elled with ‘1’. W e know nothing ab out the n umber of marbles in the bag (p erhaps the ba g is magical, with the ability to hold an unlimited num b er of marble s ). W e wan t to know the probability that if we pick a mar ble from the bag, that marble will b e the one lab elled with ‘1’. This will g enerally dep end on our knowledge of how we pick the marble. W e are not interested in this pa rticular asp ect and if w e kno w our metho d of picking cannot dis c riminate the lab els, we may neglect this knowledge for our current pur p o ses. W alley and other s hav e prop osed solutions to pro b- lems of this sort which go beyond the Ba yesian framework. One so ught after pro pe rty of a prob- ability in this situation is ca lle d r egrouping in- v ariance; i.e., it s hould somehow b e inv a riant to changes in the ‘size of the sample s pa ce’. This pre- suppo ses that our pro babilities are defined in ter ms of ‘s ample spac e s ’. Within the fra mework just prop osed the so lution requires only pr op erly stating the salient assump- tions; we hav e p ositive knowledge that ther e is a set of exclusive and ex haustive p ossibilities, we just do not know the size of the set. An appro priate prob- ability will then be of the form lim n →∞ P ( A 1 | n X j =1 I j [ A i ]) . Note, assuming P n j =1 I j [ A i ] do es not assume the v arious s ample spaces are exclusiv e t o eac h o ther. Exclusivity o f sample spa ces would require ad- ditional assumptions and c hang e the probability . This is just one example of the precise choices we could make in o ur assumptions, exemplifying the generality o f our approach. 3. Quantum theory ha s severe ontological problems. Our difficulty in solving these problems may be an insufficient formulation of pro bability theory [6]. Most if not all no-go theo rems for ontological mo d- els of quantum theory [7 – 11] implicitly assume ex- clusivity a nd ex haustivity for the s pace of ontolog- ical states. The fr amework pre s ent ed here allows for a whole cla ss of mo dels, whic h do no t assume exclusivity a nd e x haustivity , to be explored. REMARKS The proba bilistic framework here is co nsidered a s a symbolic sy stem r ather than a system of functions o r measures on a pr edefined set. The fra mework is general enough to dea l with situatio ns where sets of possibilities are not a ssumed. The principle of indifference is deriv ed as a co nsequence of our ability to re la b e l and the explication of the assumptions we implicitly make to define po ssibility . Indifference is thus not a principle impo sed a priori or arbitrarily . Probability theory as extended logic is completed by impo sing a symmetry from propo sitional logic. The deg ree to which o ne is co nvinced by the fra me- work prop osed here partly dep ends on whether one is convinced that prop ositio nal logic is sufficient for the task of scientific inference . W e hav e s een how one may represent basic notions of pos sibility and property while still maintaining logica l consis tency . What prop ositiona l logic do es not do are univ ersa ls. I argue that univ ersa ls are not directly r elev ant for scientific infer e nce; a scientist would never b e able to test the statement ‘all 6 rav ens a re black’. I prop ose the notion of universalit y is related to notions o f induction and simplicity . The framework just prop osed do es not direc tly justify induction. This is a go o d thing. An approa ch [12] b y Carnap - that ha s s imilar motiv ations to the appro a ch here - tr ies to build induction dir ectly into the frame- work. One pr oblem is that the inductive predictions do not take into account ones assumptions ; whether or not one predicts a s equence to co ntin ue at all and precisely how one pr e dic ts this depends on ones assumptions. Moreov er, I submit these things should only dep end on ones assumptions; if y o u make no assumptions you have no rea son to predict the co nt inuation of a sequence. One may still p erfor m inductive rea soning given cer - tain assumptions such a s a constant causal mechanism. There is, how ever, s till a pro blem of induction: One may make v alid predictions ba sed on as s umptions but those assumptions may not necessarily b e j us tified. The Bay e s ian framework has some built in notion of simplicity [1](Ch.20). Consider t wo sets of pr op ositional functions we’ll call mo dels, Ω m and Ω m +1 , where Ω m is parametrise d by m parameter s a nd Ω m +1 by m + 1 parameters. Supp os e the m + 1’th parameter is θ m +1 and the s ubset Ω m +1 | θ m +1 =0 has a o ne to one cor resp ondence with Ω m where ea ch element in bo th sets is identified with one that pr o duces the same likelihoo d for some data D [ A i ]. W e may take Ω m and Ω m +1 as comp ound mo dels, i.e., mo dels where the para meter s are unknown. If the elements are exclusive for b oth Ω m and Ω m +1 and the p oint in the parameter s pa ce that gives a max im um likelihoo d (for data D [ A i ]) is near θ m +1 = 0 and sharply p eaked, then the likelihoo d for the comp ound mo del of Ω m will generally b e gr eater than the lik e liho o d for Ω m +1 ; a set of mo dels that predicts the observ ations as well as another but with less par a meters will ge ner ally b e b etter. One limitation to this is tha t one has already chosen the sets o f mo de ls to consider in a certain way . This has partly to do with ones pr eferences; do I judge a mo del with v arious par ameters on the b est choice of v alues o f those parameters or do I judge a mo del on the total parameter spa ce given to me? The c ho ic e also has to do with the choice o f using a mathematical framework in the first pla ce. In principle there are an infinit e n um b er of pr op ositional functions that one may use as a mo del that hav e no discernable or c onsistent pattern. Can the restriction to pro po sitional functions with consistent patterns b e justified? This question b ecomes ma nifest in the pro p osed framework wher e we do no t rely on calculating things with r esp ect to a pr edefined set o f alternative mo dels; w e ma y a sk where those alternatives come fro m and why . Note that the framework pres ent ed her e manifests a primitive notio n o f simplicity for pro po sitional functions themselves. The probability of some Z [ A i ], g iven no assumptions, is pro p ortional to 2 − m where m is the minim um num b er of pro po sitions required to write Z [ A i ] in disjunctiv e normal fo r m. The smaller the v alue of m , the ‘simpler ’ Z [ A i ] is. I sp e c ulate that a justificatio n for induction and simplicity comes from an assumption, J [ A i ], that restricts the s et of prop os itio nal functions one may use. This restr iction could b e justified b y epistemolog ical consideratio ns. Mo dels with consistent patterns ma y then emer ge due to combinatoric reasons . The concept o f p os sibility that is o utlined in this article is suggestive of how scientific concepts may b e defined genera lly . Possibility is a pattern of prop o sitions within a mo de l. Cr ucially , this patter n is no t unique; different mo dels with different size s for p ossibility s pa ces will use different patterns (e.g ., I 2 [ A i ] and I 3 [ A i ] are different) . Moreover, the pattern may b e nested such that the different p ossibilities ar e pro po sitional functions rather than basic pro p o sitions. Within this framework, the co ncept of possibility cannot be defined as a form of classification, in co ntrast to some other attempts at the definition of a conce pt [13]. I s pe c ulate that concepts like po ssibility and prop erty may instead b e as so ciated with a lg orithms. Univ ersality may b e defined as a concept. This definition of concept sugg ests a motiv ation for its use. Co nsider an agent with data and assumption J [ A i ]. There will likely b e a n infinite set of mo dels to consider. Calculation for decisions may b e computationally in- tractable. The ag e n t may choose some sc heme that b est approximates the inferences one would ideally achiev e. This scheme could involv e a lgorithms for gener ating mo dels. It may b e the cas e that the b est algorithms come from c o llections of nested conc e pts we may call general hypothes e s. These g e neral hypotheses may not give unique r esults but rather generate prop ositional functions dependent o n input. Some of these g eneral hypotheses may b e w ell parametrised by mathematics . F urther work is required. ∗ Electronic address: cael.hasse@adelai d e.edu.au 7 [1] E. T. Jaynes, “Probabilit y Theory: The Logic of Sci- ence,” Cambridge U niversi ty Press, Cam b ridge (2003). [2] B. de Finetti, “Theory of Probability: A Critical Intro- ductory T reatment,” John Wiley & Sons Ltd, N ew Y ork (1974-75). [3] J. E. Shore and R. W. Johnson, IEEE transactions on information t heory , 26 (1980). [4] R . T. Cox, “The Algebra of Probable In ference,” John Hopkins Press, Baltimore (1961). [5] P . W alley , J. R. Statist. So c. B, 58:3-57 (1996). [6] C. F u chs, eprint arXiv:q uant-ph/1003. 5209v1 ( 2010). [7] J. S. Bell, “Sp eak able and Unsp eak able in Quantum Me- chanics ,” C ambridge Univ. Press (1987). [8] N. H arrigan, R. W. Sp ek kens, F ound. Ph ys. 40:125 - 157 (2010). [9] R. W. Sp ekkens, Phys. R ev. A, 71:052108 (2005). [10] L. Hardy , Stud. Hist. Phil. Mod . Phys., 35:267-276 (2004). [11] M. F. Pusey , J. Barrett, T. Rudolph, N ature Phys., 8:475 (2012). [12] R. Carnap, “Logical F oun d ations of Probability ,” Uni- versi ty of Chicago Press (1950). [13] L. G. V alian t, Communications of the A CM, 27:1 134 (1984). [14] R. T. Cox, An n. J. Phys., 14:1-13 (1946).
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment