Information-theoretic measures associated with rough set approximations

Although some information-theoretic measures of uncertainty or granularity have been proposed in rough set theory, these measures are only dependent on the underlying partition and the cardinality of the universe, independent of the lower and upper a…

Authors: Ping Zhu, Qiaoyan Wen

Information-theoretic measures associated with rough set approximations Ping Zhu a,b , Qiaoyan W en b a Scho ol of Science , Beijing Universi ty of P osts and T elecommuni catio ns, Beijing 100876, China b State Ke y Laborat ory of Networking and Switchi ng, Beijing University of P osts and T elecommunic ations, Beijing 100876, China Abstract Although some informa tion-theo retic measures of uncer tainty or gr anularity have been prop osed in roug h set theo ry , these measures are o nly dep endent on the un derlying partition and th e cardina lity of the un iv er se, independent of the lower and upper approxima tions. It seems some wh at u nreason able since the basic idea o f rough set th eory aims at describing v ague concep ts by th e lo w er and upper appro ximations. In this paper, we thus define new inf ormation - theoretic entro py and co -entropy func tions associated to the pa rtition and the appro ximations to measure th e uncer- tainty an d gran ularity o f an appr oximation space. After in troduc ing the n ovel notions of entro py and co-entro py , we then examine their pr operties. In particu lar , we discuss the relatio nship of co- entropie s between d i ff erent un i verses. The theoretical development i s accompan ied by illustrati ve numerical e xamples. K eywor ds: Rough set, entropy, co-entro py, uncertainty, granularity 1. Introductio n T o handle inexact, u ncertain or v agu e knowledge in some inform ation systems, Pa wlak de velo ped roug h set theory in th e early 1980 s [14, 15]. Since then we have witn essed a systematic, world-wide growth of interest in ro ugh set theory and its application s in a numb er of field s, su ch as granu lar computin g, data m ining, d ecision ana lysis, patter n recogn ition, and approxima te reasoning [12, 17, 18, 30, 34, 35]. The starting po int of rough set theor y in [14, 15] is the idea that elements of a uni verse having th e same descrip tion are ind iscernible wi th respect to th e available inform ation. The in discernibility was describ ed b y an eq uiv alenc e relation in the way th at two elements are r elated by the relation if an d o nly if they are in discernible fro m each other . As is well known, any eq uiv alenc e relatio n defined on a universe U determines a p artition of U into a collection o f equiv alen ce classes (blocks): each class contains all an d only the elemen ts that are mu tually equiv a lent among them. Any p artition π of U repr esents a piece of knowledge abo ut the elements of U form ing a c lassification and so any equiv alen ce class induced by π is interp reted as a granule of kno wledge contained in (or supported by) π . According to Pawlak’ s termin ology exp ressed in [16], any subset X of the universe U is called a concept in U . If the co ncept X is a union of equivalence classes f rom π , then X is pr ecise in π , otherw ise X is vague. The basic idea of roug h set theo ry con sists in replacing vague concepts with a pair of p recise concep ts, its lower and upper app roximatio ns [16 ], and thu s, a basic p roblem in this f ramework is to reason ab out the acc essible gr anules of knowledge. T o this end, various knowledge granulations (also, informa tion granulations or granulatio n measures), as an average measure o f kn owledge granules, have been propo sed and add ressed in [ 1, 3, 8, 1 1, 1 3, 21, 23, 24, 25, 26, 28, 3 2]. Among them, the re are several infor mation-th eoretic measu res of unc ertainty or granula rity for rough sets [1, 3, 8, 1 0, 11, 13, 21, 23, 2 5], wh ich are b ased upon the im portant n otion o f entr opy in troduce d by Shannon [22]; for more details, we refer the reader to the excellent surve y paper s [2 , 27]. It is worth n oting that the inf ormation -theore tic measure s mentioned ab ove ar e o nly depend ent on the sizes of equiv alen ce classes (essentially , the underlying partition) and the cardinality of the univ er se, in depend ent of the l ower Email addr esses: pzhubupt@gmail. com (Ping Zhu), wqy@bupt.edu .cn (Qiaoyan W en) Prep rint submitt ed to Elsevier October 31, 2018 and up per app roxima tion operators. F o r example, in [6, 13, 23, 26] the informatio n entropy H ( π ) of the pa rtition π = { U 1 , U 2 , . . . , U k } is de fined as H ( π ) = − k X i = 1 n i n log n i n , where n i is the cardinality of U i and n = P k i = 1 n i . As a result, it often yields that some partitions like {{ 1 } , { 2 }} and {{ 1 , 2 } , { 3 , 4 }} have the same entr opy (or co-entropy). This seems som ewhat unreaso nable since the basic ide a of rou gh set theory aims at describing vague concepts by the lo we r and upper approxim ations. In other words, the result of this description relies on bo th the partition an d the appro ximations. In light of th is, we should p ay more attention to the lower and upper approxim ation operators. The previous o bservation motiv a tes u s to pro pose an other inform ation-theo retic entropy fu nction to measure the uncertainty associa ted to the par tition and the app roximatio n op erators in this p aper . More co ncretely , given a un iv er se U with n elemen ts and a partition π of U , we take count of the subsets of U described by every pair of lower and upper appro ximations. Assume that r i , 1 ≤ i ≤ m , is the nu mber of subsets descr ibed by the r ough set app roximatio n ( A i , A ′ i ) an d every subset o f U appears with the same p robab ility . It f ollows that th e ro ugh set appro ximation ( A i , A ′ i ) appears with th e ac cumulative probability r i / 2 n since the amoun t of all subsets of U is precisely 2 n . In this way , we obtain a probab ility distribution P ( π ) =  r 1 2 n , r 2 2 n , . . . , r m 2 n  . It giv es rise to an information entropy , say H ( π ) , according to Shannon’ s information theory [22]. On the other hand, we can get by the proba bility distrib ution a co-en tropy G ( π ). It tur ns out that H ( π ) + G ( π ) = n . After exploring some proper ties of the en tropy and co-entro py , we discuss the r elationships o f co -entrop ies between d i ff erent u niv erses. Roughly speaking, the co-e ntropy mo noton ically increases when the p artition becomes coarser . For exam ple, the co-entro py o f {{ 1 , 2 } , { 3 , 4 }} is gre ater than that of {{ 1 } , { 2 }} . The remain der of the pa per is structured as follows. In Section 2, we br iefly revie w some basics of Pawlak’ s rough set theory and the information- theoretic measur es of uncertainty and granularity for rough sets in the literature. Section 3 is dev ote d to ou r novel notions of entr opy and co-entro py and their pro perties. W e ad dress the re lationship of co-en tropies b etween d i ff erent universes in Sectio n 4 and c onclude the paper in Section 5 with a brief discussion on the future research. 2. P reliminaries This sectio n con sists of two subsections. W e briefly recall the definition of P awlak’ s rough sets in the first sub- section a nd th en r evie w two inform ation-the oretic measures of uncertain ty and g ranular ity in r ough set theo ry in the second subsection. 2.1. Rough s e ts W e start by reca lling some basic notions in P awlak’ s rough set theory [14, 15]. Let U be a finite and nonempty univ er sal set, and let R ⊆ U × U be an equiv alence relation on U . Denote by U / R the set of all eq uiv alence classes induced by R . Such equiv alence classes are also called elemen tary sets ; every union (not necessarily nonemp ty) of elementary sets is called a definable set . For any X ⊆ U , one can ch aracterize X by a pair o f lower and upp er ap prox imations. The lower appr oximatio n a p p R X of X is defined as th e greatest d efinable set contained in X , while the u pper ap pr oximation a p p R X of X is defined as the least definable set containing X . Formally , a p p R X = ∪{ C ∈ U / R | C ⊆ X } an d a p p R X = ∪{ C ∈ U / R | C ∩ X , ∅ } . The p air  a p p R X , a p p R X  is referre d to as the r ough set ap pr oximatio n of X . It follows immed iately f rom d efinition that a p p R X ⊆ X ⊆ a p p R X for any X ⊆ U . The ord ered pair h U , R i is said to b e an appr ox imation space . A r o ugh set in h U , R i is th e family of all sub sets of U having the sam e lower and up per appr oximation s. Thus, the general no tion of rough set can be simply id entified with the rough approx imation of any giv en set. 2 Recall that a p artition of U is a co llection of n onemp ty subsets of U such th at every element x o f U is in exactly one of the se s u bsets; such subsets making up the partition are called blocks . W e write Π ( U ) for the s et of all p artitions of U and P ( U ) for the power set of U . It is well-known that the notion s of p artition and equiv alen ce relation are essentially equivalent, that is, for any equivalence relatio n R on U , the set U / R is a par tition of U , and co n versely , fro m any par tition π o f U , one ca n d efine an equivalence relation R π on U such that U / R π = π in the obvious way . Thu s, we sometimes say that the ordered pair h U , π i is an approxim ation space and write a p p π X a nd a p p π X f or a p p R π X a nd a p p R π X , respectiv ely . Mo re generally , we will use equi valence relation and partition indiscriminately . If a un iv e rse U has more than one element, it is always possible to introdu ce at least tw o canonica l par titions: One is the tri vial partition , denoted by ˇ π , consisting of a unique block, and the other is the discrete partition, denoted by ˆ π , consisting of all singletons from U . F o rmally , ˇ π = { U } an d ˆ π = {{ x } | x ∈ U } . W e n ow define a partial order “  ” on Π ( U ): For any π, σ ∈ Π ( U ), σ  π if and on ly if for any C ∈ σ , th ere exists D ∈ π such that C ⊆ D . For instance, ˆ π  π  ˇ π fo r any π ∈ Π ( U ) . W e say that σ is fi ner than π and th at π is coarser than σ if σ  π . When σ ≺ π , that is, σ  π and σ , π , we say tha t σ is strictly finer than π and th at π is strictly coarser than σ . In formally , t h is means that σ is a f urther fragmentation of π . 2.2. Information-theo r etic measur es In this subsection , we r evie w two info rmation- theoretic m easures associated with roug h sets in the liter ature. These measures are concerned with the uncertainty or granularity of knowledge provided by a partition. In [6, 1 3, 2 3, 26], Shannon entro py [22] has been used as a measu re of inf ormation for ro ugh set the ory as follows. For subsequ ent need , we fix a n otational c onv en tion: Throu ghou t the pap er , all lo garithms ar e to base 2 unless otherwise specified. Definition 2 .1 ([6, 13 , 23, 2 6]) . Let h U , π i be an app r oximation spa ce, wher e the partition π con sists o f blocks U i , 1 ≤ i ≤ k , each having car dina lity n i . Th e information entropy H ( π ) of partition π is defined by H ( π ) = − k X i = 1 n i n log n i n , wher e n = k X i = 1 n i . (1) When π = ˇ π , the en tropy f unction H achieves the minimum value 0, an d when π = ˆ π , it achie ves the maxim um value log n . Moreover, it has been sho wn in [23] th at for any two partitions π and σ of U , if σ ≺ π , then H ( σ ) > H ( π ). The equation (1) can be rewritten as follo ws: H ( π ) = log n − k X i = 1 n i n log n i . (2) Recall that the Hartley measure [7] of uncertainty for a finite set X is H ( X ) = log | X | , where “ | X | ” deno tes the c ardinality of the set X . It m easures th e am ount of uncertain ty associated with a fin ite set of possible alternatives, the n onspecificity inherent in the set. The first term log n (i.e., log | U | ) in Eq . (2) is exactly the Hartley measure of U , which is a constant independ ent of any partition. The second term of the e quation is basically a n expectation of granu larity with re spect to all blocks in a partition. This q uantity has bee n used b y Y ao to m easure the gran ularity of a p artition in [ 26] and h as been d efined by Lia ng an d Sh i as the ro ugh entro py of knowledge in an app roximatio n space in [11]. This quantity has also b een referred to as co-entropy by some scholars (see, for e x ample, [2, 3]). Definition 2.2 ([2, 3, 11, 26]) . Let h U , π i be an a ppr oxima tion space, where the p artition π consists o f blocks U i , 1 ≤ i ≤ k , each having car dina lity n i . Th e co-entropy G ( π ) of partition π is defin ed by G ( π ) = k X i = 1 n i n log n i , where n = k X i = 1 n i . (3) 3 It follows immediately from definition that H ( π ) + G ( π ) = log n . Contrary to the uncertainty measure H , the co-entropy function G a chieves the max imum v alue log n when π = ˇ π an d the m inimum v alue 0 when π = ˆ π ; moreover , it has been kno wn [11 ] that for any two partitio ns π and σ of U , if σ ≺ π , then G ( σ ) < G ( π ). As argued in [2, 3], the entropy H ( π ) ca n be interpreted a s the uncertain ty m easure o f the partition π , while the co-entro py G ( π ) can be r egarded as th e granular ity measure of π . In [ 21], Sen an d Pal intr oduced two othe r en tropy measures fo r cr isp sets and fuzz y sets with (crisp or fuzzy ) equivalence relatio ns or (crisp o r f uzzy) toleran ce relation s, which ar e based upo n th e rou ghness measur es o f X and of the comp lement of X in the u niv e rse an d have been used to analyze the grayness a nd spatial ambiguities in images. Und er the s am e name, there are some di ff erent concepts of entropy in the literature of rough set theory (see, for e x ample, [9, 20]). 3. A novel pair of entropy and co-entropy In this section, we first introduce a novel entropy and the correspondin g co-entropy and then explore their proper- ties. Let u s begin with some notations. T hroug hout th is sectio n, we write h U , π i for an approximation space and assume that | U | = n . Giv en a h U , π i , we use A ( U , π ) to deno te the set of roug h set a pprox imations of all subsets of U . Mo re formally , we set A ( U , π ) =   a p p π X , a p p π X      X ⊆ U  . (4) It follows from E q. (4) tha t A ( U , π ) has at le ast two elem ents: ( ∅ , ∅ ) and ( U , U ). If n = 1, then A ( U , π ) exactly consists of the two elements; if n > 1 and π = ˇ π , then A ( U , π ) con tains one m ore elem ent ( ∅ , U ); for any n ≥ 1, if π = ˆ π , then we see th at A ( U , π ) = { ( X , X ) | X ⊆ U } , wh ich con sists of 2 n elements. No te th at the set A ( U , π ) is n ot a multiset, th at is, the same elem ent can not app ear mo re than o nce in A ( U , π ). I n gene ral, we have that |A ( U , π ) | ≤ 2 n since the subset X of U in Eq. (4) h as only 2 n alternatives. For simplicity , we use m to stand for |A ( U , π ) | . For any ( A i , A ′ i ) ∈ A ( U , π ), 1 ≤ i ≤ m , we set A i =  X ⊆ U      a p p π X , a p p π X  = ( A i , A ′ i )  and |A i | = r i . (5) In other word s, r i is the n umber of sub sets of U tha t have the r ough set ap proxim ation ( A i , A ′ i ). It tur ns out that {A 1 , A 2 , . . . , A m } g i ves rise to a partition of P ( U ). T herefo re, we get by Eq. ( 4) that m X i = 1 r i = 2 n . T o illustrate the above concepts, let us see an example. Example 3. 1. Consider U = { 1 , 2 , 3 , 4 } and π = {{ 1 , 2 } , { 3 , 4 }} . In this case, U h as 16 sub sets. F or ea ch subset X of U , we compute the r ough set appr oximatio n of X ; the r esults ar e listed in T able 1. Hence, we see that A ( U , π ) = { ( ∅ , ∅ ) , ( ∅ , { 1 , 2 } ) , ( ∅ , { 3 , 4 } ) , ( { 1 , 2 } , { 1 , 2 } ) , ( ∅ , U ) , ( { 3 , 4 } , { 3 , 4 } ) , ( { 1 , 2 } , U ) , ( { 3 , 4 } , U ) , ( U , U ) } . As an e xa mple, let us calculate r 2 . B y definitio n, r 2 =       X ⊆ U      a p p π X , a p p π X  = ( ∅ , { 1 , 2 } )       = |{{ 1 } , { 2 }}| = 2 . This is exactly the number o f subsets of U that have the r ough set a ppr oxima tion ( ∅ , { 1 , 2 } ) , which can be counted fr om the table. I n lig ht of this, we ma y get T able 2 b y r earranging T able 1. It follows immedia tely fr om T able 2 th at r 1 = r 4 = r 6 = r 9 = 1 , r 2 = r 3 = r 7 = r 8 = 2 , and r 5 = 4 . 4 T able 1: The subsets and corresponding rough set approximat ions in E xample 3.1. subset approx imation s u bset app roximatio n subset app roximatio n subset approx imation ∅ ( ∅ , ∅ ) { 1 } ( ∅ , { 1 , 2 } ) { 2 } ( ∅ , { 1 , 2 } ) { 3 } ( ∅ , { 3 , 4 } ) { 4 } ( ∅ , { 3 , 4 } ) { 1 , 2 } ( { 1 , 2 } , { 1 , 2 } ) { 1 , 3 } ( ∅ , U ) { 1 , 4 } ( ∅ , U ) { 2 , 3 } ( ∅ , U ) { 2 , 4 } ( ∅ , U ) { 3 , 4 } ( { 3 , 4 } , { 3 , 4 } ) { 1 , 2 , 3 } ( { 1 , 2 } , U ) { 1 , 2 , 4 } ( { 1 , 2 } , U ) { 1 , 3 , 4 } ( { 3 , 4 } , U ) { 2 , 3 , 4 } ( { 3 , 4 } , U ) U ( U , U ) T able 2: The rough set approximati ons and correspon ding subsets in Example 3.1. approx imation subsets approx imation subsets appro ximation subsets ( ∅ , ∅ ) ∅ ( ∅ , { 1 , 2 } ) { 1 } , { 2 } ( ∅ , { 3 , 4 } ) { 3 } , { 4 } ( { 1 , 2 } , { 1 , 2 } ) { 1 , 2 } ( ∅ , U ) { 1 , 3 } , { 1 , 4 } , { 2 , 3 } , { 2 , 4 } ( { 3 , 4 } , { 3 , 4 } ) { 3 , 4 } ( { 1 , 2 } , U ) { 1 , 2 , 3 } , { 1 , 2 , 4 } ( { 3 , 4 } , U ) { 1 , 3 , 4 } , { 2 , 3 , 4 } ( U , U ) U Because we ar e concerned with the p artition granulation o f h U , π i with respect to the approximation o perators a p p and a p p , we may a ssume that every sub set of U appear s with the same probability 1 / 2 n . As a result, the ro ugh set approx imation ( A i , A ′ i ) appears with the accumulative pr obability r i / 2 n and we thus obtain a prob ability distrib utio n P ( π ) =  r 1 2 n , r 2 2 n , . . . , r m 2 n  . (6) According to Shannon’ s infor mation theor y [22], the Shanno n entropy func tion of the p robab ility distribution P ( π ) is defined as follows. Definition 3.1. K eep the notations as above. Th e information entropy H ( π ) of h U , π i (with r espe ct to the appr oxima- tion operators a p p and a p p) is defin ed by H ( π ) = H ( P ( π )) = − m X i = 1 r i 2 n log r i 2 n . (7) In the above d efinition, for simplicity we ha ve used the notation H ( π ) instead of H ( U , π ) . Follo w ing the e xp lana- tion of Shanno n entropy in information theory , the quantity H ( π ) measures the uncertain ty associated to the partition π with respe ct to the appro ximation op erators a p p and a p p . For in stance, the p robab ility distribution corre spondin g to the partition π = {{ 1 , 2 } , { 3 , 4 }} in Example 3.1 is P ( π ) = 1 2 4 , 2 2 4 , 2 2 4 , 1 2 4 , 4 2 4 , 1 2 4 , 2 2 4 , 2 2 4 , 1 2 4 ! . It follows from Definition 3.1 that H ( π ) = − 9 X i = 1 r i 2 4 log r i 2 4 = − " 1 2 4 log 1 2 4 + 2 2 4 log 2 2 4 + 2 2 4 log 2 2 4 + 1 2 4 log 1 2 4 + 4 2 4 log 4 2 4 + 1 2 4 log 1 2 4 + 2 2 4 log 2 2 4 + 2 2 4 log 2 2 4 + 1 2 4 log 1 2 4 # = 3 . 5 Similar to other entro py functions in rough set theory , the information entropy in Definition 3.1 has the following proper ties. Theorem 3.1. (1) F or any π, σ ∈ Π ( U ) , if σ ≺ π , then H ( σ ) > H ( π ) . (2) The entr opy function H r ea ches the maximum value n for the finest partition ˆ π . (3) The entr opy function H r ea ches the minimum value n − 2 n − 2 2 n log(2 n − 2) for the coarsest partition ˇ π . Pr oof. (1 ) W ithout loss of g enerality , we may assume that π = { U 1 , U 2 , . . . , U k } an d σ = { U a , U b , U 2 , . . . , U k } , where U a ∪ U b = U 1 . Suppose that |A ( U , π ) | = m and for any ( A i , A ′ i ) ∈ A ( U , π ), 1 ≤ i ≤ m , we write r i for       X ⊆ U      a p p π X , a p p π X  = ( A i , A ′ i )       . Based on the partition π , the power set P ( U ) is partitioned into m bloc ks an d the i -th block has the cardinality r i . Similarly , we de note b y s j the cardinality o f the j -th block o f P ( U ) associated to the partition σ . W e no w conside r the elem ents of A ( U , σ ). For a ny ( B j , B ′ j ) ∈ A ( U , σ ), ther e are two p ossibilities: On e is th at ( B j , B ′ j ) ∈ A ( U , π ), say ( B j , B ′ j ) = ( A i j , A ′ i j ) for s o me i j . In this c ase, it is clear that s j = r i j . T he other case is that ( B j , B ′ j ) ∈ A ( U , σ ) \A ( U , π ) , where the sym bol A \ B deno tes the set o f all elem ents which a re memb ers of A but n ot membe rs of B . It f ollows that for some i j ,  X ⊆ U      a p p σ X , a p p σ X  = ( B j , B ′ j )  (  X ⊆ U      a p p π X , a p p π X  = ( A i j , A ′ i j )  , because the partition σ is strictly finer tha n π . In this case, we also see that the i j -th block provided b y π is partitioned into s m aller blocks and th us r i j = P j s j > s j . In summary , we ge t that e ither r i = s j or r i = P j s i j > s i j , and moreover , the latter case must exist as σ ≺ π . W e th us assume that r i = s i j for i ∈ I 1 and r i = P j s i j > s i j for i ∈ I 2 , where I 2 , ∅ and I 1 ∪ I 2 = { 1 , 2 . . . , m } . Let us compare H ( σ ) with H ( π ) . H ( π ) = − m X i = 1 r i 2 n log r i 2 n = − X i ∈ I 1 r i 2 n log r i 2 n − X i ∈ I 2 r i 2 n log r i 2 n = − X i ∈ I 1 s i j 2 n log s i j 2 n − X i ∈ I 2 P j s i j 2 n log P j s i j 2 n = − X i ∈ I 1 s i j 2 n log s i j 2 n − 1 2 n X i ∈ I 2         X j s i j                 log         X j s i j         − n         = − X i ∈ I 1 s i j 2 n log s i j 2 n − 1 2 n X i ∈ I 2           log         X j s i j          P j s i j  − n         X j s i j                   < − X i ∈ I 1 s i j 2 n log s i j 2 n − 1 2 n X i ∈ I 2         log         Y j s s i j i j         − n         X j s i j                 = − X i ∈ I 1 s i j 2 n log s i j 2 n − 1 2 n X i ∈ I 2         X j s i j log s i j − n         X j s i j                 = − X i ∈ I 1 s i j 2 n log s i j 2 n − X i ∈ I 2 X j s i j 2 n log s i j 2 n = H ( σ ) , namely , H ( σ ) > H ( π ). Th erefore, the clause (1) holds. 6 (2) It follows from (1) that H reaches the maximum v a lue when π = ˆ π . In this case, we get by definition that H ( ˆ π ) = − 2 n X i = 1 1 2 n log 1 2 n = n . This proves (2). (3) By ( 1), we see that H re aches th e m inimum value wh en π = ˇ π . In this case, the empty sub set ∅ of U has the rough set approximation ( ∅ , ∅ ) and U itself has the rough set approxim ation ( U , U ). For any prop er subset of U , if an y , it has the rough set approx imation ( ∅ , U ). He nce, r 1 = r 2 = 1 an d r 3 = 2 n − 2. W e thus obtain by definition that H ( ˇ π ) = − 1 2 n log 1 2 n − 1 2 n log 1 2 n − 2 n − 2 2 n log 2 n − 2 2 n = n − 2 n − 2 2 n log(2 n − 2) . Whence, (3) holds, finishing the proof of the proposition. Note that in the clause (3) of Theorem 3.1, if n = 1, the value of the correspond ing summ and 0 log 0 is taken to be 0, which is consistent with the limit: lim x → 0 + x log x = 0 . For later need, let us recall the follo wing definition from [31]. Definition 3.2. Let h U , π i an d h V , σ i be two appr oxima tion spaces, and suppose that f : U − → V is a ma pping. (1) The map ping f is called a homomor phism fr o m h U , π i to h V , σ i if for any C ∈ π , ther e exists D ∈ σ such that f ( C ) ⊆ D, wher e f ( C ) = { f ( u ) | u ∈ C } . (2) A homomorph ism f is called a mon omorp hism if f is an in jective mapping. (3) A monomo rphism f is called strictly mo nomo rphic if either ther e e xist C ∈ π and D ∈ σ such that f ( C ) ( D, namely , f ( C ) ⊆ D an d f ( C ) , D, or | V | > | U | . (4) The ma pping f is ca lled an isomo rphism if the ma pping f : U − → V is bijective, an d mor eover , both f an d its in verse mapping f − 1 ar e h omomorph isms. W e can now state the following facts. Proposition 3.1 . Let h U , π i and h V , σ i be two a ppr oxima tion spaces w ith | U | = | V | , and let f : U − → V be a ma pping. (1) If f is a mon omorph ism fr om h U , π i to h V , σ i , in particular , π  σ , th en H ( π ) ≥ H ( σ ) . (2) If f is a strict m onomorp hism fr om h U , π i to h V , σ i , in particular , π ≺ σ , then H ( π ) > H ( σ ) . (3) If f is a n isomorphism fr om h U , π i to h V , σ i , then H ( π ) = H ( σ ) . Pr oof. It follows immediately from Definition 3.1 and Theorem 3.1. T o measure the granularity with r espect t o the approximation op erators a p p an d a p p ca rried by the partition π , w e introdu ce the concept of co-entropy , which correspond s to the information entropy in Definition 3.1. Definition 3.3. K eep the notations as in D efinition 3.1. The co-entro py G ( π ) of h U , π i (with r espect to the appr oxima- tion operators a p p and a p p) is defin ed by G ( π ) = G ( P ( π )) = m X i = 1 r i 2 n log r i . (8) The qu antity G ( π ) fu rnishes a m easure of th e average granularity carried b y the partition π as a whole. It f ollows immediately from definition that H ( π ) + G ( π ) = n . (9) It m eans that the two measures complement eac h o ther with respect t o the co nstant quantity n = | U | , which is in variant with respect to the choice of the partition π of U . The co-entro py function G is of the following properties. 7 Theorem 3.2. (1) F or any π, σ ∈ Π ( U ) , if σ ≺ π , then G ( σ ) < G ( π ) . (2) The co-entr opy function G r eaches the minimum value 0 for the finest partition ˆ π . (3) The co-entr opy function G r eaches the maximum value 2 n − 2 2 n log(2 n − 2) for the coarsest partition ˇ π . Pr oof. All th e clauses follow directly from Theorem 3.1 and Eq. (9). Similar to Proposition 3.1, we have the following observation. Proposition 3.2. Let h U , π i and h U , σ i be tw o appr oximatio n spa ces with | U | = | V | , and let f : U − → V b e a mapping. (1) If f is a mon omorph ism fr om h U , π i to h V , σ i , in particular , π  σ , th en G ( π ) ≤ G ( σ ) . (2) If f is a strict m onomorp hism fr om h U , π i to h V , σ i , in particular , π ≺ σ , then G ( π ) < G ( σ ) . (3) If f is a n isomorphism fr om h U , π i to h V , σ i , then G ( π ) = G ( σ ) . Pr oof. It follows immediately from Proposition 3.1 and Eq. (9). As a c orollary of The orem 3.2 and Pr oposition 3.2, we see that G is a partitio n measure on U in th e sense o f [31, Definition 3.4], th at is, G is non negativ e and satisfies the fo llowing two co nditions: G ( σ ) < G ( π ) if σ ≺ π ; G ( π ) = G ( σ ) if ther e exists an isomorp hism from h U , π i to h V , σ i . Note that our information entropy and co-entropy are not directly based on the blocks of a partition. Ther efore, in general they do not satisfy the definition of expected granularity proposed in [28]. 4. Re lationship of co-entropies between di ff erent uni verses In the la st section, we ha ve seen that if f is a strict monomorp hism from h U , π i to h U , σ i , in particular, π ≺ σ , then H ( π ) > H ( σ ) an d G ( π ) < G ( σ ). In this section, we consider th e monoto nicities of H an d G for di ff eren t universes. In othe r words, we compar e H ( π ) with H ( σ ) and G ( π ) with G ( σ ) when there exists a strict mono morph ism fro m h U , π i to h V , σ i , where | V | > | U | . For con venien ce, we write h U , π i ֒ → h V , σ i if | V | > | U | an d there exists a strict monom orphism from h U , π i to h V , σ i . W e start with the fo llowing observation on the entro py functio n H and the co -entropy fu nction G revie we d in Section 2 .2. Con sider h U 1 , π 1 i = h{ 1 } , {{ 1 }}i , h U 2 , π 2 i = h{ 1 , 2 } , {{ 1 } , { 2 }}i , and h U 3 , π 3 i = h{ 1 , 2 , 3 } , {{ 1 , 3 } , { 2 }}i . Clearly , h U 1 , π 1 i ֒ → h U 2 , π 2 i ֒ → h U 3 , π 3 i . It is easy to check by Definition 2.1 that H ( π 1 ) = 0, H ( π 2 ) = 1, and H ( π 3 ) = log 3 − 2 3 < 1. Th is means that the entropy function H is not mon otonic. By the way , we can get by a d irect computatio n that G ( π 1 ) = 0 , G ( π 2 ) = 0, an d G ( π 3 ) = 1 2 . Let us con tinue to discuss the mon otonicity of co-entr opy function G . Consider h U 1 , π 1 i = h{ 1 } , {{ 1 }}i , h U 2 , π 2 i = h{ 1 , 2 } , {{ 1 , 2 }}i , and h U 3 , π 3 i = h{ 1 , 2 , 3 } , {{ 1 , 2 } , { 3 }}i . Again, we see that h U 1 , π 1 i ֒ → h U 2 , π 2 i ֒ → h U 3 , π 3 i . It is easy to check by Definition 2. 2 th at G ( π 1 ) = 0, G ( π 2 ) = 1, and G ( π 3 ) = 2 3 . T his shows that the co-e ntropy function G is not monotonic either . In this c ase, we can obtain by a direct computation that G ( π 1 ) = 0, G ( π 2 ) = 1 2 , and G ( π 3 ) = 1 2 . Finally , we address the mono tonicity of entr opy func tion H . Con sider h U 1 , π 1 i = h{ 1 , 2 } , {{ 1 , 2 }}i , h U 2 , π 2 i = h{ 1 , 2 , 3 } , {{ 1 , 2 } , { 3 }}i , and h U 3 , π 3 i = h{ 1 , 2 , 3 , 4 } , {{ 1 , 2 , 4 } , { 3 }}i . Obviously , we h av e that h U 1 , π 1 i ֒ → h U 2 , π 2 i ֒ → h U 3 , π 3 i . By a rou tine com putation we c an g et that H ( π 1 ) = 3 2 , H ( π 2 ) = 5 2 , an d H ( π 3 ) = 13 4 − 3 4 log 3 < 5 2 . Consequently , th e entropy function H is not mono tonic either . On the o ther hand , it fo llows fr om E q. (9) tha t G ( π 1 ) = 1 2 , G ( π 2 ) = 1 2 , and G ( π 3 ) = 3 4 + 3 4 log 3. 8 As a result, in all the above three cases we alw ay s ha ve that G ( π 1 ) ≤ G ( π 2 ) ≤ G ( π 3 ) . W e thus conjec ture that G ( π ) ≤ G ( σ ) whenever h U , π i ֒ → h V , σ i . Ind eed, it holds true, as we will see later . T o prove the conjecture, it is con venient to introduce the following no tion and a ke y lem ma. Definition 4 .1. Let h U , π i b e a n a ppr oxima tion space a nd a < U . Th e ap pr oximatio n space h U ∪ { a } , π ∪ {{ a }}i is called the one- point extension of h U , π i by a. W e s ay th at h V , σ i is a one-po int extension of h U , π i if h V , σ i = h U ∪ { a } , π ∪ {{ a }}i for some a. For example, h U 2 , π 2 i = h{ 1 , 2 , 3 } , {{ 1 , 2 } , { 3 }}i is the one-poin t extension of h U 1 , π 1 i = h{ 1 , 2 } , {{ 1 , 2 }}i by 3. The following lemma shows that one-poin t extension does not change co-entropy . Lemma 4.1. Let h V , σ i be a one-po int e xten sion of h U , π i . Then G ( σ ) = G ( π ) . Pr oof. Sup pose th at π = { U 1 , U 2 , . . . , U k } and σ = { U 1 , U 2 , . . . , U k , { a }} , wh ere a < U ; assum e that A ( U , π ) = { ( A i , A ′ i ) | 1 ≤ i ≤ m } and A i =  X ⊆ U      a p p π X , a p p π X  = ( A i , A ′ i )  with |A i | = r i . It thus follows that A ( V , σ ) = A ( U , π ) ∪  A i ∪ { a } , A ′ i ∪ { a }  | 1 ≤ i ≤ m  . For any ( B i , B ′ i ) ∈ A ( V , σ ), we w rite B i for  X ⊆ V      a p p σ X , a p p σ X  = ( B i , B ′ i )  and s i for |B i | . If ( B i , B ′ i ) = ( A i , A ′ i ) ∈ A ( U , π ), then we see that B i = A i and thus s i = r i in this case. If ( B i , B ′ i ) =  A i ∪ { a } , A ′ i ∪ { a }  ∈ n A i ∪ { a } , A ′ i ∪ { a }  | 1 ≤ i ≤ m o , th en we have that B i = { X ∪ { a } | X ∈ A i } and s i = r i still h olds in this case. There- fore, we get by Definition 3.3 that G ( σ ) = m X i = 1 r i 2 n + 1 log r i + m X i = 1 r i 2 n + 1 log r i = m X i = 1 r i 2 n log r i = G ( π ) , finishing the proof of the lemma. For subsequent need, we w o uld like to generalize Definition 4.1 as follows. Definition 4.2. Let h U , π i and h V , σ i be two a ppr oxima tion spaces. W e say that h V , σ i is a m ulti-one- point extension of h U , π i if th er e a r e a ppr oxima tion spac es h U i , π i i , 0 ≤ i ≤ l, with h U 0 , π 0 i = h U , π i a nd h U l , π l i = h V , σ i such tha t each h U i , π i i , 1 ≤ i ≤ l, is a one- point e xtension o f h U i − 1 , π i − 1 i . For example, h V , σ i = h{ 1 , 2 , 3 , 4 } , {{ 1 , 2 } , { 3 } , { 4 }}i is a mu lti-one-p oint extension of h U , π i = h{ 1 , 2 } , {{ 1 , 2 }}i . I n fact, we may take h U 0 , π 0 i = h U , π i , h U 1 , π 1 i = h{ 1 , 2 , 3 } , {{ 1 , 2 } , { 3 }}i , and h U 2 , π 2 i = h V , σ i . The following f act follows immediately from Lemma 4.1. Corollary 4.1. If h V , σ i is a multi-o ne-po int extension of h U , π i , th en G ( σ ) = G ( π ) . In light of the above corollar y , let u s refer to multi-one-p oint extensions as one-poin t extensions for simplicity . Further, we h av e the following observation. Theorem 4 .1. Supp ose that th er e is a mono morphism fr om h U , π i to h V , σ i . If ther e e xists h U ′ , π ′ i tha t satisfies the following two condition s: (a) either h U ′ , π ′ i = h U , π i or h U ′ , π ′ i is a o ne-po int e xten sion of h U , π i , (b) h U ′ , π ′ i is isomo rphic to h V , σ i , 9 then G ( π ) = G ( σ ) ; othe rwise, G ( π ) < G ( σ ) . Pr oof. W e first co nsider the ca se th at there exists h U ′ , π ′ i that satisfies the cond itions (a) and (b). In this case, if h U ′ , π ′ i = h U , π i and h U ′ , π ′ i is isomo rphic to h V , σ i , then | V | = | U | and we see by Pro position 3.2 that G ( π ) = G ( σ ). If h U ′ , π ′ i is a one- point e x tension of h U , π i and h U ′ , π ′ i is isomo rphic to h V , σ i , then we get tha t G ( π ) = G ( π ′ ) by Corollary 4.1 and G ( π ′ ) = G ( σ ) b y Proposition 3.2. Consequently , G ( π ) = G ( σ ). W e n ow co nsider the case that there does no t exist h U ′ , π ′ i such that the co nditions are satisfied. I t forces that the mono morph ism, say f , fro m h U , π i to h V , σ i is strict. T wo cases n eed to consider . On e is th at | V | = | U | . In this case, it f ollows f rom Prop osition 3.2 that G ( π ) < G ( σ ). The o ther case is that | V | > | U | . I n this case, let us set h V 1 , σ 1 i = h f ( U ) , f ( π ) i , where f ( U ) is the im age of U under f and f ( π ) = { f ( U ′ ) | U ′ ∈ π } . In fact, f gives rise to an isomorph ism b etween h U , π i an d h V 1 , σ 1 i . Th erefor e, G ( π ) = G ( σ 1 ). Note that V 1 = f ( U ) ⊆ V . W e now take h V 2 , σ 2 i as follows: V 2 = V , σ 2 = σ 1 ∪ {{ a } | a ∈ V \ V 1 } . It fo llows that h V 2 , σ 2 i is a one- point extension of h V 1 , σ 1 i . Hence, G ( σ 1 ) = G ( σ 2 ). Because f is a strict mono mor- phism, we see that h U , π i ֒ → h V 2 , σ 2 i and σ 2 ≺ σ . Whence, we get G ( σ 2 ) < G ( σ ) by Theor em 3.2. As a result, G ( π ) < G ( σ ). This comp letes the proof of the theorem. Let us provide an informal explanation of Theor em 4 .1. The hypothesis that there is a monomo rphism fro m h U , π i to h V , σ i mean s th at h U , π i is fin er than h V , σ i . I n the special case tha t the mon omorp hism is n ot strict, we h av e that h U , π i an d h V , σ i are isomo rphic, and thus, th ey have the same co -entropy . If the mo nomo rphism is strict, th en after renaming the elements of U , we can get a finer partition than h V , σ i by using one- point e x tensions. T heorem 4.1 says that G ( π ) < G ( σ ) if π is finer than σ . W e end this section with several e x amples. Example 4.1. A trivial example is that h U , π i = h{ 1 , 2 , 3 } , {{ 1 , 2 } , { 3 }}i and h V , σ i = h{ a , b , c } , {{ a , b } , { c }}i . The map - ping f tha t ma ps 1 , 2 , and 3 to a, b, and c respectively is a monomo rphism. In fact, f is an isomo rphism. Hence, G ( π ) = G ( σ ) . A dir ect computation shows that G ( π ) = 0 . 5 = G ( σ ) . Consider h U , π i = h{ 1 , 2 } , {{ 1 , 2 }}i and h V , σ i = h{ a , b , c , d } , {{ a , b } , { c } , { d }}i . The ma pping f th at ma ps 1 and 2 to a and b respectively is a monomorphism, w h ich yields that h U , π i is isomorphic to h V 1 , σ 1 i = h{ a , b } , {{ a , b }}i . Clearly , we can get h V , σ i by one- point extensions of h V 1 , σ 1 i . Ther efo r e, G ( π ) = G ( σ ) . On the o ther ha nd, we can get by a computatio n that G ( π ) = 0 . 5 = G ( σ ) . F inally , consider h U , π i = h{ 1 , 2 } , {{ 1 , 2 }}i and h V , σ i = h{ a , b , c , d } , {{ a , b } , { c , d }}i . As mentione d earlier , the mapping f that maps 1 and 2 to a and b respectively is a monomorphism, w h ich gives a n i so morphism between h U , π i and h V 1 , σ 1 i = h{ a , b } , {{ a , b }}i . W e can get h V 2 , σ 2 i = h{ a , b , c , d } , {{ a , b } , { c } , { d }}i by one-po int extensions of h V 1 , σ 1 i . Clearly , σ 2 ≺ σ . As a r esult, G ( π ) < G ( σ ) . On the other hand, w e can obtain by a dir ect computation that G ( π ) = 0 . 5 and G ( σ ) = 0 . 75 . 5. Co nclusion In this pap er , we have proposed the novel notions of entropy and co-en tropy by taking both par titions and the lower an d upper app roximatio ns into ac count. Some desirab le prop erties o f the en tropy and co-en tropy have been presented. Fu rthermo re, we ha ve inv estiga ted the relationship of co-entrop ies between di ff erent uni verses. There are se veral problem s which are worth fu rther studying. Firstl y , the present work fo cuses on the classical rough sets b ased on p artitions. It would be inter esting to ge neralize th e notio ns of entropy and c o-entro py here into the framework of covering ro ugh sets [4, 19, 29] or fuzzy rou gh sets [5]. It is also in teresting to compare the entropies (co-entrop ies) un der some special homomo rphisms such as neighbo rhood -consistent functions introduced in [33]. Seco ndly , it rem ains to develop the co rrespon ding roughness measure based on the entropy or co-en tropy for measuring numer ically the r oughn ess of an appro ximation. Finally , the con ditioned entropy and conditio ned co-entro py [2] a re yet to be addressed in our framew o rk. Acknowledgements This work was su pported by the N ational Natur al Science Foundatio n of Chin a und er Grants 608 2100 1, 608 7319 1, 60903 152, and 610702 51. 10 References [1] Bea ubouef, T ., Petry , F . E., Arora, G., 1998. Information-theo retic measure s of uncertaint y for rough sets and rough relation al databases. Inform. Sci. 109, 185–195. [2] Bia nucci, D. , Cattan eo, G., 2009. Information entropy and granulat ion co-en tropy of partitions and cov erings: A summary . In: Peters, J., Sko wron, A., W olski , M. , Cha kraborty , M. , W u, W .-Z. (Eds.), T ransaction s on Rough Set s X. V ol. 5656 of Lec t. Notes Comput. Sci . Springer Berlin / H eidel berg, pp. 15–66. [3] Bia nucci, D., Cattaneo, G., Ciucci, D. , 2007. Entropies and co-entropies of cov erings with applicati on to incomple te information systems. Fund. Inform. 75, 77–105. [4] Bryni arski, E., 1989. A calcu lus of rough sets of the first order . Bull. Pol. Acad. Sci. 36 (16), 71–77. [5] Duboi s , D., Prade, H., 1990. Rough fuzzy sets and fuzzy rough sets. Int. J. Gen. Syst. 17, 191–208. [6] D ¨ untsch, I., Gediga , G., 1998. Uncertainty measures of rough set prediction. Artif. Intell. 106 (1), 109–137. [7] Hart ley , R. V . L., 1928. Transmission of informa tion. Bell Syst. T ech. J. 7, 535–564. [8] Liang , J., Shi, Z., Li, D. , Wie rm an, M. J., 2006. Information entropy , rough entropy and knowledg e granulation in incomplete information systems. Int. J. Gen. Syst. 35 (6), 641–654. [9] Liang , J. Y ., Chin, K. S., Dang, C. Y . , Y am, R. C. M., 2002. A new method for measuring uncert ainty and fuzziness in rough set theory . Int. J. Gen. Syst. 31, 331–342. [10] Lian g, J. Y ., Qian, Y . H., 2008. Information granules and entropy theory in information systems. Sci. China Ser . F: Inform. Sci. 51 (10), 1427–1444. [11] Lian g, J . Y ., Shi, Z. Z., 2004. The informati on entropy , rough entropy and knowl edge granula tion in rough set theory . Int. J. Uncert. Fuzz. Kno wl. Syst. 12 (1), 37–46. [12] Lin, T . Y ., Cercone, N. (Eds.), 1997. Rough Sets and Data Mining. Kluwer Academic Publishers, Boston. [13] Mia o, D. Q., W ang, J. , 1998. On the rel ationships between information entrop y and roug hness of kno wledge in rough se t the ory (in Chinese). Patt ern Recognit. Artif. Intell. 11 (1), 34–40. [14] P awlak, Z., 1982. Rough sets. Int. J. Comput. Inform. Sci. 11 (5), 341–356. [15] P awlak, Z., 1991. Rough Sets: Theoretical Aspects of Rea soning about D ata. Kluwer Acade mic Publishers, Boston. [16] P awlak, Z. , 1992. Rough s ets: a new approach to vague ness. In: Zadeh, L . A., Kacprzyc, J. (Eds. ), Fuzzy Logic for the Management of Uncerta inty . John Wil ey & Sons, Inc, Ne w Y ork, pp. 105–118. [17] Polk owski, L., Sko wron, A. (Eds. ), 1998. Rough Sets and Current Tre nds in Computing. V ol. 1424. Springer , Berlin. [18] Polk owski, L., Sko wron, A. (Eds. ), 1998. Rough Sets in Kno wledge Discov ery . V ol. 1 and 2. Physica-V erlag, Heidelber g. [19] Pomyka la, J. A., 1987. Approximat ion operations in approxi m ation space. Bull. Pol. Acad. Sci. 35 (9-10), 653–662. [20] Qia n, Y . H. , Liang, J. Y . , 2008. Combination entrop y and combination granulati on in rough set theory . Int. J. Uncert. Fuzz. Knowl. Syst. 16 (2), 179–193 . [21] Sen, D., Pal, S. K., 2009. Generali zed rough sets, entropy , and image ambiguity measures. IEE E Trans. Syst., Man, Cybern. B, Cybern. 39 (1), 117–128 . [22] Shan non, C. E., 1948. A mathemati cal theory of communica tion, I, II. Bell Syst. T echn. J. 27, 379–423, 623–656. [23] W ierman, M., 1999. Measuri ng uncertainty in rough set theory . Int. J. Gen. Syst. 28, 283–297. [24] Xu, W . H. , Zhang, X. Y ., Zhang, W . X., 2009. Knowle dge granulation, kno wledge entropy and knowledge uncertaint y measure in ordered informati on systems. Appl. Soft Comput. 9 (4), 1244–1251. [25] Y ao, Y . Y ., 2003. Information-the oretic measures for knowle dge discover y and data m ining. In: Karmeshu (Ed. ), Entropy Measures, Maxi- mum Entropy and Emer ging Applicat ions. Springer , Berlin, pp. 115–136. [26] Y ao, Y . Y ., 2003. Probabilisti c approac hes to rough sets. Expert Syst. 20 (5), 287–297. [27] Y ao, Y . Y ., 2010. Notes on rough set approximat ions and associate d measures. J. Zhejiang Ocea n Univ . (Natur . Sci.) 29, 399–410. [28] Y ao, Y . Y ., Zhao, L. Q., 2009. Granularit y of partiti ons, (pri vate communication ). [29] Zak owski, W ., 198 3. Approximati ons in the spac e ( u , π ). Demonstr . Math. 16, 761–769 . [30] Zhong, N., Y ao, Y ., Ohshima, M. , 2003 . Peculiarity oriented multidataba se mining. IEEE Trans. Kno wl. Data Eng. 15 (4), 952–960. [31] Zhu, P ., 2009. An axiomatic approach to the roughness m easure of rough sets. Fund. Inform.T o appear , av ailabl e at: Arxiv preprint arXi v:0911.5395 . [32] Zhu, P . , 2009. An impro ved axiomatic definition of information granulatio n. Arxiv prepri nt arXiv: 0908.3999 . [33] Zhu, P . , W en, Q., 2010. Some impro ved results on communication between information systems. Inform. Sci. 180 (18), 3521–3531. [34] Zhu, W . , W ang, F . Y ., 2006. Cove ring based granular computing for conflict analysis. V ol. 3975 of L ect. Notes Comput. Sci. pp. 566–571, IEEE Int. Conf. Intel l. Secur . Inform. (ISI 2006), San Diego, CA, May 23-24, 2006. [35] Ziar ko, W . (Ed.), 1994. Rough Sets, Fuzzy Sets, and Knowledg e Discove ry . Springer-V erlag, Berlin. 11

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment