Experts Fusion and Multilayer Perceptron Based on Belief Learning for Sonar Image Classification

Experts Fusion and Mult ilayer Perceptron Based on Belief Lear ning for Sonar Image Classiﬁcati on Arnaud Martin ENSIET A E 3 I 2 EA3876 2 rue Franc ¸ ois V erny , 29806 Brest Cedex 09, France Email: Arna ud.Martin@ensieta.fr Christophe Osswald ENSIET A E 3 I 2 EA3876 2 rue F ranc ¸ ois V e rny , 29806 Bres t Cedex 09, France Email: Christophe.Os swald@ensieta.fr Abstract — The sonar images pro vide a rapid view of the seabed in order to charac terize it. Howe ver , in such as uncertain en vironment, real seabed is unknown and the only information we can obtain, is t he in terpreta tion of different hu man experts, sometimes in conﬂict. In th is p aper , we propose to manage this conﬂict in order to prov ide a robust reality for the learning step of classiﬁcation algorithms. The classiﬁcation is cond ucted by a multilayer perceptron, taking i nto account the un certainty of the reality in the learning stage. Th e r esults of this seabed characterization are presented on real sonar images. I . I N T RO D U C T I O N The seabed character ization ser ves m any useful purpo ses, e.g help the na vigation o f Autono mous Und erwater V ehicles or provid e data to sedime ntologists. In such sonar applica- tions, seabed image s ar e obtained with many imperfection s [1]. Ind eed, in order to build images, a hu ge nu mber of physical data (geometry of the device, coo rdinates of the ship, movements of the sonar , etc.) are taken into accou nt, but these data are po lluted with a large amou nt of noises caused by instrumentatio n. In additio n, there are some in terferenc es d ue to the signal traveling on multiple paths (reﬂection on the bottom or surface), due to speckle, and due to fauna an d ﬂora. Therefo re, sonar ima ges have a lot of imperfection s such as imprecision and un certainty; th us sediment classiﬁcation on sonar images is a difﬁcult problem e ven for hu man experts. In this kind o f app lications, the reality is u nknown and different experts can pr opose different classiﬁcations of the image . Figure 1 exhib its the differences b etween the inte rpretation and the certainty of two sonar experts trying to dif ferentiate the type of sedim ent (rock, cobbles, sand , r ipple, silt) or sh adow when the inf ormation is in visible. Each color correspo nds to a kind of sed iment and the associated certainty of the expert f or this sediment expressed in term of sure, mo derately sure and not sure. Thus, in ord er to learn a n automa tic classiﬁcation algorithm , we must take in to accoun t this difference and the uncertainty of each expert. For example, ho w a tile o f rock labeled as n ot sur e mu st be taken in to account in the learning step of th e classiﬁer an d how to take into accoun t this tile if another exper t says that it is sand? T extur ed image c lassiﬁcation, such as so nar ima ge, is gen- erally do ne on a local part of the im age (pixel, or most of the time on small tiles of e.g . 16 × 1 6 or 32 × 32 pixels). Usual sonar im age classiﬁcation methods are usually sup ervised [2], [3], [1] and can b e described into three steps. First, signiﬁcan t features are extracted fro m these tiles. Generally , a secon d step in necessary in order to r educe these features, becau se they are too numerou s. In the third step , these features feed classiﬁcation algorithms. The particularity in considering small tiles in image classiﬁcation is that som etimes, two or more classes can co-exist on a tile. How to take into acco unt the tiles with mor e than one sediment? Fig. 1. Se gmentat ion giv en by tw o exp erts. Many fusion theories can b e used for the expe rts fusion in image classiﬁcation such as voting rules [4], [5] , possibility theory [6], [7], be lief function theory [8], [ 9], [10], [11]. In o ur case, e xperts can e xpress th eir certitude on their perception. As a result, p robab ilities theories such as the Bayesian theory or th e belief function th eory ar e more a dapted. I ndeed, the possibility theory is more adapted to m odelize the im precise data whereas probability -based theories is mo re ad apted to modelize th e un certain da ta. Of course bo th possibility and probab ility-based th eories can imitate imprecise and uncertain data at the same time, but no t so ea sily . That is why our choice is cond ucted on the belief function th eory , also called the Dempster-Shafer theory [ 8], [9] or the Transferable Belief Model [1 0], [11]. W e can divide the fusion appro ach in to two lev els: the cr edal level an d the d ecision level. The credal lev el can be described in to three stages: the b elief fun ction model, the estimation of some parameters dep ending on the model (n ot always necessary) , and the com bination. The m ost difﬁcult step is presumably th e ﬁrst o ne: th e belief f unction model from which the other steps f ollow . The paper is organized as follow: in a ﬁrst section we r ecall the bases of the transfer able b elief m odel. Next, we present an approa ch of exper ts fusion in order to o btain a reality o n ou r sonar images. W e propo se a new m ultialyer perce ptron based on belief lear ning. I n the last section, we sho w the result of the classiﬁcation of sona r images. I I . T R A N S F E R A B L E B E L I E F M O D E L B A S E S A. Cr eda l level 1) Belief Fun ction Models: Consider the space of discern - ment Θ = { C 1 , C 2 , . . . , C N } , where C i is the hyp othesis “th e considered tile belon gs to the class i ”. The b elief function s can be expressed in several forms: the basic b elief assignments (bba) m , th e credibility fu nction bel and the plausibility function pl , wh ich are in one-to- one correspon dence. The basic belief assign ments (bb a) m a re d eﬁned by the mapping of the power set 2 Θ (deﬁned by a ll the disjunctions of Θ ) o nto [0 , 1] , with: X X ∈ 2 Θ m ( X ) = 1 . (1) In the op en world case [10]: m ( ∅ ) > 0 . (2) These simple conditions in equ ation (2) and ( 1), give a large panel of deﬁnitions of the bba, w hich is one of the dif ﬁculties of the theory . The b elief functio ns mu st therefo re be chosen accordin g to the inten ded applicatio n. The cr edibility f unction is given f or all X ∈ 2 Θ by: bel( X ) = X Y ∈ 2 X ,Y 6 = ∅ m ( Y ) . (3) The p lausibility fu nction is given for all X ∈ 2 Θ by: pl( X ) = X Y ∈ 2 Θ ,Y ∩ X 6 = ∅ m ( Y ) = b el(Θ ) − bel( X c ) , (4) where X c is the com plementary of X . 2) Combinatio n rules: M any comb ination rules have been propo sed these last years in the context of the belief func tion theory ([12], [1 3], [10], [14], [1 5], etc. ). In the con text of th e TBM, the comb ination r ule most used today seems to be the conjunc ti ve ru le g iv en by [ 10] for all X ∈ 2 Θ by: m c ( X ) = X Y 1 ∩ ... ∩ Y M = X M Y j =1 m j ( Y j ) , (5) where Y j ∈ 2 Θ is the response of the expert j , and m j ( Y j ) the associated belief func tion. Howe ver, the conﬂict (tha t is g i ven b y m c ( ∅ ) ) can be redistributed on partial ignoran ce like in the Du bois and Pr ade rule [13], a mixed con junctive and disjunctive r ule g iv en for all X ∈ 2 Θ , X 6 = ∅ by: m DP ( X ) = X Y 1 ∩ ... ∩ Y M = X M Y j =1 m j ( Y j ) + X Y 1 ∪ ... ∪ Y M = X Y 1 ∩ ... ∩ Y M = ∅ M Y j =1 m j ( Y j ) , (6) where Y j ∈ 2 Θ is the response of the expert j , and m j ( Y j ) the associated belief func tion. W e have propo sed an other pro portion al conﬂict redistribu- tion ru le [15] for M experts, for X ∈ 2 Θ , X 6 = ∅ : m P C R ( X ) = m c ( X )+ M X i =1 m i ( X ) 2 . X ( Y σ i (1) ,...,Y σ i ( M − 1) ) ∈ (2 Θ ) M − 1 M − 1 ∩ k =1 Y k ∩ X = ∅ M − 1 Y j =1 m σ i ( j ) ( Y σ i ( j ) ) m i ( X ) + M − 1 X j =1 m σ i ( j ) ( Y σ i ( j ) ) , (7) where:  σ i ( j ) = j, if j < i, σ i ( j ) = j + 1 , if j ≥ i, (8) m i ( X ) + M − 1 X j =1 m σ i ( j ) ( Y σ i ( j ) ) 6 = 0 , m c is the co njunctive consensus rule g iv en b y the equ ation (5). This ru le allows a propo rtional conﬂict redistribution on the subsets from where the conﬂict comes and is equiv alent fo r two experts to the rule giv en in [ 16]. This r ule will b e illustrated on simple examples in the next sectio n. These ru les are compa red in [ 17]. B. Decision level The decision is a difﬁcult task. No measures are able to provide the best decision in all the cases. Genera lly , we con- sider the maximu m of one of the th ree fun ctions: credibility , plausibility , and pign istic proba bility . The pignistic prob ability , introdu ced by [18], is here given for all X ∈ 2 Θ , with X 6 = ∅ b y: betP ( X ) = X Y ∈ 2 Θ ,Y 6 = ∅ | X ∩ Y | | Y | m ( Y ) 1 − m ( ∅ ) . (9) If the credibility function p rovides a pessimist decision , the p lausibility fun ction is often too optimist. T he pign istic probab ility is usu ally taken as a comp romise. I I I . E X P E RT S F U S I O N In o rder to fuse the opinions of different expe rts on a gi ven tile X , we have to take into acco unt the cer tainty of experts and p ropor tion of the two (o r mo re) sed iments but no t o nly on one focal elem ent. In this case, th e space of discern ment Θ represents the different kind of sediments on sonar images, such as rock , sand, silt, cobble, rip ple or shad ow (th at mean s no sedimen t inform ation). The experts give their perception and belief accord ing to th eir certainty . For in stance, the expert can be moderately sure of his choice when he labels one part of the image as belonging to a certain class, and be totally doubtf ul on anothe r par t of th e image. Moreover , on a co nsidered tile, mor e than o ne sedime nt can be present. Consequently we have to take into acco unt all these aspects of the applications. In order to simplif y , we consider on ly two classes in the fo llowing: the rock r eferred as A , and the sand, referred as B . Th e pr oposed models can be easily extended , but their study is easier to under stand with on ly two classes. Hence, on certain tiles, A and B can be present fo r one or more experts. The belief function s have to take into accoun t the certainty given by the expe rts (r eferred respectively as c A and c B , two n umbers in [0 , 1] ) as we ll as the pr oportio n of the kind of sediment in the tile X (referred as p A and p B , also two numbers in [0 , 1] ) . W e h av e two inter pretations of “the expert believes A ”: it can mean th at the expert th inks that there is A on X and n ot B , o r it can m ean that the expert thinks that there is A on X and it can also have B but he does not say anything abou t it. The ﬁrst interpretation yield s that hypoth eses A and B are exclusive an d with the secon d they are n ot exclusive. W e o nly study the ﬁrst case: A and B are exclusi ve. But on the tile X , the expert can also provide A and B , in this case the two p roposition s “th e expert believes A ” an d “the expert be lie ves A an d B ” are not exclusive. W e propose a model con sidering on ly on e belief function accordin g to the pro portion by:    m ( A ) = p A .c A , m ( B ) = p B .c B , m ( A ∪ B ) = 1 − ( p A .c A + p B .c B ) . (10) For in stance, consid er two exp erts providing their opinion on th e tile X . T he ﬁrst expert says that on tile X there is some rock A with a cer tainty equ al to 0.6. Hen ce f or this ﬁrst expe rt we h av e : p A = 1 , p B = 0 , an d c A = 0 . 6 . The second expert thinks that there a re 5 0% of rock and 50% of sand on the considered tile X with a respecti ve certainty of 0.6 and 0.4. Hence f or the second expert we have: p A = 0 . 5 , p B = 0 . 5 , c A = 0 . 6 and c B = 0 . 4 . W e illustrate all our proposed mod els with this nu merical exemple. Consequently , we have simply: A B A ∪ B m 1 0 . 6 0 0 . 4 m 2 0 . 3 0 . 2 0 . 5 The non-norm alized conju nctive rule, th e cre dibility , the plausibility a nd the pignistic probab ility are given by: el ement m c bel pl betP ∅ 0 . 12 0 0 − A 0 . 6 0 . 6 0 . 8 0 . 7955 B 0 . 08 0 . 08 0 . 28 0 . 204 5 A ∪ B 0 . 2 0 . 8 8 0 . 88 1 In this case we d o not hav e the p ossibility to d ecide o n A ∩ B , because th e conﬂict is on ∅ . The PCR rule provides: el ement m P C R bel pl betP ∅ 0 0 0 − A 0 . 69 0 . 69 0 . 89 0 . 79 B 0 . 1 1 0 . 1 1 0 . 31 0 . 21 A ∪ B 0 . 2 1 1 1 where m P C R ( A ) = 0 . 6 0 + 0 . 09 = 0 . 69 , m P C R ( B ) = 0 . 08 + 0 . 03 = 0 . 11 . W ith the PCR r ule, th e d ecision will be a lso A . Of cou rse, we cannot say on th is example which rule is the best, and we can apply these two ru les in order to construct a reality tak ing into ac count the doubts o f d ifferent experts. This reality can serve to train a c lassiﬁer and also to evaluate this classiﬁer . W e ca n use many supervised classiﬁers. In the next section, we propose to introd uced a new class iﬁer: a multilay er perceptro n based o n belief learn ing, take in to accou nt all the reachness o f the belief basic assign ment. I V . M U LT I L A Y E R P E R C E P T R O N B A S E D O N B E L I E F L E A R N I N G W e prop ose in this section a new belief multilayer percep- tron where the difference between the multilay er per ceptron relates to th e lea rning based on a belief learn ing. I n [19], a neural network classiﬁer based on Dempster-Shafer theory is presented. In this work, the neura l network consider the bba at each neuro n, that is no t th e case in our approach presented feedfor ward. A. A mu ltilayer perceptr on The n eural network classiﬁers are to day the m ost used supervised classiﬁers. T he multilayer perceptron (M LP) is a feedfor ward f ully co nnected neura l network. The tile X is d escribed b y n featur es ( x 1 , ..., x n ) . E ach u nit of the network is an artiﬁcial neur on called perce ptron, with the structure given in ﬁgu re 2. All the neu ron o utputs of e very lay er are connecte d to all the neuron inputs of the next layer weighted by values we have to learn . Th ese weigh ts are ﬁrst initialized with small random values. In order to le arn these v alues we pr esent to the network the learnin g vectors and the correspo nding desired outputs. The ob jectiv e of the le arning process is to minimize the quadratic er ror: ǫ = 1 2 N X i =1 ( d i − s i ) 2 , (11) Fig. 2. Arti ﬁcial neuron struct ure. where s i are the obtain ed outputs of the multilay er perceptron and d i is 1 if th e class of X is C i and 0 elsewhere. As shown on ﬁgu re 2, we can use the sigmo id function giv en by: f ( x ) = 1 1 + e − x . (12) So we o btain the learn ing algorithm called the back prop aga- tion algorithm for the iteration t + 1 : w l 1 l 2 ( t + 1) = w l 1 l 2 ( t ) + η δ l 2 ( t ) s l 1 ( t ) , (13) where w l 1 l 2 is the weigh t value between the neur on l 1 of the ﬁrst layer a nd the neur on l 2 of the f ollowing layer , η stands for the learning rate, s l 1 ( t ) is th e obtained output of the n euron l 1 at th e iter ation t , and δ l 2 ( t ) is given by: δ i ( t ) = cs i ( t )(1 − s i ( t ))( d i − s i ( t )) , (14) if l 2 = i is on the output layer, where the con stant c contro ls the slope of the sigmoid fu nction, an d δ l 2 ( t ) = cs l 2 ( t )(1 − s l 2 ( t )) X l δ l ( t ) w ll 2 ( t ) , (15) else where. B. Belief lea rning The use of uncertain and imprecise data f or learning h av e been used in [20], [2 1] for decision tr ees an d in [22], [2 3] f or a creda l EM appro ach. I n the previous ap proach, the learning set L is comp osed of K examples ( X t , C t ) , t = 1 , ..., K , where X t is a tile (a n -dimension al vector g i ven b y n features calculated on the tile) and C t ∈ Θ the class of X t . The learning set is also given by the couples ( X t , d t ) , with d t the function equal to 1 if the class of X is C t and 0 else wh ere. The b elief learning is b ased on the use of a learning set ˜ L given by: ˜ L = { ( X t , m Θ t ) , t = 1 , ..., K } , (16) where m Θ t is the bb a deﬁned on Θ . In our case, hu man expert can not provide w ith certainty the class of a given tile X , and acc ording to the experts, more than one class can be present o n the tile X . Hen ce we cann ot have the function d i that is 1 if the class of X is C i and 0 else where. The simple idea of the belief learning for the m ultilayer perceptro n is to consider the b elief basic assignme nt in order to min imize the er ror ǫ given b y the e quation (11). Hence, we obtain 2 | Θ | neuron s o n the outp ut le vel a nd we ca n stay in the credal level. C. Decision level Usually the d ecision is taken con sidering th e maximu m of the values on the o utput layer . These values are between 0 and 1, but the sum is n ot 1. W e can easily nor malize them in order to inte rpret these values as belief b asic assignmen t. For instance the n ormalization can be made dividing by the sum of the values of the outpu t lay er . Hence , the decision can be condu cted by the maximum of the pignistic probab ility , or wit h other function such as the credibility or th e plau sibility . Note that if th e outp ut layer is co mposed o nly with the singletons, to consider the m aximum o f the v a lues o r the maximum o f the pignistic p robability is th e sam e. V . I L L U S T R AT I O N A. Database Our datab ase contains 42 son ar imag es provided by the GESMA (Gro upe d’Etu des Sous-Marine s de l’Atlan tique). These images were obtained with a Klein 5400 lateral sonar with a resolution of 2 0 to 30 cm in azimuth and 3 cm in range. The sea-b ottom dep th was b etween 15 m a nd 4 0 m . Three experts ha ve manu ally se gmented these images gi ving the kind of sediment (rock , cobble, sand, silt, ripple (hori- zontal, vertical or at 45 degrees)) , shadow or oth er (typically ships) p arts on images, helped by the manual segmentation interface p resented in ﬁgu re 3. All sedimen ts are given with a certainty level (sure, moder ately sure o r not sure), and the bound ary between two sedim ents is also given with a c ertainty (sure, moderately sure or not sure). Hence, e very pixel of e very image is labeled as bein g either a certain type of sediment or a shad ow or other, or a bou ndary with o ne o f th e thr ee certainty lev els. W e cho ose the weights: 2/3, 1 /2 and 1/3, for respectively the certainty lev els: sur e, moderately sure and not sure. Fig. 3. Man ual Segment ation Interfa ce. B. Experts F usion In o rder to obtain a kind o f rea lity fo r lear ning task, we ﬁrst fuse th e o pinion of the th ree expe rts following the p resented model. W e n ote A for rock, B for sand, C fo r cobble, D for silt, E for ripple, F for shadow and G for o ther, h ence we have seven classes and Θ = { A, B , C , D , E , F , G } . W e h av e applied our mod el on tiles of size 64 × 64 pixels given by:                            m ( A ) = p A 1 .c 1 + p A 2 .c 2 + p A 3 .c 3 m ( B ) = p B 1 .c 1 + p B 2 .c 2 + p B 3 .c 3 m ( C ) = p C 1 .c 1 + p C 2 .c 2 + p C 3 .c 3 m ( D ) = p D 1 .c 1 + p D 2 .c 2 + p D 3 .c 3 m ( E ) = p E 1 .c 1 + p E 2 .c 2 + p E 3 .c 3 m ( F ) = p F 1 .c 1 + p F 2 .c 2 + p F 3 .c 3 m ( G ) = p G 1 .c 1 + p G 2 .c 2 + p G 3 .c 3 m (Θ) = 1 − ( m ( A ) + m ( B ) + m ( C ) + m ( D ) + m ( E ) + m ( F ) + m ( G )) , (17) where c 1 , c 2 and c 3 are the weights associated to the certitude respectively: “sure”, “moderately sure” a nd “not sure” ( e.g. here: c 1 = 2 / 3 , c 2 = 1 / 2 and c 3 = 1 / 3 ). Inde ed we have to consider the cases wh en the same kin d of sed iment (but with different certainties) is present on the same tile. The proportion of ea ch sedim ent in the tile associated to these weights is noted, for instance fo r A : p A 1 , p A 2 and p A 3 . In ord er to pr ovide a reality for th e learning , the experts can b e fu se by the n on-no rmalized conjunctive rule or th e generalized PCR as we see before, and the decision can be taken on th e ma ximum of the pignistic p robab ility . The total conﬂict between the three experts is 0.24 32. T his conﬂict comes essentially f rom the difference of opin ion of the experts and not from th e tiles with more th an one sediment. In deed, we ha ve a weak auto-conﬂict (conﬂict com ing fro m the combinatio n of the same expert three times). The values of the auto -conﬂict for the th ree experts are: 0.0 841, 0.0840 , a nd 0.074 6. W e note a d ifference of decision between the thr ee combinatio n rules g iving by the equation s (7) for th e gen eral- ized PCR, and (5) for the con junctive rule. The prop ortion o f tiles with a different d ecision is 1.01% between the generalized PCR a nd th e co njunctive rule. Ho wev er , we cann ot evaluate on th is datab ase which combin ation rule is the best. C. Results In or der to classify the tiles of size 64 × 6 4 p ixels, we ﬁrst h av e to e xtract textur e parameter s f rom each tile. Here, we choose th e co-occu rrence matrices approach [ 1]. The co- occurre nce matrices are calculated by numb ering the occur- rences of iden tical gray lev el of two pixels. Four dire ctions are c onsidered : 0, 45, 90 and 13 5 degrees. Concerning th ese four directions, six p arameters gi ven by [24] are calculated: homog eneity , contrast estimation, entropy estimation , the cor- relation, the directivity , and the unifor mity . This classical approa ch yields 24 parameters. The proble m for co-occurren ce matrices is the non- in variance in translation. T y pically , this problem can ap pear in a rip ple texture characterization . Mor e features extraction approaches can be used such as the run- lengths matrix , the wa velet transfor m an d the Gabo r ﬁlters [1]. Hence, each tile is re presented by the 24 par ameters, and we can try to classify the ti les by the multilayer perceptron and the belief multilayer perceptron . So, the input layer contains 24 neuron s, and the o utput lay er co ntains 7 neu rons (o ne for each class). F or the belief multilayer perceptro n, th e mass calculated by the fusion of th e three expe rts accordin g to the mod el given in (17) allows the learning. W e test the both comb ination gi ven by th e co njunctive n on-no rmalized rule (5) an d the PCR rule (7). The m ass m odel g iv es f ocal elem ent o nly o n the sing leton and the ignorance Θ . In order to learn only on the singletons, we con sider only the bb a given o n th e singleto ns, and we renorm alize them in order to obta in one for the singleto n giv en the m aximum belief; the outp ut values are not bba in all th e case. Hence, the ou tput lay er of the belief mu ltilayer pe rceptron is compo sed only by seven neu rons (one for each class). Of course, it co uld b e more interesting to keep 2 Θ = 128 neur ons on the last layer in order to stay in the creda l level and keep the power of this classiﬁer . However , this is possible on ly if enoug h data are av ailable for the learning . In order to take a d ecision on bba with the maximum of the pign istic pr obability , we ann ul the minimu m value of the output layer th en we nor malize b y th e sum of the values. Here it is similar to decide o n the maximum of the values of the output layer, but it is not the same in all th e cases as shown afterwards. On the 42 sonar image s, we hav e 9266 tiles of size 64 × 64 pixels. Our datab ase has b een ra ndomly divided into two parts. The ﬁrst one (2 /3 of the d atabase) is used for the multilayer per ceptron an d the belief m ultilayer p erceptro n learnings, and the secon d one for tests. W e r epeat this rando m division 3 0 times in order to achieve a good estimation of the classiﬁcation rate, and we analyze the mean percentage of g ood classiﬁcation rates deﬁned as the number of good classiﬁed small- images dived by the total of small-images. W ith the non-norm alized conjunctive rule, we obtain 64.49 % of g ood-classiﬁcation rates (with a con ﬁdence in- terval of [64.07;64 .91]) fo r the classic multilayer percep tron and 65.1 0% of goo d-classiﬁcation rates (with a co nﬁdence interval of [64.72 ;65.48 ]) f or th e belief mu ltilayer p erceptro n. If the reality is obtained by the gen eralized PCR, we have 64.96 % of g ood-classiﬁcation rates (with a conﬁde nce interval of [64.44;65. 25]) f or the classic multilayer perc eptron and 64.84 % of g ood-classiﬁcation rates (with a conﬁde nce interval of [6 4.55;6 5.39]) for the belief mu ltilayer p erceptron . The e valuation is made on an unknown r eality and so we can not say that the experts fusion giv en by the non-no rmalized conjunc ti ve rule is b etter than the experts fusion obta ined b y the genera lized PCR rule. In the c ase of the non -norm alized conjunc ti ve r ule, th e b elief mu ltilayer percep tron gives sig nif- icantly better g ood-classiﬁcation rates than the multilayer per- ceptron. In the case of the gener alized P CR rule, the results are not signiﬁcan tly different. Howe ver , if w e rep eat the random division 100 0 times, we ob tain 6 5.043 % of g ood- classiﬁcation rates (with a conﬁden ce interval of [64.97 ;65.11 ]) for th e clas- sic multilayer p erceptron an d 65. 125% of good -classiﬁcation rates (with a con ﬁdence interval of [6 5.06;6 5.19]) for th e belief mu ltilayer perce ptron, with the reality is o btained by th e generalized PCR. Th ese results show tha t, h ere a lso, the b elief multilayer per ceptron improves signiﬁcantly the classiﬁcation rates. Another interest of the belief multilayer perceptro n comes from the d ecision step. For instance, the cob ble c an be seen like a doubt between the rock a nd the sand according to the size of the tile. Hence, the class C can be re writen a s A ∪ B . The learn ing will be the same th at previously , b ut the decision by the maximum of th e p robab ility pignistic will provide ano ther result. W e can not co mpare these results with the classic mu ltilayer perceptro n, because the decision step is taken by the m aximum of the values of th e outputs layer . Another example can b e done if we consider the class shadow like the absence of info rmation, e.g. we can associate this class to the igno rance and rewrite the mo del by :                        m ( A ) = p A 1 .c 1 + p A 2 .c 2 + p A 3 .c 3 m ( B ) = p B 1 .c 1 + p B 2 .c 2 + p B 3 .c 3 m ( C ) = p C 1 .c 1 + p C 2 .c 2 + p C 3 .c 3 m ( D ) = p D 1 .c 1 + p D 2 .c 2 + p D 3 .c 3 m ( E ) = p E 1 .c 1 + p E 2 .c 2 + p E 3 .c 3 m ( G ) = p G 1 .c 1 + p G 2 .c 2 + p G 3 .c 3 m (Θ) = 1 − ( m ( A ) + m ( B ) + m ( C ) + m ( D ) + m ( E ) + m ( G )) . (18) The igno rance Θ can be lear ned and so be rep resented by a n euron on th e ou tput la yer . He re also, the d ecision b y the maximum of the p robability pign istic provides ano ther result, we ca nnot c ompare with the classic m ultilayer perce ptron. V I . C O N C L U S I O N S W e have p roposed in this paper two different fu sion a p- proach es in sonar images pro cessing. The ﬁrst novelty is the experts fusion m odel, w e can apply in many image pro cessing problem s. Ind eed, if some images represen t un certain en v i- ronmen ts, the r eality is unknown and we must co mpose and propo se a reality ( e.g. in order to train a classiﬁer) f rom the experts opin ions. In this kind of environments, experts cann ot say with certainty what is exactly o n the images and we have to take into account the d oubt o f the experts in order to describe the images. The second novelty is the multilayer perceptro n with a b elief learning improves signiﬁcantly the classic m ultilayer percep tron. I t could b e mo re interesting to keep 2 | Θ | neuron s on the last layer in order to stay in the credal lev el and keep the power of t his classiﬁer . Hence, this classiﬁer can provide a belief o n every subset of 2 Θ , an d the decision can be made on this space. The e valuation of this classiﬁer must b e made o n more d ata sets especially on databases where the real classes are kn own an d with data giving in terms of belief. T he prob lem of image classiﬁcation ev alu ation is very hard to so lve in un certain environment [25]. R E F E R E N C E S [1] A. Martin, Comparati ve study of informati on fusion methods for sonar images cla ssiﬁcation , T he Eight h Internat ional Conference on Informat ion Fusion, Phila delphia , USA, 25-29 July 2005. [2] G. Le Chenadec, and J. M. Boucher , Sonar Image Segmentati on using the Angular Dependence of Backscat tering Distributions, IEEE Oceans’05 Eur ope , Brest, F rance , 20-23 J une 2005. [3] M. Liana ntonakis, and Y . R. Petillot, Sidescan sonar segmenta tion using acti ve contours and lev el set methods, IEEE Oceans’05 Eur ope , Brest , F rance , 20-23 June 2005. [4] L. Xu, A. Krzyzak, C.Y . Suen, Methods of Combi ning Multi ple Classiﬁers and Their Applicati on to Handwritin g Recognitio n, IEEE T ransactions on Systems, Man Cybernetic s , V ol 22(3), pp 418-435, 1992. [5] L. Lam, and C.Y . Suen, Application of Majority V oting to Patter n Recog- nition: An Analysis of Its Behavior and Performance, IEEE T ransactions on Syste ms, Man, and Cybernetic s - P art A: Systems and Humans , V ol 27(5), pp 553-568,1997. [6] L. Zadeh, Fuzzy s ets as a basis for a theory of possibilit y , F uzzy Sets and Systems , V ol 1, pp 3-28, 1978. [7] D. Dubois, and H. Prade, P ossibility Theory: An Appr oach to Compute r- ized P r ocessing of Uncertainty . Plenu m Press, Ne w Y ork, 1988. [8] A.P . Dempster , Upper and Lower probabil ities induced by a multi value d mapping, A nnals of Mathematic al Stat istics , V ol 83, pp 325-339 , 1967. [9] G. Shafer , A mathematical theory of evidenc e , Princeton Univ ersity Press, 1976. [10] Ph. Smets, The Combination of E vidence in the Transferab le Belief Model, IEEE T ransactions on P attern Analysis and Machine Intellig ence , V ol 12(5), pp 447-458, 1990. [11] Ph. Smets, and R. K ennes, T he Transferabl e Belief Model, Artiﬁcial Intell igen t , V ol 66, pp 191-234, 1994. [12] R.R. Y ager , On the Dempster-Sha fer Frame work and New Combination Rules, Informations Sciences , V ol 41, pp 93-137, 1987. [13] D. Dubois, an d H. Prade, Rep resentat ion and Combinati on of uncertain ty with belief functions and possibility measures, Compu tational Intell i- genc e , V ol 4, pp 244-264 , 1988. [14] Ph. Sm ets, Beli ef functions: the Disjuncti ve Rule of Combina tion and the Generalize d Bayesi an T heorem, Internation al Journal of Appr oximate Reasoning , V ol 9, pp 1-35, 1993. [15] A. Martin, and C. O sswald, Human Experts Fusion for Image Classi- ﬁcation , submitt ed to Information & Security: An International Journal, Special issue on Fusing Uncertain , Impre cise and P aradoxist Information (DSmT) , 2006. [16] F . Smarandach e,and J. Dezert, Information Fusion Based on Ne w Pro- portiona l Conﬂict Redistrib ution Rules, Information Fusion, Phil adelphi a, USA , 25-29 July 2005. [17] Osswald, C. and Martin, A. , Understanding the large family of Dempster -Shafer theory’ s fusion operators a decision-base d measure, 9 th Internati onal Confere nce on Information Fusion, Flor ence , Italy , 10-13 July 2006. [18] Ph. Smets, Constructing the pignisti c probabi lity function in a conte xt of uncerta inty , Uncertainty in Artiﬁcial Intellig ence , V ol 5, pp 29-39, 1990. [19] T . Denœux, A Neural Network Classiﬁer Based on Dempster- Shafer Theory , IEEE T ransacti ons on Systems, Man, and Cybernet ics - P art A : Systems and Humans , V ol 30(2), pp 131-150,2000. [20] T . Denœux, and M. Skarstei n Bjange r , Induction of decision trees from partia lly classiﬁe d data using beli ef funct ion, Pr oceedings of SMC’2000, Nashvill e, USA , pp 2923-2928, 2000. [21] P . V annoorenber ghe, and T . Denœux, Handling uncerta in labels in multicl ass probl ems using belie f dec ision trees, IPMU’2002, Annecy , F rance , V ol 3, pp 1919-1926, july 2002. [22] C. Ambroise, T . Denœux, G. Goavert , and Ph. Smets, Learning from an imprecise teacher: probabilisti c and eviden tial approaches, ASMD A’200 1, Compi ` egne , F rance , V ol 1, pp 100-105, 2001. [23] P . V annoore nbergh e, and Ph. Smets, Partia lly supervise d learn ing by a credal EM approach, ECQSARU 2005, Barcelon a, Spain , july 2005. [24] R. Harali ck, Statist ical and textural approaches to text ures, Procee dings of the IEEE , V ol 67, No 5, pp 786-804, 1979. [25] A. Mart in, Fusion for E va luati on of Image Classiﬁc ation in Uncerta in En vironment s, 9 th Internati onal Confere nce on Information Fusion, Flo- re nce , Italy , 10-13 J uly 2006.

Experts Fusion and Multilayer Perceptron Based on Belief Learning for Sonar Image Classification

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment