Dimensionality Reduction and Reconstruction using Mirroring Neural Networks and Object Recognition based on Reduced Dimension Characteristic Vector

DIM ENSIONALITY REDUCTION AND RECONSTRUCTION USING M IRRORING NEURAL NETWORKS AND OBJECT RECOGNITI ON BASED ON REDUCED DI ME NSION CHARACTERISTIC VECTOR Name(s) D asika R atna De epthi (1) , Sujee t Kuc hibholta (2) a n d K. Esw ar an (3) Address: (1) P h. D. Stude nt, College o f Enginee rin g , Osman ia U n ive r sity , Hy derabad-500007, A . P., (2) Alte ch Imaging & Comput ing, Sri M anu Plaza , A . S. R ao Nagar, Hy derabad, 5000 62, A.P. , (3) Sree nidhi Institu te of Science and Tec hn o logy , Yam n am pet, Ghatke sar, Hy derabad - 501 3 01. A.P., Co un try In dia Email addresse s ra dee p07@gma il.com , ksuje et@gmai l.co m , kumar.e @gmail . co m ABSTRAC T In this p aper, w e pr e sent a Mir roring Neura l Netw ork archite ct ure to perform non-linear dimens ionality reduction a nd O bject Recog nition using a reduced low - dimension al chara cteristi c vector . I n add ition to dimension ality redu ction, th e n etw ork al so r e cons tructs (m irrors) the original high -dimension al inpu t vector f rom the reduced low -dimensional da ta. The Mirroring Neural Netw ork architectur e has more number of processing elements (adal ines) in the outer l a ye rs and the least number o f elem ent s in the central layer to form “converging -diverg ing” shape i n its configura tion. Since this netw ork is able to recons t ru ct the ori ginal image from the output of the innermos t l a ye r (w hich c ontains all the informat i on abou t t he input pa tte rn) , t h ese out p uts can be used as object signature to c lassi fy patterns . The netw ork is t rained to minimize the discrep ancy betw een actual output and t h e input by ba ck propa gating the mean squared e rr or from the output layer to the i nput lay er. After suc cessfully training the netw ork, it can redu ce the dimension of input vectors and mirror the pat te rns f ed to it. The Mirroring Neural Netw ork architectur e gave ve ry good r e sults on various test patt erns. KEY WORD S Mi rr oring Ne ura l Netw ork, no n -linea r dime nsionality reductio n, charac t eris ti c v ecto r , ad alines, class ification. 1. Intro duct ion This p ape r p r o pose s a pat te rn reco gn itio n algo r it h m using a n ew n eu r al n etw ork ar c h itec ture called Mirrori n g Neural Ne t wo rk. Thi s pape r uses facial pat terns as an exa mple, to explain mirro r ing neural netw ork arc h itectu r e and illust r ate its perfo r m an ce . F acial p a tte rn reco gnition ca n b e broadly classifie d into tw o techni ques v iz., ma nually spe cifying t h e facial f eatures and au tomatical ly extracting the fe atur es. This pape r deals wit h the sec ond t ec hn ique in w h ich neural netwo rk reco gn ize s face pa tterns automatic ally . There a r e many pr ob lems that can be reso lved using neural netw orks, such as face d etec tion [1] & [2], o ptical c h aracte r r ec ognit io n [3] , v isual pattern reco gn itio n and ge n der cl assific a tion [4] etc., Mirro r ing neural n etw o r ks are used to m irror the input image pattern and r educe the dime nsion o f t he input p attern. This reduce d pattern d imensio n ve ctor (outputs o f th e ce n tral hidde n lay e r ) is co n side r ed as the signa ture o f th e p attern and used for classify ing the o bject. Fr o m this reduce d pattern d imensio n v ecto r , the mi rr ori n g n eu r al n etw ork reco n st r ucts the o r igin a l image pattern w ith m inim al distortio n. The app roach i s very diffe r ent f r om the past wo r k done o n r ec ognitio n and cl assif i catio n o f pat te rns as discus sed in [1], [2], [3] an d [4] a s w e used a di ffe r ent architec ture c ouple d w ith r esc aled lea rn ing ra te pa r amete r and also a diff erent activat ion functio n. Mirroring Neural Ne t wo r ks can b e used to recognise an y t y pe of pat terns or ob ject of interes t. If m any suc h netw orks a r e trained fo r a multitude of patte rn s the n w e h ave a set of networks that have th e ability to r ec ognize patte rn s albeit o ne pe r netwo rk. If these n etw orks are co nn ec ted and a framew ork o r architectu r e is made suc h that a patte rn is f ed as in put to all t h es e n etw o r ks an d this arc h itec tur e gives o utput fro m th e netwo rk which succ essfully mirr ors the patte rn , then suc h an architectu r e c ould be a pos sible data st ructure fo r simula ted memory . Detailed discuss i o n o n this architec ture ca n be fo und in [5]. In thi s propo se d wo r k w e deve lope d the mirro r ing n eu ral netw o r k fo r face pat te rn s . 2. Arch itecture of Mirroring Neural Netwo rk The pr es ent se ction explains the arc h itectu r e of the Mi rr oring Neural N etw ork that can r educ e di mensio n and mirro r input patterns , th e n etw o r k can be used to tr ain a single patte rn. The Mirro r ing Neu ral Netw o r k, upon suc c es s f ul t raini n g , ca n r ec ognize th e image patte rn fo r w h ich it h as b een trained . The laye r ed architectu r e, lea rn ing rat e p ar ame ter an d the random we igh ts are co n f igur ed such that the neu r al netwo rk will co n v erge w ith a minim al los s of in f ormation. The co ded fo r m of th e ob ject at th e le ast dime n sio nal hidde n la y er is called sig n atu r e of th e ob ject, it can be used to classify ob jects. The Mi rr oring Ne ural Netw ork contains more ad al ines in the o ut er la y e r s and least in the midd le lay er to fo rm a “co n v er gi n g-div erging” s h ape as show n in F ig.1. Mirr oring Neural Networ k Fig. 1 Typical ne ural networ k w i th inp uts, hidden l ay er s a nd o utputs The co n v er ging part of th is examp le netwo rk starts w it h ‘n’ units a t t h e input lay er, ‘ n 1 ’ units (n>n 1 ) in t h e 1 st hidde n lay er, ‘n 2 ’ units ( n 1 > n 2 ) in the 2 nd hidde n lay er and so on t ill it r eac h es l eas t dime nsional hidden laye r (p th hidde n lay er) w ith 4 u n its. T h is co n ve r g ing p art condenses the high dime n sio nal p a tte rn int o l o w dim ensio nal co de fo r mat. The d ive r ging part of the n etw o r k s tar ts a t the least dimens ional central hidde n laye r and e n ds with th e o utput lay er. As w e have 4 units at the p th laye r of th e netwo rk, (p+1) th lay er will h ave 4+q u nits (4+q< n ) and so o n t ill i t r eac h es the outpu t l ay er ‘n L ’ h av ing ‘n’ units (equal t o the input v ecto r ). The numbe r of hi d den laye r s and the values o f variables n 1 ’, ‘n 2 ’… ‘n L ’ and q are se l ec ted in such a w a y th at the input pattern is m irro red at the ou t put w ith minimum di s tortion. Fo r example, co nsider a n etw o r k whic h has 25 - 10 - 6 - 3 - 8 - 25 node s in respec tive la y ers. This netwo rk h as 25 inputs, 10 adali n es in the 1 st hidde n lay er, 6 adali n es i n the 2 nd hidde n lay er, 3 a dali n es in the 3 rd , 8 adalines in t h e 4 th and 25 ad alines i n the l ast l aye r . The patte rn is reco n st r ucted at the output w ith its origi nal dime n sio n of 25 units f r om this signatu re. The input patte rn s with 25 dime n sio n s can t h us b e r epres ented w ith the 3 co de un its of th e 3 rd hidde n l ay er (leas t dime nsional lay er) . We hav e tried var io us architectu res w ith vary in g h idde n laye r dime n sio n s. Af ter co n side ra b le experimenta tio n , w e found that a ne two rk having one hidden lay er and an output lay er is a suitable ch o ice for our patte rn . The degree of reductio n o f th e input patte rn pl ay s an i mportant r ole w h ile r eco nstructi n g input patte rn f rom r educe d dime n sio n ve ctor and so , th e numb er of un its in th e least dime n sio na l hidde n lay er must be chose n after caref ul expe r ime ntation. Af ter tr y ing diff eren t dime nsions of the hidde n lay ers by t rail & error m et h o d, and chec king the neural n etw ork’s perfo rmance, we found that 40 units at the hidde n lay er gav e t he most accu r ate resul ts . W e des ign ed our mi rr o ring n eu r al netw ork w ith 676 i nputs to 40 h idde n (co de) unit s and 676 o utput units (67 6-40-676). The inputs to t h e netw ork w ere 26X26 gray scale images. The input gray scale i ntensitie s w ere r esc aled to [0,255]. The rescale d gray scale in te n sities we r e then m a ppe d to a range of [-1, +1]. The initia l w eights w ere ch o sen randomly in t h e r ange of -0. 2 to +0.2. Th e detaile d discus sion of thi s a r c h itec tur e is def ined in the fo llow ing se ctions. 2.1 Inp ut P atter n Res caling The input im age intensity values we r e rescaled f rom 0 to 255 as des cribe d in [6 ]. The res cale functio n is defi n ed as: G i = (G i – MIN)* 255 / (MA X – MIN) Where G i = inte nsity of the input image MIN = minimu m intensity in the im age MAX = ma x imum inte nsity in t h e ima ge Af ter rescaling th e i mage patte rn, the r esult ing inte n sitie s we r e mappe d to the ra n ge o f [-1 to +1] using the fo r mula: G i = (G i – 128 ) / 128 R escaling and mappi ng w as do n e for every in put image presente d t o the Mirro ring Neu r al Netw o r k. 2.2 No n-linear Dim ensionalit y Redu ction Initially small random v al ues were chosen for the weights and bias t erms of th e h idde n lay er and the outpu t lay er. We h ave used h y p erbo lic t ange n t functio n [8], instead of linear and logistic fu n ctions a s used in [7], fo r faster co n v ergence o f the netwo rk. This dif fe r ential n onli n ea r activatio n f un ct ion is imple mented at e ach n o de of t he hidde n lay er and o utput lay er. Th e t r ansf er functio n ca n be def in ed as: f(x ) = tanh (x /2) = (1 - e -x ) / (1 + e -x ) Fig. 2 Weig hted input and use of Sig moid function to o btain output The functio na lity of an y node ‘ j’ in th e hidden laye r or o utput l ay er of th e mirr o ring n eu ra l n etw o r k is s h ow n in Fig. 2 It c an be m ath em atically def in ed * as: Fo r hidde n lay e r n o de: k=im a g e size Ne t Input hj = bias hj + ∑ W hjk * G k k= o Adaline hj = ta nh (Ne tInput hj /2) Where NetIn pu t hj = net inpu t to the j th node in t h e hidden lay er b i as hj = b ias term o f th e j th node in t h e hidde n lay er W hjk = k th we ight of j th node in t h e hi d den lay er G k = k th i n te n sity of t he input image Adaline hj = output of j th node in the hidden lay er Fo r output lay er node: k=hidden layer si ze Ne t Input oj = bias oj + ∑ W ojk * Adal ine hk k=o Adaline oj = ta nh (Ne tInput oj /2) Where NetIn pu t oj = net inpu t to the j th node in the output lay er b i as oj = b i as t e rm of t he j th node in the outpu t lay er W ojk = k th we ight o f j th node in the o utput lay er Adaline hk = output of k th node in the hidde n lay er Adaline oj = output of j th node in the ou tput lay er While tr ai n ing th e back p r o pagating Mi rr o ring Ne ural Ne t wo r k we h ave us ed the usual g r adie nt des cent [10] t o minimize the mean squa r ed e rr o r be tween the input a n d its reco n st r uctio n at the output. The a ct ivatio n funct ion and v a riable learning r ate parameter [11] r educ e out-of-range v a lues an d h elp in faster co n v er ge n ce of th e netwo rk. The lea rn ing rate para meter w as in creme nted by 10% at the hidde n lay er compared t o t he output la y e r . The mirr o ring neural n etw o r k, w ith lea rning r ate rescaling in co m b ina tio n w ith the h y p erb olic tan gent f un ctio n , l ea rn t the input patterns rapid ly and reco n st r ucted them w ith low defo r matio n. * This m ath em atical definition i s pertinen t to t he neural architecture s pecified in this paper. 2.3 Obj ec t Recogn iti o n Af ter s uccessf ull y t raining th e neu r al netwo rk t o a des ired acc ur acy , th e average featu r e vecto r for th e input pat tern w as computed by a ve r aging t h e least dime n sio nal h idde n lay er outputs fo r all input i ma ges. The r ec ognition of the ob ject w as base d on t w o t hres h old value s. Th e first threshold v a lue w as t he Euclidian di st ance † be t wee n the reduce d dime n sio n fe atur e ve ctor for the test image an d the a v era ge fe atur e vecto r w h ich w a s computed after the training o f th e Mirro r ing Neu ral Netw ork. The seco nd threshold value was the Euclidi an dis tan ce be t ween the o utput and t h e input of the Mi rr o ring Ne ura l Netwo rk. These two v alues were co mputed f or each test i mage and if th ey were within an acce pted t hres h old then w e catego ri zed th e test image as a face pat te rn. The thr es hold v a lues we r e fixe d af t er co ns ide rable ex perimentatio n i n o r der to ma ximi ze succ ess r ate and r ed uce false acc eptan ce . 2.4 Resu lts and Dis cus sion A training set co n sisti n g of 549 facial im ages (8 Bit) of size 26 X 26 w as fed to th e mirro r ing n eu ral netwo rk. Af ter sufficie n t traini n g (succe ss r ate abov e 95%), th e netwo rk co uld r ec ognize face s an d r ej ec t n on-face patterns . R ec ognit io n was b ased o n the t w o thres h o ld v a lues discus sed in 2.3. Ty pic al sample input images f r om the tra ini n g set and th ei r correspo nding mirror images (o utput s of t he mirrori n g n eu r al netwo rk) a r e show n in Fig. 3 . Test s am p les of face an d n on- f ace images are give n in Fig. 4. Face samples Output of the M irroring Neura l Networ k Fig. 3 Typical face s ample s and their m irror images † The Eucl idian d is t ance is computed af t er n ormalization. Face samples Non-face s amples Fig. 4 Typical face & non-f ace images We tes ted the netwo rk wit h 250 face images of w h ich the algorithm classif ied 245 face images corr ec tly f or an acc ur acy of 98%. We also tested 300 mo r e images co n taini n g 150 f ace a nd 150 n o n -fac e im ages. The algorithm co uld co r r ec tly classify 277 images o ut of 300 (92.33%). Bo th sets of test images (250 + 300 = 5 50) we r e enti r ely n ew images, none of w h ich w ere in the training se t . We also use d tr aining images of size 26X42 to train th e netwo rk. We increase d the n ode s in the hidden lay e r to 70 and the n etw ork pe r fo r med almost equa lly we ll. 3. Conclusions and futur e work The a r chitec ture desc r ibe d in this paper i s a simple approac h fo r objec t r ec ognition whic h i s appl icable to v a rious image catego ries like face s, f ur niture , flow ers, tree s, etc and w as tested f or the same with s li g ht c hanges in the n etw ork arc h itec tur e w. r. t., h idde n lay er size and threshold values. Such n etw orks could be use d for f a ce dete ction by in co r po rat ing them in an applica ti o n to ve r ify po ssible fac e can dida tes. Suc h n etw orks ca n be “paralle lized” fo r r eco gnit io n tasks pe r tai n ing to dif fe r ent patterns . For example, the input may be an image of an y pattern w hi c h is se n t to all the spe cialized mirrori n g neu ral netwo rks in parallel and the netwo rk which giv es th e l east v a lue b el ow t he two t hres h o lds w ould identify th e input pattern. The ove r v ie w of th is mu lt iple pattern mirr o ring architec ture is illust rated in Fig . 5. Fig. 5 Pattern input to multip le mirror netw or ks in pa r allel Ref e rences [1] C. G arcia & M . Delakis, Convo lutional fac e finde r : A neural a r c h ite cture fo r fast a n d r o bust fac e detection, IEEE Trans . P att ern A na l. Ma ch. Intel l., 26(11) , No v. 2004, 1408-1423 . [2] M. -H . Yang, D. K r ieg man & N. Ah uja , Detec ti ng fac es in images: A survey , IEEE Trans. Pat tern A na l. Mach. Int ell., 24(1) , Ja n. 2002, 34-58 . [3] M. D. Ganis, C. L. Wilso n & J. L. Blue , Neural Ne t wo r k-b ased sy st ems for handpri n t OCR applicatio ns, IEEE Trans . Image Process., 7(8) , Aug. 1998 , 1097-1112 . [4] Son Lam P h ung & Ab desse lam Bo uzerdoum, A Py r amidal Neural Netw o r k Fo r Visual P attern R e co gn ition, I EEE Transa ctions on Neura l Netw orks, 18( 2), March 2007, 329-343. [5] K. Esw aran, “S yste m & M eth o d of Ide n tify ing Patte rn s”, pate nts file d in IPO on 1 9/7/06 an d 19/03/07. Also see “New Automated Technique s fo r Pattern R e co gn ition & their Applicatio ns”, Al tech Report, 6 th July 2006. [6] R. C Gonzalez and R. E. Wo ods , Digital image processing (Engle wood Clif fs, NJ: P r e n tic e-Hall, 2002). [7] G. E. H int o n & R . R. Sala k hutdino v , Re du cing th e Dime n sio nality of Data w it h Neu ral Netwo rks, Sci ence, 313, July 2006, 504-5 07. [8] J. –S. R. Jang, C. –T. Sun & E. Mi z ut an i, Neur o-fuzzy and soft c ompu ting (Delhi, India: Pea r so n Educatio n (Si n gapo r e) Pte. Ltd., 1997 (Sec ond Indian R epr in t , 2005)) . [9] Christo ph e r M . B ishop, Neural netw orks for pattern recognit ion (United States: Oxf ord U n ive r sity Pr es s Inc., New York, 1999). [10] James A. Fr ee man & David M. Skapura, Neural netw orks (Eastern P r ess (Bangalo r e) Pv t. Ltd.: Addison- We sley Publishing Co mpany , Inc., 1991 (First ISE r ep rint, 1999)) . [11] A. K. Rigle r , J . M. Irvine and T. P . Vo gl, R esc al ing of v ar iable s in back propagatio n learning, Neural Netw orks, 4(2) , 1991 , 225-22 9.

Dimensionality Reduction and Reconstruction using Mirroring Neural Networks and Object Recognition based on Reduced Dimension Characteristic Vector

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment