Nearest Prime Simplicial Complex for Object Recognition

The structure representation of data distribution plays an important role in understanding the underlying mechanism of generating data. In this paper, we propose nearest prime simplicial complex approaches (NSC) by utilizing persistent homology to ca…

Authors: Junping Zhang, Ziyu Xie, Stan Z. Li

Nearest Prime Simplicial Complex for Object Recognition
Nearest Prime Simplicial Complex for Ob ject Recognition ? Junping Zhang 1 , Ziyu Xie 1 , and ?? Stan Z. Li 2 1 Shanghai Key Lab oratory of In telligen t Information Pro cessing and Sc ho ol of Computer Science, F udan Univ ersit y , Shanghai, China 2 Institute of Automation, Chinese Academy of Sciences, Beijing, China. Abstract. The structure represen tation of data distribution plays an imp ortan t role in understanding the underlying mechanism of generat- ing data. In this paper, w e prop ose nearest prime simplicial complex ap- proac hes (NSC) by utilizing p ersistent homology to capture such struc- tures. Assuming that eac h class is represented with a prime simplicial complex, we classify unlab eled samples based on the nearest pro jection distances from the samples to the simplicial complexes. W e also extend the extrap olation abilit y of these complexes with a pro jection constrain t term. Exp erimen ts in sim ulated and practical datasets indicate that com- pared with several published algorithms, the prop osed NSC approaches ac hiev e promising p erformance without losing the structure representa- tion. Keyw ords: T opology; Persisten t Homology; Ob ject Recognition; Sup ervised Learning 1 In tro duction The structure representation is important to understanding the underlying mech- anism of generating data. T o capture suc h structures, manifold learning algo- rithms [1,2] assume that data are generated from an underlying low-dimensional manifold. How ev er, it is not easy to discov er and preserve the top ological struc- ture hidden in the manifold. F or example, [3] p ointed out that the view and st yle-indep enden t action manifolds, which are used to describ e human activi- ties, can b e assumed to lie in a torus. P ersisten t homology can effectiv ely discov er the top ological inv ariants such as holes, which cannot be easily av ailable by other means suc h as manifold learning algorithms [4]. The metho d first incrementally constructs nested families of sim- plicial complexes from p oint cloud data (PCD), and then computes the lifecycle of each p ossible top ological inv ariant b y placing the complexes within an ev olu- tionary growth pro cess. Finally , it extracts those truly top ological inv arian ts or ? Junping Zhang, Ziyu Xie, Email: jpzhang@fudan.edu.cn, ziyu.ryan@gmail.com ?? Stan Z. Li, Email: szli@nlpr.ia.ac.cn 2 Junping Zhang, Ziyu Xie, and Stan Z. Li features with longer lifecycle and remo v es topological noises [5]. Ho w ev er, how to emplo y it for practical applications (e.g., ob ject recognition) remains unsolv able. In this pap er, w e propose a nov el method, called nearest prime simplicial complex approac hes (NSC), to obtain a structure-preserving representation and ac hiev e higher p erformance in ob ject recognition. Sp ecifically , we generate a nested family of simplicial complexes p er class, and estimate a prime simpli- cial complex p er class by weigh ting the lifecycles of alive top ological structures. Then w e classify ob jects based on the nearest pro jection distances from each ob ject to simplices in these simplicial complexes. F urthermore, we also utilize a pro jection constraint term to enhance the extrap olation ability of NSC and prev ent incorrect pro jection. The main contribution is that we extend the geo- metrical framework of simplicial complex to ob ject recognition. Sp ecifically , w e prop ose NSC approac hes to ob ject recognition and show ho w to use them for p oin t classification. Experiments in sev eral sim ulated and practical datasets show that without losing the structure represen tation, the prop osed NSC approac hes attain promising p erformance compared with several w ell-known algorithms. The remainder of this pap er is organized as follo ws. In Section 2, we will in tro duce some preliminary of simplicial complex and give a brief survey on p ersisten t homology . In Section 3 we will detail our proposed NSC algorithm. W e ev aluate the p erformance of the prop osed NSC approaches in Section 4. W e conclude the pap er in Section 5. 2 Preliminary and Related w ork In this section, we will in tro duce some preliminary of simplicial complex and the construction of simplicial complex, and survey the dev elopment of p ersistent homology . 2.1 Preliminary The simplicial complex is a collection of simplices sub ject to some rules. The simplex and simplicial complex are defined as follo ws: Definition 1: Let { v 0 , v 1 , · · · , v p } b e a geometrically independent set in R N . W e define the p -simplex σ spanned by v 0 , v 1 , · · · , v p to b e the set of all p oints x of R N suc h that [6]: x = P p i =0 λ i v i , where P p i =0 λ i = 1 and ∀ i, λ i > 0. In general, each p -simplex σ has p + 1 faces whic h are ( p − 1)-simplices. The face is obtained b y deleting one of the vertices v 0 , v 1 , · · · , v p . Definition 2: A simplicial complex K in R N is a collection of simplices in R N suc h that [6]: 1) Every face of a simplex of K is in K . 2) The intersection of an y tw o simplices of K is a face of each of them. An illustration on the distinction b et ween simplicial complex and non-simplicial complex is sho wn in Fig. 1. Ob viously , the non-simplicial complex violates the second rule. In this paper, w e mainly utilize Lazywitness complexes, whic h behav es like Delauna y triangulations computed in the intrinsic geometry of the data set X , to construct the simplicial complex for PCD. NSC for Ob ject Recognition 3 Fig. 1. The distinction b etw een non-simplicial complex (left) and simplicial complex (righ t). Sp ecifically , w e select a subset Z = { v 1 , · · · , v p } ⊂ X as the vertex set b y us- ing sampling techniques such as max-min sampling or random sampling metho ds at first. The max-min sampling metho d randomly extracts one p oint as the first v ertex, and iteratively selects the next n − 1 p oints that maximize the minimal distances betw een them and the previous vertices. With this w ay , the metho d generates a vertex set uniformly distributed around the data structure [4]. Then w e utilize the remaining p oin ts as the witness p oints set { w 1 , w 2 , · · · , w q } to determine whic h simplices o ccur in the complex [4]. More formally , let D b e a p × q distance matrix, where q denotes the num- b er of the remaining p oints. Eac h elemen t D ( i, j ) measures the distance b e- t ween landmark p oint i and witness point j . T o discov er a p ersistent top ologi- cal in v ariant from PCD, w e construct a nested family of simiplicial complexes W ( D ; R, f ). Here R is the radius of me tric ball, and f is a non-negativ e in- teger. If f =0, w e define m j = 0 , ∀ j = 1 , 2 , · · · , q . Otherwise, let m j b e the f -th smallest entry of the j -th column of matrix D . Then w e utilize t wo rules to determine which simplices can b e added into the complex [4]: 1) a 1-simplex σ = [ v a v b ] will b e added to W ( D ; R , f ) iff there exists a witness w j (1 6 j 6 q ) satisfying max( D ( a, j ) , D ( b, j )) ≤ R + m j . 2) a k -simplex [ v a 0 v a 1 · · · v a k ] will b e added to W ( D ; R, f ) iff there exists a witness w j (1 6 j 6 q ) satisfying max( D ( a 0 , j ) , D ( a 1 , j ) , · · · , D ( a k , j )) ≤ R + m j . Note that when the num b er of training samples is small, which is very com- mon in ob ject recognition domain, we can instead use the Rips complex metho d to obtain the nested families of simplicial complexes. Assuming that each p oint is a cen ter of a closed Euclidean ball with radius R , Rips metho d iteratively builds a complex by forming a line in any t wo points if the balls of them are in tersected [4]. 2.2 Related work P ersistent homology is to discov er some stable topological inv ariants from PCD. T o ac hieve the goal, there are three crucial steps [5]: 1) selecting a subset that expresses the non-trivial topological attributes measured b y homology groups, 2) measuring the imp ortance of these subsets and 3) eliminating those top ological attributes with the minim um num b er of side-effects. 4 Junping Zhang, Ziyu Xie, and Stan Z. Li [4,7] inv estigated the influence of sampling technique to the estimation of top ological inv ariants. With p ersistent homology and sampling strategy , [4,7] disco vered that image patc hes with edges lie in a Klein-b ottle-shap e space. [8] prop osed to use geo desic Delaunay triangulation to reduce the n umber of sam- ples, whic h is required to capture the top ology of PCD. T o disco v er the top ological structure from data cloud p oints, it is neces- sary to construct the simplicial complexes. There are several differen t complexes including Cˇ ec h, Rips, Explicit, Witness and Lazywitness complexes in litera- tures. Let PCD b e X , and the radius of PCD b e R . Sp ecifically , the C ˇ ech com- plex Cˇ ec h( X , R ) means the nerve of the collection of metric balls { B ( x j ) , R/ 2 } , x j ∈ X, j = 0 , 1 , · · · , p [9], with v ertex set X . [5] proposed α -complexes, and estimated the inv ariants through computing the p ersistent Betti n umbers. F or sa ving storage space, Rips( X , R ) only stores the edges and v ertices, and forms the largest simplicial complex that has the same 1-sk e leton (i.e. vertices and edges) as Cˇ ech( X, R ). How ever, b oth of the t wo metho ds pro duce a very large amoun t of complexes, esp ecially for large-scale PCD. T o refine the efficiency , [4] prop osed witness complex by selecting a group of landmark p oints and utilizing the remaining p oin ts as witness of the existence of simplicial complexes. [10] employ ed Barco de technique to measure the imp ortance of top ologi- cal attributes. F urthermore, [11] applied the p ersistent homology to extract some topological features from character-shape p oint cloud data. [12] studied the smallest cov erage issue in sensor netw orks based on the p ersistent homology . Assuming that stratified spaces consist of multiple manifolds or non-manifolds, eac h of which has v arying dimension, [13] generalized the computation of p ersis- ten t homology to that of in tersection homology for better analyzing the stratified spaces. Moreov er, [14] clustered data p oints into different stratified space using metho ds derived from kernel and cok ernel p ersistent homology . [15] inv estigated the p ersistent homology of random fields and manifold learning. A ma jor diffi- cult y is that it is not easy to fill the gap betw een the p ersistent homology and practical applications. 3 Nearest Prime Simplicial Complex In this section, we will detail the NSC approaches by dividing them into tw o parts. 3.1 Selecting prime simplicial complexes T o utilize the persistent homology for recognition, w e prop ose three crucial steps including eliminating the redundant simplices, recording a recognition-related Barco de and selecting the prime simplicial complexes. With the metho ds mentioned ab ov e, sp ecifically , we can construct a filtered simplicial complex from the p oint cloud data by increasing R from 0 to ∞ . The filtered complex is an increasing sequence of simplicial complexes which deter- mine an inductive system of homology groups [10]. T o our research, we discov er NSC for Ob ject Recognition 5 that in this sequence, a prop er complex, named prime simplicial complex, is use- ful for recognition. The prime simplicial complex is a relatively stable complex from which w e can capture the homology of the data’s top ological structure. F or b etter understanding, an example is sho wn in Fig. 2. Fig. 2. W e can construct a simplicial complex through metric balls with a radius R . A go od choice of R (left) induces a prime simplicial complex whic h can help us to capture the homology of an annulus from the union of balls. Meanwhile, the union of balls with incorrect radius will induce an incorrect structure representation (middle and right). A k -simplex which is not a face of any k + 1-simplices of the same complex is a relatively highest-dimensional simplex compared with its low er-dimensional ones. Note that we alwa ys fo cus on the relatively highest-dimensional simplices of the prime simplical complex since their faces which are lo wer-dimensional ones hav e b een implicitly considered in our NSC approaches. F or av oiding the rep eated computation, we prop ose to remov e these faces when constructing the prime simplicial complex. Here we give a pseudo-co de based on Lazywitness complexes in Algorithm 1. Note that in line 3, the matrix E is calculated as: E ( i, j ) = min k max( D ( i, k ) , D ∗ ( k , j ))) − m k (1) and in line 9, the low er-dimensional simplices will b e remov ed after the merging pro cedure is completed. Once the prime simplicial complex is constructed, we use Barco de technique to record the lifetime of each simplex belonging to the complex as the parameter R increases un til R max is reac hed. W e only consider the simplices which are still aliv e when R = R max . An example is sho wn in Fig. 3 3 . Ob viously , it is hard to find the b est prime simplicial complex from the sequence. Therefore, we prop ose to select a radius R ∗ based on the weigh ted lifecycles: R ∗ = P m i =1 ` i M i P m i =1 ` i (2) 3 Note that our Barco de is different from that in [15]. The reason is that although [15]’s Barco de technique is a go o d wa y to describ e the p ersisten t homology by recording the birth and death time of some topological inv ariants, only the alive simplices are useful for our prop osed NSC algorithm. 6 Junping Zhang, Ziyu Xie, and Stan Z. Li Algorithm 1 Construct the Prime Simplicial Complex using Lazywitness Com- plexes input Poin t Data P , R , the ratio r , the family f output the vertices of eac h simplex constructing the simplicial complex 1: Cho ose p landmark p oints and q witness p oints, where p = size ( P ) / ( r + 1) and q = p · r . 2: Compute the p × q matrix D of distances. 3: Compute the p × p matrix E with off-diagonal entries E ( i, j ) = R [ v i v j ] whic h record the time when edge v i v j app ears. 4: Consider every t wo pairs ( i, j ) where i < j 6 p 5: if E ( i, j ) 6 R then 6: Add [ v i v j ] to W ( D ; R, f ). 7: Remo v e [ v i ] , [ v j ] from W ( D ; R, f ) 8: end if 9: Generate higher-dimensional cells inductively: the k -simplex [ v a 0 v a 1 · · · v a k ] o ccurs iff the three lo wer-dimensional simplices [ v a 1 · · · v a k ], [ v a 0 · · · v a k − 1 ] and [ v a 0 v a k ] all o ccur. where m is the num b er of simplices, ` i is the length of the i -th barco de, and M i is the radius corresp onding to the median of ` i . Intuitiv e ly , the shorter the lifecycle, the more unstable the corresp onding simplex, and the less influence it raises to the determination of a stable and prime simplicial complex. F ormally , let the length of the shorter lifecycles be ` A,i ( i = 1 , · · · , s ) and the others b e ` B ,j ( j = 1 , · · · , s 0 ) with s + s 0 = m , then w e can rewrite Eq. (2) as: R ∗ = P s i =1 ` A,i M i + P s 0 j =1 ` B ,j M j P s i =1 ` A,i + P s 0 j =1 ` B ,j = P s i =1 ` A,i M i P s i =1 ` A,i + P s 0 j =1 ` B ,j + P s 0 j =1 ` B ,j M j P s i =1 ` A,i + P s 0 j =1 ` B ,j (3) When for all the lifecycles, we hav e ` A,i  ` B ,j , then Eq. (3) can b e appro xi- mated b y: R ∗ ≈ P s 0 j =1 ` B ,j M j P s 0 j =1 ` B ,j (4) It indicates that the primal simplicial complex is less sensitiv e to those simplicial complexes with the shorter lifecycles. It is also noting that we construct a prime simplicial complex p er class for classification. 3.2 Classifying ob jects based on NSC Assuming that data distribution per class is represented by a prime simplicial complex, w e attempt to classify unlabelled samples by pro jecting the samples NSC for Ob ject Recognition 7 −4 −2 0 2 4 −4 −3 −2 −1 0 1 2 3 4 5 0 0.5 1 0 5 10 15 20 25 30 R Fig. 3. W e construct a simplicial complex (left) for a circle-shape data. In the panel, red dotted line and blue line denote 1-simplex and 2-simplex, resp ectively . Each barco de (righ t) of its simplices starts at a specific R v alue, and ends up at R max whic h is used to determine when to stop the computation of barco de. In this figure, R max is set to 1. Those disapp eared simplices ha v en’t b een sho wn in the figure. to the simplices of prime simplicial complexes. With this wa y , we can av oid pro jecting them to some holes and voids that ma y exist in the structures. The holes and voids will lead to incorrect pro jection and impair the classification p erformance. Sp ecifically , let σ i ( i = 1 , 2 , · · · , m ) b e a k -simplex with vertices { v 0 , v 1 , · · · , v k } . Then the pro jection p osition x p of sample x can b e defined as a linear combina- tion of v ertices in the simplex: x p = k X i =0 λ i v i , where k X i =0 λ i = 1 (5) Here λ i is the weigh t v alue. T ake a 2-simplex as an example. The weigh t is equal to λ i =  ( B T B ) − 1 B T ( x − v i ) , i = 0 , 1 1 − λ 0 − λ 1 , i = 2 (6) where B = [ v 0 − v 1 , v 1 − v 2 ]. As for a 1-simplex, the w eigh t is equal to: λ i = ( ( x − v 1 ) T ( v 1 − v 0 ) ( v 1 − v 0 ) T ( v 1 − v 0 ) i = 0 1 − λ 0 i = 1 (7) If the pro jection index 0 ≤ λ i ≤ 1, the pro jection p osition locates inside the face. Otherwise, it lo cates outside the face. F or λ i > 1 or λ i < 0, on one hand, it can lead to an incorrect pro jection for distant points. On the other hand, it pro vides a “forw ard” and “bac kward” in terp olation along a face when the n um b er of training sample is small. T o make a compromise b etw een preven ting that data are incorrectly pro jected outside the face and preserving the extrap olation abilit y 8 Junping Zhang, Ziyu Xie, and Stan Z. Li of top ological structure, we introduce a parameter γ to compute the pro jection p osition and corresp onding pro jection distance as follo ws: x p =  v i + (1 + γ )( v j − v i ) , if λ i > 1 + γ v i − γ ( v j − v i ) , if λ i 6 − γ (8) where v i , v j denote t wo different vertices of a simplex. Then the distance b etw een a sample x and a simplicial complex of the c -th class is: d N S C ( x | S C c ) = min ` ( x − x `,c p ) T A ( x − x `,c p ) , ` = 1 , · · · , m ; c = 1 , · · · , C (9) where m denotes the num b er of simplices in the complex, C is the n umber of classes, and A is a non-negative matrix 4 . Finally , we classify s ample to a class that has the nearest simplicial complex distance to the sample: C ( x ) = arg min c d N S C ( x | S C c ) c = 1 , · · · , C (10) 4 Exp erimen ts Exp erimen ts are p erformed to ev aluate the p erformance of the NSC approach. Here, tw o face recognition datasets and eight UCI b enchmark data sets [16] are used as listed in T ab. 1. W e also use fiv e sim ulated datasets and four practical m ulti-view datasets. The five simulated datasets are generated from differen t topological structures plus random noise with v ariance ρ . They are 1) D1: t wo concentric circles ( ρ = 1 . 0); 2) D2: tw o spirals ( ρ = 3 . 5); 3) D3: circle-cross-circle ( ρ = 2 . 0); 4) D4: four circle-cross-circle ( ρ = 2 . 0) and 5) D5: Sphere-cross-s phere ( ρ = 1 . 5) datasets as shown in Fig. 4. Each dataset includes 2-class, eac h of whic h has 500 training samples and 500 test samples without ov erlap. W e use max-min sampling strategy to s elect 50% training samples as the landmark p oints [4] and the remaining samples as the witness p oints to construct the prime simplicial complexes. Some examples of these complexes are illustrated in Fig. 4. F rom the figures w e can see that the NSC approac hes preserv e the structure represen tation w ell. The four practical m ulti-view data sets used for ob ject recognition are COIL- 20 [17], COIL-100 [18], SOIL-47A and SOIL-47B [19]. The COIL-20 dataset consists of 20 ob jects, each of whic h has 72 differen t views that are sampled ev ery 5 o around an axis passing through the ob ject. Eac h ob ject is an image with size 128 × 128. W e subsample them to 32 × 32 ones. The COIL-100 dataset has 4 It can b e obtained by metric learning which go es b eyond the scope of this paper. In our pap er, we set it to b e an identit y matrix or inv erse cov ariance matrix. The former one is equiv alent to a Euclidean distance. Meanwhile, the latter one leads to a classical Mahanalobis distance, named NSC-M. NSC for Ob ject Recognition 9 T able 1. Description of several b enchmark data sets. Here “#”, “Dim” denote the num- b er of samples and means dimension, resp ectiv ely . ‘C’ denotes the num b er of classes, and ‘RA’ denotes the ratio of the num b er of training samples in each dataset or the n um ber of training samples versus that of test samples. The latter one means that training set and test set hav e b een separated by their provider. Datasets # Dim C RA ORL 400 10304 40 0.5 UMIST 575 10304 20 0.5 Iris 150 4 3 0.5 Landsat Satellite 6335 36 2 0.1 Image Segmen tation 2310 16 7 210/2100 Gaussian Elena 5000 8 2 0.5 Breast Cancer Wisconsin 569 31 2 0.5 Phoneme 5404 5 2 0.1 P endigits 10992 17 10 7494/3498 Optdigits 5620 65 10 3823/1797 T able 2. Exp eriment I: The influence of f to the classification p erformance on the five sim ulated datasets. Exp eriment I I: The influence of R max and comparison with other algorithms. The exp eriment results are the av erage of 20 repetitions. In the table, A ± B means a v erage error rate and standard deviation (%). D1 D2 D3 D4 D5 Exp erimen t I: R max = 0 . 5 , γ = 0 NSC ( f = 0) 4 . 24 ± 0 . 68 13 . 38 ± 1 . 18 8 . 09 ± 1 . 03 9 . 08 ± 1 . 05 3 . 85 ± 0 . 56 NSC-M ( f = 0) 4 . 40 ± 0 . 85 13 . 27 ± 1 . 10 9 . 52 ± 0 . 92 13 . 29 ± 1 . 55 7 . 62 ± 1 . 04 NSC ( f = 1) 3 . 81 ± 0 . 59 11 . 70 ± 1 . 09 6 . 01 ± 0 . 74 6 . 94 ± 0 . 82 2 . 97 ± 0 . 57 NSC-M ( f = 1) 3 . 83 ± 0 . 59 11 . 66 ± 1 . 13 6 . 54 ± 0 . 79 8 . 53 ± 1 . 00 5 . 15 ± 0 . 92 NSC ( f = 2) 3 . 87 ± 0 . 81 10 . 68 ± 1 . 05 6 . 19 ± 0 . 70 6 . 55 ± 0 . 72 2 . 89 ± 0 . 73 NSC-M ( f = 2) 3 . 79 ± 0 . 76 10 . 63 ± 1 . 06 6 . 69 ± 0 . 63 7 . 79 ± 1 . 05 4 . 29 ± 0 . 84 Exp erimen t I I: f = 2 , γ = 0 NSC: R max = 0 . 5 3 . 41 ± 0 . 51 11 . 70 ± 0 . 62 5 . 77 ± 0 . 42 6 . 58 ± 0 . 57 2 . 68 ± 0 . 70 NSC-M: R max = 0 . 5 3 . 43 ± 0 . 52 11 . 67 ± 0 . 70 6 . 23 ± 0 . 49 8 . 06 ± 0 . 76 4 . 36 ± 1 . 01 NSC: R max = 1 . 0 3 . 83 ± 0 . 74 10 . 60 ± 0 . 76 6 . 18 ± 0 . 46 6 . 77 ± 0 . 70 2 . 42 ± 0 . 67 NSC-M: R max = 1 . 0 3 . 78 ± 0 . 74 10 . 59 ± 0 . 76 6 . 46 ± 0 . 52 7 . 60 ± 0 . 89 3 . 42 ± 0 . 88 1-NN 4 . 58 ± 0 . 53 13 . 24 ± 0 . 80 7 . 30 ± 0 . 88 7 . 88 ± 0 . 71 3 . 13 ± 0 . 86 3-NN 4 . 20 ± 0 . 63 11 . 74 ± 0 . 87 6 . 65 ± 0 . 76 7 . 09 ± 0 . 73 2 . 83 ± 0 . 77 SVM-G 3 . 24 ± 0 . 62 10 . 23 ± 0 . 87 5 . 46 ± 0 . 73 6 . 30 ± 0 . 63 2 . 35 ± 0 . 52 10 Junping Zhang, Ziyu Xie, and Stan Z. Li −4 −2 0 2 4 −4 −2 0 2 4 −10 −5 0 5 10 15 −10 −5 0 5 10 15 −5 0 5 −2 0 2 4 6 −2 −1 0 1 2 3 4 −5 0 5 −2 0 2 4 −4 −2 0 2 4 6 8 10 −5 0 5 −5 0 5 10 −5 0 5 Fig. 4. F rom left to right, from top to b ottom: D1 to D5 datasets. In each panel, red dotted line and blue line denote 1-simplex and 2-simplex, resp ectively . The test sets are generated based on the same distribution. Note that the fifth dataset cannot b e shown correctly in the three-dimensional space since tw o spheres which cross each other can only b e seen in four or higher-dimensional space. 100 ob jects which is collected with the same wa y as the COIL-20. W e subsample eac h ob ject image to colored 16 × 16 one, short for COIL-100A and gray 32 × 32 one, short for COIL-100B. The SOIL-47A and SOIL-47B datasets are sampled from different illuminations [19]. Each dataset consists of 47 ob jects, each of whic h has 21 differen t views that are sampled every 9 o around an axis passing through the ob ject. Eac h ob ject image is subsampled to a colored image with size 24 × 30. All of these images are directly served as the feature vectors. Some ob jects in the three practical datasets are shown in Fig. 5. As to the t wo face datasets, the UMIST face dataset [20] is a multi-view one for testing the robust of our approach, and the ORL dataset [21] is also another p opular b enchmark one for face recognition. It is worth men tioning that our ob ject recognition is in an instance level, i.e, all the data points in a data set b elongs to the same category , and is not in the sense of the VOC (visual ob ject classes) c hallenges. F or comparison, w e also compare the p erformance of our approaches with 1-nearest neighbor (1-NN), 3-NN and SVM with Gaussian RBF kernels (SVM- G) [22]. The parameters in SVM are tuned by cross-v alidation. The whole train- ing set is used b y these approaches. NSC for Ob ject Recognition 11 Fig. 5. F rom Left to Right: examples of COIL-20 [17], COIL-100 [18] and SOIL-47 [19] b enc hmark datasets. 4.1 Sim ulated datasets and parameter influences W e inv estigate the influence of f in the five datasets. Given f = 0 , 1 , 2, R max = 0 . 5 and γ = 0, the a verage results of 20 rep etitions are shown in T ab. 2. F rom the T ab. 2 w e can see that the p erformance of the prop osed approaches with f = 2 is better in most cases. A p ossible reason is that as [4] p ointed out, f = 2 pro vides a clean p ersisten t interv al graph with little “noise”. Therefore, it leads to a more stable structure represen tation. Note that in practical noisy en vironments, such graphs cannot b e easily obtained. Mean while, f = 1 can b e in terpreted as arising from a family of co verings of the space X with V oronoi-lik e regions surrounding each landmark p oint. W e th us set the parameter f to b e 1 or 2 in the subsequent exp eriments. Note that in “small training samples” case sho wn in the next subsection, we use Rips complex, whic h needn’t the parameter f , for ob ject recognition. F urthermore, w e found that Mahalanobis distance is helpful to impro ve the p erformance of the prop osed algorithms in some cases. W e also study the influence of the parameter R max in determining the optimal v alue R ∗ , whic h is closely related to the selection of prime simplicial complex p er class. W e perform exp eriments on the five simulated datasets b y selecting a group of R max , follow ed by computing the corresp onding R ∗ . The results are sho wn in T ab. 2 and Fig. 6. F rom the results we can see that when R max lo cates in an interv al [0 . 3 , 1], the classification p erformance are b etter. A reason is that the radius of these simulated datasets is close to 0.5. As a result, the top ological structure can b e preserv ed well when R max is selected around 0.5. F urthermore, we compare the NSC approac hes with 1-NN, 3-NN and SVM metho ds in the five 2-class datasets. The rep orted results are sho wn in T ab. 2. It can b e seen from T ab. 2 that in these fiv e datasets, the NSC approaches are alw ays b etter than 1-NN and 3-NN, and achiev e comp etitiv e p erformance com- pared with SVM-G. It is worth noting that as illustrated in Fig. 4, our approaches also preserv e reasonable structure represen tations to these data distributions. 12 Junping Zhang, Ziyu Xie, and Stan Z. Li 0 0.5 1 1.5 2 0.02 0.04 0.06 0.08 0.1 Two Concetric Circles [f=2] NSCmeanerror R R max R chosen Fig. 6. Parameter Influence on the D1 simulated dataset. 4.2 Small training samples and high-dimensional datasets W e test the proposed approaches on four m ulti-view ob ject recognition datasets, eac h of whic h can b e regarded as generating from a circle-shap e structure. W e use four different views per class (i.e., 0 o , 90 o , 180 o and 270 o ) and eight views p er class as the training set, resp ectively . The remaining images are used as the te st set. Since the num b er of training samples is small, we set γ to b e 1 to enhance the extrap olation ability of NSC, and employ Rips metho d [4] instead for the construction of prime simplicial complexes based on the whole training set. Here R is set to b e 30. Note that R in Rips metho d is different from that in Lazywitness metho d. The results are shown in T ab. 3. F rom the results we can see that compared with NN and SVM algorithms, the proposed NSC approaches ac hieve the best performance in 4 out of 5 datasets. In SOIL-47B dataset, the p erformance of NSC is sligh tly w orse than those of SVM algorithms. It indicates that the prop osed NSC approac hes can w ork w ell in high-dimensional m ulti-view structures. Note that here we hav en’t rep orted the results of NSC-M approach since the computation of cov ariance matrix is ill-p osed when the n um b er of samples is less than the dimension of a data set. F urthermore, we also observe that with 8 views as the training samples, our approach obtains the comp etitive p erformance as those state-of-art algorithms using 4 views in COIL and SOIL datasets [23]. How ever, the latter ones utilize v ery effectiv e feature extraction and image registration techniques. In contrary , our approaches ac hieve a go o d trade- off b etw een recognition accuracy and top ology preserv ation by only in tro ducing additional 4 views. NSC for Ob ject Recognition 13 T able 3. The error rates and standard deviations (%) of several approaches in the 12 practical datasets. Here ‘4V’ and ‘8V’ denote 4 and 8 views, resp ectively . NSC NSC-M 1-NN 3-NN SVM-G COIL-100A (4V) 12.84 N/A 16.85 28.40 14.23 COIL-100A(8V) 2.81 N/A 5.33 13.36 4.06 COIL-100B (4V) 24.01 N/A 29.50 43.66 24.23 COIL-100B (8V) 7.50 N/A 12.78 27.67 8.36 SOIL-47A (4V) 16.67 N/A 19.12 56.99 21.08 SOIL-47A (8V) 11.61 N/A 13.54 26.49 12.65 SOIL-47B (4V) 22.92 N/A 23.41 40.93 22.67 SOIL-47B (8V) 15.33 N/A 18.30 25.60 14.58 COIL-20 (4V) 15.00 N/A 16.76 28.24 17.36 COIL-20 (8V) 2.97 N/A 5.39 12.27 4.85 ORL 7 . 02 ± 2 . 00 N/A 8 . 78 ± 2 . 62 17 . 23 ± 2 . 5 6 . 38 ± 2 . 01 UMIST 3 . 66 ± 1 . 52 N/A 5 . 34 ± 1 . 52 11 . 05 ± 2 . 36 6 . 43 ± 1 . 31 Iris 5 . 27 ± 2 . 10 4 . 00 ± 2 . 10 7 . 40 ± 1 . 81 5 . 93 ± 2 . 10 4 . 53 ± 1 . 80 Landsat Satellite 13 . 49 ± 0 . 44 20 . 40 ± 0 . 61 13 . 54 ± 0 . 38 13 . 58 ± 0 . 55 11 . 89 ± 0 . 54 Image Segmentation 6.38 45.90 7.10 10.62 6.05 Gaussian-elena 15 . 24 ± 0 . 58 34 . 61 ± 1 . 23 20 . 15 ± 0 . 70 18 . 52 ± 0 . 55 9 . 98 ± 0 . 51 Breast Cancer Wisconsin 3 . 77 ± 0 . 95 10 . 63 ± 2 . 03 5 . 12 ± 1 . 32 4 . 18 ± 0 . 83 3 . 18 ± 1 . 12 Phoneme 19 . 91 ± 0 . 61 21 . 85 ± 0 . 66 16 . 13 ± 0 . 45 16 . 57 ± 0 . 54 15 . 40 ± 0 . 78 Pendigits 2 . 20 4 . 20 2 . 57 2 . 43 1 . 83 Optdigits 3 . 06 3 . 95 3 . 45 3 . 28 1 . 56 4.3 F ace recognition W e compare our approac hes with others in ORL [21] and UMIST [20] face recog- nition datasets. In ORL dataset, the images of each sub ject are tak en at dif- feren t times with v arious lighting, facial expressions and facial details [21]. In UMIST dataset, the images of eac h sub ject are tak en b y v arying angles from left profile to right profile. W e emplo y PCA to reduce the original dimensions to 40-dimensional subspaces since empirically , the subspaces preserv e most of the principal structures. F urthermore, we also employ Rips metho d to construct the prime simplicial complexes based on the whole training set. The results are sho wn in T ab. 3. It can b e seen from the results that NSC approach obtains the b est p erformance in UMIST data set and ranks 2 in ORL data set. 4.4 UCI datasets Finally , we ev aluate the p erformance of the NSC approaches in 8 UCI datasets. Differen t from the aforementioned datasets, these datasets are tak en from re- mark ably differen t domains. The results are shown in T ab. 3. W e can see from the T ab. 3 that the prop osed NSC approaches ac hieve comp etitive p erformance in these datasets. NSC ranks 1 in 2 of 8 datasets and ranks 2 in 5 of 8 datasets. It means that although dev oting to preserve structures, the prop osed NSC ap- proac hes can also b e applied to some general fields. 14 Junping Zhang, Ziyu Xie, and Stan Z. Li 4.5 Discussion Here w e perform a significant analysis to the prop osed NSC approaches based on the results shown in T ab. 3. With the significance lev el of 5%, the p-v alue of the paired t-test results for the NSC approaches in the 20 data sets are sho wn in T ab. 4. It indicates that NSC, 1-NN and SVM-G are statistically similar in these datasets. T able 4. The p -v alue of the paired t-test results based on T ab. 3. The p -v alue in b old type indicates a rejection of the null hypothesis at the 5% significance level, whic h means there is significant difference b etw een the tw o approaches. NSC vs 1-NN NSC vs 3-NN NSC vs SVM-G p-v alue 0.4014 0.0101 0.8163 W e also w ant to discuss some limitations of the prop osed approac hes. Al- though our goal is to preserve the top ological structure of datasets, first of all, the curren t persistent homology tec hniques can only provide some appro xima- tions to the ‘truly’ topological inv ariants, as our approac hes do. It is also unclear that whether the top ological structures indeed exist in the high-dimensional data sets. Secondly , the ev aluation is to a certain extend unfair to our approac hes since other approaches use the whole training sets to train their mo dels, whereas due to the nature of witness complex, we ha ve to select at most use 50% of the training sets as the landmark p oin ts to build our classification mo del for large- scale training samples. Thirdly , the computational complexity is higher. Giv en the dimension is d , and the num b er of data set is n , sp ecifically , the computa- tional complexity of Rips complexes is O ( d · n 2 ), and that of witness complexes is O ( r · d · n 2 ), where r is the ratio of the n umber of landmark p oin ts to n . F urthermore, the computational complexity of computing nearest distance from data p oint to the prime simplicial complexes is O ( n 2 ). When data are sub ject to Gaussian distribution, finally , the prop osed approaches will lose their merits in recognizing ob jects. 5 Conclusion W e prop ose new structure-preserving NSC approaches b y utilizing p ersistent homology technique in this pap er. W e refine the construction of simplicial com- plex by removing some simplices that are redundant to the NSC approaches. W e present a new Barco de metho d to determine a prime simplicial complex p er class for classification. W e also prop ose a nearest pro jection technique b y com- puting the distance from unlab elled samples to the prime simplicial complexes. F urthermore, w e generalize the extrapolation abilit y of simplicial complexes with a pro jection constraint term. Exp eriments indicate that compared with sev eral NSC for Ob ject Recognition 15 w ell-known algorithms, our prop osed NSC approac hes achiev e promising p erfor- mance without losing the preserv ation of structure representation. In this paper, the prop osed approaches do es not consider how to deal with recognizing those faces in the wild. How ever, our goal is to design a topology- preserving classifier for ob ject recognition and supervised learning, and the “face in the wild” problem can b e av oided by employing Near-Infrared sensor to alle- viate the influence of background if we attempt to employ our approac h to such a scenario. In the future, w e will inv estigate ho w to emplo y the NSC approac hes to other practical applications with more complex topological structures. F urthermore, ho w to construct a more suitable prime simplicial complex deserve study . More- o ver, we will study the p erformance of the prop osed NSC approaches for ob ject recognition in a category lev el rather than in an instance level. Finally , we will consider to further refine the p erformance of the NSC approac hes by utilizing metric learning metho ds. References 1. Ro w eis, S., Saul, L.: Nonlinear dimensionality reduction by lo cally linear embed- ding. Science 290 (2000) 2323–2326 2. T enen baum, J., Silv a, V., Langford, J.: A global geometric framew ork for nonlinear dimensionalit y reduction. Science 290 (2000) 2319–2323 3. Lew ando wski, M., Makris, D., Nebel, J.C.: View and style-independent action manifolds for human activity recognition. In: ECCV. (2010) 547–560 4. de Silv a, V., Carlsson, G.: T op ological estimation using witness complexes. In Alex, M., Rusinkiewicz, S., eds.: Eurographics Symp osium on Poin t-Based Graphics. (2004) 5. Edelsbrunner, H., Letscher, D., Zomoro dian, A.: T op ological persistence and sim- plification. In: IEEE Symp osium on F oundations of Computer Science. (2000) 454–463 6. Munkres, J.R.: Elements of Algebraic T op ology . Massach usetts Institute of T ec h- nology , Cam bridge, Massac h usetts (1984) 7. Carlsson, G., Ishkhanov, T., de Silv a, V., Zomoro dian, A.: On the lo cal b ehavior of spaces of natural images. IJCV 76 (2007) 1–12 8. Oudot, S.Y., Guibas, L.J., Gao, J., W ang, Y.: Geo desic delaunay triangulations in b ounded planar domains. ACM T ransactions on Algorithms 6 (2010) 9. Spanier, E.H.: Algebraic T op ology . McGraw-Hill Bo ok Co. (1966) 10. Zomoro dian, A., Carlsson, G.: Computing p ersistent homology . In: IEEE Sympo- sium on Computational Geometry . (2004) 11. Collins, A., Zomorodian, A., Carlsson, G., Guibas, L.: A barcode shap e descriptors for curve p oint cloud data. In Alex, M., Rusinkiewicz, S., eds.: Eurographics Symp osium on P oin t-Based Graphics. (2004) 12. de Silv a, V., Ghrist, R.: Cov erage in sensor net works via persistent homology . Algebraic & Geometric T op ology 7 (2007) 339–358 13. Bendic h, P .: Analyzing Stratified Spaces Using Persisten t V ersions of Intersection and Lo cal Homology . PhD thesis, Department of Mathematics, Duk e Universit y (2009) 16 Junping Zhang, Ziyu Xie, and Stan Z. Li 14. Bendic h, P ., W ang, B., Mukherjee, S.: T ow ards stratification learning through homology inference, h ttp://arxiv.org/abs/1008.3572 (2010) 15. Adler, R.J., Bobrowski, O., Borman, M.S., Subag, E., W einberger, S.: P ersistent homology for random fields and complexes, http://arxiv.org/abs/1003.1001 (2010) 16. Asuncion, A., Newman, D.J.: UCI machine learning rep ository (2007) 17. Nene, S., Nay ar, S., Murase, H.: Columbia ob ject image library (COIL-20). T ech- nical Rep ort CUCS-005-96, Colum bia Univ ersit y (1996) 18. Nene, S., Na y ar, S., Murase, H.: Columbia ob ject image library (COIL-100). T ech- nical Rep ort CUCS-006-96, Colum bia Univ ersit y (1996) 19. Koubaroulis, D., Matas, J., Kittler, J.: Ev aluating colour-based ob ject recognition algorithms using the SOIL-47 database. In: ACCV. (2002) 840–845 20. Graham, D.B., Allinson, N.M.: F ace recognition: F rom theory to applications. In: NA TO ASI Series F, Computer and Systems Sciences. V olume 163. (1998) 446–456 21. Samaria, F., Harter, A.: P arameterisation of a stochastic mo del for h uman face iden tification. In: Pro ceedings of 2nd IEEE W orkshop on Applications of Computer Vision. (1994) 22. Can u, S., Grandv alet, Y., Guigue, V., Rakotomamonjy , A.: SVM and kernel meth- o ds matlab toolb ox. Perception Syst` emes et Information, INSA de Rouen, Rouen, F rance (2005) 23. Mori, G., Belongie, S., Malik, J.: Shap e con texts enable efficien t retriev al of similar shap es. In: CVPR. (2001)

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment