Cross-Domain Sparse Coding

Cr oss-Domain Spars e Coding Jim Jing-Y an W ang 1 , 2 1 Unive rsity at Buffalo , The State University of Ne w Y ork, Buffalo , NY 14203, USA 2 National K ey Laborato r y for No vel S oftware T echnology , Nanjing Unive rsity , Nanji ng 210023, China jimjywang @gmail.com ABSTRA CT Sparse codin g has show n its p o wer a s an eﬀective data rep- resen tation method. How ever, up to no w, all the sparse cod - ing approac hes are limited wi thin the single domain learning problem. In this pap er, w e extend the sparse cod in g t o cross domain learning problem, which tries to learn from a source domain to a target domai n with signiﬁcant diﬀeren t distribu- tion. W e imp ose the Maximum Mean Discrepancy (MMD) criterion to redu ce th e cross-domain distribution diﬀerence of sparse co des, and also regularize the sparse co des by the class lab els of the samples from b oth domains to increase the discriminativ e abilit y . The encouragi ng exp erimen t results of the proposed cross-domain sparse co d ing alg orithm on tw o chall enging tasks — image classiﬁcation of photograph and oil painting domains, and multiple user spam detection — show t h e adv antage of the prop osed metho d o ver other cross-domain d ata representation metho ds. Categories and Subject Descriptors I.2.6 [ AR TIFICIAL INTELLIGENCE ]: Learning Keyw ords Cross-Domain Learning; Sparse Coding; Maximum Mea n Discrepancy 1. INTR ODUCTION T raditional ma chine l earning metho ds usually assume that there are suﬃcient t raining sa mples to train the classiﬁer. How ever, in many real-w orld app licatio ns, t h e num b er of labeled samples are alwa y s limited, making the learned clas- siﬁer not robust enough. Recently , cross-domain learning has been proposed to solve this problem [2], by borrow ing labeled samples from a so called “source domain” for the learning p roblem of th e “target domain” in h and. The sam- ples from these tw o domains hav e diﬀerent d istributions but are related, and share th e same class lab el and feature space. Tw o t yp es of domain transfer learning metho ds hav e b een Permission to make digital or hard copies of all or part of this work for personal or classroom us e is granted without fee provided that copies are not made or distributed for proﬁt or commercial advantage and that copies bear this notice and the full cita- tion on the ﬁrst page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. T o copy otherwise, or re- publish, to pos t on servers or to redistribute to lis ts, requires prior s peciﬁc permission and/or a fee. Request permissions from permissions@acm. or g. Copyright 20XX ACM X-XXXXX-XX-X/XX/XX ...$15.00. studied: classiﬁe r tr ansfer metho d which learns a classi- ﬁer for the target domain by t he target d omain samples with help of the source domain samples, while cross domain data represent ation tries to map all the samples from b oth source and target domains t o a data representation space with a common distribution across domains, whic h could b e used t o train a single d omain classiﬁer for t he target domain [2, 5, 8, 9]. In this pap er, we fo cus on the cross do- main rep resen tation p roblem. Some w orks hav e b een done in this ﬁeld by v arious data representation metho ds. F or exam- ple, Blitzer et al. [2] proposed the structural corresp ondence learning (SCL) algorithm to indu ce corresp ondence among features from th e source and target domains, Daume I II [5] prop osed the feature rep lication (FR) metho d to augmen t features for cross-domain learning. Pan et al. [8] prop osed transfer comp onent analysis (TCA) whic h learns transfer compon ents across d omains via Maximum Mean Discrep- ancy (MMD) [3], and extended it to semi-sup ervised TCA (SSTCA) [9]. Recently , sparse coding has attracted man y attention as an eﬀective d ata representation metho d, w hich represen t a data sample as the sparse linear combinatio n of some code- w ords in a cod ebo ok [7]. Most of th e sparse coding algo- rithms are unsup ervised, due to the small num b er of la- b eled samples. Some semi-supervised sparse coding meth- ods are proposed to utilize the labeled samples and signif- ican t p erformance improv ement has b een rep orted [4]. In this case, it would b e very interesting to inv estigate the use of cross-domain represen tation to p rovide more a v ailable la- b eled samples from t he source domain. T o our kno wledge, no work has b een done u sing the sparse coding meth od to solv e the cross-domain problem T o ﬁll in t h is gap, in this p a- p er, we prop ose a no vel cross-domain sparse cod ing method to combine the adva ntages of b oth sparse co ding and cross- domain learning. T o this end, we will try to learn a common codeb o ok for the sparse codin g of the samples from b oth the s ource and target domains. T o utilize the class lab els, a semi -sup ervised regulariza tion will also b e in trod uced to the sparse codes. Moreov er, t o reduce the mismatch b etw een the distributions of the sparse co des of the source and target samples, w e adapt t h e MMD rule to sparse co des. The remaining of this pap er is organized as follo ws: In Section 2 , we will introd uce the formulatio ns of th e pro- p osed Cross-Domain Sparse coding (CroDomSc), and its im- plementatio ns. Section 3 reports exp erimental results, and Section 4 concludes the pap er. 2. CR OSS-DOMAIN SP ARSE C ODING In t his section, w e will introdu ce the p roposed CroDomSc metho d . 2.1 Objecti ve function W e denote the t raining dataset with N samples as D = { x i } N i =1 ∈ R D , where N is the num b er of data samples, x i is the feature vector of the i -th sample, an d D is the feature dimensionalit y . It is also organized as a matrix X = [ x 1 , · · · , x N ] ∈ R D × N . The training set is comp osed of the source domain set D S and target domain set D T , i.e., D = D S S D T . W e also denote N D and N T as the num b er of samples in source and target domain set separativel y . All the samples from the source d omain set D S are l ab eled, while only a few samples from the target domain D T are lab eled. F or each labeled sample x i , w e d enote its class lab el as y i ∈ C , where C is the class lab el space. T o construct th e ob jective function, w e consider the follow ing three p rob lems: Sparse Co ding Problem Given a sample x i ∈ D and a codeb o ok matrix U = [ u 1 , · · · , u K ] ∈ R D × K , where the k -th column is the k -th cod ew ord and K is the num b er of codewords in the co deb ook, sparse coding tries to reconstruct x by the linear reconstruction of the co dew ords in the codeb o ok as x i ≈ P K k =1 v ki u k = U v i , where v i = [ v 1 i , · · · , v K i ] ⊤ ∈ R K is the recon- struction coeﬃcient vector for x i , which should b e as sparse as possible, thus v i is also called sparse code. The problem of sparse co ding can b e formulated as follo ws: min U,V X i : x i ∈D  k x i − U v i k 2 2 + α k v i k 1  = k X − U V k 2 2 + α X i : x i ∈D k v i k 1 s.t. k u k k ≤ c, k = 1 , · · · , K (1) where V = [ v 1 , · · · , v N ] ∈ R K × N is the sparse co de matrix, with its i -th collum the sparse co d e of i -th sample. Semi-Sup ervised Sparse Co ding Re gul arization In th e sparse code space, th e in tra-class v ariance should b e minimized while the inter-cla ss va riance should b e max- imized for all th e samples labeled, from b oth t arget and source domains. W e ﬁ rst deﬁne the semi-sup ervised regularization matrix as W = [ W ij ] ∈ { +1 , − 1 , 0 } N × N , where W ij =    +1 , if y i = y j , − 1 , if y i 6 = y j , 0 , if y i or y j is unk own. (2) W e deﬁ ne the degree of x i as d i = P j : x j ∈D W ij , D = diag ( d 1 , · · · , d N ), and L = D − W as the is the Lapla- cian matrix. Then we formulate the semi-sup ervised regularization p roblem as the follo wing problem: min V X i,j : x i , x j ∈D k v i − v j k 2 2 W ij = tr [ V ( D − W ) V ⊤ ] = tr ( V LV ⊤ ) (3) In this w a y , the l 2 norm distance b etw een sparse co des of intra -class pair ( W ij = 1) will b e minimized, while inter-cla ss pair ( W ij = − 1) maximized. Reducing Mismatch of Sparse Code Di s tribution T o reduce the mismatch of the d istributions of th e source domain and target domain in the sparse co d e space, w e ad opt the MMD [3] as a criteri on, whic h is based on the minimization of the d istance b etw een the means of codes from tw o domains. The problem of redu cing the mismatc h of the sparse cod e distribution b etw een source and target domains could b e formatted as fol- lo ws, min V    1 N S P i : x i ∈D S v i − 1 N T P j : x j ∈D T v j    2 = k V π k 2 2 = T r [ V π π ⊤ V ⊤ ] = T r ( V Π V ⊤ ) (4) where π = [ π 1 , · · · , π N ] ⊤ ∈ R N with π i the domain indicator of i -th sample deﬁned as π i = ( 1 N S , x i ∈ D S , − 1 N T , x i ∈ D T . (5) and Π = π π ⊤ . By summarizing the formulations in (1), (3) and (4), t he CroDomSc problem is mo deled as the follo wing optimization problem: min U,V k X − U V k 2 2 + β T r [ V LV ⊤ ] + γ T r [ V Π V ⊤ ] + α X i : x i ∈D k v i k 1 = k X − U V k 2 2 + T r [ V E V ⊤ ] + α X i : x i ∈D k v i k 1 s.t. k u k k ≤ c, k = 1 , · · · , K (6) where E = ( β L + γ Π). 2.2 Optimization Since direct optimization of (6) is diﬃcult, an iterativ e, tw o-step strategy is used to optimize the cod ebo ok U and sparse co des V alternately while ﬁxing the oth er one. 2.2.1 On optimizing V by ﬁxing U By ﬁxing the co debo ok U , the optimization problem (6) is redu ced to min V k X − U V k 2 2 + T r [ V E V ⊤ ] + α X i : x i ∈D k v i k 1 (7) Since the reconstruction error term can b e rewritten as k X − U V k 2 2 = P i : x i ∈D k x i − U v i k 2 2 , and th e sparse code regularization term could b e rewritten as T r [ V E V ⊤ ] = P i,j : x i , x j ∈D E ij v ⊤ i v j , (7) could b e rewritten as: min V X i : x i ∈D k x i − U v i k 2 2 + X i,j : x i , x j ∈D E ij v ⊤ i v j + α X i : x i ∈D k v i k 1 (8) When up dating v i for any x i ∈ D , th e other codes v j ( j 6 = i ) for x j ∈ D , j 6 = i are ﬁxed . Thus, we get the follow ing optimization prob lem: min v i k x i − U v i k 2 2 + E ii v ⊤ i v i + v ⊤ i f i + α k v i k 1 (9) with f i = 2 P j : x j ∈D ,j 6 = i E ii v j . The ob jective function in (9) could b e optimized eﬃciently by the modiﬁed feature-sign searc h algorithm prop osed in [10]. 2.2.2 On optimizing U by ﬁxing V By ﬁx ing the sparse co des V an d removing irrelev ant terms, the opt imization p roblem (6) is red uced to min U k X − U V k 2 2 s.t. k u k k 2 2 ≤ c, k = 1 , · · · , K (10) The p roblem is a least square problem with quadratic con- strain ts, and it can b e solv ed in the same wa y as [7]. 2.3 Algorithm The p roposed Cross Domai n Sparse co ding algor ithm, named as CroDomSc , is summarized in A lgorithm 1. W e hav e applied the original sparse coding method s to the sam- ples from b oth the source and target domains for initializa- tion. Algorithm 1 CroDom-Ss Algorithm INPUT : T raining sample set from b oth source and target sets D = D S S D T ; Initialize the co deb ooks U 0 and sparse co des V 0 for sam- ples in D by using single domain sp arse co ding. for t = 1 , · · · , T do for i : x i ∈ D do Up date the sp arse co de v t i for x i by ﬁxing U t − 1 and other sparse co des v t − 1 j for x j ∈ D , j 6 = i by solving (9). end for Up date the co deb o ok U t by ﬁxing the sparse co d e ma- trix V t by solving ( 10). end for OUTPUT : U T and V T . When a test sample from target domain comes, we simply solv e problem (9) to obtain its sparse co de. 3. EXPERIMENTS In the exp eriments , w e exp erimen t ally ev aluate th e pro- p osed cross domain data representation meth od, CroDomSc. 3.1 Experimen t I: Cr oss-Domain Image Clas - siﬁcation In the ﬁrst exp erimen t, w e considered the problem of cross domain image cl assiﬁcation of the ph otographs and the oil paintings , which are treated as tw o diﬀerent domains. 3.1.1 Dataset and Setup W e collected an image d atabase of b oth photographs and oil paintings. The database contains totally 2,000 images of 20 seman t ical classes. There are 100 images in each class, and 50 of them are photographs, and the remaining 50 ones are oil paintings. W e extracted and concatenated the color, texture, shap e and bag-of-w ords h istogra m features as visual feature vector from each image. T o conduct the exp eriment, we use photograph domain and oil painting domain as source domain and target domain in turns. F or each target d omain, we randomly split it into training subset (600 images) and test subset (400 images), while 200 images from the training sub set are randomly se- lected as lab el samples and all the source domain samples are labeled. The random splits are rep eated for 1 0 times. W e ﬁrst p erform the CroDomSc to the training set and u se the sparse co d es learned to train a semi-sup ervised SV M classiﬁer [6]. Then the test samples will also b e represented as sparse co de and classiﬁed u sing the learned SV M. 3.1.2 Results W e compare our CroDomSc against severa l cross-Domain data representation metho ds: SSTCA [9], TCA [8], FR [5] and SCL [2]. The boxplots of the classiﬁcation accuracies of the 10 splits using photograph and oil p ain t ing as target domains are rep orted in Figure 1. F rom Figure 1 we can see that the prop osed CroDomSc outp erforms the other four compet in g metho ds for b oth photograph and oil painting domains. It’s also interesting to notice th at the classiﬁcation of the FR and SCI metho ds are p o or, at around 0.7. SSTCA and TCA seems b etter than FR and SC but are still not compet itive to CroDomSc. CroDomSc SSTCA TCA FR SCL 0.5 0.6 0.7 0.8 0.9 1 Target Domain: Photograph Accuracy (a) Photograph as target domain CroDomSc SSTCA TCA FR SCL 0.5 0.6 0.7 0.8 0.9 1 Target Domain: Oil Painting Accuracy (b) Oil p ain t ing as t arget domain Figure 1: The b oxplot of classiﬁcation accuracies of 10 spli ts of CroDomSc and compared m ethods. 3.2 Experimen t II: Multiple User Spam Email Detection In the second experiment, w e wi ll ev aluate the prop osed cross-domain d ata representation metho d for the m ultiple user based spam email detection. 3.2.1 Dataset and Setup A email dataset with 15 in b o xes from 15 diﬀerent users is used in this exp eriment [1]. There are 400 email samples in eac h inbox, and half of th em are spam and the other half non-spam. Due to the signiﬁcan t diﬀerences of the email source among diﬀerent users, the email set of diﬀerent users could b e treated as diﬀerent domains. T o condu ct the ex periment, w e randomly select tw o users’ inboxes as source and target domains. The target domai n will further b e split into test set (100 emails) and training set (300 emails, 100 of which lab eled, and 200 unlab eled). The source domain emails are all lab eled. The word o ccurrence frequency histogra m is extracted from each email as origi- nal feature vector. The CroDomSc algorithm was p erformed to learn the sparse co de o f both source and target d omain samples, whic h w ere used to train the semi-sup ervised clas- siﬁer. The target domain test samples were also represented as sparse cod es, which were classiﬁed u sing the learned clas- siﬁer. This selection will b e rep eated for 40 times to reduce the b ias of each selection. 3.2.2 Results Figure 2 shows the b o xplots of classiﬁcation accuracies on th e spam detection task. A s we can observed from the ﬁgure, the prop osed CroDomSc alw ays outp erforms its com- p etitors. This is another solid evidence of the eﬀectiveness of the sparse coding metho d for the cross-domain represen- tation problem. Mo reov er, SSTCA, whic h is also a semi- sup ervised cross-domain representation metho d, seems to outp erform other metho ds in some cases. How ever, the dif- ferences of its p erformances and oth er ones are not signiﬁ- cant. CroDom−Sc SSTCA TCA FR SCL 0.7 0.8 0.9 1 ACC ACC Figure 2: The b o xplots of detection accuracies of 40 runs for spam dete ction task. 4. CONCLUSION In this paper, we introd uce the ﬁrst sparse co ding algo- rithm for cross-domain data representa tion problem. T he sparse co de distribution diﬀerences b etw een source and tar- get domai ns are reduced b y regularizi ng sparse co des wi th MMD criterion. Moreo ver, the class labels of b oth source and target domain samples are ut ilized to encourage the dis- criminativ e abilit y . The developed cross-domain sparse co d- ing al gorithm is tested o n t w o cross-domain learning tas ks and th e eﬀectiveness was shown. Acknowledgeme nts This work was supp orted by the National Key Lab oratory for Nov el Soft wa re T ec hnology , Nanjing U nive rsity (Grant No. KFKT2012B1 7). 5. REFERE NCES [1] S. Bic kel. ECML/PKDD Discov ery Challenge 2006, September 2006. [2] J. Blitzer, R. McDonald, and F. P ereira. Domain adaptation with structural corresp ondence learning. In 2006 Confer enc e on Em piric al M etho ds in Natur al L anguage Pr o c essing, Pr o c e e dings of the Confer enc e , pages 120 – 128, 2006. [3] K. M. Borgwa rdt, A . Gretton, M. J. Rasc h, H.-P . Kriegel, B. Schoelkopf, and A. J. S mola. Integrating structured biological data by Kernel Maximum Mean Discrepancy. Bi oinformatics , 22(14):E49– E57, JUL 2006. [4] L. Cao, R. Ji, Y. Gao, Y. Y ang, and Q . Tian. W eakly Sup ervised Sparse Co ding with Geometric Consistency P o oling. In CVPR 2012 , pages 3578–35 85. IEEE, 2012. [5] H. Daume I I I. F rustratingly easy d omain adaptation. In ACL 2007 - Pr o c e e dings of the 45th Annua l Me eting of the Asso ciation for Computational Linguistics , p ages 256 – 263, 2007. [6] B. Geng, D . T ao, C. Xu, L. Y ang, and X.-S. H ua. Ensem ble Manifold R egulariza tion. IEEE T r ansactions on Pattern Ana lysis and Machine Intel ligenc e , 34(6):1227–1233 , JUN 2012. [7] H. Lee, A . Battle, R. R aina, and A. Y. N g. Eﬃcient sparse co ding algorithms. In NI PS , pages 801–808. NIPS, 2007. [8] S. J. Pan, I. W. Tsang, J. T. K wok, and Q. Y ang. Domain Adaptation via T ransfer Component Analysis. In IJCAI 2009 , pages 1187–11 92, 2009. [9] S. J. Pan, I. W. Tsang, J. T. K wok, and Q. Y ang. Domain Adaptation via T ransfer Component Analysis. IEEE T r ansactions on Neur al Networks , 22(2):199– 210, FEB 2011. [10] M. Zheng, J. Bu, C. Chen, C. W ang, L. Z h ang, G. Q iu, and D . Cai. Graph Regularized Sparse Co ding for Image Representation. IEEE T r ansactions on Image Pr o c essing , 20(5):1327– 1336, MA Y 2011.

Cross-Domain Sparse Coding

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment