Advanced
A Novel Multiple Kernel Sparse Representation based Classification for Face Recognition
A Novel Multiple Kernel Sparse Representation based Classification for Face Recognition
KSII Transactions on Internet and Information Systems (TIIS). 2014. Apr, 8(4): 1463-1480
Copyright © 2014, Korean Society For Internet Information
  • Received : December 20, 2013
  • Accepted : March 25, 2014
  • Published : April 28, 2014
Download
PDF
e-PUB
PubReader
PPT
Export by style
Share
Article
Author
Metrics
Cited by
TagCloud
About the Authors
Hao Zheng
School of Mathematics and Information Technology, Nanjing XiaoZhuang University Nanjing, China
Qiaolin Ye
Computer science department, Nanjing Forestry University Nanjing, China
Zhong Jin
School of Computer Science and Technology, Nanjing University of Science and Technology Nanjing, China

Abstract
It is well known that sparse code is effective for feature extraction of face recognition, especially sparse mode can be learned in the kernel space, and obtain better performance. Some recent algorithms made use of single kernel in the sparse mode, but this didn’t make full use of the kernel information. The key issue is how to select the suitable kernel weights, and combine the selected kernels. In this paper, we propose a novel multiple kernel sparse representation based classification for face recognition (MKSRC), which performs sparse code and dictionary learning in the multiple kernel space. Initially, several possible kernels are combined and the sparse coefficient is computed, then the kernel weights can be obtained by the sparse coefficient. Finally convergence makes the kernel weights optimal. The experiments results show that our algorithm outperforms other state-of-the-art algorithms and demonstrate the promising performance of the proposed algorithms.
Keywords
1. Introduction
O ver the last decade, there has been rapid development in Image recognition, and many algorithms have been proposed, such as Eigenface [1] , locality preserving projection(LPP) [2] , Fisherface [3] , maximum margin criterion (MMC) [4] . In a recent work, Yu and Tao [5] proposed an adaptive hypergraph learning method for transductive image classification. Afterward, a semi-supervised classification algorithm named semi-supervised multiview distance metric learning [6] were proposed. To efficiently combine the visual features for subsequent retrieval and synthesis tasks , another new semi-multiview subspace learning algorithm [7] was proposed. In addition, Wang et al. [8] proposed a neighborhood similarity measure to explores the local sample and label distributions. To integrate multiple complementary graphs into a regularization framework, the optimized multigraph-based semi-supervised learning algorithm [9] was subsequently proposed. In addition other related methods [10 - 14] were proposed. In Image recognition, Face recognition is a very challenging question, especially classifier is important for its final role. The nearest-neighbor(NN) algorithm is extremely simple and it is accurate and applicable to various problems [15] . The simplest 1-nn algorithm assigns an input sample to the category of its nearest neighbor from the labeled training set. Due to the NN’s shortcoming that only one training sample is used to represent the test face image, the nearest feature line classifier [16] was proposed through using two training samples for each class to represent the test face image. Then the nearest feature plane classifier [17] was proposed through using three samples to represent the test image. Later, for representing the test image by all the training samples of each class, the local subspace classifier [18] and the nearest subspace classifier [19 - 21] were proposed. The support vector machine (SVM) classifier is also another classifier which is solidly based on the theory of structural risk minimization in statistical learning. It is well known that the SVM maps the inputs to a high-dimensional feature space and then finds a large margin hyperplane between the two classes which can be solved through the quadratic programming algorithm. For noise images, sparse representation based classifier (SRC) [22] make good performance, and SRC shows exciting results in dealing with occlusion by assuming a sparse coding residual. Later, many extended algorithms were proposed, e.g. Gabor SRC [23] , SRC for face misalignment or pose variation [24 - 25] , SRC for continuous occlusion [26] and Heteroscedastic SRC [27] .
Recently the kernel approach [28] has attracted great attention. It offers an alternative solution to increase the computational power of linear learning machines by mapping the data into a high dimensional feature space. The approach has been studied and extended some kernel based algorithms such as kernel principal component analysis (KPCA) [29] and kernel fisher discriminant analysis (KFD) [30 , 31] . As the extension of conventional nearest-neighbor algorithm, the kernel optimization algorithm [32 - 38] was proposed which can be realized by substitution of a kernel distance metric for the original one in Hilbert space. By choosing an appropriate kernel function, the results of kernel nearest-neighbor algorithm are better than those of conventional nearest-neighbor algorithm. Similarity, the single-kernel SVM classifier was proposed, and various remedies were introduced, such as the reduced set method [39] , [40] ,bottom-up method [41] , building of a sparse large margin classifier [42] , [43] , and the incremental building of a reduced-complexity classifier.
But above methods have some disadvantages. NN predicts the category of the image to be tested by only using its nearest neighbor in the training data, and it can easily be affected by noise. NS approximates the test image to the category which minimizes the reconstruction error, therefore the performance is not ideal when the classes are highly correlated to each other. The shortcoming of the SVM is that it is often not as compact as the other classifiers such as neural networks. Fortunately Wright et al. [22] proposed a sparse representation based classifier for face recognition (SRC) which first codes a testing sample as a sparse linear combination of all the training samples, and then classifies the testing sample by evaluating which class leads to the minimum representation error. SRC is much more effective than state-of-art methods in dealing with face occlusion, corruption, lighting and expression changes, etc. It is well known that if an appropriate kernel function is utilized for a test sample, more neighbors probably have the same class label in the high dimensional feature space. Sparse representation in the high dimensional space can improve the performance of recognition and discriminative ability. Some methods were proposed such as kernel representation based classification algorithm (KSRC) [44 - 46] , etc. However the algorithm is often unclear about what is the most suitable kernel for the task at hand, and hence the user may wish to combine several possible kernels. One problem with simply adding kernels is that using uniform weights is possibly not optimal. To overcome it, we proposed a novel algorithm named multiple kernel sparse representation based classifier (MKSRC) which can optimize the kernel weights while training the dictionary. The contributions of this paper can be summarized as follow.
  • 1) We propose a multiple kernel sparse representation based classifier. By making full use of the kernel information, classification performance is improved compared with the state-of-the-art classifier.
  • 2) Through the dicaitonary learning, kernel weights can be adaptive selected. Due to automaticly adjusting the weights, our classifier is more robust, especially for occlusion images.
  • 3) We conduct the experiments in two facial image databases in the conditions of no occlusion and block occlusion. The experiments results validate the effectiveness of new classifier.
2. Related Work
- 2.1 Sparse representation based classification
Sparse representation based classification (SRC) was reported by Wright [22] for robust face recognition. In Wright ’s pioneer work, the training face images are used as the dictionary of representative samples, and an input test image is coded as a sparse linear combination of these sample images via l 1 -norm minimization.
Given a signal (or an image) y ∈ ℜ m , and a matrix A = [ a 1 , a 2 ,⋯, an ] ∈ ℜ m×n containing the elements of an overcomplete dictionary in its columns, the goal of sparse representation is to represent y using as few entries of A as possible. This can be formally expressed as follows:
PPT Slide
Lager Image
where x ∈ ℜ n is the coefficient vector, and║ x 0 is the l 0 -norm which is equal to the number of non-zero components in x . However, this criterion is not convex, and finding the sparsest solution of Eq. (1) is NP-hard. Fortunately this difficulty can be overcomed by convexizing the problem and solving
PPT Slide
Lager Image
where l 1 is used instead of l 0 . It can be shown that if the solution x sought is sparse enough, the solution of l 0 minimization problem is equal to the solution of l 1 minimization problem.
Finally, for each class i, let δi : ℜ n → ℜ n be the characteristic function which selects the coefficients associated with the i -th class. Using only the coefficients associated with the i -th class, one can approximately reconstruct the test sample y as
PPT Slide
Lager Image
= i
PPT Slide
Lager Image
, then classify y based on these approximations by assigning it to the class that minimizes the residual:
PPT Slide
Lager Image
If rl ( y ) = min rl ( y ), y is assigned to class l .
Now suppose that the face image is partially occluded or corrupted, the problem can be expressed as follows:
PPT Slide
Lager Image
where ε is residual. We can approximately reconstruct the test sample y as
PPT Slide
Lager Image
= i
PPT Slide
Lager Image
+
PPT Slide
Lager Image
, then compute the residuals:
PPT Slide
Lager Image
If rl ( y ) = min rl ( y ), y is assigned to class l .
- 2.2. Kernel sparse representation based classification (KSRC)[44]
It is well known that kernel approach can change the distribution of samples through mapping samples into a high dimensional feature space by a nonlinear mapping. In the high dimensional feature space, the sample can be represented more accurately by sparse representation dictionary.
Suppose there are p classes in all, and the set of the training samples is A = [ A 1 , A 2 ,…, Ap ] = [ x 1,1 , x 1,2 ,…, xp,np ] ∈ ℜ d×N , and N =
PPT Slide
Lager Image
ni is total training samples number, y ∈ ℜ d×1 is test sample. The samples are mapped from original feature space into a high dimensional feature space :
y ϕ ( y ), A = [ x 1,1 , x 1,2 ,…, xp,np ] → U = [ φ ( x 1,1 ), φ ( x 1,2 ),…, φ ( xp,np )] by a nonlinear mapping φ : ℜ d → ℜ k ( d < K ). The sparse representation mode can be formulated as
PPT Slide
Lager Image
where ϕ ( y ) is test sample in the high dimensional feature space. Due to NP hard problem, the solution of Eq.(6) can be obtained through the following Eq.(7):
PPT Slide
Lager Image
In the presence of noises, the Eq.(7) should be relaxed and the following optimization problem is obtained:
PPT Slide
Lager Image
Though U and ϕ ( y ) are unknown, according to [44] ,we can prove that Eq.(8) is equivalent to the following Eq.(9):
PPT Slide
Lager Image
PPT Slide
Lager Image
PPT Slide
Lager Image
The procedures of KSRC algorithm are summarized as Algorithm1 :
PPT Slide
Lager Image
3. Multiple Kernel Sparse Representation based Classifier (MKSRC)
Suppose there are p classes in all, and the set of the training samples is A = [ A 1 , A 2 ,…, Ap ] = [ x 1,1 , x 1,2 ,…, xp,np ] ∈
PPT Slide
Lager Image
and y ∈ ℜ d×1 is the test sample. The traditional sparse coding model is equivalent to the so-called LASSO problem [47] :
  • whereσ>0 is a constant.
Suppose there is a feature mapping function ϕ : ℜ d → ℜ k ( d < K ). It maps the feature and basis to the high dimensional feature space:
y ϕ ( y ), A = [ x 1,1 , x 1,2 ,…, xp,np ] → U = [ ϕ ( x 1,1 ), ϕ ( x 1,2 ),⋯, ϕ ( xp,np )] . There exits one problem that one kernel is not most suitable kernel, so we wish to combine several possible kernels. Multiple kernel sparse representation based classification (MKSRC) is a way of optimizing kernel weights while training dictionary. The mode of Multiple Kernel by Lanckriet [48] is k ( xi , xj ) =
PPT Slide
Lager Image
αkkk ( xi , xj ) , and we restrain the kernel weights by
PPT Slide
Lager Image
αK 2 = 1, αK ≥ 0, then substitute the mapped features and basis to the formulation of sparse coding, obtain the objective function as follows:
PPT Slide
Lager Image
The Lagrangian function for Eq. (12) is :
PPT Slide
Lager Image
For sample x and y , we have:
PPT Slide
Lager Image
Therefore
PPT Slide
Lager Image
Setting the derivative of J w.r.t. the primal variable αk to zero,
PPT Slide
Lager Image
Finally we obtain:
PPT Slide
Lager Image
Because φ ( y ) and U are unknown, Eq. (12) cannot be solved directly. But according to [34] , Eq. (12) can be transformed to
PPT Slide
Lager Image
PPT Slide
Lager Image
PPT Slide
Lager Image
Since initial weights are an estimator which is not optimal, the implementation of MKSRC is an iterative process. When the difference of weights αi is small enough, the convergence is stopped. It can be formulated as follows: ║ α t+1 - αt ║≤ tol . In order to verify the convergence of the MKSRC algorithm, Experiments on ORL database were done. It is straightforward that the proposed MKSRC algorithm converges because recognition rate is stable after several iterations, as illustrated in Fig. 1
PPT Slide
Lager Image
Illustration of the convergence of Algorithm 2
The MKSRC algorithmic procedures can be summarized as Algorithm 2 :
PPT Slide
Lager Image
4. Experiments and discussions
In this section, we perform experiments on face databases to demonstrate the efficiency of MKSRC. To evaluate more comprehensively the performance of MKSRC, in section 4.1 we discuss the comparison methods and experiment configurations, then in section 4.2 test FR without occlusion, and finally in section 4.3 we test FR with block occlusion. Through experiments we chose three kernel functions: linear kernel, polynomial kernel, and gussian kernel, of which the kernel parameters were tuned using cross validation. For statistical stability, we generate ten different training and test dataset pairs by randomly permuting 10 times. We compare the performance of the proposed MKSRC with the state-of-the-art classifiers, such as SVM [41] , SRC [22] , KSRC (Polynomial) [44] , KSRC (Gaussian) [44] .
- 4.1 Comparison methods and configurations
To verify the performance of the MKSRC method, we selected the following the methods to compare.
  • 1) SVM. Here, we use the one-versus-all strategy, and select RBF kernel. The radius parameter is tuned to their optimal values through cross validation.
  • 2) SRC. SRC is an effective classifier which codes a testing sample as a sparse linear combination of all the training samples. In order to obtain the better performance, we select the basic pursuit algorithm. The parameter value 0.001 of the SRC is selected by cross validation.
  • 3) KSRC (Polynomial). KSRC makes use of polynomial kernel function to improve the classifier performance. Through the experiments without occlusion, the value of kernel parameter in FERET face database is set to 2, the value in ORL face database is set to 13, while in block occlusion condition both of the values are set to 2.
  • 4) KSRC (Gaussian). KSRC makes use of Gaussian kernel function to improve the classifier performance. Through the experiments without occlusion, both of the values in FERET and ORL face database are set to 2, while in block occlusion condition, the value of kernel parameter in FERET face database is set to 2, the value in ORL face database is set to 3.
- 4.2 Face recognition without occlusion
- 1) The FERET face dataset
FERET database [49] were used in our experiments including the images marked with two-character strings, i.e., “ba,” “bj,” “be,” “bk” “bf,” “bd,” and “bg.” Thus, the entire data set include 1400 images of 200 different subjects, with 7 images per subject. All these images were aligned according to the center points of the eyes. The images are of size 80 by 80. Some sample images are shown in Fig. 2 . The 800 images of 200 subjects were randomly used for training, while the remaining 600 of 200 subjects were used for testing. Table 1 and Fig. 3 show the recognition rate in different algorithms. We can see that MKSRC algorithm retains higher performance than SRC and KSRC. In dimension 300, the recognition rate of MKSRC is 69.32% which is 4.5% higher than SVM.
PPT Slide
Lager Image
Sample images of one person on FERET face database
Important notations used in this paper and their description
PPT Slide
Lager Image
Important notations used in this paper and their description
PPT Slide
Lager Image
The average recognition rates of SVM, SRC, KSRC (Polynomial), KSRC (Gaussian) and MKSRC versus the dimensions on FERET face database
For the selection of kernel parameters, we find the candidate interval from 1 to 10. For simple computing, we find the optimal kernel parameters within these intervals through the single kernel experiments. Fig. 4 (a) shows the recognition rates of the polynomial kernel versus the variation of the parameter d Fig. 4 (b) shows the recognition rates of the gaussian kernel versus the variation of the parameter t . From the Fig. 4 ,both of optimal parameter t and d are 2.
PPT Slide
Lager Image
(a) Recognition rates versus the parameter d of polynomial kernel (b) Recognition rates versus the parameter t of Gaussian kernel
Accuracy on FERET face database
PPT Slide
Lager Image
Accuracy on FERET face database
- 2) The ORL face dataset
The ORL face database consists of 400 frontal face images of 40 subjects. They are captured under various lighting conditions and cropped and normalized to 112 × 92 pixels. The face images were captured under various illumination conditions. We randomly split the database into two halves. One half (5 images per person) was used for training, and the other half for testing. The images are reduced to 30, 60, 110 dimensions, respectively. Table 2 and Fig. 5 illustrate the face recognition rates by different methods. We can see that the recognition rates increase with the larger dimensions. Our MKSRC algorithm achieves a recognition rate between 89% and 97.8%, much better than the other algorithms, especially in dimension 60 MKSRC gets the best performance.
Accuracy on ORL face database
PPT Slide
Lager Image
Accuracy on ORL face database
PPT Slide
Lager Image
The average recognition rates of SVM, SRC, KSRC (Polynomial), KSRC (Gaussian) and MKSRC versus the dimensions on ORL face database
From the experiment without occlusion, we can see the proposed MKSRC method not only outperforms the SRC, but also outperforms the KSRC. Experiments results demonstrate that kernel information helps to improve the recognition rate. This is attributed to two reasons: 1) face image features in kernel feature space contain more effective discriminant information than features in the original feature space, therefore the samples can be easily separated. 2) The appropriate kernel combination makes the test sample in the high dimensional feature space reflect its class label information more accurately. In addition, different kernel functions conduct different experiments results, so the selection of the kernel functions and their kernel parameters is important.
- 4.3 Face recognition with block occlusion
- 1) FERET database
The next experiment is about occlusion for FERET database. We randomly take the four face images of each person for training and the rest three face images for testing. We simulate various levels of contiguous occlusion, from 10% to 30%, by replacing a randomly located square block of each test image with an unrelated image, Again, the location of occlusion are randomly chosen for each image and are unknown to the computer. For computer convenience, the dimension is reduced to 50.
From Table 3 we can see that the accuracy rate of all the methods decline with the occlusion levels increasing, which indicates that loss of feature affects the face recognition performance. But MKSRC achieves better performance than other algorithms. When occlusion is 30%, SRC is only 25.5%, while MKSRC is 29.2% which is more than 3.7% improvement than SRC.
Accuracy on FERET face database under occlusion
PPT Slide
Lager Image
Accuracy on FERET face database under occlusion
- 2) ORL database with regular shapes occlusion
The next one is that we test the efficiency of MKSRC to the block occlusion using the ORL face dataset. We randomly take the first half for training and the rest for testing. We simulate various levels of contiguous occlusion, from 10% to 30%, by replacing a randomly located square block of each test image with an unrelated image, Again, the location of occlusion is randomly chosen for each image and is unknown to the computer. A test example of ORL with 30% occluded block is shown as Fig. 6 . Here, for computational convenience, the size of image is cropped to 32 × 32. The dimensions of the images are reduced to 60. The results of the experiments are more exciting. From Table 4 we can see that the accuracy rate of all the methods decline with the occlusion levels increasing, which indicates that loss of feature affects the face recognition performance. But MKSRC retains good performance of 74.8% when the occlusion percentage is 30%. Through above experiments the fact has been verified that combination of the multiple kernels can improve the performance of face recognition.
PPT Slide
Lager Image
A test example of ORL face database with 30% occluded block
Accuracy on ORL face database under occlusion
PPT Slide
Lager Image
Accuracy on ORL face database under occlusion
- 3) ORL database with irregular shapes occlusion
The next one is more challenge, and we chose the irregular shape occlusion such as conch. The location of occlusion is randomly chosen for each image and is unknown to the computer. A test example of ORL with irregular shape occluded block is shown as Fig. 7 . Here, for computational convenience, the size of image is cropped to 32 × 32. The dimensions of the images are reduced to 60. From Table 5 we can see that the MKSRC method retains the good performance, and accuracy is 5.7% than SRC. This demonstrates that the MKSRC method is stable, and suitable in the different occlusion conditions.
PPT Slide
Lager Image
A test example of ORL face database with irregular shape occluded block
Accuracy on ORL face database under irregular shape occlusion
PPT Slide
Lager Image
Accuracy on ORL face database under irregular shape occlusion
The face exeriments with block occlusion demostrate the MKSRC method is more robust than other methods. We conduct exhaustive experiments not only in two face image database, but also in the conditions of regular shape occlusion and irregular shape occlusion. Because kernel weights can be adaptive selected, the MKSRC method get more suitable kernel combination, as a result achieve the better permance than other methods. With the occlusion rate increasing , the performance of the proposed method doesn’t decline significantly. This means that the mutiple kernel classifier is not sensitive.
5. Conclusion
This paper proposed a multiple kernel sparse representation based classification. On the high-dimensional data such as face images, KSRC algorithm has got better performance than SRC, but KSRC algorithm does not make full use of kernel information. MKSR algorithm can solve this problem by combining several possible kernels, e.g. gussian kernel, while selecting the suitable weights of kernel function. On various face databases MKSRC algorithm achieves the best performance. Because kernel parameter is important for the recognition performance, we will focus on estimating the kernel parameter in the future.
BIO
Hao Zheng received the his BS degree from SouthEast University in 1998, the MS degree from Nanjing University Posts and Telecommunications in 2005, and the PhD degree in pattern recognition and intelligence system from Nanjing University of Science and Technology in 2013. He visited the Center of Quantum Computation & Intelligent Systems, University of Technology Sydney, Australia, from September 2013 to March 2014. He is currently an associate professor with the School of Mathematics and Information Technology at the Nanjing Xiaozhuang University. His research interests include pattern recognition, image processing,face recognition,computer vision.
Qiaolin Ye received the BS degree in Computer Science from Nanjing Institute of Technology, Nanjing, China, in 2007, the MS degree in Computer Science and Technology from Nanjing Forestry University, Jiangsu, China, in 2009, and the Ph.D. degree in pattern recognition and intelligence system from Nanjing University of Science and Technology, Jiangsu, China, in 2013. He is currently an associate professor with the computer science department at the Nanjing Forestry University, Nanjing, China. He has authored more than 30 scientific papers in pattern recognition, machine learning and data mining. His research interests include machine learning, data mining, and pattern recognition.
Zhong Jin received his BS in mathematics, MS in applied mathematics,and PhD in pattern recognition and intelligence systems from Nanjing University of Science and Technology (NUST), China,in 1982, 1984, and 1999, respectively. He is a professor in the Department of Computer Science, NUST, and previously was a research assistant at the Department of Computer Science and Engineering, Chinese University of Hong Kong from 2000 to 2001.He visited the Laboratoire HEUDIASYC, Universite de Technologiede Compiegne, France, from October 2001 to July 2002. He visited the Centre de Visio per Computador, Universitat Autonoma de Barcelona, Spain, as the Ramon y Cajal Research Fellow from September 2005 to October 2005. His current interests are in the areas of pattern recognition, computer vision, face recognition, facial expression analysis, and content-based image retrieval.
References
Turk M. , Pentland A. 1991 “Eigenfaces for recognition” J. Cognitive Neuroscience Article (CrossRef Link) 3 (1) 71 - 86    DOI : 10.1162/jocn.1991.3.1.71
He X. , Yan S. , Hu Y. , Niyogi P. , Zhang H. J. 2005 “Face recognition using laplacianfaces” IEEE Trans. Pattern Anal. Mach. Intell. Article (CrossRef Link) 27 (3) 328 - 340    DOI : 10.1109/TPAMI.2005.55
Belhumeur P. N. , Hespanha J. P. , Kriengman D. J. 1997 “Eigenfaces versus Fisherfaces: Recognition using class specific linear projection” IEEE Trans. Pattern Anal. Mach. Intell. Article (CrossRef Link) 19 (7) 711 - 720    DOI : 10.1109/34.598228
Li H , Jiang T , Zhang K. 2004 “Efficient and robust feature extraction by maximum margin criterion” MIT Press In Proc. of Proceedings of the advances in neural information processing systems Vancouver, Canada vol.16, Article (CrossRef Link)
Yu J. , Tao D. 2012 “Adaptive Hypergraph Learning and Its Application in Image Classification” IEEE Trans. on Image Processing Article (CrossRef Link) 21 (7) 3262 - 3271    DOI : 10.1109/TIP.2012.2190083
Yu J. , Wang M. , Tao D. 2012 “Semisupervised Multiview Distance Metric Learning for Cartoon Synthesis” IEEE Trans. on Image Processing Article (CrossRef Link) 21 (11) 4636 - 3271    DOI : 10.1109/TIP.2012.2207395
Yu J. , Liu D. , Tao D. 2012 “On Combining Multiview Features for Cartoon Character Retrieval and Clip Synthesis” IEEE Trans. on Systems, Man, and Cybernetics Article (CrossRef Link) 42 (5) 1413 - 1427    DOI : 10.1109/TSMCB.2012.2192108
Wang M. , Hua X. , Hong R. , Tang J. , Qi G. , Song Y. 2009 “Unified Video Annotation via Multigraph Learning” IEEE Trans. on Circuits and System for Video Technology Article (CrossRef Link) 19 (5) 733 - 746    DOI : 10.1109/TCSVT.2009.2017400
Wang M. , Hua X. , Tang J. , Hong R. 2009 “Beyond Distance Measurement: Constructing Neighborhood Similarity for Video Annotation” IEEE Trans. on Multimedia Article (CrossRef Link) 11 (3) 465 - 476    DOI : 10.1109/TMM.2009.2012919
Wang M. , Ni B. , Hua X. , Chua T. 2012 “Assistive Tagging: A Survey of Multimedia Tagging with Human-Computer Joint Exploration” ACM Computing Surveys Article 25, Article (CrossRef Link) 4 (4)
Wang M. , Hua X. 2011 “Active Learning in Multimedia Annotation and Retrieval: A Survey” ACM Transactions on Intelligent Systems and Technology Article (CrossRef Link) 2 (2) 10 - 31    DOI : 10.1145/1899412.1899414
Gao Y. , Wang M. , Zha Z. , Shen J. 2012 “Visual Texttual Joint Relevance Learning for Tag-Based Social Image Search” IEEE Trans. on Image Processing Article (CrossRef Link) 22 (1) 363 - 376    DOI : 10.1109/TIP.2012.2202676
Gao Y. , Wang M. , Tao D. , Ji R. 2012 “3D Object Retrieval and Recognition with Hypergraph Analysis” IEEE Trans. on Image Processing Article (CrossRef Link) 21 (9) 4290 - 4303    DOI : 10.1109/TIP.2012.2199502
Gao Y. , Wang M. , Zha Z. , Tian Q. 2011 “Less is More: Efficient 3D Object Retrieval with Query View Selection” IEEE Trans. on Multimedia Article (CrossRef Link) 13 (5) 41007 - 1018    DOI : 10.1109/TMM.2011.2160619
Duda R.O. , Hart P.E. 1973 “Pattern Classification and Scene Analysis” Wiley New York Article (CrossRef Link)
Li S.Z. , Lu J. 1999 “Face recognition using nearest feature line method” IEEE Trans. Neural Network Article (CrossRef Link) 10 (2) 439 - 443    DOI : 10.1109/72.750575
Chien J.T. , Wu C.C. 2002 “Discriminant waveletfaces and nearest feature classifiers for face recognition” IEEE Trans. Pattern Analysis and Machine Intelligence Article (CrossRef Link) 24 (12) 1644 - 1649    DOI : 10.1109/TPAMI.2002.1114855
Laaksonen J. 1997 “Local subspace classifier” in Proc. of Int’l Conf. Artificial Neural Networks Article (CrossRef Link)
Lee K. , Ho J. , Kriegman D. 2005 “Acquiring linear subspaces for face recognition under variable lighting” IEEE Trans. Pattern Analysis and Machine Intelligence Article (CrossRef Link) 27 (5) 684 - 698    DOI : 10.1109/TPAMI.2005.92
Li S.Z. 1998 “Face recognition based on nearest linear combinations” in Proc. of IEEE Int’l Conf. Computer Vision and Pattern Recognition Article (CrossRef Link)
Naseem I. , Togneri R. , Bennamoun M. 2010 “Linear regression for face recognition” IEEE Trans. Pattern Analysis and Machine Intelligence Article (CrossRef Link) 32 (11) 2106 - 2112    DOI : 10.1109/TPAMI.2010.128
Wright J. , Yang A.Y. , Ganesh A. , Sastry S.S. , Ma Y. 2009 “Robust face recognition via sparse representation” TPAMI Article (CrossRef Link) 31 (2) 210 - 227    DOI : 10.1109/TPAMI.2008.79
Yang M. , Zhang L. 2010 “Gabor Feature based Sparse Representation for Face Recognition with Gabor Occlusion Dictionary” in Proc. of European Conf. Computer Vision Article (CrossRef Link)
Huang J.Z. , Huang X.L. , Metaxas D. 2008 “Simultaneous image transformation and sparse representation recovery” in Proc. of IEEE Conf. Computer Vision and Pattern Recognition Article (CrossRef Link)
Wagner A. , Wright J. , Ganesh A. , Zhou Z.H. , Ma Y. 2009 “Towards a Practical Face Recognition System: Robust Registration and Illumination by Sparse Representation” in Proc. of IEEE Conf. Computer Vision and Pattern Recognition Article (CrossRef Link)
Zhou Z. , Wagner A. , Mobahi H. , Wright J. , Ma Y. 2009 “Face recognition with contiguous occlusion using markov random fields” in Proc. of IEEE Int’l Conf. Computer Vision Article (CrossRef Link)
Zheng H. , Xie J. , Jin Z. 2012 “Heteroscedastic Sparse Representation Classification for Face Recognition” Neural Processing Letters Article (CrossRef Link) 35 (3) 233 - 244    DOI : 10.1007/s11063-012-9214-4
Aizerman M. A. , Braverman E. M. , Rozonoer L. I. 1964 “Theoretical foundation of potential function method in pattern recognition learning” Automat. Remote Contr. Article (CrossRef Link) 25 821 - 837
Scholkopf B. , Alexander S. , Muller K. 1998 “Nonlinear component analysis as a kernel eigenvalue problem” Neural Comput Article (CrossRef Link) 10 1299 - 1319    DOI : 10.1162/089976698300017467
Mike S. , Ratsch G. , Weston J. , Scholkopf B. , Muller K. 1999 “Fisher discriminant analysis with kernels” in Proc. of Proceedings of the 1999 IEEE Signal Processing Society Workshop Neural Networks for Signal Processing vol. IX, Article (CrossRef Link) 41 - 48
Mike S. , Ratsch G. , Scholkopf B. , Smola A. , Weston J. , Muller K.R. 1999 “Invariant feature extraction and classification in kernel spaces” in Proc. of Proceedings of the 13th Annual Neural Information Processing Systems Conference Article (CrossRef Link) 526 - 532
Argyriou A. , Hauser R. , Micchelli C. A. , Pontil M. 2006 “A DC algorithm for kernel selection” in Proc. of 23rd Int. Conf. Mach. Pittsburgh,PA Article (CrossRef Link) 41 - 49
Argyriou A. , Micchelli C. A. , Pontil M. 2005 “Learning convex combinations of continuously parameterized basic kernels” in Proc. of 18th Annu. Conf. Learn. Theory Bertinoro, Italy Article (CrossRef Link) 338 - 352
Ong C. S. , Smola A. J. , Williamson R. C. 2005 “Learning the kernel with hyperkernels” J. Mach. Learn. Res. Article (CrossRef Link) 6 1043 - 1071
Rakotomamonjy A. , Bach F. , Canu S. , Grandvalet Y. 2007 “More efficiency in multiple kernel learning” in Proc. of 24th Int. Conf. Mach.Learn. Corvallis, OR Article (CrossRef Link) 775 - 782
Rakotomamonjy A. , Bach F. R. , Canu S. , Grandvalet Y. 2008 “SimpleMKL” J. Mach. Learn. Res. Article (CrossRef Link) 9 2491 - 2521
Ratsch G. , Schafer C. , Scholkopf B. 2006 “Large scale multiple kernel earning” J. Mach. Learn. Res. Article (CrossRef Link) 7 1531 - 1565
Zien A. , Ong C. S. 2007 “Multiclass multiple kernel learning” in Proc. of 24th Int. Conf. Mach. Learn. Corvallis, OR Article (CrossRef Link) 1191 - 1198
Burges C. J. C. 1996 “Simplified support vector decision rules” in Proc. of 13th Int. Conf. Mach. Learn. San Mateo, CA Article (CrossRef Link) 71 - 77
Scholkopf B. , Smola A. 2002 “Learning with Kernels” MIT Press Cambridge,MA Article (CrossRef Link)
Nguyen D. , Ho T. 2005 “An efficient method for simplifying support vector machines” in Proc. of 22nd Int. Conf. Mach. Learn. Bonn, Germany Article (CrossRef Link) 617 - 624
Wu M. , Scholkopf B. , Bakir B. 2006 “A direct method for building sparse kernel learning algorithms” J. Mach. Learn. Res. Article (CrossRef Link) 7 603 - 624
Wu M. , Scholkopf B. , Bakir G. 2005 “Building sparse large margin classifiers” in Proc. of 22nd Int. Conf. Mach. Learn. Bonn, Germany Article (CrossRef Link) 996 - 1003
Yin Jun , Jin Zhong 2012 “Kernel sparse representation based classification” Neurocomputing Article (CrossRef Link) 77 (1) 120 - 128    DOI : 10.1016/j.neucom.2011.08.018
Gao Shenghua , Tsang Ivor Wai-Hung , chia Liang-Tien 2010 “Kernle Sparse Representation for Image Classification and Face Recognition” In Proc. of Proceedings 11th European Conference on Computer Vision Article (CrossRef Link) 1 - 14
Zhang Li , zhou Wei-Da 2012 “Kernel sparse representation-based classifier” IEEE Transactions on Signal Processing Article (CrossRef Link) 60 (4) 1684 - 1695    DOI : 10.1109/TSP.2011.2179539
Tibshirani R. 1996 “Regression shrinkage and selection via the lasso” Journal of the Royal Statistical Society B Article (CrossRef Link) 58 (1) 267 - 288
Lanckriet G.R.G. 2004 “Learning the Kernel Matrix with Semidefinite Programming” J.Machine Learning Research Article (CrossRef Link) 5 27 - 72
Phillips P. J. , Moon H. , Rivzi S. A. , Rauss P. 2000 “The FERET EvaluationMethodology for Face-Recognition Algorithms” IEEE Transactions on Pattern Analysis and Machine Intelligence Article (CrossRef Link) 22 1090 - 1104    DOI : 10.1109/34.879790