It is well known that sparse code is effective for feature extraction of face recognition, especially sparse mode can be learned in the kernel space, and obtain better performance. Some recent algorithms made use of single kernel in the sparse mode, but this didn’t make full use of the kernel information. The key issue is how to select the suitable kernel weights, and combine the selected kernels. In this paper, we propose a novel multiple kernel sparse representation based classification for face recognition (MKSRC), which performs sparse code and dictionary learning in the multiple kernel space. Initially, several possible kernels are combined and the sparse coefficient is computed, then the kernel weights can be obtained by the sparse coefficient. Finally convergence makes the kernel weights optimal. The experiments results show that our algorithm outperforms other stateoftheart algorithms and demonstrate the promising performance of the proposed algorithms.
1. Introduction
O
ver the last decade, there has been rapid development in Image recognition, and many algorithms have been proposed, such as Eigenface
[1]
, locality preserving projection(LPP)
[2]
, Fisherface
[3]
, maximum margin criterion (MMC)
[4]
. In a recent work, Yu and Tao
[5]
proposed an adaptive hypergraph learning method for transductive image classification. Afterward, a semisupervised classification algorithm named semisupervised multiview distance metric learning
[6]
were proposed. To efficiently combine the visual features for subsequent retrieval and synthesis tasks , another new semimultiview subspace learning algorithm
[7]
was proposed. In addition, Wang et al.
[8]
proposed a neighborhood similarity measure to explores the local sample and label distributions. To integrate multiple complementary graphs into a regularization framework, the optimized multigraphbased semisupervised learning algorithm
[9]
was subsequently proposed. In addition other related methods
[10

14]
were proposed. In Image recognition, Face recognition is a very challenging question, especially classifier is important for its final role. The nearestneighbor(NN) algorithm is extremely simple and it is accurate and applicable to various problems
[15]
. The simplest 1nn algorithm assigns an input sample to the category of its nearest neighbor from the labeled training set. Due to the NN’s shortcoming that only one training sample is used to represent the test face image, the nearest feature line classifier
[16]
was proposed through using two training samples for each class to represent the test face image. Then the nearest feature plane classifier
[17]
was proposed through using three samples to represent the test image. Later, for representing the test image by all the training samples of each class, the local subspace classifier
[18]
and the nearest subspace classifier
[19

21]
were proposed. The support vector machine (SVM) classifier is also another classifier which is solidly based on the theory of structural risk minimization in statistical learning. It is well known that the SVM maps the inputs to a highdimensional feature space and then finds a large margin hyperplane between the two classes which can be solved through the quadratic programming algorithm. For noise images, sparse representation based classifier (SRC)
[22]
make good performance, and SRC shows exciting results in dealing with occlusion by assuming a sparse coding residual. Later, many extended algorithms were proposed, e.g. Gabor SRC
[23]
, SRC for face misalignment or pose variation
[24

25]
, SRC for continuous occlusion
[26]
and Heteroscedastic SRC
[27]
.
Recently the kernel approach
[28]
has attracted great attention. It offers an alternative solution to increase the computational power of linear learning machines by mapping the data into a high dimensional feature space. The approach has been studied and extended some kernel based algorithms such as kernel principal component analysis (KPCA)
[29]
and kernel fisher discriminant analysis (KFD)
[30
,
31]
. As the extension of conventional nearestneighbor algorithm, the kernel optimization algorithm
[32

38]
was proposed which can be realized by substitution of a kernel distance metric for the original one in Hilbert space. By choosing an appropriate kernel function, the results of kernel nearestneighbor algorithm are better than those of conventional nearestneighbor algorithm. Similarity, the singlekernel SVM classifier was proposed, and various remedies were introduced, such as the reduced set method
[39]
,
[40]
,bottomup method
[41]
, building of a sparse large margin classifier
[42]
,
[43]
, and the incremental building of a reducedcomplexity classifier.
But above methods have some disadvantages. NN predicts the category of the image to be tested by only using its nearest neighbor in the training data, and it can easily be affected by noise. NS approximates the test image to the category which minimizes the reconstruction error, therefore the performance is not ideal when the classes are highly correlated to each other. The shortcoming of the SVM is that it is often not as compact as the other classifiers such as neural networks. Fortunately Wright et al.
[22]
proposed a sparse representation based classifier for face recognition (SRC) which first codes a testing sample as a sparse linear combination of all the training samples, and then classifies the testing sample by evaluating which class leads to the minimum representation error. SRC is much more effective than stateofart methods in dealing with face occlusion, corruption, lighting and expression changes, etc. It is well known that if an appropriate kernel function is utilized for a test sample, more neighbors probably have the same class label in the high dimensional feature space. Sparse representation in the high dimensional space can improve the performance of recognition and discriminative ability. Some methods were proposed such as kernel representation based classification algorithm (KSRC)
[44

46]
, etc. However the algorithm is often unclear about what is the most suitable kernel for the task at hand, and hence the user may wish to combine several possible kernels. One problem with simply adding kernels is that using uniform weights is possibly not optimal. To overcome it, we proposed a novel algorithm named multiple kernel sparse representation based classifier (MKSRC) which can optimize the kernel weights while training the dictionary. The contributions of this paper can be summarized as follow.

1) We propose a multiple kernel sparse representation based classifier. By making full use of the kernel information, classification performance is improved compared with the stateoftheart classifier.

2) Through the dicaitonary learning, kernel weights can be adaptive selected. Due to automaticly adjusting the weights, our classifier is more robust, especially for occlusion images.

3) We conduct the experiments in two facial image databases in the conditions of no occlusion and block occlusion. The experiments results validate the effectiveness of new classifier.
2. Related Work
 2.1 Sparse representation based classification
Sparse representation based classification (SRC) was reported by Wright
[22]
for robust face recognition. In Wright ’s pioneer work, the training face images are used as the dictionary of representative samples, and an input test image is coded as a sparse linear combination of these sample images via
l
_{1}
norm minimization.
Given a signal (or an image)
y
∈ ℜ
^{m}
, and a matrix
A
= [
a
_{1}
,
a
_{2}
,⋯,
a_{n}
] ∈ ℜ
^{m×n}
containing the elements of an overcomplete dictionary in its columns, the goal of sparse representation is to represent
y
using as few entries of
A
as possible. This can be formally expressed as follows:
where
x
∈ ℜ
^{n}
is the coefficient vector, and║
x
║
_{0}
is the
l
_{0}
norm which is equal to the number of nonzero components in
x
. However, this criterion is not convex, and finding the sparsest solution of Eq. (1) is NPhard. Fortunately this difficulty can be overcomed by convexizing the problem and solving
where
l
_{1}
is used instead of
l
_{0}
. It can be shown that if the solution
x
sought is sparse enough, the solution of
l
_{0}
minimization problem is equal to the solution of
l
_{1}
minimization problem.
Finally, for each class i, let
δ_{i}
: ℜ
^{n}
→ ℜ
^{n}
be the characteristic function which selects the coefficients associated with the
i
th class. Using only the coefficients associated with the
i
th class, one can approximately reconstruct the test sample
y
as
=
Aδ_{i}
, then classify
y
based on these approximations by assigning it to the class that minimizes the residual:
If
r_{l}
(
y
) = min
r_{l}
(
y
),
y
is assigned to class
l
.
Now suppose that the face image is partially occluded or corrupted, the problem can be expressed as follows:
where
ε
is residual. We can approximately reconstruct the test sample
y
as
=
Aδ_{i}
+
, then compute the residuals:
If
r_{l}
(
y
) = min
r_{l}
(
y
),
y
is assigned to class
l
.
 2.2. Kernel sparse representation based classification (KSRC)[44]
It is well known that kernel approach can change the distribution of samples through mapping samples into a high dimensional feature space by a nonlinear mapping. In the high dimensional feature space, the sample can be represented more accurately by sparse representation dictionary.
Suppose there are
p
classes in all, and the set of the training samples is
A
= [
A
_{1}
,
A
_{2}
,…,
A_{p}
] = [
x
_{1,1}
,
x
_{1,2}
,…,
x_{p,np}
] ∈ ℜ
^{d×N}
, and
N
=
n_{i}
is total training samples number,
y
∈ ℜ
^{d×1}
is test sample. The samples are mapped from original feature space into a high dimensional feature space :
y
→
ϕ
(
y
),
A
= [
x
_{1,1}
,
x
_{1,2}
,…,
x_{p,np}
] →
U
= [
φ
(
x
_{1,1}
),
φ
(
x
_{1,2}
),…,
φ
(
x_{p,np}
)] by a nonlinear mapping
φ
: ℜ
^{d}
→ ℜ
^{k}
(
d
<
K
). The sparse representation mode can be formulated as
where
ϕ
(
y
) is test sample in the high dimensional feature space. Due to NP hard problem, the solution of Eq.(6) can be obtained through the following Eq.(7):
In the presence of noises, the Eq.(7) should be relaxed and the following optimization problem is obtained:
Though
U
and
ϕ
(
y
) are unknown, according to
[44]
,we can prove that Eq.(8) is equivalent to the following Eq.(9):
The procedures of KSRC algorithm are summarized as
Algorithm1
:
3. Multiple Kernel Sparse Representation based Classifier (MKSRC)
Suppose there are
p
classes in all, and the set of the training samples is
A
= [
A
_{1}
,
A
_{2}
,…,
A_{p}
] = [
x
_{1,1}
,
x
_{1,2}
,…,
x_{p,np}
] ∈
and
y
∈ ℜ
^{d×1}
is the test sample. The traditional sparse coding model is equivalent to the socalled LASSO problem
[47]
:
Suppose there is a feature mapping function
ϕ
: ℜ
^{d}
→ ℜ
^{k}
(
d
<
K
). It maps the feature and basis to the high dimensional feature space:
y
→
ϕ
(
y
),
A
= [
x
_{1,1}
,
x
_{1,2}
,…,
x_{p,np}
] →
U
= [
ϕ
(
x
_{1,1}
),
ϕ
(
x
_{1,2}
),⋯,
ϕ
(
x_{p,np}
)] . There exits one problem that one kernel is not most suitable kernel, so we wish to combine several possible kernels. Multiple kernel sparse representation based classification (MKSRC) is a way of optimizing kernel weights while training dictionary. The mode of Multiple Kernel by Lanckriet
[48]
is
k
(
x_{i}
,
x_{j}
) =
α_{k}k_{k}
(
x_{i}
,
x_{j}
) , and we restrain the kernel weights by
α_{K}
^{2}
= 1,
α_{K}
≥ 0, then substitute the mapped features and basis to the formulation of sparse coding, obtain the objective function as follows:
The Lagrangian function for Eq. (12) is :
For sample
x
and
y
, we have:
Therefore
Setting the derivative of
J
w.r.t. the primal variable
α_{k}
to zero,
Finally we obtain:
Because
φ
(
y
) and
U
are unknown, Eq. (12) cannot be solved directly. But according to
[34]
, Eq. (12) can be transformed to
Since initial weights are an estimator which is not optimal, the implementation of MKSRC is an iterative process. When the difference of weights
α_{i}
is small enough, the convergence is stopped. It can be formulated as follows: ║
α
^{t+1}

α^{t}
║≤
tol
. In order to verify the convergence of the MKSRC algorithm, Experiments on ORL database were done. It is straightforward that the proposed MKSRC algorithm converges because recognition rate is stable after several iterations, as illustrated in
Fig. 1
Illustration of the convergence of Algorithm 2
The MKSRC algorithmic procedures can be summarized as
Algorithm 2
:
4. Experiments and discussions
In this section, we perform experiments on face databases to demonstrate the efficiency of MKSRC. To evaluate more comprehensively the performance of MKSRC, in section 4.1 we discuss the comparison methods and experiment configurations, then in section 4.2 test FR without occlusion, and finally in section 4.3 we test FR with block occlusion. Through experiments we chose three kernel functions: linear kernel, polynomial kernel, and gussian kernel, of which the kernel parameters were tuned using cross validation. For statistical stability, we generate ten different training and test dataset pairs by randomly permuting 10 times. We compare the performance of the proposed MKSRC with the stateoftheart classifiers, such as SVM
[41]
, SRC
[22]
, KSRC (Polynomial)
[44]
, KSRC (Gaussian)
[44]
.
 4.1 Comparison methods and configurations
To verify the performance of the MKSRC method, we selected the following the methods to compare.

1) SVM. Here, we use the oneversusall strategy, and select RBF kernel. The radius parameter is tuned to their optimal values through cross validation.

2) SRC. SRC is an effective classifier which codes a testing sample as a sparse linear combination of all the training samples. In order to obtain the better performance, we select the basic pursuit algorithm. The parameter value 0.001 of the SRC is selected by cross validation.

3) KSRC (Polynomial). KSRC makes use of polynomial kernel function to improve the classifier performance. Through the experiments without occlusion, the value of kernel parameter in FERET face database is set to 2, the value in ORL face database is set to 13, while in block occlusion condition both of the values are set to 2.

4) KSRC (Gaussian). KSRC makes use of Gaussian kernel function to improve the classifier performance. Through the experiments without occlusion, both of the values in FERET and ORL face database are set to 2, while in block occlusion condition, the value of kernel parameter in FERET face database is set to 2, the value in ORL face database is set to 3.
 4.2 Face recognition without occlusion
 1) The FERET face dataset
FERET database
[49]
were used in our experiments including the images marked with twocharacter strings, i.e., “ba,” “bj,” “be,” “bk” “bf,” “bd,” and “bg.” Thus, the entire data set include 1400 images of 200 different subjects, with 7 images per subject. All these images were aligned according to the center points of the eyes. The images are of size 80 by 80. Some sample images are shown in
Fig. 2
. The 800 images of 200 subjects were randomly used for training, while the remaining 600 of 200 subjects were used for testing.
Table 1
and
Fig. 3
show the recognition rate in different algorithms. We can see that MKSRC algorithm retains higher performance than SRC and KSRC. In dimension 300, the recognition rate of MKSRC is 69.32% which is 4.5% higher than SVM.
Sample images of one person on FERET face database
Important notations used in this paper and their description
Important notations used in this paper and their description
The average recognition rates of SVM, SRC, KSRC (Polynomial), KSRC (Gaussian) and MKSRC versus the dimensions on FERET face database
For the selection of kernel parameters, we find the candidate interval from 1 to 10. For simple computing, we find the optimal kernel parameters within these intervals through the single kernel experiments.
Fig. 4
(a) shows the recognition rates of the polynomial kernel versus the variation of the parameter
d
，
Fig. 4
(b) shows the recognition rates of the gaussian kernel versus the variation of the parameter
t
. From the
Fig. 4
，both of optimal parameter
t
and
d
are 2.
(a) Recognition rates versus the parameter d of polynomial kernel (b) Recognition rates versus the parameter t of Gaussian kernel
Accuracy on FERET face database
Accuracy on FERET face database
 2) The ORL face dataset
The ORL face database consists of 400 frontal face images of 40 subjects. They are captured under various lighting conditions and cropped and normalized to 112 × 92 pixels. The face images were captured under various illumination conditions. We randomly split the database into two halves. One half (5 images per person) was used for training, and the other half for testing. The images are reduced to 30, 60, 110 dimensions, respectively.
Table 2
and
Fig. 5
illustrate the face recognition rates by different methods. We can see that the recognition rates increase with the larger dimensions. Our MKSRC algorithm achieves a recognition rate between 89% and 97.8%, much better than the other algorithms, especially in dimension 60 MKSRC gets the best performance.
Accuracy on ORL face database
Accuracy on ORL face database
The average recognition rates of SVM, SRC, KSRC (Polynomial), KSRC (Gaussian) and MKSRC versus the dimensions on ORL face database
From the experiment without occlusion, we can see the proposed MKSRC method not only outperforms the SRC, but also outperforms the KSRC. Experiments results demonstrate that kernel information helps to improve the recognition rate. This is attributed to two reasons: 1) face image features in kernel feature space contain more effective discriminant information than features in the original feature space, therefore the samples can be easily separated. 2) The appropriate kernel combination makes the test sample in the high dimensional feature space reflect its class label information more accurately. In addition, different kernel functions conduct different experiments results, so the selection of the kernel functions and their kernel parameters is important.
 4.3 Face recognition with block occlusion
 1) FERET database
The next experiment is about occlusion for FERET database. We randomly take the four face images of each person for training and the rest three face images for testing. We simulate various levels of contiguous occlusion, from 10% to 30%, by replacing a randomly located square block of each test image with an unrelated image, Again, the location of occlusion are randomly chosen for each image and are unknown to the computer. For computer convenience, the dimension is reduced to 50.
From
Table 3
we can see that the accuracy rate of all the methods decline with the occlusion levels increasing, which indicates that loss of feature affects the face recognition performance. But MKSRC achieves better performance than other algorithms. When occlusion is 30%, SRC is only 25.5%, while MKSRC is 29.2% which is more than 3.7% improvement than SRC.
Accuracy on FERET face database under occlusion
Accuracy on FERET face database under occlusion
 2) ORL database with regular shapes occlusion
The next one is that we test the efficiency of MKSRC to the block occlusion using the ORL face dataset. We randomly take the first half for training and the rest for testing. We simulate various levels of contiguous occlusion, from 10% to 30%, by replacing a randomly located square block of each test image with an unrelated image, Again, the location of occlusion is randomly chosen for each image and is unknown to the computer. A test example of ORL with 30% occluded block is shown as
Fig. 6
. Here, for computational convenience, the size of image is cropped to 32 × 32. The dimensions of the images are reduced to 60. The results of the experiments are more exciting. From
Table 4
we can see that the accuracy rate of all the methods decline with the occlusion levels increasing, which indicates that loss of feature affects the face recognition performance. But MKSRC retains good performance of 74.8% when the occlusion percentage is 30%. Through above experiments the fact has been verified that combination of the multiple kernels can improve the performance of face recognition.
A test example of ORL face database with 30% occluded block
Accuracy on ORL face database under occlusion
Accuracy on ORL face database under occlusion
 3) ORL database with irregular shapes occlusion
The next one is more challenge, and we chose the irregular shape occlusion such as conch. The location of occlusion is randomly chosen for each image and is unknown to the computer. A test example of ORL with irregular shape occluded block is shown as
Fig. 7
. Here, for computational convenience, the size of image is cropped to 32 × 32. The dimensions of the images are reduced to 60. From
Table 5
we can see that the MKSRC method retains the good performance, and accuracy is 5.7% than SRC. This demonstrates that the MKSRC method is stable, and suitable in the different occlusion conditions.
A test example of ORL face database with irregular shape occluded block
Accuracy on ORL face database under irregular shape occlusion
Accuracy on ORL face database under irregular shape occlusion
The face exeriments with block occlusion demostrate the MKSRC method is more robust than other methods. We conduct exhaustive experiments not only in two face image database, but also in the conditions of regular shape occlusion and irregular shape occlusion. Because kernel weights can be adaptive selected, the MKSRC method get more suitable kernel combination, as a result achieve the better permance than other methods. With the occlusion rate increasing , the performance of the proposed method doesn’t decline significantly. This means that the mutiple kernel classifier is not sensitive.
5. Conclusion
This paper proposed a multiple kernel sparse representation based classification. On the highdimensional data such as face images, KSRC algorithm has got better performance than SRC, but KSRC algorithm does not make full use of kernel information. MKSR algorithm can solve this problem by combining several possible kernels, e.g. gussian kernel, while selecting the suitable weights of kernel function. On various face databases MKSRC algorithm achieves the best performance. Because kernel parameter is important for the recognition performance, we will focus on estimating the kernel parameter in the future.
BIO
Hao Zheng received the his BS degree from SouthEast University in 1998, the MS degree from Nanjing University Posts and Telecommunications in 2005, and the PhD degree in pattern recognition and intelligence system from Nanjing University of Science and Technology in 2013. He visited the Center of Quantum Computation & Intelligent Systems, University of Technology Sydney, Australia, from September 2013 to March 2014. He is currently an associate professor with the School of Mathematics and Information Technology at the Nanjing Xiaozhuang University. His research interests include pattern recognition, image processing,face recognition,computer vision.
Qiaolin Ye received the BS degree in Computer Science from Nanjing Institute of Technology, Nanjing, China, in 2007, the MS degree in Computer Science and Technology from Nanjing Forestry University, Jiangsu, China, in 2009, and the Ph.D. degree in pattern recognition and intelligence system from Nanjing University of Science and Technology, Jiangsu, China, in 2013. He is currently an associate professor with the computer science department at the Nanjing Forestry University, Nanjing, China. He has authored more than 30 scientific papers in pattern recognition, machine learning and data mining. His research interests include machine learning, data mining, and pattern recognition.
Zhong Jin received his BS in mathematics, MS in applied mathematics,and PhD in pattern recognition and intelligence systems from Nanjing University of Science and Technology (NUST), China,in 1982, 1984, and 1999, respectively. He is a professor in the Department of Computer Science, NUST, and previously was a research assistant at the Department of Computer Science and Engineering, Chinese University of Hong Kong from 2000 to 2001.He visited the Laboratoire HEUDIASYC, Universite de Technologiede Compiegne, France, from October 2001 to July 2002. He visited the Centre de Visio per Computador, Universitat Autonoma de Barcelona, Spain, as the Ramon y Cajal Research Fellow from September 2005 to October 2005. His current interests are in the areas of pattern recognition, computer vision, face recognition, facial expression analysis, and contentbased image retrieval.
Turk M.
,
Pentland A.
1991
“Eigenfaces for recognition”
J. Cognitive Neuroscience
Article (CrossRef Link)
3
(1)
71 
86
DOI : 10.1162/jocn.1991.3.1.71
He X.
,
Yan S.
,
Hu Y.
,
Niyogi P.
,
Zhang H. J.
2005
“Face recognition using laplacianfaces”
IEEE Trans. Pattern Anal. Mach. Intell.
Article (CrossRef Link)
27
(3)
328 
340
DOI : 10.1109/TPAMI.2005.55
Belhumeur P. N.
,
Hespanha J. P.
,
Kriengman D. J.
1997
“Eigenfaces versus Fisherfaces: Recognition using class specific linear projection”
IEEE Trans. Pattern Anal. Mach. Intell.
Article (CrossRef Link)
19
(7)
711 
720
DOI : 10.1109/34.598228
Li H
,
Jiang T
,
Zhang K.
2004
“Efficient and robust feature extraction by maximum margin criterion”
MIT Press
In Proc. of Proceedings of the advances in neural information processing systems
Vancouver, Canada
vol.16, Article (CrossRef Link)
Yu J.
,
Tao D.
2012
“Adaptive Hypergraph Learning and Its Application in Image Classification”
IEEE Trans. on Image Processing
Article (CrossRef Link)
21
(7)
3262 
3271
DOI : 10.1109/TIP.2012.2190083
Yu J.
,
Wang M.
,
Tao D.
2012
“Semisupervised Multiview Distance Metric Learning for Cartoon Synthesis”
IEEE Trans. on Image Processing
Article (CrossRef Link)
21
(11)
4636 
3271
DOI : 10.1109/TIP.2012.2207395
Yu J.
,
Liu D.
,
Tao D.
2012
“On Combining Multiview Features for Cartoon Character Retrieval and Clip Synthesis”
IEEE Trans. on Systems, Man, and Cybernetics
Article (CrossRef Link)
42
(5)
1413 
1427
DOI : 10.1109/TSMCB.2012.2192108
Wang M.
,
Hua X.
,
Hong R.
,
Tang J.
,
Qi G.
,
Song Y.
2009
“Unified Video Annotation via Multigraph Learning”
IEEE Trans. on Circuits and System for Video Technology
Article (CrossRef Link)
19
(5)
733 
746
DOI : 10.1109/TCSVT.2009.2017400
Wang M.
,
Hua X.
,
Tang J.
,
Hong R.
2009
“Beyond Distance Measurement: Constructing Neighborhood Similarity for Video Annotation”
IEEE Trans. on Multimedia
Article (CrossRef Link)
11
(3)
465 
476
DOI : 10.1109/TMM.2009.2012919
Wang M.
,
Ni B.
,
Hua X.
,
Chua T.
2012
“Assistive Tagging: A Survey of Multimedia Tagging with HumanComputer Joint Exploration”
ACM Computing Surveys
Article 25, Article (CrossRef Link)
4
(4)
Wang M.
,
Hua X.
2011
“Active Learning in Multimedia Annotation and Retrieval: A Survey”
ACM Transactions on Intelligent Systems and Technology
Article (CrossRef Link)
2
(2)
10 
31
DOI : 10.1145/1899412.1899414
Gao Y.
,
Wang M.
,
Zha Z.
,
Shen J.
2012
“Visual Texttual Joint Relevance Learning for TagBased Social Image Search”
IEEE Trans. on Image Processing
Article (CrossRef Link)
22
(1)
363 
376
DOI : 10.1109/TIP.2012.2202676
Gao Y.
,
Wang M.
,
Tao D.
,
Ji R.
2012
“3D Object Retrieval and Recognition with Hypergraph Analysis”
IEEE Trans. on Image Processing
Article (CrossRef Link)
21
(9)
4290 
4303
DOI : 10.1109/TIP.2012.2199502
Gao Y.
,
Wang M.
,
Zha Z.
,
Tian Q.
2011
“Less is More: Efficient 3D Object Retrieval with Query View Selection”
IEEE Trans. on Multimedia
Article (CrossRef Link)
13
(5)
41007 
1018
DOI : 10.1109/TMM.2011.2160619
Duda R.O.
,
Hart P.E.
1973
“Pattern Classification and Scene Analysis”
Wiley
New York
Article (CrossRef Link)
Li S.Z.
,
Lu J.
1999
“Face recognition using nearest feature line method”
IEEE Trans. Neural Network
Article (CrossRef Link)
10
(2)
439 
443
DOI : 10.1109/72.750575
Chien J.T.
,
Wu C.C.
2002
“Discriminant waveletfaces and nearest feature classifiers for face recognition”
IEEE Trans. Pattern Analysis and Machine Intelligence
Article (CrossRef Link)
24
(12)
1644 
1649
DOI : 10.1109/TPAMI.2002.1114855
Laaksonen J.
1997
“Local subspace classifier”
in Proc. of Int’l Conf. Artificial Neural Networks
Article (CrossRef Link)
Lee K.
,
Ho J.
,
Kriegman D.
2005
“Acquiring linear subspaces for face recognition under variable lighting”
IEEE Trans. Pattern Analysis and Machine Intelligence
Article (CrossRef Link)
27
(5)
684 
698
DOI : 10.1109/TPAMI.2005.92
Li S.Z.
1998
“Face recognition based on nearest linear combinations”
in Proc. of IEEE Int’l Conf. Computer Vision and Pattern Recognition
Article (CrossRef Link)
Naseem I.
,
Togneri R.
,
Bennamoun M.
2010
“Linear regression for face recognition”
IEEE Trans. Pattern Analysis and Machine Intelligence
Article (CrossRef Link)
32
(11)
2106 
2112
DOI : 10.1109/TPAMI.2010.128
Wright J.
,
Yang A.Y.
,
Ganesh A.
,
Sastry S.S.
,
Ma Y.
2009
“Robust face recognition via sparse representation”
TPAMI
Article (CrossRef Link)
31
(2)
210 
227
DOI : 10.1109/TPAMI.2008.79
Yang M.
,
Zhang L.
2010
“Gabor Feature based Sparse Representation for Face Recognition with Gabor Occlusion Dictionary”
in Proc. of European Conf. Computer Vision
Article (CrossRef Link)
Huang J.Z.
,
Huang X.L.
,
Metaxas D.
2008
“Simultaneous image transformation and sparse representation recovery”
in Proc. of IEEE Conf. Computer Vision and Pattern Recognition
Article (CrossRef Link)
Wagner A.
,
Wright J.
,
Ganesh A.
,
Zhou Z.H.
,
Ma Y.
2009
“Towards a Practical Face Recognition System: Robust Registration and Illumination by Sparse Representation”
in Proc. of IEEE Conf. Computer Vision and Pattern Recognition
Article (CrossRef Link)
Zhou Z.
,
Wagner A.
,
Mobahi H.
,
Wright J.
,
Ma Y.
2009
“Face recognition with contiguous occlusion using markov random fields”
in Proc. of IEEE Int’l Conf. Computer Vision
Article (CrossRef Link)
Zheng H.
,
Xie J.
,
Jin Z.
2012
“Heteroscedastic Sparse Representation Classification for Face Recognition”
Neural Processing Letters
Article (CrossRef Link)
35
(3)
233 
244
DOI : 10.1007/s1106301292144
Aizerman M. A.
,
Braverman E. M.
,
Rozonoer L. I.
1964
“Theoretical foundation of potential function method in pattern recognition learning”
Automat. Remote Contr.
Article (CrossRef Link)
25
821 
837
Scholkopf B.
,
Alexander S.
,
Muller K.
1998
“Nonlinear component analysis as a kernel eigenvalue problem”
Neural Comput
Article (CrossRef Link)
10
1299 
1319
DOI : 10.1162/089976698300017467
Mike S.
,
Ratsch G.
,
Weston J.
,
Scholkopf B.
,
Muller K.
1999
“Fisher discriminant analysis with kernels”
in Proc. of Proceedings of the 1999 IEEE Signal Processing Society Workshop Neural Networks for Signal Processing
vol. IX, Article (CrossRef Link)
41 
48
Mike S.
,
Ratsch G.
,
Scholkopf B.
,
Smola A.
,
Weston J.
,
Muller K.R.
1999
“Invariant feature extraction and classification in kernel spaces”
in Proc. of Proceedings of the 13th Annual Neural Information Processing Systems Conference
Article (CrossRef Link)
526 
532
Argyriou A.
,
Hauser R.
,
Micchelli C. A.
,
Pontil M.
2006
“A DC algorithm for kernel selection”
in Proc. of 23rd Int. Conf. Mach.
Pittsburgh,PA
Article (CrossRef Link)
41 
49
Argyriou A.
,
Micchelli C. A.
,
Pontil M.
2005
“Learning convex combinations of continuously parameterized basic kernels”
in Proc. of 18th Annu. Conf. Learn. Theory
Bertinoro, Italy
Article (CrossRef Link)
338 
352
Ong C. S.
,
Smola A. J.
,
Williamson R. C.
2005
“Learning the kernel with hyperkernels”
J. Mach. Learn. Res.
Article (CrossRef Link)
6
1043 
1071
Rakotomamonjy A.
,
Bach F.
,
Canu S.
,
Grandvalet Y.
2007
“More efficiency in multiple kernel learning”
in Proc. of 24th Int. Conf. Mach.Learn.
Corvallis, OR
Article (CrossRef Link)
775 
782
Rakotomamonjy A.
,
Bach F. R.
,
Canu S.
,
Grandvalet Y.
2008
“SimpleMKL”
J. Mach. Learn. Res.
Article (CrossRef Link)
9
2491 
2521
Ratsch G.
,
Schafer C.
,
Scholkopf B.
2006
“Large scale multiple kernel earning”
J. Mach. Learn. Res.
Article (CrossRef Link)
7
1531 
1565
Zien A.
,
Ong C. S.
2007
“Multiclass multiple kernel learning”
in Proc. of 24th Int. Conf. Mach. Learn.
Corvallis, OR
Article (CrossRef Link)
1191 
1198
Burges C. J. C.
1996
“Simplified support vector decision rules”
in Proc. of 13th Int. Conf. Mach. Learn.
San Mateo, CA
Article (CrossRef Link)
71 
77
Scholkopf B.
,
Smola A.
2002
“Learning with Kernels”
MIT Press
Cambridge,MA
Article (CrossRef Link)
Nguyen D.
,
Ho T.
2005
“An efficient method for simplifying support vector machines”
in Proc. of 22nd Int. Conf. Mach. Learn.
Bonn, Germany
Article (CrossRef Link)
617 
624
Wu M.
,
Scholkopf B.
,
Bakir B.
2006
“A direct method for building sparse kernel learning algorithms”
J. Mach. Learn. Res.
Article (CrossRef Link)
7
603 
624
Wu M.
,
Scholkopf B.
,
Bakir G.
2005
“Building sparse large margin classifiers”
in Proc. of 22nd Int. Conf. Mach. Learn.
Bonn, Germany
Article (CrossRef Link)
996 
1003
Gao Shenghua
,
Tsang Ivor WaiHung
,
chia LiangTien
2010
“Kernle Sparse Representation for Image Classification and Face Recognition”
In Proc. of Proceedings 11th European Conference on Computer Vision
Article (CrossRef Link)
1 
14
Zhang Li
,
zhou WeiDa
2012
“Kernel sparse representationbased classifier”
IEEE Transactions on Signal Processing
Article (CrossRef Link)
60
(4)
1684 
1695
DOI : 10.1109/TSP.2011.2179539
Tibshirani R.
1996
“Regression shrinkage and selection via the lasso”
Journal of the Royal Statistical Society B
Article (CrossRef Link)
58
(1)
267 
288
Lanckriet G.R.G.
2004
“Learning the Kernel Matrix with Semidefinite Programming”
J.Machine Learning Research
Article (CrossRef Link)
5
27 
72
Phillips P. J.
,
Moon H.
,
Rivzi S. A.
,
Rauss P.
2000
“The FERET EvaluationMethodology for FaceRecognition Algorithms”
IEEE Transactions on Pattern Analysis and Machine Intelligence
Article (CrossRef Link)
22
1090 
1104
DOI : 10.1109/34.879790