In conventional, skin detection methods using for skin color definitions is based on prior knowledge. By experimentation, the threshold value for dividing the background from the skin region is determined subjectively. A drawback of such techniques is that their performance is dependent on a threshold value which is estimated from repeated experiments. To overcome this, the present paper introduces a skin region detection method. This method uses a histogram approximation based on the mean shift algorithm. This proposed method applies the mean shift procedure to a histogram of a skin map of the input image. It is generated by comparing with the standard skin colors in the C
b
C
r
color space. It divides the background from the skin region by selecting the maximum value according to the brightness level. As the histogram has the form of a discontinuous function. It is accumulated according to the brightness values of the pixels. It is then, approximated by a Gaussian mixture model (GMM) using the Bezier curve technique. Thus, the proposed method detects the skin region using the mean shift procedure to determine a maximum value. Rather than using a manually selected threshold value, as in existing techniques this becomes the dividing point. Experiments confirm that the new procedure effectively detects the skin region.
1. INTRODUCTION
Skin detection plays an important role in various areas of human-computer interaction, such as face detection, face tracking,content-based image data search systems, and gesture analysis.Recently, skin detection methods based on skin color data have attracted considerable attention because of their computational efficiency in terms of rotation, size, and partial obstruction of the relevant region. Skin color is used to complement the geometric data in designing the accurate face detection systems
[1
-
4]
. Skin detection via skin color is a significant preprocessing tool in face detection and recognition.
Research on skin detection is generally conducted using the visible spectrum images. However, skin detection in visible spectrum images is limited by ambient illumination, camera characteristics, ethnicity, and personal characteristics. Procedures using non-visual spectrum images, such as infrared images, have been considered as a means of resolving such issues, but these procedures require prohibitively expensive hardware devices or extremely limited environments
[5
-
9]
.
As a means of classification, skin detection distinguishes two categories: the skin region and the non-skin region. For skin detection based on color data, an efficient classification procedure requires (1) the selection of an appropriate color model, (2) the selection of a suitable model distribution for skin and non-skin pixels, and (3) the consideration of the actual distribution being modeled. Selection of an appropriate color model determines the efficiency with which a given skin color distribution can be modeled. The skin color distribution is usually modeled by a histogram or a Gaussian distribution. Various techniques for
Block diagram of proposed skin region detection method.
obtaining the actual distribution have been researched, from the simple use of a skin color lookup table to complex pattern recognition. In existing research on skin color detection, RGB images are transformed to a color space, where they can be divided by intensity and color to exclude effects of external illumination. TSL, NCC, HSV, and YC
b
C
r
are the color spaces that are usually considered. Techniques of transforming the color spaces include linear and nonlinear transformations in RGB. Linear transformations include YIQ, YYUV, and YC
b
C
r
. Nonlinear transformations include NCC, HSV, and HSL
[10
-
13]
. Recent research efforts have generally used techniques that transform higher-dimensional color spaces into lower-dimensional color spaces to save computational time
[14
,
15]
. When skin detection is conducted using predefined skin color data, the skin similarity threshold value that divides the background region from the skin region. This is determined from repeated experimentation
[16]
. Such methods are limited in that the threshold values vary according to the experimental environment and skin color data. Also, these threshold values are not standardized objectively, and are partly based on the subjective user concepts. To overcome the weaknesses in the existing procedures, this study introduces a technique of skin color detection using histogram data based on the mean shift algorithm in a lower-dimensional color space. Unlike the existing procedures, this technique does not use experimentally determined threshold values. Instead, it uses the mean shift algorithm to find local maxima, which is used as segmentation points to detect the skin region.
Methods of skin color detection can be classified based on physical characteristics and on statistical characteristics. The latter can be subdivided into parametric approaches
[17
-
19]
and nonparametric approaches
[20
-
22]
. Generally, methods based on statistical characteristics detect the skin region in a lowerdimensional color space to reduce effects of illumination. The skin color distributions used in parametric statistical approaches tend to follow a Gaussian mixture model (rather than a simple Gaussian model), and thus methods of skin detection based on these approaches usually employ a Gaussian mixture model
[18
,
23]
. Selection of model dimensions is one of the important problems which arise in such procedures. The selection is usually accomplished inductively, based on a priori environmental data, and the parameters of the skin color distribution are determined variously on the basis of ethnicity and illumination. Recent studies have made frequent use of the Expectation-Maximization (EM) algorithm as an applied technique for estimating the parameters of the Gaussian mixture model
[24]
. The EM algorithm uses a probability density function, determining the parameters of a skin color distribution by inductive estimation,based on ambient illumination and ethnicity. Nonparametric approaches use a model that expresses the general shape of the skin color distribution more easily than parametric approaches
Analysis of skin color in the CbCr space.
[21
,
22]
. Such techniques usually employ a histogram to represent the characteristics of the skin color distribution in color space. One major advantage of using a histogram is that the probability density function can be calculated quantitatively from a quantization level figure, even when the skin color distribution is complex.
Unlike the statistical approach, skin detection based on physical characteristics uses a physical model of inherent skin color. Such physical models are often employed in research methods to detect skin regions because they neglect background-based changes in illumination, and use permanent skin color characteristics.
2. METHOD OF SKIN REGION DETECTIOM
Figure 1
shows the entire block of the suggested skin region detection.
The proposed technique includes color transformation, skin map histogram generation, histogram approximation, and skin region detection via a mean shift. To accomplish the color transformation, input images are transformed to the YC
b
C
r
color space using a color transformation formula in RGB color space. The skin map histogram is generated by expressing the skin region distribution in terms of brightness values. This is calculated from a standard skin color table and similarities established in advance by using the skin color characteristics of the C
b
C
r
color space. Histogram approximation is carried out by regarding the histogram as a discontinuous function, which is approximated by a continuous Gaussian function using the Bezier curve theorem.Finally, the mean shift algorithm is applied to the histogram to find out the Gaussian local maxima in certain regions having similar brightness distributions. The brightness values of the pixels in the relevant regions are made uniform with these local maxima, and the regions having the maximum brightness value are detected as skin regions via region growing.
- 2.1 Skin color analysis
The skin region occupies certain parts of a color space, and this characteristic enables skin color to be divided from other background colors. The skin region distribution varies according to color space, and thus the choice of color space affects detection performance
[16]
. The YC
b
C
r
color space includes Y, which indicates the brightness value, and C
b
and C
r
, which represent color differences. Except for Y, the pixels of
CbCr
color space contain only color data, and are less affected by illumination. Thus, the skin color region in
CbCr
color space is effective in various illumination environments, as brightness has less effect on color values than in other color spaces. The first step in skin color region detection is to define the skin color region. This definition is accomplished by using the existing skin color region images made from images of various people's faces in varying ambient illumination. The effects of illumination should be considered in detecting skin color regions. In this study, lower-dimensional
CbCr
color space is used to minimize the effects of illumination
[14
,
15]
.
Figure 2
shows the distribution regions of the standard skin color table in
CbCr
color space.
In the images used in this study, the
Cb
skin color values were distributed primarily between 102 and 118, while the
Cr
color values were between 137 and 152. A standard skin color table was constructed on the basis of 100 Korean male and female adults illuminated by fluorescent lights in ordinary buildings. According to the results of research on face detection, skin color distribution is similar in form to a Gaussian distribution.
Thus, skin color distribution can be expressed as a 2D Gaussian function
G
(
μ
CbCr
, ∑
CbCr
).
Here,
C
b
and
C
r
denote pixel color values,
denote Gaussian mean color values, and denotes a 3D Gaussian covariance matrix.
- 2.2 Skin-map generation
The skin map used in this study is calculated from the skin region similarity between a standard skin color table and input images, and then normalized to brightness values between 0 and 255. The skin map is generated by applying the Mahalanobis distance to the Gaussian mean and covariance of predefined skin color images.
Equation (6) is the formula for calculating the Mahalanobis distance. ∑
CbCr
denotes the 2D Gaussian covariance inverse matrix, and N is the total number of pixels in the input image. The values obtained from Equation (6) indicate the degree of similarity to the skin region, but it requires normalization between 0 and 255 to express the intensity of the image. Equation (7) provides a formula for normalizing the values obtained from Equation (6). It produces brightness values that are close to 255 when the similarity to the skin region is high.
- 2.3 Skin region detection using histogram approximation
The proposed technique uses skin map histogram approximation to efficiently detect skin regions in environments with varying or complex illumination. The procedure is carried out in three steps. First, the skin map histogram is regarded as a discontinuous function to be approximated by a continuous Gaussian function, using the Bezier curve theorem. In the second step, the mean shift algorithm is used to find Gaussian local maxima in certain regions having similar brightness distributions, and the brightness values of pixels in the relevant regions are approximated at the local maxima. In the third step, uniform brightness value of each region is investigated, and the region with the highest brightness value is detected via region growing.
- 2.3.1 Histogram approximation using bezier curve
Generally, a skin map histogram has the form of a discontinuous function determined by the accumulated brightness values of the pixels. In the proposed method, this histogram is approximated by a continuous Bezier curve, using the brightness value of each level of the histogram as a Bezier control point. Equation (8) is used to obtain a histogram from a skin map.
Here, N indicates the size of the skin map.
Equations (8) and (9) are Bernstein function equations for the Bezier curve
[18]
.
Here,
Pi
is a control point for generating the Bezier curve, and
u
denotes a variable for controlling distance (smaller values of
u
corresponding to shorter distances on the curve). The Bezier curve is approximated by a Gaussian curve with Bezier control points given by
h
(level, value), which denotes the frequency of any given brightness level of the histogram. The number of dimensions in the Bezier curve formula is determined by the number of control points, and thus the Bernstein function formula has 256 dimensions. Computational errors cause higher-dimensional Bernstein functions to generate unstable Bezier curves,and thus this study uses a one-dimensional De Castelli algorithm repetitively instead of the Bernstein functions
[27]
. Equation (11) is the Bezier curve formula using the De Castelli algorithm.
The control point
PS(x)
indicates the frequency at each brightness level of the histogram, and t is a distance control variable calculated via Equation (12) (smaller values of t corresponding to shorter curves).
Skin detection process using the proposed method: (a) input image, (b) skin-map of (a), (c) histogram of (b), (d) histogram of (c) smoothed by De Castelli's algorithm, (e) skin region detection of (d) by mean shift algorithm. Analysis of skin color in the CbCr space.
- 2.3.2 Establishment of threshold value using mean shift algorithm
In the mean shift algorithm, the mode of the probability density function is found by hill climbing. The probability density function indicates the brightness distribution of pixels in the intensity image. The algorithm is a procedure for converging on a local maximum point within the kernel via repetitive calculation of mean locations and mean brightness values of pixels having a similar brightness distribution in a neighborhood of the given pixel. In other words, the pixel value at the current location is transformed to the brightness value at the local maximum, and thus the brightness values in the spatial region are made uniform. Thus, the optimal segmentation threshold value is obtained by using the mean shift algorithm. It is a point which has a valley-point in the boundary line between the uniform regions,or in the Gaussian histogram approximation. Equation (13) expresses the mean shift algorithm.
Equation (14) describes the transformation of the current pixel brightness value to the local maximum brightness value via the mean shift algorithm.
Here, x denotes the current pixel brightness value and k denotes the weight variable. Thus,
PM(X')
transforms the current pixel brightness value to the local maximum brightness value in a given region via Equation (13). Equation (15) gives the optimal threshold value for segmenting the background region and the skin region via Equations (13) and (14).
The maximum value is used because the skin region is the brightest region in the skin map.
Comparison of the existing and proposed methods: (a) input images, (b) results obtained by the existing method, (c) results obtained by the proposed method.
Equation (16) is the equation for region growing.
Here,
IR
indicates the skin region. The proposed technique is realized as follows.
step 1. The RGB input image is transformed to be YC
b
C
r
image.
step 2. From the analysis of standard skin color, skin color similarity is calculated using the following formula and a skin map is generated.
Comparison of the existing and proposed methods: (a) input images, (b) results obtained by the existing method, (c) results obtained by the proposed method.
Here,
μ
CbCr
denotes the mean, ∑
CbCr
denotes the covariance, and
is expressed as a
CbCr
component of
I
CbCr
.
step 3. After
IS
is quantized, a skin-map histogram
HS
is obtained, and is approximated by a Gaussian function using the Bezier curve of De Castelli's algorithm
[27]
.
step 4. The mean shift algorithm is used to find Gaussian local maxima in certain regions having similar brightness distributions. The brightness values of pixels in the relevant region are made uniform with the local maximum.
step 5. After the brightness values of the segmented regions are investigated, the regions having the maximum brightness value are detected as skin regions via region growing.
3. RESULTS
In the experiment, RGB color images 320 × 240 in size were captured with an ordinary digital camera.
Figure 3
shows the process of skin region detection via the proposed method.
Figure 3
(a) shows the input image, and
Fig. 3
(b) shows the normalization of skin similarity to the brightness values of 0 to 255 by applying Equations (6) and (7) to the input image.
Figure 3
(b) is brighter than the background because the face of the input image was accurately identified as the skin region by Equations (6) and (7).
Figure 3
(c) shows the brightness data of
Figure 3
(b) converted to a histogram, and
Figure 3
(d) shows the continuous approximation of the histogram via De Castelli's algorithm.
Figure 3
(e) illustrates the process of skin region detection via the proposed method.
Figure 4
compares the proposed method to the existing method
[16]
using the skin color model. The same skin color model was used in this study for the objectivity of verification.
The same skin color model was used in this study for the objectivity of verification. As
Figure 4
indicates, the proposed method accurately detected the skin regions via the mean shift procedure without a user-supplied threshold values. Also, the figure shows that compared to the existing method, the proposed method can accurately segment the skin and lip regions.
Performance evaluation of skin color detection.
Performance evaluation of skin color detection.
Figure 5
shows the results of skin region detection using the existing method
[25
,
26]
which establishes an appropriate threshold value based on ambient illumination, and the proposed method, in which segmentation points are determined via the mean shift algorithm.
Images used in skin detection experiments are generally captured in an internal environment under fluorescent light, where the skin color contamination by illumination is insignificant. In this study, strong illumination was deliberately projected onto a certain part of at the left side of the human face to investigate the performance of the proposed method. The same skin color model was used with both the existing and the proposed method for the sake of performance objectivity.
Figure 5
(a) shows input images in which illumination was projected onto the faces from a certain direction.
Figure 5
(b) shows skin detection results using the existing method, and
Figure 5
(c) shows the results using the proposed method. In this experiment, the threshold values for the existing method were selected from the optimum skin detection values determined by experiment. As
Figure 5
indicates, the proposed method detected skin regions more efficiently than the existing method, even though the skin color was changed by the illumination in certain directions. The existing technique detected the region by calculating skin similarity and establishing a threshold value at each pixel. On the other hand the proposed method applies the mean shift algorithm to the skin map histogram to find Gaussian local maxima in certain regions having similar brightness distributions, and assigns uniform brightness values to pixels in the relevant regions.
To evaluate the performance on the proposed method, we suggested Precision, Recall and Accuracy, the equation is as followed.
Recall is defined as the ratio between the number of skin pixels correctly classified by the proposed method and the total number of actual skin pixels. Accuracy means that skin region by proposed method was how many matched with real skin region. In the equation (18) and (19),
N(SM)
and
N(SA)
mean the number of real skin region and detected skin region.
N(SM∩SA)
means the number of matched pixels between the real skin region and the detected skin region.
N(SU)
means that the number of pixels that are not detected in the real skin region and
N(SO)
means that detected skin region in the not skin region.
Table 1
shows the result of the performance of the proposed method, X axis indicated the input images and y axis indicated the percentage of performance. The results of recall was 95.8%, accuracy was 97.8%. The reason for the lower value of recall than accuracy is due to the contamination of skin color on the illumination change. In terms of the result of performance the proposed method appeared as a strong method to detect the skin region.
4. CONCLUSION
This study introduces a method of skin detection by applying the mean shift algorithm to histogram data. In the existing methods using standard skin color models, skin similarity threshold values for segmenting the background region and the skin region are determined by repeated experimentation. A weakness of these techniques is that the threshold values vary according to illumination and environment. Also, established threshold values cannot be standardized objectively, and include subjective factors, determined by individual users.
In the proposed method, a skin map histogram of an input image is created by using standard skin color characteristics of the C
b
C
r
color space. The accumulated data at each brightness level are analyzed via the mean shift algorithm, and the skin region is detected by finding the regional segmentation points. Even when the skin color is contaminated by illumination, this procedure can accurately segment the skin region and the background region. The proposed method may be useful in detecting facial regions as a pretreatment for face recognition in various types of illumination.
Acknowledgements
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2011-0005952).
Kherchaoui S
,
Houacine A
2010
Face Detection Based on A Model of the Skin Color with Constranins and Template Matching.
Int'l Conf. Machine and Web Intell.
469 -
472
DOI : 10.1109/ICMWI.2010.5648043
Uongqiu T
,
Faling Y
,
Guohua C
,
Shizhong J
2010
Skin Color Detection by Illumination Estimation and Normalization in Shadow Regions.
IEEE. Conf. ICIA.
1082 -
1085
DOI : 10.1109/ICINFA.2010.5512300
Socolinsky D.A
,
Selinger A
,
Neuheisel J.D
2003
Face Recognition with Visible and Thermal Infrared Imagery.
Computer Vision Image Understanding
91
(2)
72 -
114
DOI : 10.1016/j.physletb.2003.10.071
Kong S.G
,
Heo J
,
Abidi B.R
,
Paik J
,
Abidi M.A
2005
Recent Advances in Visual and Infrared Face Recognition: A Review.
Computer Vision Image Understanding
97
(1)
103 -
135
DOI : 10.1016/j.cviu.2004.04.001
Liensberger C
,
Stottinger J
,
Kampel M
2009
Color-Based and Context-Aware Skin Detection for Online Video Annotation.
IEEE. Trans. Intl'l MMSP
1 -
6
DOI : 10.1109/MMSP.2009.5293337
Pan Z
,
Healey G
,
Prasad M
,
Tromberg B
2003
Face Recognition in Hyperspectral Images.
IEEE Trans. PatternAnal. Mach. Intell
25
(12)
1552 -
1559
DOI : 10.1109/CVPR.2003.1211372
Niazi M
,
Jafar S
2010
Hybrid Face Detection with HSV Color method and HAAR Classifier.
Int'l Conf. Software Technology and Engineering
325 -
329
DOI : 10.1109/ICSTE.2010.5608795
Popov A
,
Dimitrova D
2008
A New Approach for Finding Face Features in Color Images.
IEEE. Int'l. Intelligent Systems
33 -
37
DOI : 10.1109/IS.2008.4670517
Adachi Y
,
Imai A
,
Ozaki M
,
Ishii N
2000
Extraction of face region by using characteristics of color space and detection of face direction through an eigenspace.
Int'l Conf. Knowledge-Based Intelligent Engineering Systems and Allied Technologies
393 -
396
DOI : 10.1109/KES.2000.885839
Xinyu W
,
Huosheng X
,
Heng W
,
Heng L
2008
Robust Real-Time Face Detection with Skin Color Detection and The Modified Census Transform.
Int'l Conf. ICIA.
590 -
595
DOI : 10.1109/ICINFA.2008.4608068
Sebastian P
,
Vooi V
2007
Tracking using Normalized Cross Correlation and Color Space.
Intl'l Conf. Intelligent and Advanced Systems
770 -
774
DOI : 10.1109/ICIAS.2007.4658490
Hsu R. L
,
Abdel-Mottaleb M
,
Jain A. K
2002
Face Detectionin Color Images.
IEEE Trans. on PAMI
24
(5)
696 -
706
DOI : 10.1109/34.1000242
Darrell T
,
Gordon G. G
,
Harville M
,
Woodfill J
1998
Integrated Person Tracking Using Stereo Color and Pattern Detection.
Proc. IEEE Conf. CVPR
601 -
607
DOI : 10.1109/CVPR.1998.698667
Zhu X
,
Yang J
,
Waibel A
2000
Segmenting Hands of Arbitrary Color.
in Proc. Int'l Conf. Automatic Face and Gesture Recognition
446 -
453
DOI : 10.1109/AFGR.2000.840673
Yang M. H
,
Ahuja N
1999
Gaussian Mixture Model for Human Skin Color and Its Application in Image and Video Databases.
in Proc. SPIE Conf. Storage and Retrieval for Image and Video Databases
458 -
466
DOI : 10.1117/12.333865
Saxe D
,
Foulds R
1996
Toward Robust Skin Identification in Video Image.
in Porc. Int'l Conf. Automatic Face and Gesture Recognition
379 -
384
DOI : 10.1109/AFGR.1996.557295
Schwerdt K
,
Crowley J. L
2000
Robust Face Tracking Using Color.
in Proc. Int'l Conf. Automatic Face and Gesture Recognition
90 -
95
DOI : 10.1109/AFGR.2000.840617
Soraino M
,
Martinkauppi B
,
Huovinen S
,
Laaksonen M
2000
Skin Detection in Video under Changing Illumination Conditions.
in Proc. Int'l Conf. Pattern Recognition
1
(1)
839 -
842
DOI : 10.1109/ICPR.2000.905542
Pal A
2008
Multicues Face Detection in Complex Background for Frontal Faces.
Int'l. Machine Vision and Image Processing Conf.
57 -
62
DOI : 10.1109/IMVIP.2008.32
Diplaros A
,
Gevers T
,
Vlassis N
2004
Skin Detection using The EM Algorithm with Spatial Constraints.
IEEE. Int'l. Conf. Systems, Man and Cybernetics
4
3071 -
3075
DOI : 10.1109/ICSMC.2004.1400810
Ukil Y
,
Minsung K
,
Kar-Ann T
,
Kwanghoon S
2010
An Illumination Invariant Skin-Color Model for Face Detection.
IEEE. Int'l Conf. Biometrics: Theory Applications and Systems
1 -
6
DOI : 10.1109/BTAS.2010.5634474
D. Hyun-Chul
,
Y. Ju-Yeon
,
C. Sung-Il
2007
Skin Color Detection through Estimation and Conversion of Illuminant Color under Various Illuminations.
IEEE. Trans. Consumer Electronics
1103 -
1108
DOI : 10.1109/TCE.2007.4341592