Skin Region Detection Using a Mean Shift Algorithm Based on the Histogram Approximation
Skin Region Detection Using a Mean Shift Algorithm Based on the Histogram Approximation
Transactions on Electrical and Electronic Materials. 2012. Feb, 13(1): 10-15
Copyright ©2012, The Korean Institute of Electrical and Electronic Material Engineers
This is an open-access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • Received : September 19, 2011
  • Accepted : November 24, 2011
  • Published : February 25, 2012
Export by style
Cited by
About the Authors
Ki-Won Byun
Ki-Gon Nam

In conventional, skin detection methods using for skin color definitions is based on prior knowledge. By experimentation, the threshold value for dividing the background from the skin region is determined subjectively. A drawback of such techniques is that their performance is dependent on a threshold value which is estimated from repeated experiments. To overcome this, the present paper introduces a skin region detection method. This method uses a histogram approximation based on the mean shift algorithm. This proposed method applies the mean shift procedure to a histogram of a skin map of the input image. It is generated by comparing with the standard skin colors in the C b C r color space. It divides the background from the skin region by selecting the maximum value according to the brightness level. As the histogram has the form of a discontinuous function. It is accumulated according to the brightness values of the pixels. It is then, approximated by a Gaussian mixture model (GMM) using the Bezier curve technique. Thus, the proposed method detects the skin region using the mean shift procedure to determine a maximum value. Rather than using a manually selected threshold value, as in existing techniques this becomes the dividing point. Experiments confirm that the new procedure effectively detects the skin region.
Skin detection plays an important role in various areas of human-computer interaction, such as face detection, face tracking,content-based image data search systems, and gesture analysis.Recently, skin detection methods based on skin color data have attracted considerable attention because of their computational efficiency in terms of rotation, size, and partial obstruction of the relevant region. Skin color is used to complement the geometric data in designing the accurate face detection systems [1 - 4] . Skin detection via skin color is a significant preprocessing tool in face detection and recognition.
Research on skin detection is generally conducted using the visible spectrum images. However, skin detection in visible spectrum images is limited by ambient illumination, camera characteristics, ethnicity, and personal characteristics. Procedures using non-visual spectrum images, such as infrared images, have been considered as a means of resolving such issues, but these procedures require prohibitively expensive hardware devices or extremely limited environments [5 - 9] .
As a means of classification, skin detection distinguishes two categories: the skin region and the non-skin region. For skin detection based on color data, an efficient classification procedure requires (1) the selection of an appropriate color model, (2) the selection of a suitable model distribution for skin and non-skin pixels, and (3) the consideration of the actual distribution being modeled. Selection of an appropriate color model determines the efficiency with which a given skin color distribution can be modeled. The skin color distribution is usually modeled by a histogram or a Gaussian distribution. Various techniques for
Lager Image
Block diagram of proposed skin region detection method.
obtaining the actual distribution have been researched, from the simple use of a skin color lookup table to complex pattern recognition. In existing research on skin color detection, RGB images are transformed to a color space, where they can be divided by intensity and color to exclude effects of external illumination. TSL, NCC, HSV, and YC b C r are the color spaces that are usually considered. Techniques of transforming the color spaces include linear and nonlinear transformations in RGB. Linear transformations include YIQ, YYUV, and YC b C r . Nonlinear transformations include NCC, HSV, and HSL [10 - 13] . Recent research efforts have generally used techniques that transform higher-dimensional color spaces into lower-dimensional color spaces to save computational time [14 , 15] . When skin detection is conducted using predefined skin color data, the skin similarity threshold value that divides the background region from the skin region. This is determined from repeated experimentation [16] . Such methods are limited in that the threshold values vary according to the experimental environment and skin color data. Also, these threshold values are not standardized objectively, and are partly based on the subjective user concepts. To overcome the weaknesses in the existing procedures, this study introduces a technique of skin color detection using histogram data based on the mean shift algorithm in a lower-dimensional color space. Unlike the existing procedures, this technique does not use experimentally determined threshold values. Instead, it uses the mean shift algorithm to find local maxima, which is used as segmentation points to detect the skin region.
Methods of skin color detection can be classified based on physical characteristics and on statistical characteristics. The latter can be subdivided into parametric approaches [17 - 19] and nonparametric approaches [20 - 22] . Generally, methods based on statistical characteristics detect the skin region in a lowerdimensional color space to reduce effects of illumination. The skin color distributions used in parametric statistical approaches tend to follow a Gaussian mixture model (rather than a simple Gaussian model), and thus methods of skin detection based on these approaches usually employ a Gaussian mixture model [18 , 23] . Selection of model dimensions is one of the important problems which arise in such procedures. The selection is usually accomplished inductively, based on a priori environmental data, and the parameters of the skin color distribution are determined variously on the basis of ethnicity and illumination. Recent studies have made frequent use of the Expectation-Maximization (EM) algorithm as an applied technique for estimating the parameters of the Gaussian mixture model [24] . The EM algorithm uses a probability density function, determining the parameters of a skin color distribution by inductive estimation,based on ambient illumination and ethnicity. Nonparametric approaches use a model that expresses the general shape of the skin color distribution more easily than parametric approaches
Lager Image
Analysis of skin color in the CbCr space.
[21 , 22] . Such techniques usually employ a histogram to represent the characteristics of the skin color distribution in color space. One major advantage of using a histogram is that the probability density function can be calculated quantitatively from a quantization level figure, even when the skin color distribution is complex.
Unlike the statistical approach, skin detection based on physical characteristics uses a physical model of inherent skin color. Such physical models are often employed in research methods to detect skin regions because they neglect background-based changes in illumination, and use permanent skin color characteristics.
Figure 1 shows the entire block of the suggested skin region detection.
The proposed technique includes color transformation, skin map histogram generation, histogram approximation, and skin region detection via a mean shift. To accomplish the color transformation, input images are transformed to the YC b C r color space using a color transformation formula in RGB color space. The skin map histogram is generated by expressing the skin region distribution in terms of brightness values. This is calculated from a standard skin color table and similarities established in advance by using the skin color characteristics of the C b C r color space. Histogram approximation is carried out by regarding the histogram as a discontinuous function, which is approximated by a continuous Gaussian function using the Bezier curve theorem.Finally, the mean shift algorithm is applied to the histogram to find out the Gaussian local maxima in certain regions having similar brightness distributions. The brightness values of the pixels in the relevant regions are made uniform with these local maxima, and the regions having the maximum brightness value are detected as skin regions via region growing.
- 2.1 Skin color analysis
The skin region occupies certain parts of a color space, and this characteristic enables skin color to be divided from other background colors. The skin region distribution varies according to color space, and thus the choice of color space affects detection performance [16] . The YC b C r color space includes Y, which indicates the brightness value, and C b and C r , which represent color differences. Except for Y, the pixels of CbCr color space contain only color data, and are less affected by illumination. Thus, the skin color region in CbCr color space is effective in various illumination environments, as brightness has less effect on color values than in other color spaces. The first step in skin color region detection is to define the skin color region. This definition is accomplished by using the existing skin color region images made from images of various people's faces in varying ambient illumination. The effects of illumination should be considered in detecting skin color regions. In this study, lower-dimensional CbCr color space is used to minimize the effects of illumination [14 , 15] . Figure 2 shows the distribution regions of the standard skin color table in CbCr color space.
In the images used in this study, the Cb skin color values were distributed primarily between 102 and 118, while the Cr color values were between 137 and 152. A standard skin color table was constructed on the basis of 100 Korean male and female adults illuminated by fluorescent lights in ordinary buildings. According to the results of research on face detection, skin color distribution is similar in form to a Gaussian distribution.
Thus, skin color distribution can be expressed as a 2D Gaussian function G ( μ CbCr , ∑ CbCr ).
Lager Image
Lager Image
Lager Image
Lager Image
Lager Image
Here, C b and C r denote pixel color values,
Lager Image
denote Gaussian mean color values, and denotes a 3D Gaussian covariance matrix.
- 2.2 Skin-map generation
The skin map used in this study is calculated from the skin region similarity between a standard skin color table and input images, and then normalized to brightness values between 0 and 255. The skin map is generated by applying the Mahalanobis distance to the Gaussian mean and covariance of predefined skin color images.
Lager Image
Equation (6) is the formula for calculating the Mahalanobis distance. ∑ CbCr denotes the 2D Gaussian covariance inverse matrix, and N is the total number of pixels in the input image. The values obtained from Equation (6) indicate the degree of similarity to the skin region, but it requires normalization between 0 and 255 to express the intensity of the image. Equation (7) provides a formula for normalizing the values obtained from Equation (6). It produces brightness values that are close to 255 when the similarity to the skin region is high.
Lager Image
- 2.3 Skin region detection using histogram approximation
The proposed technique uses skin map histogram approximation to efficiently detect skin regions in environments with varying or complex illumination. The procedure is carried out in three steps. First, the skin map histogram is regarded as a discontinuous function to be approximated by a continuous Gaussian function, using the Bezier curve theorem. In the second step, the mean shift algorithm is used to find Gaussian local maxima in certain regions having similar brightness distributions, and the brightness values of pixels in the relevant regions are approximated at the local maxima. In the third step, uniform brightness value of each region is investigated, and the region with the highest brightness value is detected via region growing.
- 2.3.1 Histogram approximation using bezier curve
Generally, a skin map histogram has the form of a discontinuous function determined by the accumulated brightness values of the pixels. In the proposed method, this histogram is approximated by a continuous Bezier curve, using the brightness value of each level of the histogram as a Bezier control point. Equation (8) is used to obtain a histogram from a skin map.
Lager Image
Here, N indicates the size of the skin map.
Equations (8) and (9) are Bernstein function equations for the Bezier curve [18] .
Lager Image
Lager Image
Here, Pi is a control point for generating the Bezier curve, and u denotes a variable for controlling distance (smaller values of u corresponding to shorter distances on the curve). The Bezier curve is approximated by a Gaussian curve with Bezier control points given by h (level, value), which denotes the frequency of any given brightness level of the histogram. The number of dimensions in the Bezier curve formula is determined by the number of control points, and thus the Bernstein function formula has 256 dimensions. Computational errors cause higher-dimensional Bernstein functions to generate unstable Bezier curves,and thus this study uses a one-dimensional De Castelli algorithm repetitively instead of the Bernstein functions [27] . Equation (11) is the Bezier curve formula using the De Castelli algorithm.
Lager Image
The control point PS(x) indicates the frequency at each brightness level of the histogram, and t is a distance control variable calculated via Equation (12) (smaller values of t corresponding to shorter curves).
Lager Image
Lager Image
Skin detection process using the proposed method: (a) input image, (b) skin-map of (a), (c) histogram of (b), (d) histogram of (c) smoothed by De Castelli's algorithm, (e) skin region detection of (d) by mean shift algorithm. Analysis of skin color in the CbCr space.
- 2.3.2 Establishment of threshold value using mean shift algorithm
In the mean shift algorithm, the mode of the probability density function is found by hill climbing. The probability density function indicates the brightness distribution of pixels in the intensity image. The algorithm is a procedure for converging on a local maximum point within the kernel via repetitive calculation of mean locations and mean brightness values of pixels having a similar brightness distribution in a neighborhood of the given pixel. In other words, the pixel value at the current location is transformed to the brightness value at the local maximum, and thus the brightness values in the spatial region are made uniform. Thus, the optimal segmentation threshold value is obtained by using the mean shift algorithm. It is a point which has a valley-point in the boundary line between the uniform regions,or in the Gaussian histogram approximation. Equation (13) expresses the mean shift algorithm.
Lager Image
Equation (14) describes the transformation of the current pixel brightness value to the local maximum brightness value via the mean shift algorithm.
Lager Image
Here, x denotes the current pixel brightness value and k denotes the weight variable. Thus, PM(X') transforms the current pixel brightness value to the local maximum brightness value in a given region via Equation (13). Equation (15) gives the optimal threshold value for segmenting the background region and the skin region via Equations (13) and (14).
Lager Image
The maximum value is used because the skin region is the brightest region in the skin map.
Lager Image
Comparison of the existing and proposed methods: (a) input images, (b) results obtained by the existing method, (c) results obtained by the proposed method.
Equation (16) is the equation for region growing.
Lager Image
Here, IR indicates the skin region. The proposed technique is realized as follows.
step 1. The RGB input image is transformed to be YC b C r image.
step 2. From the analysis of standard skin color, skin color similarity is calculated using the following formula and a skin map is generated.
Lager Image
Comparison of the existing and proposed methods: (a) input images, (b) results obtained by the existing method, (c) results obtained by the proposed method.
Lager Image
Here, μ CbCr denotes the mean, ∑ CbCr denotes the covariance, and
Lager Image
is expressed as a CbCr component of I CbCr .
step 3. After IS is quantized, a skin-map histogram HS is obtained, and is approximated by a Gaussian function using the Bezier curve of De Castelli's algorithm [27] .
step 4. The mean shift algorithm is used to find Gaussian local maxima in certain regions having similar brightness distributions. The brightness values of pixels in the relevant region are made uniform with the local maximum.
step 5. After the brightness values of the segmented regions are investigated, the regions having the maximum brightness value are detected as skin regions via region growing.
In the experiment, RGB color images 320 × 240 in size were captured with an ordinary digital camera. Figure 3 shows the process of skin region detection via the proposed method.
Figure 3 (a) shows the input image, and Fig. 3 (b) shows the normalization of skin similarity to the brightness values of 0 to 255 by applying Equations (6) and (7) to the input image. Figure 3 (b) is brighter than the background because the face of the input image was accurately identified as the skin region by Equations (6) and (7). Figure 3 (c) shows the brightness data of Figure 3 (b) converted to a histogram, and Figure 3 (d) shows the continuous approximation of the histogram via De Castelli's algorithm. Figure 3 (e) illustrates the process of skin region detection via the proposed method. Figure 4 compares the proposed method to the existing method [16] using the skin color model. The same skin color model was used in this study for the objectivity of verification.
The same skin color model was used in this study for the objectivity of verification. As Figure 4 indicates, the proposed method accurately detected the skin regions via the mean shift procedure without a user-supplied threshold values. Also, the figure shows that compared to the existing method, the proposed method can accurately segment the skin and lip regions.
Performance evaluation of skin color detection.
Lager Image
Performance evaluation of skin color detection.
Figure 5 shows the results of skin region detection using the existing method [25 , 26] which establishes an appropriate threshold value based on ambient illumination, and the proposed method, in which segmentation points are determined via the mean shift algorithm.
Images used in skin detection experiments are generally captured in an internal environment under fluorescent light, where the skin color contamination by illumination is insignificant. In this study, strong illumination was deliberately projected onto a certain part of at the left side of the human face to investigate the performance of the proposed method. The same skin color model was used with both the existing and the proposed method for the sake of performance objectivity.
Figure 5 (a) shows input images in which illumination was projected onto the faces from a certain direction. Figure 5 (b) shows skin detection results using the existing method, and Figure 5 (c) shows the results using the proposed method. In this experiment, the threshold values for the existing method were selected from the optimum skin detection values determined by experiment. As Figure 5 indicates, the proposed method detected skin regions more efficiently than the existing method, even though the skin color was changed by the illumination in certain directions. The existing technique detected the region by calculating skin similarity and establishing a threshold value at each pixel. On the other hand the proposed method applies the mean shift algorithm to the skin map histogram to find Gaussian local maxima in certain regions having similar brightness distributions, and assigns uniform brightness values to pixels in the relevant regions.
To evaluate the performance on the proposed method, we suggested Precision, Recall and Accuracy, the equation is as followed.
Lager Image
Lager Image
Recall is defined as the ratio between the number of skin pixels correctly classified by the proposed method and the total number of actual skin pixels. Accuracy means that skin region by proposed method was how many matched with real skin region. In the equation (18) and (19), N(SM) and N(SA) mean the number of real skin region and detected skin region. N(SM∩SA) means the number of matched pixels between the real skin region and the detected skin region. N(SU) means that the number of pixels that are not detected in the real skin region and N(SO) means that detected skin region in the not skin region.
Table 1 shows the result of the performance of the proposed method, X axis indicated the input images and y axis indicated the percentage of performance. The results of recall was 95.8%, accuracy was 97.8%. The reason for the lower value of recall than accuracy is due to the contamination of skin color on the illumination change. In terms of the result of performance the proposed method appeared as a strong method to detect the skin region.
This study introduces a method of skin detection by applying the mean shift algorithm to histogram data. In the existing methods using standard skin color models, skin similarity threshold values for segmenting the background region and the skin region are determined by repeated experimentation. A weakness of these techniques is that the threshold values vary according to illumination and environment. Also, established threshold values cannot be standardized objectively, and include subjective factors, determined by individual users.
In the proposed method, a skin map histogram of an input image is created by using standard skin color characteristics of the C b C r color space. The accumulated data at each brightness level are analyzed via the mean shift algorithm, and the skin region is detected by finding the regional segmentation points. Even when the skin color is contaminated by illumination, this procedure can accurately segment the skin region and the background region. The proposed method may be useful in detecting facial regions as a pretreatment for face recognition in various types of illumination.
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2011-0005952).
Jiang Z , Wu Z , Yao M 2008 Skin Detection on Images with Color Deviation. IEEE Trans Congress on Services, Part Ⅱ 171 - 174    DOI : 10.1109/SERVICES-2.2008.21
Kherchaoui S , Houacine A 2010 Face Detection Based on A Model of the Skin Color with Constranins and Template Matching. Int'l Conf. Machine and Web Intell. 469 - 472    DOI : 10.1109/ICMWI.2010.5648043
Zhengming L , Tong Z , Jin Z 2010 Skin Detection in Color Images. Int'l Conf. ICCET. 156 - 159    DOI : 10.1109/ICCET.2010.5486235
Uongqiu T , Faling Y , Guohua C , Shizhong J 2010 Skin Color Detection by Illumination Estimation and Normalization in Shadow Regions. IEEE. Conf. ICIA. 1082 - 1085    DOI : 10.1109/ICINFA.2010.5512300
Socolinsky D.A , Selinger A , Neuheisel J.D 2003 Face Recognition with Visible and Thermal Infrared Imagery. Computer Vision Image Understanding 91 (2) 72 - 114    DOI : 10.1016/j.physletb.2003.10.071
Kong S.G , Heo J , Abidi B.R , Paik J , Abidi M.A 2005 Recent Advances in Visual and Infrared Face Recognition: A Review. Computer Vision Image Understanding 97 (1) 103 - 135    DOI : 10.1016/j.cviu.2004.04.001
Nunez A. S , Mendenhall M. J 2008 Detection of Human Skin in Near Infrared Hyperspectral Imagery. IEEE. Int'l IGARSS. 2 621 - 624    DOI : 10.1109/IGARSS.2008.4779069
Liensberger C , Stottinger J , Kampel M 2009 Color-Based and Context-Aware Skin Detection for Online Video Annotation. IEEE. Trans. Intl'l MMSP 1 - 6    DOI : 10.1109/MMSP.2009.5293337
Pan Z , Healey G , Prasad M , Tromberg B 2003 Face Recognition in Hyperspectral Images. IEEE Trans. PatternAnal. Mach. Intell 25 (12) 1552 - 1559    DOI : 10.1109/CVPR.2003.1211372
Hjelm E , Low B.K 2001 Face Detection: A Survey. Computer Vision and Image Understanding 83 (3) 236 - 274    DOI : 10.1006/cviu.2001.0921
Niazi M , Jafar S 2010 Hybrid Face Detection with HSV Color method and HAAR Classifier. Int'l Conf. Software Technology and Engineering 325 - 329    DOI : 10.1109/ICSTE.2010.5608795
Popov A , Dimitrova D 2008 A New Approach for Finding Face Features in Color Images. IEEE. Int'l. Intelligent Systems 33 - 37    DOI : 10.1109/IS.2008.4670517
Adachi Y , Imai A , Ozaki M , Ishii N 2000 Extraction of face region by using characteristics of color space and detection of face direction through an eigenspace. Int'l Conf. Knowledge-Based Intelligent Engineering Systems and Allied Technologies 393 - 396    DOI : 10.1109/KES.2000.885839
Xinyu W , Huosheng X , Heng W , Heng L 2008 Robust Real-Time Face Detection with Skin Color Detection and The Modified Census Transform. Int'l Conf. ICIA. 590 - 595    DOI : 10.1109/ICINFA.2008.4608068
Sebastian P , Vooi V 2007 Tracking using Normalized Cross Correlation and Color Space. Intl'l Conf. Intelligent and Advanced Systems 770 - 774    DOI : 10.1109/ICIAS.2007.4658490
Hsu R. L , Abdel-Mottaleb M , Jain A. K 2002 Face Detectionin Color Images. IEEE Trans. on PAMI 24 (5) 696 - 706    DOI : 10.1109/34.1000242
Darrell T , Gordon G. G , Harville M , Woodfill J 1998 Integrated Person Tracking Using Stereo Color and Pattern Detection. Proc. IEEE Conf. CVPR 601 - 607    DOI : 10.1109/CVPR.1998.698667
Zhu X , Yang J , Waibel A 2000 Segmenting Hands of Arbitrary Color. in Proc. Int'l Conf. Automatic Face and Gesture Recognition 446 - 453    DOI : 10.1109/AFGR.2000.840673
Yang M. H , Ahuja N 1999 Gaussian Mixture Model for Human Skin Color and Its Application in Image and Video Databases. in Proc. SPIE Conf. Storage and Retrieval for Image and Video Databases 458 - 466    DOI : 10.1117/12.333865
Saxe D , Foulds R 1996 Toward Robust Skin Identification in Video Image. in Porc. Int'l Conf. Automatic Face and Gesture Recognition 379 - 384    DOI : 10.1109/AFGR.1996.557295
Schwerdt K , Crowley J. L 2000 Robust Face Tracking Using Color. in Proc. Int'l Conf. Automatic Face and Gesture Recognition 90 - 95    DOI : 10.1109/AFGR.2000.840617
Soraino M , Martinkauppi B , Huovinen S , Laaksonen M 2000 Skin Detection in Video under Changing Illumination Conditions. in Proc. Int'l Conf. Pattern Recognition 1 (1) 839 - 842    DOI : 10.1109/ICPR.2000.905542
Pal A 2008 Multicues Face Detection in Complex Background for Frontal Faces. Int'l. Machine Vision and Image Processing Conf. 57 - 62    DOI : 10.1109/IMVIP.2008.32
Diplaros A , Gevers T , Vlassis N 2004 Skin Detection using The EM Algorithm with Spatial Constraints. IEEE. Int'l. Conf. Systems, Man and Cybernetics 4 3071 - 3075    DOI : 10.1109/ICSMC.2004.1400810
Ukil Y , Minsung K , Kar-Ann T , Kwanghoon S 2010 An Illumination Invariant Skin-Color Model for Face Detection. IEEE. Int'l Conf. Biometrics: Theory Applications and Systems 1 - 6    DOI : 10.1109/BTAS.2010.5634474
D. Hyun-Chul , Y. Ju-Yeon , C. Sung-Il 2007 Skin Color Detection through Estimation and Conversion of Illuminant Color under Various Illuminations. IEEE. Trans. Consumer Electronics 1103 - 1108    DOI : 10.1109/TCE.2007.4341592
Ding R , Zhang Y 2003 The Extension of The Dual De Casteljau Algorithm. Int'l Conf. on PDCAT 688 - 692    DOI : 10.1109/PDCAT.2003.1236392