Robust Sign Recognition System at Subway Stations Using Verification Knowledge
Robust Sign Recognition System at Subway Stations Using Verification Knowledge
ETRI Journal. 2014. Aug, 36(5): 696-703
Copyright © 2014, Electronics and Telecommunications Research Institute(ETRI)
  • Received : January 10, 2014
  • Accepted : September 05, 2014
  • Published : August 01, 2014
Export by style
Cited by
About the Authors
Dongjin Lee
Hosub Yoon
Myung-Ae Chung
Jaehong Kim

In this paper, we present a walking guidance system for the visually impaired for use at subway stations. This system, which is based on environmental knowledge, automatically detects and recognizes both exit numbers and arrow signs from natural outdoor scenes. The visually impaired can, therefore, utilize the system to find their own way (for example, using exit numbers and the directions provided) through a subway station. The proposed walking guidance system consists mainly of three stages: (a) sign detection using the MCT-based AdaBoost technique, (b) sign recognition using support vector machines and hidden Markov models, and (c) three verification techniques to discriminate between signs and non-signs. The experimental results indicate that our sign recognition system has a high performance with a detection rate of 98%, a recognition rate of 99.5%, and a false-positive error rate of 0.152.
I. Introduction
The automatic detection and recognition of both signs and text in real-world environments has been studied for several years [1] [3] . One of the reasons for such studies is the importance in obtaining useful information about real-word environments so as to assist the visually impaired. According to the Korean Ministry of Health and Welfare, the number of visually impaired persons in South Korea grew from 180,526 in 2005 to 249,259 in 2011, which is an increase of 38%. However, there were only 65 guide dogs in South Korea in 2011. For this reason, we have been developing a walking guidance system for application at subway stations. This system allows us to extract signs from a real-world image and to classify them into specific categories. The final guiding message is then provided to the blind person through tiny speakers embedded in their hat. In this way, the system can help the visually impaired find their way around a subway station.
The main role of the walking guidance system is to detect candidates for subway signs from natural scenes and then to filter out non-sign regions correctly. One of the representative algorithms for this type of system is AdaBoost, which many researchers have adopted for the detection of signs and texts [4] [6] . Although the algorithm extracts signs and texts from a cluttered background exceptionally well, it is sensitive to noisy data and outliers. Furthermore, it has a difficult time detecting most signs that are smaller than 20 pixels in diameter [7] .
To solve these problems, as described in Section III of this paper, we suggest a detection method that is robust even for small or geometrically distorted signs in natural-scene images. In Section IV, we introduce the three verification techniques used to identify candidate sign regions. In addition, the results demonstrating the superiority of our proposed system are provided. We end this paper with our conclusions in Section V.
II. System Overview
The aim of our walking guidance system is to detect and recognize two different types of signs: exit numbers and arrows from the natural-scene images found at subway stations. As shown in Fig. 1 , our system consists of the following three main stages:
1) Detection. Candidate sign (arrows and numbers) regions are detected from the natural subway scenes using the MCT-AdaBoost algorithm. A large set of images are then used to train the sign detector. A general sign detector can occasionally miss a small numeral, such as an exit number located behind a detected arrow. In this case, we can use an additional a prior knowledge-based detection algorithm that allows signs and exit numbers to be detected in a pairwise manner. Figure 2 illustrates the types of signs used in our work.
2) Recognition. The recognition process is based on the following three steps:
  • Support vector machine (SVM)-based feature extraction and classification.
  • Hidden Markov model (HMM)-based feature extraction and classification.
  • SVM/HMM hybrid recognition.
3) Knowledge-based verification. Non-sign regions are excluded using the following three effective means of verification:
  • HMM log-likelihood
  • Hough transform (HT)
  • Color information
To capture the scene images, we designed a hat with two embedded webcams, as shown in Fig. 3 . The two webcams allow us to obtain a wider field of view (FOV) (100°) (1,280 pixels × 480 pixels) with a real-time processing of about 3 fps to 8 fps using a 1.8 GHz Samsung ATIV smart PC.
PPT Slide
Lager Image
Flowchart of proposed walking guidance system.
PPT Slide
Lager Image
Examples of eight directional arrows and exit number patterns.
PPT Slide
Lager Image
Hat embedded with two webcams and a smart PC.
III. Sign Detection
- 1. AdaBoost-Based Detection
To detect candidate subway signs, we adopted the MCT-AdaBoost technique, which was originally used for face detection and aligning a detected face image [8] [9] . This technique is divided into three steps. First, to train the classifiers of the proposed sign detector, it is necessary to obtain a large set of sign (positive) and non-sign (negative) images from subway stations. We therefore collected a large set of images in seven different subway stations in Daejeon, South Korea.
Second, we used the modified census transform (MCT) feature, which compares the intensity of each pixel in a 3 × 3 kernel with the kernel mean. The resulting values of the MCT feature are integer indices representing a local structure of a pixel. Third, we trained the weak classifiers of AdaBoost by examining the distribution of the integer indices for the pixels. A combination of these weak classifiers was used as an input for the AdaBoost algorithm so as to create a strong classifier. Figure 4 shows the steps of the proposed MCT-AdaBoost detector.
PPT Slide
Lager Image
Steps of MCT-AdaBoost detector.
In our work, 30,000 training samples were determined to be positive images, and 80,000 were determined to be negative images. Thus, we were able to make a respectable sign detector that has a high success rate of 95%, with a false-positive rate of 5%. To reduce false-positive sign results, we introduce several verification methods in the next subsection.
- 2. Detection of Small Numbers
In this section, we introduce a detection technique that is robust with respect to small and geometrically distorted signs from subway scenes. The main idea is simple but very effective. We first needed to detect and recognize relatively large arrow signs, and then detect a small sign within a limited area of the scene next to this large sign. In our case, we used the prior knowledge that the small exit number signs were located next to the arrow signs. Moreover, a region including small signs has a black (or close to black) background such that the correct region is, therefore, mostly darker than other regions, as shown in Fig. 5(a) . From this idea, both candidate regions (the same size as the arrow sign) were investigated to choose the correct one. We then worked out the coordinates of the selected region of interest (ROI) in the scene under the following conditions:
Figure 5(b) shows the selected candidate region. Next, Sauvola’s locally adaptive binarization method was performed to binarize the selected region [10] . This method is an improvement of Niblack (a local masking method) and is especially for use on badly illuminated documents [11] . Then, a connected-component analysis (CCA) algorithm was performed to inspect all of the blobs and to remove non-sign blobs, as shown in Fig. 5(c) [12] . The CCA method is adopted in computer vision to find connected regions in binary images. Finally, we were able to recognize the sign blobs, as shown in Fig. 5(d) , and verify the results using an adaptive Hough transform (AHT), the algorithm of which is described in detail in Section V-2. From this proposed approach, we can achieve a detection rate of 98%.
PPT Slide
Lager Image
Sequence of detecting and recognizing small signs: (a) detected arrow, (b) selected candidate ROI from the detected arrow, (c) result of a blob analysis on ROI, and (d) recognition results.
- 3. Foreground/Background Separation
The purpose of this step is to perform foreground/ background separation. In Fig. 6 , we show two types of exit numbers with different foreground/background structures. Our detection algorithm provides these two types of areas, and we then have to separate each type.
PPT Slide
Lager Image
Two types of exit numbers with different backgrounds.
In our previous research, we utilized the Otsu threshold, which automatically finds the optimal threshold value to minimize the intra-class variance of the white and black pixels. This method has a better perform ance than the Niblack for our system to binarize a candidate region [12] . However, Otsu’s method occasionally did not work well with small signs under varying types of illumination. Moreover, the method assumes an image histogram to be bimodal; hence, the method breaks down if the distribution of two classes of data are largely unequal. To cope with this problem, we used Sauvola’s algorithm (explained in Section III-2) when the width or height of the detected region was less than 37 pixels. The results of this algorithm are shown in Fig. 7 .
PPT Slide
Lager Image
Comparison of binarization with small signs.
Next, in the segmentation step, we performed the CCA (explained in Section III-2) to eliminate non-sign blobs from the labeled blobs. For this, we used the following conditions:
  • ■ Too small a blob:W< 0.33 ×IW,H< (0.33 ×IHor 9 pixels).
  • ■ Too small or large an area compared to the size of the bounding box:A< 0.18 ×W×H,A> 0.62 ×W×H.
Here, the width ( W ), height ( H ), area ( A ), and center of the bounding box ( X , Y ) of the blobs are used, and IW and IH are the width and height of the detected sign region. When none of the blobs satisfied the above conditions, we rejected the detected sign region. Finally, the selected blob was normalized by 30 pixels × 30 pixels, and a median filter was then used to smooth the blob.
IV. Sign Recognition
- 1. Learning-Based Recognition
A. SVM-Based Recognition
In our previous work, we adopted the gradient-based feature extraction method proposed by Liu [13] . For this gradient feature extraction, we computed a gradient for each pixel to calculate a weighted vote for an orientation histogram based on the gradient magnitude, and the votes were then accumulated into eight orientation bins over the interval [0, 2π]. The gradient is computed by a Sobel Operator, which was employed to calculate the approximations of the horizontal and vertical derivatives. Next, to reduce the dimensionality of the feature vectors, the normalized 30 × 30 pixel blob in Section III-3 was divided into N × N blocks. Then, spatially connected pixels were merged into a block, so the combination of the pixel histograms created the block histogram. In our work, the block size was 5 × 5 pixels; thus, the dimensionality of the feature vectors was 200 (block size (5) × block size (5) × bin size (8)).
In the recognition step, the implementation of SVMs was based on the LIBSVM library, and we were therefore able to employ a multiclass classification using SVMs with a radial basis function (RBF) kernel, defined formally as [14]
K=exp( γ|| X i X j | | 2 ).
In this work, we trained 38,250 sample images, including 2,250 non-sign (noise) images, which were selected randomly from natural-scene images in subway stations.
B. HMM-Based Recognition
In the above section, we show an SVM-based classifier that can be used to classify sign and non-sign candidates well. However, we cannot train all samples in a real environment as there are a variety of signs and non-signs in subway stations. We therefore used an HMM-based classifier [15] , which has several advantages. First, this classifier is a stochastic model and is robust to noise and shape variations. Second, we can use the log-likelihood of the HMM as a verification technique to discriminate signs from non-signs (noise).
In the first step, to obtain the skeleton of a normalized image, we compared two different types of thinning algorithms — Zhang’s [16] and Ahmed’s [17] . As a result, Ahmed’s thinning algorithm is more suitable for our work because this algorithm preserves the shape of the normalized image such that it is invariant to rotation, as shown in Fig. 8 . This method is a rule- based thinning algorithm to generate a single-pixel-wide skeleton in a binary image and has the advantage of low computational cost.
PPT Slide
Lager Image
Comparison of two thinning algorithms: (a) input image, (b) thinned image of (a) using Zhang’s algorithm, (c) thinned image of (a) using Ahmed’s algorithm, (d) rotated input image, (e) thinned image of (d) using Zhang’s algorithm, and (f) thinned image of (d) using Ahmed’s algorithm.
For this reason, we adopted Ahmed’s thinning method. In the next step, we created a chain code by tracing the skeleton of a thinned image and storing the tracking information in a vector space. We then made the chain code smoother such that its effect is similar to a median filter [18] . Next, we found the most important points, such as end, curve, and branch points, in the chain code. Finally, we created the weighted chain code features described in our previous study [12] . Figure 9 shows how to generate feature vectors from a chain code. This set of weighted features was used to create the model for each class using HMMs.
PPT Slide
Lager Image
Method for generating a feature vector: (a) chain code of eight directions with end, curve, and branch points; (b) change the important points of each feature value; (c) ordered feature set from (b); and (d) set of weighted feature values for the significant points.
- 2. Knowledge-Based Verification
In spite of high classification rates, we still experienced erroneous classifications of non-signs as a false-positive recognition. For this reason, we additionally introduced compatible techniques with the log-likelihood of the HMM; that is, an HT and color information.
A. HMM Log-Likelihood Verification
In our previous research, we introduced the log-likelihood of the HMM as a verification technique to discriminate signs from non-sign regions [12] . The decision to accept or reject was based on the log-likelihood score of the HMM [19] . To obtain the optimal threshold, we examined 40,000 images (2,500 per class). However, we still experienced erroneous classifications of non-signs.
B. Number Verification
The HT was used to verify the exit number signs, as all such signs at subway stations in Daejeon have a circular shape. The circle equation can be described as follows:
(xa) 2 + (yb) 2 = r 2 ,
where ( a , b ) is the coordinate of the circle’s center and r is the radius of the circle. To find circles in the sign regions, we utilized the AHT, which can help reduce both storage and computational requirements compared to the HT [20] . This algorithm is divided into two steps. The first step involves a two-parameter HT to find the center of the circle, and the second step involves a one-dimensional HT, which is a simple histogram used to identify the radius of the circle. For evaluating the performance of this method, we tested 1,500 samples of exit numbers and compared the results with those obtained using the ellipse detection method in [21] . As a result, the detection result of the AHT was 99.8%, whereas, the ellipse detection method only yielded 96.2%. This means that the AHT method is more suitable than the ellipse detection method, though most of the data (images), collected from a hat embedded with two webcams, in Fig. 3 is a little skewed. Figure 10(a) shows the detected circle using the AHT.
C. Arrow Verification
In this stage, color information is used to verify the arrow signs. We found that all of the signs can be classified into four different types, as shown in Fig. 10(b) .
To use the color information, we adopted a multiclor model based on a hue-saturation-value (HSV) color space, which is more robust against illumination changes than the RGB color space [22] .
PPT Slide
Lager Image
Number verification techniques: (a) result of AHT and (b) a variety of signs from a subway station in Daejeon.
A histogram of the HSV color space was made of N = NhNs + Nv bins, and we denoted bt ( p ) ∈ {1, ... , N } as the bin index associated with the color vector yt ( p ) at pixel location p in sign images t . Here, the number of bins N is 110, where Nh , Ns , and Nv are set to 10. A histogram of four different signs is shown in Fig. 11 .
PPT Slide
Lager Image
Color (HSV) histograms of the four different signs.
To compare the color model between the digit and arrow, we defined the candidate region as R ( xt ) of the state vector xt , while the kernel density estimate qt ( x ) = { qt ( n ; x )} n=1.....N of the color distribution at time t was then composed using [23]
q t (n;x)=C pR(x) δ[ b t (p)n],
where δ is the Kronecker delta function, C is a normalizing constant that ensures
∑ n=1 N q t (n;x)=1
, and p is any pixel within R ( xt ). The number of bins N in the histogram is represented as a feature vector used to train a classifier using an SVM. In our work, we trained 16,560 samples of four different signs and tested the other 1,840 samples to evaluate the performance of the classifier. The classification result was 99.72%, and only five samples were misclassified.
Before the verification technique was applied, we combined the recognition results from two different types of classifiers, SVM and HMM, to get lower false positive rates. Because the low false positive rate is much more important than a high true positive rate to correctly inform the visually impaired person of sign information. For this work, we adopted Wu’s pairwise coupling method using the SVM [24] . This method estimated the probabilities for multiclass classification by combining all pairwise comparisons of binary SVM classifiers. After that, when the recognition result was exit numbers and arrows and not noise, and the probability was less than a certain threshold, we rejected the result from the SVM. However, when the recognition result from the SVM and HMM was the same, we accepted this result without reference to the multiclass probabilities with SVMs.
In Table 1 , we compared four different combinations. Table 1 shows the variation in average recognition rates and false positive error (FPE) from the four different cases. As we expected, adopting the proposed verification rules, we were able to obtain higher recognition rates and lower FPE rates, which can be diminished. In addition, Fig. 12 shows the detailed recognition rate fluctuation in accordance with each sign.
Average recognition rates and FPE of four different cases.
Verification case FPE Average recognition rate
SVM+HMM 1.087 97.9
SVM+HMM+log-likely 0.604 98.2
SVM+HMM+log-likely+number 0.257 98.9
SVM+HMM+log-likely+number+arrow 0.152 99.5
PPT Slide
Lager Image
Fluctuation in detailed recognition rate in accordance with each sign.
Finally, we show misclassified examples, which can occur for several different reasons, including changes in illumination, shadow, motion blur, and changes in pose (see Table 2 ).
Examples of misclassified sign candidates.
Original image Binarized image Recognition result Correct recognition result Cause of misclassification
Noise Number 2 Reflection
Number 2 Number 1 Reflection
Number 6 Number 5 Motion blur
Number 7 Number 1 Pose changes/ motion blur
Noise Arrow down-left Illumination
Noise Arrow up Shadow
V. Conclusion
We described a walking guidance system that can detect and recognize sixteen classes of exit numbers and arrow signs. Our system may be useful for assisting the visually impaired when finding their way through a subway station. The main contributions of this paper are divided into two parts. First, we exploited three verification methods: (a) HMM log-likelihood, (b) the Hough Transform, and (c) color information in Section IV-2. This significantly decreased the false-positive rate and hence, increased the reliability of the system. Second, we suggested a detection method for small signs. As described in Section III, our detector for small signs extracted an additional 1,057 candidate regions, including three noises that were not overlapped with other regions, detected by the MCT-AdaBoost detector. Additionally, the recognition result for small signs was 99.5%. These results indicate the robustness of our detector for small and geometrically distorted signs.
However, some improvements can be considered in future work. First, it is necessary to add more classes, such as toilets and elevators, as well as ticket-office pictograms in subway stations, allowing the proposed system to provide more subway-sign information. Second, the system must work in real time. The processing time of our system is currently about 7 fps to 15 fps, which is dependent on how many sign candidates are extracted from natural scenes. An improvement in the processing time will also, therefore, be made.
This research was supported by the Converging Research Center Program through the Ministry of Science, ICT and Future Planning, Korea (2013K000329) and the R&D program of MOTIE & KEIT [10041610, The development of the recognition technology for user identity, behavior, and location that has a performance approaching recognition rates of 99% on 30 people by using perception sensor network in the real environment].
Dongjin Lee received his MS degree in computer software engineering (image processing) from the University of Science and Technology, Daejeon, Rep. of Korea, in 2013. From 2008 to 2010, he was an engineer at the Samsung S1 Coporation, Seoul, Rep. of Korea. He has been a research scientist at the Electronics and Telecommunications Research Institute, Daejeon, Rep. of Korea, since 2013. He is working toward his PhD degree in electronics at Chungnam National University, Daejeon, Rep. of Korea. His research interests are in computer vision, machine learning, and affective computing.
Corresponding Author
Hosub Yoon received his PhD degree in computer science (image processing) from the Korea Advanced Institute of Science and Technology, Daejeon, Rep. of Korea, in 2003. In 1991, he was with the System Engineering Research Institute, Korea Institute of Science and Technology, Seoul, Rep. of Korea. Since 1998, he has been a research scientist at the Electronics and Telecommunications Research Institute, Daejeon, Rep. of Korea. His research interests include HRI, image processing, pattern matching, and speech processing.
Myung-Ae Chung received her BS and MS degrees in chemistry from Ewha Womans University, Seoul, Rep. of Korea, in 1986 and 1988, respectively. She moved to Germany during her PhD course at Ewha Womans University and received her PhD degree in physical chemistry from Clausthal Technical University, Lower Saxony, Germany, in 1997. From 1998 to 1999, she was a member of the research staff at the Max-Planck Institute for Polymer Research, Mainz, Germany. She joined the Electronics and Telecommunications Research Institute, Daejeon, Rep. of Korea, in 2000 and is engaging in research projects on IT convergence technologies, such as in nano-bio materials and in cognition systems.
Jaehong Kim received his PhD degree in computer engineering from Kyungpook National University, Daegu, Rep. of Korea, in 2006. He has been a research scientist at the Electronics and Telecommunications Research Institute, Daejeon, Rep. of Korea, since 2001. His research interests include socially assistive robotics for the elderly, human-robot interaction, and gesture/activity recognition.
Gerónimo D. 2013 “Traffic Sign Recognition for Computer Vision Project-Based Learning,” IEEE Trans. Educ. 25 (3) 364 - 371    DOI : 10.1109/TE.2013.2239997
Park H.S. 2013 “In-Vehicle AR-HUD System to Provide Driving-Safety Information,” ETRI J. 35 (6) 1038 - 1047    DOI : 10.4218/etrij.13.2013.0041
Müller-Schneiders S. , Nunny C. , Meuter M. “Performance Evaluation of a Real-Time Traffic Sign Recognition System,” IEEE Intell. Veh. Symp. Eindhoven, Netherlands June 4–6, 2008 79 - 84    DOI : 10.1109/IVS.2008.4621164
Lee J. “AdaBoost for Text Detection in Natural Scene,” Int.Conf. Document Anal. Recogn. Beijing, China Sept. 18–21, 2011 429 - 434    DOI : 10.1109/ICDAR.2011.93
Choi K. 2014 “State Machine and Downhill Simplex Approach for Vision-Based Nighttime Vehicle Detection,” ETRI J. 36 (3) 439 - 449    DOI : 10.4218/etrij.14.0113.0509
Hanif S.M. , Prevost L. 2009 “Text Detection and Localization in Complex Scene Images Using Constrained AdaBoost Algorithm,” Int. Conf. Document Anal. Recogn. Barcelona, Spain July 26–29, 2009 1 - 5    DOI : 10.1109/ICDAR.2009.172
Bahlmann C. “A System for Traffic Sign Detection, Tracking, and Recognition Using Color, Shape, and Motion Information,” IEEE Intell. Veh. Symp. Las Vegas, NV, USA June 6–8, 2005 255 - 260    DOI : 10.1109/IVS.2005.1505111
Froba B. , Ernst A. “Face Detection with the Modified Census Transform,” IEEE Conf. Autom. Face Gesture Recogn. Seoul, Rep. of Korea May 17–19, 2004 91 - 96    DOI : 10.1109/AFGR.2004.1301514
Ban K-D. 2011 “Tiny and Blurred Face Alignment for Long Distance Face Recognition,” ETRI J. 33 (2) 251 - 258    DOI : 10.4218/etrij.11.1510.0022
Sauvola J. , Pietikainen M. 2000 “Adaptive Document Image Binarization,” Pattern Recogn. 33 (1) 149 - 160    DOI : 10.1016/S0031-3203(99)00055-2
Yoon Y. 2013 “Best Combination of Binarization Methods for License Plate Character Segmentation,” ETRI J. 35 (3) 491 - 500    DOI : 10.4218/etrij.13.0112.0545
Lee D. , Yoon H. “Sign Recognition with HMM/SVM Hybridfor the Visually-Handicapped in Subway Stations,” Int. Joint Conf. Comput. Intell. Barcelona, Spain Oct. 5–9, 2012 631 - 634    DOI : 10.5220/0004155006310634
Liu C-L. 2008 “Handwritten Chinese Character Recognition: Effects of Shape Normalization and Feature Extraction,” Lecture Notes Comput. Sci. 4768 104 - 128    DOI : 10.1007/978-3-540-78199-8_7
Chang C.C. , Lin C.J 2011 “LIBSVM: A Library for Support Vector Machines,” ACM Trans. Intell. Syst. Technol. 2 (27) 2 - 27    DOI : 10.1145/1961189.1961199
Schlapbach A. , Bunke H. “Off-line Handwriting Identification Using HMM Based Recognizers,” Int. Conf. Pattern Recogn. Cambridge, UK Aug. 23–26, 2004 654 - 658    DOI : 10.1109/ICPR.2004.1334343
Zhang T.Y. , Suen C.Y. 1984 “A Fast Parallel Algorithm for Thinning Digital Patterns,” Commun. ACM 27 (3) 236 - 239    DOI : 10.1145/357994.358023
Ahmed M. , Ward R. 2002 “A Rotation Invariant Rule-Based Thinning Algorithm for Character Recognition,” IEEE Trans.Pattern Anal. Mach. Intell. 24 (12) 1672 - 1678    DOI : 10.1109/TPAMI.2002.1114862
Kim J. , Yoon H. “Graph Matching Method for Character Recognition in Natural Scene Images: A Study of Character Recognition in Natural Scene image Considering VisualImpairments,” IEEE Int. Conf. Intell. Eng. Syst. Poprad, Slovakia June 23–25, 2011 347 - 350    DOI : 10.1109/INES.2011.5954771
Van B.L. , Garcia-Salicetti S. , Dorizzi B. 2004 Fusion of HMM’s Likelihood and Viterbi Path for On-line Signature Verification Springer Berlin Heidelberg 3087 318 - 331    DOI : 10.1007/978-3-540-25976-3_29
Illingworth J. , Kittler J. 1987 “The Adaptive Hough Transform,” IEEE Trans. Pattern Anal. Mach. Intell. 9 (5) 690 - 698    DOI : 10.1109/TPAMI.1987.4767964
Xie Y. , Ji Q. “A New Efficient Ellipse Detection Method,” Int.Conf. Pattern Recogn Quebec, Canada Aug. 11–15, 2002 957 - 960    DOI : 10.1109/ICPR.2002.1048464
Perez P. “Color-Based Probabilistic Tracking,” European Conf. Comput. Vis. Copenhagen, Denmark May 28–31, 2002 661 - 675    DOI : 10.1007/3-540-47969-4_44
Comaniciu D. , Ramesh V. , Meer P. “Real-Time Tracking of Non-Rigid Objects Using Mean Shift,” Comput. Vis. Pattern Recogn. Hilton Head, SC, USA June 13–15, 2000 2 142 - 149    DOI : 10.1109/CVPR.2000.854761
Wu T. , Lin C. , Weng R.C. 2004 “Probability Estimates for Multiclass Classification by Pairwise Coupling,” J. Mach. Learning Res. 52 975 - 1005