We present a scalable object tracking framework, which is capable of removing shadows and tracking the people. The framework consists of background subtraction, fuzzy based shadow removal and boundary tracking algorithm. This work proposes a general-purpose method that combines statistical assumptions with the object-level knowledge of moving objects, apparent objects, and shadows acquired in the processing of the previous frames. Pixels belonging to moving objects and shadows are processed differently in order to supply an object-based selective update. Experimental results demonstrate that the proposed method is able to track the object boundaries under significant shadows with noise and background clutter.
Video surveillance systems rely on the ability to detect moving objects in the video stream which is a relevant information extraction step in a wide range of computer vision applications. This should be done in a reliable and effective way in order to cope with unconstrained environments such as non stationary background, shadow removal and other noisy environments. The scientific challenge is to devise and implement automatic systems able to detect and track moving objects, and interpret their behaviours and activities.
preservation that will be specifically addressed in the sequel include illumination changes, moving background, cast shadows, bootstrapping and camouflage. More generally, Elgammal, Harwood and Davis proposed the nonparametric estimation method for modelling the background. They used kernel density estimation (KDE) to establish local membership of a pixel. Besides the temporally local information, spatially global cues may provide the pixel model with complement evidence in foreground segmentation
Cucchiara, Grana, Piccardi, and Prati used a temporal median filtering in the RGB color space to produce a background model, and explored the hue, saturation and value color space for shadow detection and classification as shadows those pixels having the approximately the same hue and saturation values compared to the background, but lower luminosity
. Salvador, Cavallaro, and Ebrahimi presented a method for shadow identification suited for both still images and video sequences. In particular, their approach for video sequences consists of an initial hypothesis based on RGB differences between each frame and the reference image, and a validation stage by exploiting photometric and geometric properties of shadows
Leone and Distante proposed a shadow detection algorithm for intensity images based on texture analysis. In their approach, patches of each new frame are compared with the respective patches of the background model, and Gabor functions are used to detect if textural information remains the same
. YingLi Tian, Haowei Liu and Ming-Ting Sun proposed a background model based on Gaussian mixtures, to handle complex situations, several improvements are implemented for shadow removal, quick-lighting change adaptation, fragment reduction, and keeping a stable update rate for video streams with different frame rates
. Rittscher, Kato, Joga, and Blake used both, Hidden Markov Model (HMM) and Markov random field (MRF) for foreground and shadow segmentation. In their work, each site (or block) is model by a single HMM independent of the neighbouring sites (or blocks). The HMM and the MRF are employed in two different processes to impose temporal and spatial contextual constraints, respectively
In this work, we provide fuzzy based shadow removal and integrated boundary detection for video surveillance. The main features of the proposed method are the following.
The paper is organized as follows: Motivation and related works are illustrated in section 2. The proposed method is presented in section 3. Experimental results using the proposed system are described in Section 4. Section 5 concludes this paper and discusses future work.
1) Background subtraction,
2) Edge detection,
3) Image Denoising
4) Shadow removal based on fuzzy
5) Boundary detection and
6) People Tracking
2. Motivation and Related Works
In the literature, video object tracking has been intensively studied and many effective methods have been proposed. For single-target tracking, various object appearance models and motion models are well exploited to estimate target state (location, velocity, etc.)
. Recently, a class of techniques called “tracking by detection” has been shown to provide promising results
. The texture based is used to differentiate the background and shadow region of the given input sequence. This method is most promising part for shadow detection
. The foreground contains object of interest and background is complementary set. A background subtraction technique should identify as a foreground region, as the definition of foreground objects relates to the application level
Cast shadows produce troublesome effects, typically for object tracking from a fixed viewpoint, since it yields appearance variations of objects depending on whether they are inside or outside the shadow. Matsushita
proposed a framework based on the idea of intrinsic images to handle such appearance variations by removing shadows in the image sequence
. Wang, Tan, Loe and Wu proposed a probabilistic approach for background subtraction and shadow removal. In their method, a combined intensity and edge measure was used for shadow detection, and temporal continuity was used to improve detection results. Results of their proposed method are good, but the determination of several parameters needed by their model increase the computational cost of the method
. proposed adaptive background model based on Gaussian mixtures, a local normalized cross-correlation metric to detect shadows, and a texture similarity metric to detect illumination changes
proposed method foreground is separated from shadow by integrating gradient and intensity information dynamically
proposed linebased algorithms to improve the accuracy of shadow elimination
proposed a simple statistical model for background separation, and explored the standard deviation of pixel ratios in small neighborhoods for shadow identification
. used the MoG model for background subtraction and explored “ratio edges” for shadow identification and removal. The core of their approach for shadow identification is to compute local ratios, which are modeled as chi-squared distributions in shadowed regions
. Leone and Distante proposed a shadow detection algorithm for intensity images based on texture analysis. In their proposed approach, patches of each new frame were compared with the respective patches of the background model, and Gabor functions were used to detect if textural information remains the same for shadowed regions
proposed a moving cast shadow detection algorithm that combines shading, color, texture, neighborhoods, and temporal consistency in the scene
Yang Wang presented an approach of moving vehicle detection and cast shadow removal for video based traffic monitoring. He developed computationally efficient algorithm to discriminate moving cast shadows and handle non stationary background processes for real-time vehicle detection in video streams
proposed a statistical learning-based approach to learn and remove cast shadows
. Jung proposed a new method for background subtraction and shadow removal for grayscale video sequences. The background image was modeled using robust statistical descriptors, and a noise estimate was obtained. Foreground pixels were extracted, and the statistical approach combined with geometrical constraints, was adopted to detect and remove shadows
. Besides chroma information, the method in
combine stereo information to remove shadow. However, this kind of method needs multiple cameras and complicated camera calibration.
described a novel framework for detection and suppression of properly shadowed regions for most possible scenarios occurring in real video sequences. Their technique can detect both achromatic and chromatic shadows even in the presence of camouflage that occurs when foreground regions are very similar in color to shadowed regions. To detect shadowed regions in a scene, the values of the background image are divided by values of the current frame in the RGB color space
presented a novel method for shadow removal using Markov random fields (MRF). In their method first they constructed the shadow model in a hierarchical manner. At the pixel level, they used the Gaussian mixture model to model the behavior of cast shadows for every pixel in the HSV color space. Second, they constructed an MRF model to represent the dependencies between the label of a pixel and the shadow models of its neighbors
In this paper we used fuzzy based shadow removal and object tracking by using Kalman filters. By this efficient way of shadow removal and object tracking we met good accuracy. The proposed method is explored in the next section in detail.
3. Proposed System
The proposed method is explored in
. The block diagram consists following major blocks; background subtraction, edge detection, image denoising, shadow removal, boundary detection and people tracking. These blocks are explained in the following sub sections.
Proposed Shadow removal method for People Tracking
- 3.1 Background subtraction
The foreground objects are segmented from the background frame of the current video sequences by performing several algorithms such as MoG, Fuzzy and thresholding. We use the following method to segment out the foreground pixel, each pixel of a detect image is classified by using the background model. By taking the minimum m(x) and maximum n(x) intensity values and the maximum difference d(x) between consecutive frames that represent the background scene model B(x), pixel x from image I is a foreground pixel.
) is the Background pixel, I(x,y) is the difference between the current frame and the background frame. m(x) is the minimum intensity values. n(x) is the maximum intensity values. d(x) is the maximum difference between consecutive frames. Where k is the Threshold constant, in this case we consider k is ranging from 150 to 180.
- 3.2. Edge Detection
We use a canny edge detection algorithm to detect the edges and a fill the image regions by using the morphological operations. The purpose of edge detection is used to reduce image, while preserving for structuring and smoothing purpose.
- 3.3 Image denoising
Generally, denoising is used after moving objects detection to improve the accuracy. It leads to mistake some moving objects for noises and delete them. Therefore, we use median filtering to delete the noises in brightness distortion and chromaticity distortion of each frame before moving shadow removal and objects detection. This can reduce the influence to moving objects detection. It also can remove most of noises and increase detection accuracy. Obviously, using median filtering on brightness distortion matrix and chromaticity distortion matrix before moving objects detection can delete most of noises and protect the moving objects. Using median filtering on brightness distortion matrix can also reduce the square error of brightness distortion. Median filtering is a kind of smoothing technique, effective used for removing noise in smooth patches or smooth regions of a signal and also save the edges. Edges are of critical importance to the visual appearance of images. We used a 3x 3 window based median filter to remove the noises from the background subtracted Images.
- 3.4 Shadow removal using fuzzy
Shadow detection is the process of classification of foreground pixels as shadow pixels based on their appearance with respect to the reference frame and the background. The shadow detection algorithm we have defined by using fuzzy aims to prevent moving cast shadows being misclassified as moving objects by improving the background update and reducing the under segmentation problem. The major problem is how to distinguish between moving cast shadows and moving object points. The fuzzy rule has membership function tool that is used to possibility checking iteration of the given image. In our work the fuzzy rule is heuristically defined using the two membership functions. These two membership function types are defined to represent the background pixel distributions and shadow pixel distribution respectively. Each membership function has a corresponding membership value for every region, which indicates the degree of belonging to that region in combination with the widely used fuzzy IF–THEN rule structure. B (
) denotes the Background pixel where
be the rows and columns of the image. E (
) denotes the Edge pixels. S(
) denotes the Shadow pixels.
The surface plot of pixel Distribution using fuzzy viewer is shown in
IF B(i, j) == 1 AND E(i, j) == 1
THEN S(i, j) == 0
ELSEIF B(i, j) == 0 AND E(i, j) == 1
THEN S(i, j) == 0
ELSEIF B(i, j) == 0 AND E(i, j) == 0
THEN S(i, j) == 0
ELSEIF B(i, j) == 1 AND E(i, j) == 0
THEN S(i, j) == 1
Surface plot of pixel distibution using fuzzy viewer
- 3.5 Boundary detection
The border tracing algorithm is used to extract the contours of the objects (regions) from an image. When apply this algorithm it is implicit that the image with regions is either binary or those regions have been previously labelled.
1. Search the image from top left until a pixel of a new region is found; this pixelP0is the starting pixel of the region border. Define a variabledirwhich stores the path of the previous move along the border from the previous border element to the current border component. Assign,
(i)dir=0 if the border is detected in 4-connectivity as shown inFig. 3(a)
(ii)dir=7 if the border is detected in 8- connectivity as shown inFig. 3(b).
2. Search the 3x3 neighbourhood of the current pixel in an anti-clockwise direction, beginning the neighbourhood search at the pixel positioned in the direction
(i) (dir+3)mod4 as shown inFig. 3(c)
(ii) (dir+7)mod8 ifdiris even as shown inFig. 3(d)
(dir+6)mod8 ifdiris odd as shown inFig. 3(e)
3. If the current boundary elementPnis equal to the second border elementP1and if the previous border elementPn−1is equal toP0, stop. Otherwise repeat 2ndstep.
4. The detected border is represented by pixelsP0…Pn−2. Boundary tracing in 8-connectivity is shown inFig. 3(f). The dashed lines in the figure show pixels tested during the border tracing.
(a) Direction notation, 4-connectivity; (b) 8-Connectivity; (c) pixel neighbourhood search sequence is 4-connectivity; (d) and (e) search sequence in 8-connectivity, (f) boundary tracing in 8-connectivity.
- 3.6 People tracking
Establishing correspondence of connected components between frames is accomplished using a linearly predictive multiple hypotheses tracking algorithm which incorporates both position and size. We have implemented an online method for seeding and maintaining sets of Kalman filters as explained below. The equations for the Kalman filter fall into two groups, time update equations and measurement renew equations. The time update equations are dependable for projecting forward (in time) the current state and error covariance
For each time step
, a Kalman filter first makes a prediction
of the state at this time step
is a vector representing process state at time
−1 and A is a process transition matrix.
is a control vector at time
, which accounts for the action that the robot takes in response to state
, B converts the control vector
into state space. In our model of moving objects on 2D camera images, state is a 4-dimensional vector [x; y; dx; dy], where x and y represent the coordinates of the object’s center, and dx and dy represent its velocity. The transition matrix is thus simply
3.6.2. Error covariance prediction
The Kalman filter concludes the time update steps by projecting estimate error Covariance
forward one time step:
is a matrix representing error covariance in the state prediction at time k, and Q is the process noise covariance. Intuitively, the lower the prediction error covariance
, the more we trust the prediction of the state
. Prediction error covariance will be low if the process is precisely modelled, so the entries of Q are fairly low. Unfortunately, Determining Q for any process model is often difficult – Q depends on hard-to-predict variables such as how often the target object changes velocity.
3.6.3. Measurement update
After predicting the state
(and its error covariance) at time k using the time update steps, the Kalman filter next uses measurements to “correct” its prediction during the measurement update steps.
1) Kalman Gain: First, the Kalman filter computes a Kalman gain Kk, which is later used to correct the state estimate
3.6.4 State update
Using Kalman gain K
and measurements Z
from this time step
, we can update the state estimate
Where H is a matrix converting state space into measurement space (discussed below), and R is measurement noise covariance
2) Like Q, determining Rkfor a set of measurements is often difficult, many Kalman filter implementations statically analyse training data to determine a fixed R for all future time updates. We instead allow R to be dynamically calculated from the measurement algorithms’ state. This procedure is detailed at the end of this section.
Conventionally, the measurements Z
is often derived from sensors. In our approach, measurements Z
is instead the output of various tracking algorithms given the same input: one frame of a streaming video, and the most likely
coordinates of the target object in this frame.
3.6.5 Error covariance update
The final step of the Kalman filter’s iteration is to update the error covariance
The simplified error covariance will be significantly decreased if the measurements are accurate (some entries in R
are low), or only slightly decreased if the measurements are noise (all of R
is high). Kalman filters are easily able to take tracking algorithm outputs as measurements. However, the difficulty in combining arbitrary tracking algorithms as measurements comes from computing the Kalman gain, R
, the measurement covariance matrix, is complex to resolve. Our approach to this problem computes an error or noise estimate for each tracking algorithm. This computation is trained based on regression of image features that represent each algorithm’s weaknesses. The features we propose are detailed below.
At each frame, we have an available pool of Kalman models and a new available pool of connected components that they could explain. First, the models are probabilistically matched to the connected regions that they could explain. Second, the connected regions which could not be sufficiently explained are checked to find new Kalman models. Finally, models whose fitness (as determined by the inverse of the variance of its prediction error) falls below a threshold are removed. Matching the models to the connected components involves checking each existing model against the available pool of connected components which are larger than a pixel or two. All matches are used to update the corresponding model. If the updated model has sufficient fitness, it will be used in the following frame. If no match is found a “null” match can be hypothesized which propagates the model as expected and decreases its fitness by a constant factor. The unmatched models from the current frame and the previous two frames are then used to hypothesize new models. Using pairs of unmatched connected components from the previous two frames, a model is hypothesized. If the current frame contains a match with sufficient fitness, the updated model is added to the existing models. To avoid possible combinatorial explosions in noisy situations, it may be desirable to limit the maximum number of existing models by removing the least probable models when excessive models exist.
4. Experimental Results
In the proposed shadow removal method for people tracking the works are carried out in the following steps; a) background subtraction, b) edge detection, c) image denoising d) shadow removal based on fuzzy, e) boundary detection and f) people tracking. In our method first background subtraction is carried out on the selected frame and foreground object is detected. For edge detection processes we use canny edge detection algorithm. Image denoising is carried out by using median filter. And then shadow removal is attended by using fuzzy logic. To track the people boundary detection processes is necessary. In the boundary detection, the border tracing algorithm is used to extract the contours of the objects (regions) from an image. Finally peoples are tracked without shadow by using Kalman filters. The simulation results of different frames are given in
Processing on 78th Frame: (a) Input Frame; (b) Background Subtracted; (c) Detected shadow; (d) Detected Foreground; (e) Tracking with Shadow; (d) Tracking without Shadow
Processing on 102th Frame: (a) Input Frame; (b) Background Subtracted; (c) Detected shadow; (d) Detected Foreground; (e) Tracking with Shadow; (f) Tracking without Shadow
To analyze our proposed method results we have calculated true positive (TP) for a correctly classified foreground pixel, true negative (TN) for a correctly classified background pixel, false positive (FP) for a background pixel that was incorrectly classified as foreground and false negative (FN) for a foreground pixel that was incorrectly classified as background for each pixel in the selected frame. After every pixel had been classified into one of those four groups(TP, TN, FP and FN) of parameter, sensitivity (Recall), specificity, F1 (Figure of Merit or F-measure) Precision and accuracy were calculated.
Sensitivity (Recall) measures the section of actual positives which are correctly identified. Specificity measures the proportion of negatives which are correctly known. Precision and accuracy is used to describe and measure the estimate or predict. Recall, also known as detection rate, gives the percentage of detected true positives as compared to the total number of true positives in the ground truth where is the total number of true positives in the ground truth. Moreover, we considered F1 that is the weighted harmonic mean of precision and recall. Sensitivity, specificity, F1, accuracy and precision are specified in the Eqs. (8-11) and (12) respectively. The parameter analysis of proposed method is tabulated under tabe1. The simulation results shows that the proposed method has over 99% accuracy in people tracking by eliminating shadows in the frame.
This paper has presented fuzzy based shadow removal algorithm for moving object detection in image sequences. This system has the unique characteristic of explicitly addressing various troublesome situations such as shadows and noisy environments. This system has been tested in a wide range of different environments and applications. Our method has been evaluated against several video sequences including both indoor and outdoor scenes. Unlike other shadow removal method that computes multiple and more complex statistics at a time, the very simple fuzzy operator requires very limited computation. This approach, consequently, allows fast detection of moving objects which for many applications in a real time even on ordinary PCs, this in turn, allows consecutive higher-level tasks such as tracking and classification to be easily performed in real time. Comparisons to other approaches presented in the literature have shown that our approach provides better results when compared to the other new technologies. Currently, the method requires a non moving camera, which restricts its usage in certain applications. In future we plan to extend the method to support also moving cameras and to develop more accurate shadow removal method that could be implemented in Field Programmable Gate Array (FPGA).
Niranjil Kumar A received M.E. degree specialized in Applied Electronics from Anna University, Chennai, Tamil Nadu, India, in 2006. He is working as Assistant Professor in the department of ECE, P.S.R. Rengasamy College of Engineering for Women, Sivakasi, Tamil Nadu, India. He has published many papers in video surveillance, background subtraction, image quality measure and image segmentation. His research is mainly focused on video surveillance.
Sureshkumar C received M.E degree in Computer Science in 2006, and Ph.D. in Computer Science and Engineering in 2011, both from Anna University, Tamil Nadu, India. He is working as Principal in J. K. K. Nattraja College of Engineering and Technology, Namakkal, Tamil Nadu. He has published many papers in computer vision applied to automation, motion analysis, image matching, image classification and view based object recognition and management oriented empirical and conceptual papers in leading journals and magazines. His present research is focused on statistical learning and its application to computer vision and image understanding, problem recognition and video surveillance.
“Non-parametric Model for Background Subtraction”
6th European Conference on Computer Vision
Shum H. Y.
in Proc. ECCV
“Detecting moving objects, ghosts, and shadows in video streams”
IEEE Trans. Pattern Anal. Mach. Intell.
DOI : 10.1109/TPAMI.2003.1233909
“Cast shadow segmenta-tion using invariant color features”
Comput. Vis. Image Understand
DOI : 10.1016/j.cviu.2004.03.008
“A Probabilistic Background Model for Tracking”
Proc. European Conf. Computer Vision
An Introduction to the Kalman Filter
Proceedings of SIGGRAPH 2001
“Fuzzy Adaptive Mechanism for Improving the Efficiency and Precision of Vision-based Automatic Guided Vehicle Control”
IEEE Conf. Systems, Man, and Cybernetics
“The Vision-based Fast and Robust Recognition Method for Detecting Road Boundary”
in Proc. Conf. on Computer Vision Graphic and Image Processing
Halkarnikar P. P.
Khandagale H. P.
Talbar Dr. S. N.
“Object Detection Under Noisy Condition”
Journal Of American Institute of Physics
“Human Motion Detection and Tracking for Video Surveillance”
National Conference on Communication
El Baf F.
“Background Modeling using Mixture of Gaussians for Foreground Detection-A Survey”
Recent Patents on Computer Science
DOI : 10.2174/2213275910801030219
Hung V. T.
Bac L. H.
“GPU implementation of extended Gaussian mixture model for background subtraction”
IEEE International Conference on Computing and Communication Technologies, Research, Innovation, and Vision for the Future
“Saturation adjustment method based on human vision with YCbCr color model characteristics and luminance changes”
IEEE International Symposium on Intelligent Signal Processing and Communications Systems
Yoo Y. J.
Choi J. Y.
“Adaptive shadow estimator for removing shadow of moving object”
Computer Vision and Image Understanding
DOI : 10.1016/j.cviu.2010.06.003
“Evaluation of background subtraction techniques for video surveillance”
in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
“Illumination Normalization with Time-Dependent Intrinsic Images for Video Surveillanc”
IEEE Transaction on Pattern Analysis and Machine Intelligence
“A probabilistic approach for foreground and shadow segmentation in monocular image sequences”
“Robust and efficient foreground analysis for real-time video surveillance”
in Proc. IEEE Comput. Soc. Conf. Computer Vision and Pattern Recognition
“A dynamic conditional random field model for foreground and shadow segmentation”
IEEE Trans.Patt. Anal. Mach. Intel.
DOI : 10.1109/TPAMI.2006.25
“Automiatic traffic surveillance system for vehicle tracking and classification”
IEEE Trans. Intell. Transp. Syst.
DOI : 10.1109/TITS.2006.874722
Jacques J. C. S.
Jung Jr, C. R.
Musse S. R.
“A background subtraction model adapted to illuminetion changes”
in Proc. IEEE Int. Conf. Image Processing
Fang X. Z.
“Moving cast shadows detection based on ratio edge”
in Proc. 18th Int. Conf. Pattern Recognition
Fang X. Z.
Yang X. K.
“Moving cast shadows detection using ratio edge”
IEEE Trans. Multimedia
“Shadow detection for moving objects based on texture analysis”
“Real-Time Moving Vehicle Detection with Cast Shadow Removal in Video Based on Conditional Random Field”
IEEE Transactions on Circuits and Systems for Video Technology
“Moving cast shadow detection using physics-based features,”
in IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit.
Jung Cláudio Rosito
“Efficient Background Subtraction and Shadow Removal for Monochromatic Video Sequences,”
IEEE Transactions on Multimedia
“Detection of moving objects with removal of cast shadows and periodic changes using stereo vision”
Proc. 20th ICPR
Mozerov Mikhail G.
Bagdanov Andrew D.
“Accurate Moving Cast Shadow Suppression Based on Local Color Constancy Detection,”
IEEE Transaction on Image Processing
“Cast Shadow Removal in a Hierarchical Manner Using MRF”
IEEE Transactions on Circuits and Systems for Video Technology