Advanced
Crowd Activity Recognition using Optical Flow Orientation Distribution
Crowd Activity Recognition using Optical Flow Orientation Distribution
KSII Transactions on Internet and Information Systems (TIIS). 2015. Aug, 9(8): 2948-2963
Copyright © 2015, Korean Society For Internet Information
  • Received : January 21, 2015
  • Accepted : June 15, 2015
  • Published : August 31, 2015
Download
PDF
e-PUB
PubReader
PPT
Export by style
Share
Article
Author
Metrics
Cited by
TagCloud
About the Authors
Jinpyung Kim
Artificial Intelligence Lab, College of Information and Communication Engineering Sungkyunkwan University 440-746, 2066 Seobu-ro, Jangan-gu, Suwon-si, Gyeonggi-do, Republic of Korea
Gyujin Jang
Artificial Intelligence Lab, College of Information and Communication Engineering Sungkyunkwan University 440-746, 2066 Seobu-ro, Jangan-gu, Suwon-si, Gyeonggi-do, Republic of Korea
Gyujin Kim
Artificial Intelligence Lab, College of Information and Communication Engineering Sungkyunkwan University 440-746, 2066 Seobu-ro, Jangan-gu, Suwon-si, Gyeonggi-do, Republic of Korea
Moon-Hyun Kim
Artificial Intelligence Lab, College of Information and Communication Engineering Sungkyunkwan University 440-746, 2066 Seobu-ro, Jangan-gu, Suwon-si, Gyeonggi-do, Republic of Korea

Abstract
In the field of computer vision, visual surveillance systems have recently become an important research topic. Growth in this area is being driven by both the increase in the availability of inexpensive computing devices and image sensors as well as the general inefficiency of manual surveillance and monitoring. In particular, the ultimate goal for many visual surveillance systems is to provide automatic activity recognition for events at a given site. A higher level of understanding of these activities requires certain lower-level computer vision tasks to be performed. So in this paper, we propose an intelligent activity recognition model that uses a structure learning method and a classification method. The structure learning method is provided as a K2-learning algorithm that generates Bayesian networks of causal relationships between sensors for a given activity. The statistical characteristics of the sensor values and the topological characteristics of the generated graphs are learned for each activity, and then a neural network is designed to classify the current activity according to the features extracted from the multiple sensor values that have been collected. Finally, the proposed method is implemented and tested by using PETS2013 benchmark data.
Keywords
1. Introduction
A very important task for many visual recognition systems is to analyze the activities performed by an object within the frame. Activity recognition systems have inspired novel user interfaces and new applications for smart environments, including surveillance, emergency response, and military missions. In addition, a challenging problem for research in machine learning is to provide automatic recognition of object activity from data collected from imaging sensors [1] .
The problem of learning patterns of human activity from sequences of images arises in many different areas where computer science is applied, including intelligent environments, surveillance, and assistive technology for the disabled. In particular, video surveillance has become more and more important due to the increase in need for security and related applications [2] . Video surveillance of dynamic scenes has a potentially wide range of applications, such as to assist security guards for communities, understand the behavior of crowds, provide traffic surveillance in cities and expressways, detect military targets, etc. [3] . One of the most important research topics in computer vision in particular is the video surveillance of dynamic scenes that contain crowds and objects [4] [5] . A central topic of this area is the automatic analysis and recognition of crowd activity in video sequences [5] [6] . The recognition of crowd activity can be described as a combination of two tasks: feature extraction and modeling classes of activities.
Cermeno et al. [7] proposed a method that extracts global features from an image, and they trained a two-class classifier using a feature vector for event recognition. Wang et al. proposed a method to detect abnormal events based on histograms of the orientation of the optical flow descriptor and a one-class SVM classifier [8] . Technically speaking, activity recognition can be divided into two tasks: (1) activity information extraction and (2) activity pattern modeling. The activity information represents attributes of movement (velocity, orientation, location) in the data while the activity patterns are representations of events that occur frequently.
In this study, we propose an intelligent activity recognition model from an image sequence of data. This paper provides the following contributions in image sequence-based activity recognition studies. First, the model proposes a representation method for crowd activity in an image sequence by using a histogram of optical flow. Second, a new machine learning method for activity recognition is also shown. The proposed learning method transforms a histogram of the optical flow to a graph, specifically a Bayesian network. In this network, each node corresponds to a histogram bin. The model extracts common structural features from the graphs generated for each activity, and these structural features are reflected in the structure of the neural network [8] as groups of input nodes. In addition, a numerical feature that represents the statistical properties of each histogram bin are used as input values for each input node.
2. Activity Recognition
In this paper, we present a three-stage method to recognize object activities. This method consists of a description stage, a structure learning stage, and a classification stage. During the representation stage, we compute the optical flow and construct a histogram for the orientation of the optical-flow (HOOF) [9] in order to describe the movement of the crowd in an image sequence. The HOOF for the i-th frame is denoted as a vector Ɵi =[ o1 , o2 ,…, o9 ] where ok denotes the value of the k-th histogram bin. These vectors are collected for the moving time window so that a set of vectors
PPT Slide
Lager Image
={ Ɵ 1-T+1 ,…, Ɵi } can be formulated for T consecutive frames in a time window. In the structure learning stage, we generate a Bayesian network from
PPT Slide
Lager Image
by using a k2-algorithm. The node in the Bayesian network is a component of the HOOF vector. The Bayesian network generalizes causal relationships between the nodes. We name this network as the context network. For each activity class Aclass , a set of context networks, CNclass is constructed by collecting the Bayesian networks generated from frames with non-overlapping time windows. For each class, the common paths are extracted for the context networks. Each path Pi is defined as a structural feature of the class, and the structural features represent the topological characteristics of the context-networks. A structural feature Pi is implemented as a group of input nodes of the neural network, and a two-layer neural network is designed and is trained with a training image sequence for each class. During the training process, the structural features are extracted from the current context network while the numerical features for the nodes are included in the structural features that are to be applied to the input nodes of the neural network. The numerical features of the node represent the statistical characteristics of the nodes during a given time window.
For the input nodes of the non-existing structural features, 0’s are applied in order to avoid training for the sample. After training the two-layer perceptron, the current activity is classified by deriving the context network from the current
PPT Slide
Lager Image
. The paths of the context network are extracted and are compared to the stored structural features while the input nodes for the existing structural features are applied as numeric values for the nodes. The other input nodes are applied with 0 values, and the class is identified as an output node with a maximum output value. Fig. 1 illustrates the proposed activity recognition method.
PPT Slide
Lager Image
The block diagram of proposed method for activity recognition.
- 2.1 Frame Description
- 2.1.1 Optical Flow
The optical flow is comprised of the apparent velocities of the pixels in an image sequence. Since the direction and the amplitude of movement are a representation of the activity, the optical flow is used as the scene description. B.Horn and B.Schunck [11] compute the optical flow by using a global smoothness constraint. The basic Horn-Schunck (HS) method is used to compute the optical flow in this paper. The HS method minimizes the error function that combines two constraints. The first constraint is a brightness constancy constraint that assumes a constancy in the grey level of a point across the frames. The second constraint is a smoothness constraint that assumes a continuity in the velocities of the adjacent pixels [12] . Equation (1) shows the error function that is used.
PPT Slide
Lager Image
Ix , Iy and It are the derivatives for the image intensity values along the x , y spatial axes and the time axis, u , v are the horizontal and vertical components of the optical-flow, and λ is a regularization constant. The optical-flow can be iteratively computed by using (2) and (3), where k denotes the iteration step, and
PPT Slide
Lager Image
and
PPT Slide
Lager Image
are weighted averages of u and v in the neighborhood of the pixel ( x , y ).
PPT Slide
Lager Image
PPT Slide
Lager Image
- 2.1.2 Histogram of Orientation Optical Flow
Fig. 2 shows a partition of an image with non-overlapping blocks, each block has the same size of n × n pixels, and each image frame is divided into m blocks, where m is the ( heightimage/n ) × ( widthimage/n ). For each block, the average of the optical flows for the block is computed to describe the local moving direction of the crowds in the block. For block k , the average of the optical-flow
PPT Slide
Lager Image
is represented using the polar coordinates ( rk , θk ). Where, fBn denotes a histogram of block Bn , and Fk denotes a set of histograms of frame k . The orientation bins are evenly spaced into 9 parts from 0° to 360°, as shown in Fig. 2 , and the block k votes into one of the n orientation bins that include θk . The HOOF for frame i is denoted as a vector Ɵi = [ o1 , o2 ,…, o9 ] that describes the distribution of the direction in which the crowd is moving in the entire frame. For T consecutive frames until the current i-th frame,
PPT Slide
Lager Image
={ Ɵ i-T+1 ,…, Ɵi }, each HOOF for all frames is collected in order to construct a HOOF sequence Ɵi ={ Ɵ i-T+1 ,…, xi }. The HOOF sequence represents a change in the direction in which the crowd is moving during a given time window. The HOOF sequence is applied to the K2 algorithm in order to generate a context network. Fk denotes frame set of k blocks.
PPT Slide
Lager Image
8×8 cells of Histogram of optical flow descriptor.
- 2.2 Structure Learning Stage
- 2.2.1 K2-Algorithm
A Bayesian Network (BN) is a graphical model that efficiently encodes the joint probability distribution for a set of variables [13] . The BN provides powerful knowledge representation and is a reasoning tool for conditions with uncertainty. A BN is a directed acyclic graph (DAG) that has a conditional probability distribution for each node, and the DAG structure of such networks contains nodes that represent the domain variables while the arcs between the nodes represent probabilistic dependencies [13] . We used a K2-algorithm to extract the structural relationship between histogram bin values. The K2-algorithm proposed by Cooper and Herskovits [14] is the most well-known Bayesian structure learning algorithm. The algorithm generates a Bayesian graph G with a joint probability and a Bayesian metric score. It is called the K2-metric and is the most well-known Bayesian network evaluation function. The K2-metric is expressed in (4).
PPT Slide
Lager Image
Maximizing P ( G ,
PPT Slide
Lager Image
) searches for the most probable Bayesian network structure G given a database
PPT Slide
Lager Image
. P ( G ) is the structure prior probability that is constant for each G . In (4), ri represents the number of possible values of the node oi . And qi is the list of all possible instantiations of the combination. We let πi as set of parents of node oi .
PPT Slide
Lager Image
Nijk is the number of cases in
PPT Slide
Lager Image
in which the attribute xi is instantiated with its k-th value, and the parents of oi in πi are instantiated with the j-th instantiation in qi . Nij is the number of instances in the database in which the parents of oi in πi are instantiated with the j-th instantiation in qi [14] .
The K2-algorithm starts by assuming that a node has no parents, after which it incrementally adds the parent whose addition increases the probability of the resulting structure the most for every step. The K2-algorithm stops adding parents to the nodes when the addition of a single parent cannot increase the probability of the network for the given data [14] [15] . The structure learning stage uses the K2-algorithm to obtain the graphs from the training data for each of the four classes. These learned graphs are named as context-graph G and are used to extract the distinctive path patterns that are used as input features for recognition of each class [16] .
- 2.2.2 Pattern Extraction
The context-graph and extracted path patterns are generated as shown in Fig. 3 . The node oi in the context-graph is the value of the i-th histogram bin. In the graph, nodes o4 and o5 depend on the parent node o2 which is implemented as an adjacency matrix in which an element represents the existence of the connection between the nodes.
PPT Slide
Lager Image
Generated context-networks
The element of the adjacency matrix A [ i , j ]=1 if there is an edge between the i-th node and the j-th node, A [ i , j ]=0 otherwise. In a context-network CN = ( V , E ) is the directed graph where V is a set of nodes and E is a set of edges. An edge e =< os , oe >∈ E , where os and oe are the tail and head of edge e represents a causal relationship. That is, ns affects the occurrence of ne . Thus the structural features, which are topological characteristics, of the context-network that is generated reflect these causal relationships among the nodes in a current situation. The paths for the graphs generated in structure learning stage indicate the patterns that describe the specific relations between the nodes that characterize each class. The path patterns generated for the context-network are extracted, and these are used as structural features during situation recognition. Each path pattern is a path from the root to the leaf node of the context-network [16] , and each path is represented as a sequence of nodes that are ordered from the root node to the leaf node. The paths from the root to the leaf nodes in the context-network are extracted, i.e. o1 - o2 - o4 - o8 and o1 - o2 - o5 - o7 , o1 - o3 - o6 - o9 where oi is the i-th orientation bin node.
The context-graphs for each class c C ={ Walk , Run , Evacuation , Merge } are learned by using the training data for each class. A set of context-graphs for class c C is named CNc and is shown in (6), where
PPT Slide
Lager Image
CNc denotes the i-th generated context-network for class c and Nc is the number of generated context-networks for class c .
PPT Slide
Lager Image
For each context network
PPT Slide
Lager Image
, which is the i-th context-network for class c , the path patterns are extracted where K n,i is the number of the extracted path patterns from
PPT Slide
Lager Image
. Table 1 shows the set of path patterns for class c . Each column denotes a set of path patterns from a context network. Walk class , run class , merge class , and evacuation class , have their own path patterns, from the respective learned context-networks.
Path Pattern from Context-Network
PPT Slide
Lager Image
Path Pattern from Context-Network
- 2.2.3 Classification stage
During the classification stage, we designed a 2-layer neural network for pattern classification by using the path patterns extracted as shown in Fig. 4 . For each class c C , the conditional probability P ( p | c ) for each path p PS is computed during training. For each class, the path patterns with the highest conditional probabilities are selected and are used as input features for classification [17] . These selected path patterns are defined as structure features. For each distinctive selected structure feature
PPT Slide
Lager Image
, a set of input nodes for the neural network
PPT Slide
Lager Image
is assigned. For every node in pi , i.e.
PPT Slide
Lager Image
, j =1,2,…, m is assigned as an input node for IPi . Table 2 shows the structural features as selected path patterns for each class. The input nodes in each selected path feature are grouped separately, as shown in Fig. 4 . Note that a structural feature can appear in different classes repeatedly, i.e., IP2 in the run class and merge class . Null denotes that no such path pattern exists in the context-network of that class. By using this organization for the input layer, we can reflect not only the topology of the generated context-network but also the numerical properties of the variables. A two-layer perceptron with 70 input nodes and 4 output nodes is used. Four output nodes, Walk , Run , Merge and Evacuation , correspond to a class. The activation function used for a node ni is
PPT Slide
Lager Image
PPT Slide
Lager Image
PPT Slide
Lager Image
Neural Network Architecture for Activity Recognition
Implementation of structural features as clusters of input nodes
PPT Slide
Lager Image
Implementation of structural features as clusters of input nodes
where Wij is the weight of the connection between ni and nj , which is a j-th node in the precedent layer. The output value for node nj is denoted as Xj . This function allows for a smooth transition between the low output and the high output of the neuron [17] . The weights of the connections are learned by using a back-propagation algorithm, and the learning algorithm minimizes the sum of squared error E between the network outputs a and the target outputs t , and E is defined as:
PPT Slide
Lager Image
where ti ={0,1}, and ai =[0,1] are the target values and the network output for the i-th output node, respectively. N is the number of output nodes, i.e. 4. Backpropagation is the most widely used algorithm for supervised learning with multi-layered feed-forward networks. The basic idea for backpropagation learning [18] is that there is a repeated application of the chain rule in order to compute the influence of each weight in the network with respect to the error function E :
PPT Slide
Lager Image
where wij is the weight of the connection from node nj to node ni , ai is the output, and neti is the input for node ni . Once the partial derivative for each of the weights is known, the error function can be minimized by performing a simple gradient descent:
PPT Slide
Lager Image
During the training phase, the input nodes are partitioned into two sets S1 and S2
PPT Slide
Lager Image
where I is set of input nodes. Set S1 includes the nodes that belong to the structural features for the current target class, and set S2 is a set of the remaining input nodes. The input nodes in S1 are given as preprocessed current numeric values while the input nodes in S2 are given as 0’s. Thus, it selectively trains weights connected to the input nodes for the structural features of the current target class. During the recognition phase after training, the paths are extracted from the context-network that is generated. If the input nodes in the structural features of the neural network are provided with current numeric values, the other nodes are given 0’s. The current class is classified as
PPT Slide
Lager Image
where OC is output node for a class c , and v (.) denotes the value of an output node.
3. Experiments and Result
We implemented the proposed method and measured its performance by using the PETS 2013 crowd activity dataset (S3) as experimental data.
- 3.1 Crowd Activity Dataset
The crowd activity dataset contains different crowd activities, and the objective is to provide a probabilistic estimation at different times for each of the following events: walking , running , evacuation , and merging . Furthermore, we are interested in systems that can identify the start and end of the events as well as the transitions between them [19] . The image sequences depict four crowd activities, and the image frames for four classes are shown in Fig. 5 . 40 people act out the different crowd scenarios in the image sequences. In order to validate our approach, we tested the PETS DatasetS3, High Level, which contains four respective sequences with timestamps of 14:16 and 14:33. For each sequence, we use the videos recorded by camera1 ( view001 ) and camera2 ( view002 ). Moreover, we compared the performance of the proposed method against that of the MLP and Bayesian Network Classifier (BNC). Table 5 shows the distribution of the class frames in the view001 sequence and view002 sequence. The number of image frames in the walk class and the evacuation class are almost equal, but the image frames in the merge class comprise the largest portion of the dataset. We can observe that the pedestrians have relatively similar sizes in the view001 image sequence, and the pedestrians have different sizes in the view002 image sequence due to characteristics of the perspective projection. Therefore, the optical flow vectors in view001 have uniform magnitudes while the magnitudes of the optical flow vectors in view002 are irregular.
Distribution of classes in dataset
PPT Slide
Lager Image
Distribution of classes in dataset
PPT Slide
Lager Image
Four crowd activity classes in view001
- 3.2 Feature Extraction
We proposed a method that analyzes the activity of the crowd through an optical flow and also accumulated the orientation of the representative optical flow for each image block. In Fig. 6 , the feature extraction stage extracted its representative optical flow for the run class in view001 .
PPT Slide
Lager Image
Four crowd activity classes in view001
Fig. 7 presents the accumulated orientation of the optical flow vectors for each class for the 9 orientation bins, which shows a different distribution depending on the class. For the evacuation class , the number of optical flow vectors in a bin of from 80° to 120° is the maximum. For the run class , a bin from 240° to 280° has the maximum number of flow vectors. The walk class has maximum number of optical flow vectors in the 160° to 200° bin.
PPT Slide
Lager Image
Distibution of HOOF for four classes
- 3.3 Structure Learning
For the i-th image frame, a HOOF sequence
PPT Slide
Lager Image
= { θ iT+1 ,…, θi }, T=10 , which is a collection of all HOOF for the previous frames, is constructed. The K2 algorithm is applied for this HOOF sequence in order to generate context networks for each class. From the context networks for each class, the path patterns with the highest conditional probabilities are extracted. The paths that most frequently appeared in the patterns for the four crowd activity classes are shown in Table 4 . It shows path patterns according to the number of occurrences during the training phase. The path o1 - o2 - o3 - o7 for the walk class appears 12 times during training. This path also appears in the merge class with an occurrence number of 38. The path patterns represent the particular correlations among the variables that we have to significantly treat during the recognition phase. Thus these patterns become structural features for the input image sequence that is to be classified.
Most frequently appeared path patterns for activity dataset
PPT Slide
Lager Image
Most frequently appeared path patterns for activity dataset
- 3.4 Classification
During the classification stage, structural features are used to classify the current activity by detecting the existence of these features in the current context network. However, common structural features are present in the different classes. For example, Table 4 shows that the o1 - o2 - o8 path pattern belongs to the structural feature sets of all classes. Since the existence of the structural features cannot be used to distinguish the classes, we used the second features, i.e., the numeric features. These features are the values of the HOOF variables. For the path pattern o1 - o2 - o8 , the conditional joint probability distribution of o1 , o2 , and o8 for each class are shown in Fig. 8 .
PPT Slide
Lager Image
PPT Slide
Lager Image
Distibution of HOOF for four classes
Fig. 8 shows the size of the optical flow about the 9 orientation bins of each class, namely the directional velocity. Each class represents a quite different distribution for the optical flow of the crowd behavior in the image sequence. In Fig. 8 (a) and Fig. 8 (c), the walk class and the merge class mostly have narrow distribution for their values because walk occurs in one direction or all directions in each image sequence. In contrast, in Fig. 8 (b) and Fig. 8 (d), the run class and evacuation class mostly have wider distributions for their values because walk occurs in one direction or all directions in each image sequence. In Fig. 8 (a) and Fig. 8 (b), the walk class and run class have a skew in value distribution due to the movement in one direction in the image sequence. On the other hand, in Fig. 8 (c) and Fig. 8 (d), the merge class and the evacuation class have a skew distribution of the value according to the movement in one direction for the image sequence. Table 5 shows the mean of each variable, E [ oi ], as well as the correlation coefficient between the variables ρoioj in the o1 - o2 - o8 path pattern. As show in Table 5 , distribution of path pattern for each class is non-symetric distribution because, the context-network is directed network. This indicates a large mean value for o1 , E [ o1 ] of the run and evacuation class . Strong positive correlations exists between the variables.
Value of each node in path patterns
PPT Slide
Lager Image
Value of each node in path patterns
In particular, the o1 - o2 in the evacuation class has the strongest relationship, 0.91, since pedestrians are spread in every direction.
Table 6 presents the variable values for path pattern o1 - o2 - o4 - o5 - o6 in the context networks of the walk class . Their values are almost similar to those of every context network of the same class.
Value of each node in path patterns
PPT Slide
Lager Image
Value of each node in path patterns
- 3.5 Evaluation
We compare the performance of the proposed method against that of conventional MLP and a Bayesian classifier. Generally, conventional MLP exhibits better result as the number of hidden nodes becomes larger. However, the conventional MLP shows its best performance when a hidden layer consist of 20 hidden nodes. The used input nodes for the MLP and Bayesian classifiers are variables, o1 ,.., o9 . The used input values for the variables are average during the time window, as shown in (15).
PPT Slide
Lager Image
The best performance for the proposed method can be achieved when it uses 25 hidden nodes and 99.4% of the whole image sequence as a training sequence. The performance is measured in terms of the precision in (16) and is compared with the performance of other methods in Table 7 .
PPT Slide
Lager Image
Accuracy evaluation of crowd activity recognition
PPT Slide
Lager Image
Accuracy evaluation of crowd activity recognition
The proposed method exhibits the best performance for the view001 sequence while also exhibiting the worst performance for the evacuation class in the view002 sequence. Since this class takes the smallest part in the training sequence, this class is confused with other classes in view002 due to poor training. The conventional Bayesian classifier showed the worst performance when compared to the proposed method and the conventional MLP. We also present the method proposed by Wang [9] to compare its performance by using the HOOF as features and the SVM as a classifier. They performed experiments by several classes(walking toward all the directions, walking toward the same direction, walking toward the same direction, crowd formation and evacuation, local dispersion) except run class. So, we compared their classes with our corresponding classes. We compute the performance as a weighted average of the precision of the four classes using (17).
PPT Slide
Lager Image
The average precision is plotted in Fig. 9 and is compared to the average precision of a conventional MLP. It shows better performance both for the view001 and view002 sequences.
PPT Slide
Lager Image
Accuracy evaluation of crow activity recognition
4. Conclusion
In this paper, we have proposed a crowd behavior recognition method that can be applied to a variety of complex activity environments. The proposed method consists of a feature extraction stage, a structure learning stage, and a classification stage. We propose a systematic structure learning approach that can automatically learn the appropriate Context-Network. The neural networks with structural features(pattern) designed and presented as part of this method can achieve an improved classification performance. Specifically, the structure learning stage is implemented in three steps: an input temporal vector formulation step, a subsequent context-network generation step with a K2-algorithm and a path pattern extraction step from the context-network. Our automatically learned situation recognition model outperformed the Multi-Layer Perceptron and Bayesian classifier. These results demonstrate that the proposed approach is feasible and provides sufficient recognition accuracy for multiple sensor signals. In the future, the study will expand path features in order to recognize complicated activity, such as loitering. Also, path features that are automatically generated from context-networks should be investigated. In addition, the proposed method does not consider real time execution yet. However, we are investigating efficient implementation of real time crowd activity recognition system.
BIO
Jin-Pyung Kim received his M.S. degree and Ph.D. degree in College of Information and Communication Engineering from Sungkyunkwan University, Suwon, Korea, in 2006 and 2014. He is currently a post-doctor in Sungkyunkwan University, Suwon, Korea. His research interests include structure learning, pattern recognition, and computer vision.
Gyu-Jin Jang received his M.S. degree in College of Information and Communication Engineering from Sungkyunkwan University, Suwon, Korea, in 2011. He is currently a Ph.D. candidate in Sungkyunkwan University, Suwon, Korea. His research interests include artificial intelligence, computer vision, and pattern recognition.
Gyu-Jin Kim received his M.S. degree in College of Information and Communication Engineering from Sungkyunkwan University, Suwon, Korea, in 2011. He is currently a Ph.D. candidate in Sungkyunkwan University, Suwon, Korea. His research interests include computer vision, pattern recognition, and artificial intelligence.
Moon-Hyun Kim received the B.S. degree in Electronic Engineering from Seoul National University in 1978, the M.S. degree in Electrical Engineering from KAIST, Korea, in 1980, and the Ph.D. degree in Computer Engineering from the University of Southern California in 1988. He joined the College of Information and Communication Engineering, Sungkyunkwan University, Seoul, Korea in 1988, where he is currently a Professor. In 1995, he was a Visiting Scientist at the IBM Almaden Research Center, San Jose, California. In 1997, he was a Visiting Professor at the Signal Processing Laboratory of Princeton University, Princeton, New Jersey. His research interests include artificial intelligence, pattern recognition, and machine learning.
References
Hu Weiming 2004 “A survey on visual surveillance of object motion and behaviors,” Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on 34 (3) 334 - 352    DOI : 10.1109/TSMCC.2004.829274
Saxena Shobhit 2008 “Crowd behavior recognition for video surveillance,”Advanced Concepts for Intelligent Vision Systems Springer Berlin Heidelberg
Hu Weiming 2004 “A survey on visual surveillance of object motion and behaviors,” Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on 34 (3) 334 - 352    DOI : 10.1109/TSMCC.2004.829274
Ke Shian-Ru 2013 “A review on video-based human activity recognition,” Computers 2 (2) 88 - 131    DOI : 10.3390/computers2020088
Candamo Joshua 2010 “Understanding transit scenes: A survey on human behavior-recognition algorithms,” Intelligent Transportation Systems, IEEE Transactions on 11 (1) 206 - 224    DOI : 10.1109/TITS.2009.2030963
Viola Paul , Jones Michael J. , Snow Daniel “Detecting pedestrians using patterns of motion and appearance,” in Proc. of Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on. IEEE 2003
An Tae-Ki , Kim Moon-Hyun 2012 “Context-Aware Video Surveillance System,” Journal of Electrical Engineering & Technology 7 (1) 115 - 123    DOI : 10.5370/JEET.2012.7.1.115
Cermeno Eduardo , Mallor Silvana , Siguenza J. A. “Learning crowd behavior for event recognition,”Performance Evaluation of Tracking and Surveillance (PETS) in Proc. of 2013 IEEE International Workshop on. IEEE 2013
Wang Tian , Snoussi Hichem “Histograms of optical flow orientation for abnormal events detection,”Performance Evaluation of Tracking and Surveillance (PETS) in Proc. of 2013 IEEE International Workshop on. IEEE 2013
Yang Jhun-Ying , Wang Jeen-Shing , Chen Yen-Ping 2008 “Using acceleration measurements for activity recognition: An effective learning algorithm for constructing neural classifiers,” Pattern recognition letters 29 (16) 2213 - 2220    DOI : 10.1016/j.patrec.2008.08.002
Horn Berthold K. , Schunck Brian G. “Determining optical flow.” in Proc. of 1981 Technical Symposium East. International Society for Optics and Photonics 1981
Shizuka Fujisawa 2013 “Pedestrian Counting in Video Sequences based on Optical Flow Clustering,” International Journal of Image Processing 7 (1) 1 - 16
Heckerman David , Geiger Dan , Chickering David M. 1995 “Learning Bayesian networks: The combination of knowledge and statistical data,” Machine learning 20 (3) 197 - 243
Cooper Gregory F. , Herskovits Edward 1992 “A Bayesian method for the induction of probabilistic networks from data,” Machine learning 9 (4) 309 - 347
Lerner Boaz , Malka Roy 2011 “Investigation of the K2 algorithm in learning Bayesian network classifiers,” Applied Artificial Intelligence 25 (1) 74 - 96    DOI : 10.1080/08839514.2011.529265
Kim Jin-Pyung 2014 “Multi-Sensor Signal based Situation Recognition with Bayesian Networks,” JOURNAL OF ELECTRICAL ENGINEERING & TECHNOLOGY 9 (3) 1051 - 1059    DOI : 10.5370/JEET.2014.9.3.1051
Kim Gyujin , An Taeki , Kim Moon-Hyun 2012 “Estimation of Crowd Density in Public Areas Based on Neural Network,” KSII Transactions on Internet and Information Systems (TIIS) 6 (9) 2170 - 2190
Jain Anil K. , Duin Robert P. W. , Mao Jianchang 2000 “Statistical pattern recognition: A review,” Pattern Analysis and Machine Intelligence, IEEE Transactions on 22 (1) 4 - 37    DOI : 10.1109/34.824819
Choudhury Tanzeem 2008 “The mobile sensing platform: An embedded activity recognition system,” Pervasive Computing, IEEE 7 (2) 32 - 41    DOI : 10.1109/MPRV.2008.39
Eom K. Y. , Jung J. Y. , Kim Moon-Hyun 2012 “A heuristic search-based motion correspondence algorithm using fuzzy clustering,” International Journal of Control, Automation and Systems 10 (3) 594 - 602    DOI : 10.1007/s12555-012-0317-5