Classification of Induction Machine Faults using Time Frequency Representation and Particle Swarm Optimization
Classification of Induction Machine Faults using Time Frequency Representation and Particle Swarm Optimization
Journal of Electrical Engineering and Technology. 2014. Jan, 9(1): 170-177
Copyright © 2014, The Korean Institute of Electrical Engineers
  • Received : December 26, 2012
  • Accepted : August 03, 2013
  • Published : January 01, 2014
Export by style
Cited by
About the Authors
A., Medoued
Corresponding Author: Dept. of Electrical Engineering, University of 20 Aout 1955-Skikda, Algeria. (
A., Lebaroud
Dept. Dept. of Electrical Engineering, University of 20 Aout 1955-Skikda, Algeria. (
A., Laifa
Dept. Dept. of Electrical Engineering, University of 20 Aout 1955-Skikda, Algeria. (
D., Sayad
Dept. Dept. of Electrical Engineering, University of 20 Aout 1955-Skikda, Algeria. (

This paper presents a new method of classification of the induction machine faults using Time Frequency Representation, Particle Swarm Optimization and artificial neural network. The essence of the feature extraction is to project from faulty machine to a low size signal time-frequency representation (TFR), which is deliberately designed for maximizing the separability between classes, a distinct TFR is designed for each class. The feature vectors size is optimized using Particle Swarm Optimization method (PSO). The classifier is designed using an artificial neural network. This method allows an accurate classification independently of load level. The introduction of the PSO in the classification procedure has given good results using the reduced size of the feature vectors obtained by the optimization process. These results are validated on a 5.5-kW induction motor test bench.
1. Introduction
Today’s industry strives to improve performance and profitability while maintaining and improving safety. The challenges include reliability and safety operation of electric motors in an industrial process. Thus, very expensive scheduled maintenance is performed in order to detect machine problems before they may result in catastrophic failure [1 - 2] . Nowadays, maintenance cost reductions are the number one priority for electrical drive to prevent unscheduled downtimes and to increase operational effectiveness. Recent advances of signal processing techniques, such as artificial neural networks [3 - 8] , wavelets [9] , etc.., have provided more powerful tools for fault diagnosis.
The problem of diagnosis systems is that they use signals either in the time or frequency domain. In our approach, instead of using a time or a frequency approach, it is potentially more informative to use both time and frequency. Time-frequency analysis of the motor current makes signal properties, related to fault detection, more evident in the transform domain [10] .
Traditionally, the objective of time–frequency research is to create a function that will describe the energy density of a signal simultaneously in time and frequency. For explicit classification, it is not necessarily desirable to accurately represent the energy distribution of a signal in time and frequency. In fact, such a representation may conflict with the goal of classification, generating a TFR that maximizes the separability of TFRs from different classes. It may be advantageous to design TFRs that specifically highlight differences between classes [11 - 14] .
Since all TFRs can be derived from the ambiguity plane, no a priori assumption is made about the smoothing required for accurate classification. Thus, the smoothing quadratic TFRs retain only the information that is essential for classification.
This classification allows us to proceed to an optimization routine based on particle swarm technique to find the appropriate size of the feature vectors in order to reduce calculation time and keep signal with relevant information within the vectors.
In this paper, we propose a classification algorithm based on the design of an optimized TFR from a time–frequency ambiguity plane in order to extract the feature vector. The optimal size of feature vectors is realized by the PSO algorithm. The PSO technique can generate high-quality solutions within shorter calculation time and stable convergence characteristic than any other stochastic methods [15 - 17] .
Finally, a neural network-based decision criterion is used for classification. The goal of this work is the realization of an accurate classification system of motor faults such as bearing faults, stator faults, and broken bars rotor faults independently from the load level.
2. Classification Algorithm
The classification algorithm consists of the following three parts: extraction, optimization of features vectors and decision making. In the training stage, three optimal kernels are designed for separating four classes [18] :
  • 1) Class of healthy motor;
  • 2) Class of bearing fault;
  • 3) Class of stator fault;
  • 4) Class of broken bars.
The kernel design process selects, for each class, a number of locations from the time - frequency ambiguity plane. In the decision making stage, we propose an ANN classifier with the Levenberg Marquardt algorithm.The details of each step are described in the following sections.
3. Feature Extraction
- 3.1 Optimal TFR
For further details, we recommend the reader to review our previous works [19] and [20] .
The expression of the TFR is given by:
PPT Slide
Lager Image
The characteristic function for each TFR is A ( η , τ ) φ ( η , τ ) , η represents the discrete frequency shift and τ represents the discrete time delay. This means that the optimal-classification representation TFRi can be obtained by smoothing the ambiguity plane A ( η , τ ) with an appropriate kernel φopt , which is an optimal classification kernel. The problem of designing the TFRi becomes equivalent to designing the optimal classification kernel φopt ( η , τ ). This method, used to design kernels (and thus TFRs), optimizes the discrimination between predefined sets of classes.
Features can be extracted directly from A ( η , τ ) φopt ( η , τ ) instead of the optimal classification TFRi. This shortcut simplifies the computation complexity of the feature extraction by reducing the calculations.
- 3.2 Design of classification kernels
The kernel φopt ( η , τ ) is designed for each specific classification task. We determine N locations from the ambiguity plane, in such a way that the values in these locations are very similar for signals from the same class, but they vary significantly for signals from different classes. In our design, we use Fisher’s discriminant ratio, FDR [19 - 20] , to get these N locations.
In our classification procedure, C−1 kernels must be designed for a C-class classification system. In order to avoid unnecessary computation to separate classes, we have proposed the principle of the remaining classes [11] . The discrimination between different classes is made by separating the class i from all the remaining classes {i+1,…,N}. In this case, the stator fault kernel is designed to discriminate the stator fault class from the other classes (rotor fault, bearing fault and healthy motor). The rotor fault kernel is designed to discriminate the rotor fault class from the remaining classes (bearing fault and healthy motor). The bearing fault kernel is designed to discriminate the bearing fault class from the healthy motor class. The advantage of the method lies in the optimum separation between the different classes.
4. Feature Vector Optimization
One objective of our approach is to minimize the signal size by the feature vector of a very small size without losing relevant information.. Hence, the search for an optimum size of this vector provides a good compromise between the relevance of information and time consuming cost.
- 4.1 Particle Swarm Optimization (PSO)
Particle Swarm Optimization (PSO), introduced by Eberhart and Kennedy [21] , is based on the analogy of birds swarm and school of fish. In PSO, each individual called particle makes his decision using his own experience together with other individuals’ experience. In PSO, two different definitions are used: the individual best and the global best. As a particle moves through the search space, it compares its fitness value at the current position to the best fitness value it has ever attained previously. The best position that is associated with the best fitness encountered so far is called the individual best or pbest. The global best, or gbest, is the best position among all of the individual’s best positions achieved so far ( Fig. 1 ).
PPT Slide
Lager Image
Particle swarm method principle
Using the gbest and the pbest, the ith particle velocity is updated according to the following equation [22] :
PPT Slide
Lager Image
Based on the updated velocities, each particle changes its position according to the equation:
PPT Slide
Lager Image
Where w is a weighting function, cj are acceleration factors and rand is a random number between 0 and 1.
The following weighting function is usually utilized:
PPT Slide
Lager Image
Where wmax is initial weight, wmin the final weight, itermax is the maximum iteration number, and iter is the current iteration number.
The parameters used in this work are taken as follows [22 - 26] :
c1=c2=2.05; wmin =0.1; wmax =0.9.
Selection of maximum velocity:
At each iteration step, the algorithm proceeds by adjusting the distance (velocity) that each particle moves in every dimension of the problem hyperspace. The velocity of the particle is a stochastic variable and is, therefore, subject to creating an uncontrolled trajectory, making the particle follow wider cycles in the problem space. In order to damp these oscillations, upper and lower limits can be defined for the velocity vi :
PPT Slide
Lager Image
Most of the time, the value of v max is selected empirically, according to the characteristics of the problem. It is important to note that if the value of this parameter is too large, then the particles may move erratically, going beyond a good solution; on the other hand, if v max is too small, then the particle’s movement is limited and the optimal solution may not be reached.
Fan and Shi [27] have shown that an appropriate dynamically changing v max can improve the PSO algorithm performance. To ensure a uniform velocity we fixed v max according to many run tests.
Integer PSO formulation:
In the case where integer variables are included in the optimization problem such as a size of feature vector, the PSO algorithm can be reformulated by rounding off the particle’s position to the nearest integer. Mathematically, (3) and (4) are still valid, but once the new particle’s position is determined in the real-number space, the conversion to the integer number space must be done.
- 4.2 Fitness function
For searching an optimized size of the feature vector based on PSO algorithm, a fitness function is needed. In this work, we consider the variance calculated for every size of the feature vector as the fitness for this size and the goal is to optimize this fitness.
5. Classification Using Neural Networks
In most cases an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network during the learning phase. The learning procedure tries to find a set of connections w that gives a mapping that fits well the training set.
Furthermore, neural networks can be viewed as highly nonlinear functions with the basic form:
PPT Slide
Lager Image
Where x is the input vector presented to the network, w are the weights of the network, and y is the corresponding output vector approximated or predicted by the network. The weight vector w is commonly ordered first by layer, then by neurons, and finally by the weights of each neuron plus its bias.
This view of network as a parameterized function will be the basis for applying standard function optimization methods to solve the problem of neural network training.
- 5.1 Network training as a function optimization problem
As mentioned previously, neural networks can be viewed as highly non-linear functions. From this perspective, the training problem can be considered as a general function optimization problem, with the adjustable parameters being the weights and biases of the network, and the Levenberg-Marquardt can be straightforward applied in this case.
- 5.2 Levenberg-marquardt algorithm
Basically, it consists in solving the equation:
PPT Slide
Lager Image
Where J is the Jacobian matrix (Eq. 8), λ the Levenberg's damping factor, δ the desired updated weight vector and E the error vector containing the output errors for each input vector used on training the network. The δ tells us by how much we should change our network weights to achieve a (possibly) better solution. The JT J matrix may also be known as the approximated Hessian. The λ damping factor is adjusted at each iteration, and guides the optimization process. If the reduction of E is rapid, a smaller value can be used, bringing the algorithm closer to the Gauss-Newton algorithm, whereas if an iteration gives insufficient reduction in the residual, λ can be increased, giving a step closer to the gradient descent direction.
- 5.3 Computing the Jacobian
AnN-by- M matrix of all first-order partial derivatives of a vector-valued function. N is the number of entries in our training set and M is the total number of parameters (weights + biases) of our network. It can be created by taking the partial derivatives of each output in respect toeach weight, and has the form
PPT Slide
Lager Image
Where F ( xi , w ) is the network function evaluated for the ith input vector of the training set using the weight vector w and wj is the jth element of the weight vector w of the network.
- 5.4 General Levenberg-Marquardt algorithm
As stated earlier, the Levenberg-Marquardt consists basically in solving (11) with different values of λ until the sum of squared error decreases. So, each learning iteration (epoch) will consist of the following basic steps:
  • 1 Compute the Jacobian
  • 2 Compute the error gradient:g=JTE
  • 3 Approximate the Hessian:H=JTE
  • 4 Solve (H+λI)δ=gto findδ
  • 5 Update the network weightsωusingδ
  • 6 Recalculate the sum of squared errors
  • 7 If the sum of squared errors has not been decreased, discard the new weights, increaseλusingvand go to step 6.
  • 8 Else decreaseλusingvand stop.
Variations of the algorithm may include different values of v , one for decreasing λ and another for increasing it. Others may solve ( H + λdiag ( H )) δ = g instead of ( H + λI ) δ = g , while others may select the initial λ according to the size of the elements on H, by setting λ 0 = t max( diag ( H )) , where t is a chosen value.
We can see that we will have a problem if the error does not decrease after some iteration. In this case, the algorithm also stops if λ becomes too large [28 - 29] .
6. Experiment Results
The experimental data are collected in Ampère Laboratory, University of Lyon. The experimental bench consists of a three-phase asynchronous-motor squirrel cage Leroy Somer LS 132S, IP 55, Class F, T ◦C standard = 40 ◦C. The motor is loaded by a powder brake. Its maximum torque (100 Nm) is reached at rated speed.
This brake is sized to dissipate a maximum power of 5kW. Fig. 2 shows the motor bench. The wear obtained on the bearings is a real one ( Fig. 3 ). For the rotor fault, the bar has been broken by drilling the bar of the squirrel cage ( Fig. 4 ). For simulating the fault of imbalance stator, imbalanced power is obtained with a variable auto-transformer placed on a phase of the network ( Fig. 2 ).
PPT Slide
Lager Image
The 5.5 kW motor coupled with load (powder brake).
PPT Slide
Lager Image
Accelerated wear of the bearings by immersion in acid.
PPT Slide
Lager Image
Rotor with broken bars
The acquisition system used to measure these signals consists of eight differential inputs used to measure currents sampled up to 20 MHz 14-bit.
The current signals sampling rate is 20 kHz. The number of samples per signal rises to N=100000 samples on an acquisition period of 5s. The data acquisition set consists of 15 examples of stator current recorded on different levels of load (0%, 25%, 50%, 75% and 100%). Different operating conditions for the machine were considered, namely, healthy, bearing fault, stator fault and rotor fault. The training set is carried out on first ten current examples. The last five current examples are used to test the classification.
Each signal is passed through a lowpass filter and resampled with a downsampling rate of 50. Only the range of the required frequencies is preserved. The lowpass filter is used in order to avoid aliasing during downsampling. The dimension of ambiguity plane is 200×200=40000 points; by considering symmetry compared to the origin, we retain only the quarter of ambiguity plane, which corresponds to N=10000.We designed three kernels: stator fault kernel, rotor fault kernel and bearing fault kernel [18] . Fisher’s point locations in the Doppler-delay plane are ranged in the feature vectors {FV 1 ,…, FV N } as training database of the neural network. In neural network, if there are too few neurons in the hidden layer, the network may not contain sufficient degrees of freedom to form a representation. If too many neurons are defined, the network might become overtrained. Therefore, an optimum design of the neurons number is required. In this work, we used one hidden layer with a number of different neurons to determine the suitable network. As a stop criterion we intended a goal of 10 -12 which defines the convergence of the algorithm. The goal is reached in a minimum number of epochs 16 and 24, Fig. 5 and 6 respectively.
PPT Slide
Lager Image
Training diagrams for optimum case of 5 hidden neurons in kernel 1.
PPT Slide
Lager Image
Training diagrams for optimum case of 5 hidden neurons in kernel 2
The training algorithm gives a better performance for a number of 5 neurons in the hidden layers for the three kernels ( Table 1 ).
Misclassification results
PPT Slide
Lager Image
Misclassification results
Fig. 7 shows that for 15 test vectors, in case of Kernel 1, 14 were classified which indicates that the classification error is acceptable. This is also true for the two other kernels. Furthermore, the increase of the size of feature vector reduces significantly this error. However, the classification error is minimized when we increase the number of training vectors to 35 vectors (10 vectors of stator currents at 0% of charge, 5 at 25%, 5 at 50%, 5 at 75% and 10 at 100% of rated charge). Fig. 8 shows clearly a marked improvement in the classification process.
PPT Slide
Lager Image
Classification of test vectors for 20 training vectors
PPT Slide
Lager Image
Classification of test vectors for 35 training vectors
The objective of introducing the PSO is the optimization of the feature vectors size. By considering the variance as the fitness function, the size of the feature vectors was found to be 10. This means that the 10 first elements with larger values of the variance are more relevant ( Figs. 9 , 10 ). It is important to note that the training vectors strongly correlate to the number of classified vectors as can be seen on ( Fig. 11 ).
PPT Slide
Lager Image
Feature vectors size optimization by PSO (class 1)
PPT Slide
Lager Image
Feature vectors size optimization by PSO (class 2)
PPT Slide
Lager Image
Classification of test vectors versus training vectors
5. Conclusion
In this paper, we have proposed a new fault classification algorithm of induction machine based on TFR and ANN. We have introduced the PSO algorithm to optimize the size of the feature vectors. Our classification is based on the ambiguity Doppler-delay plane where all the TFRs can be derived by a suitable choice of a kernel. Each type of fault was characterized by a specific kernel. The classification algorithm was tested by comparison with experimental data collected from the stator current measurement at different load levels. The assignment of signal was made by an ANN classifier. The results show that the new algorithm, with the neural network classifier as a decision criterion and the PSO as an optimizing technique, is able to detect and diagnose faults with acceptable accuracy and time consuming calculations compared to the case without PSO optimisation, independently of the load condition and the fault type.
Ammar Medoued He received the degree of Doctor of Sciences from University of Skikda, Algeria in Electrical Engineering. He is currently a Lecturer at the University of Skikda and the Head of the Department of Electrical Engineering. His main research field is Electrical Machine Diagnosis.
Abdesselam Lebaroud was born in Constantine, Algeria, in 1969, He received the PhD degree in electrical engineering from University Claude Bernard Lyon I, Ampere laboratory, France, in 2007. Currently, he is a Professor at the Department of Electrical Engineering, University of Skikda. He carried out researches on diagnosis of electrical machines at LGEC of Constantine.
Abdelaziz LAIFA He worked in oil industrial field for many years. Since 2001, he has been with the University of Skikda as a lecturer and researcher, where he received the PhD in Electrical Power Engineering in 2012. His main interests are Power Systems Analysis and Control using intelligent programming and meta-heuristic methods.
Djamel Sayad He received the degree of Magister in Electronics from the University of Constantine 1998. He is actually a Lecturer at the University of Skikda. Algeria. His main field of research is Signal Processing, Diagnosis.
Tavner P. J. , Gaydon B. G. , Ward D. M. 1986 “Monitoring Generators and Large Motors,” Proc. Inst. Elect. Eng. — B 133 (3) 169 - 180
Vas P. 1993 “Parameter Estimation, Condition Monitoring and Diagnosis of Electrical Machines” Clarendon Oxford, U.K.
Bouzid M. , Champenois G. , Bellaaj N.M. , Signac L. , Jelassi K. 2008 “An Effective Neural Approach for the Automatic Location of Stator Interturn Faults in Induction Motor” IEEE Transactions on Industrial Electronics 55 (12) 4277 - 4289    DOI : 10.1109/TIE.2008.2004667
Abdesselam Lebaroud , Guy Clerc 2011 “Study of Rotor Asymmetry Effects of an Induction Machine by Finite Element Method” JEET, Journal of Electrical Engineering & Technology 6 (3) 342 - 349    DOI : 10.5370/JEET.2011.6.3.342
Cupertino F. , Giordano V. , Mininno E. , Salvatore L 2005 “Application of Supervised and Unsupervised Neural Networks for Broken Rotor Bar Detection in Induction Motors” IEEE International Conference onElectric Machines and Drives 1895 - 1901
Medoued Ammar , Lebaroud Abdesselem , Boukadoum Ahcene , Clerc Guy 2010 “On-line Faults Signature Monitoring Tool for, Induction Motor Diagnosis” Journal of Electrical Engineering & Technology 5 (1) 140 - 145    DOI : 10.5370/JEET.2010.5.1.140
Chow M.-y. , Mangum P.M. , Yee S.O. 1991 “A neural network approach to real-time condition monitoring of induction motors” IEEE Transactions on Industrial Electronics 38 (6) 448 - 453    DOI : 10.1109/41.107100
Su H. , Chong K. T. 2007 “Induction Machine Condition Monitoring Using Neural Network Modeling,” IEEE Trans. Ind. Electron. 54 (1) 241 - 249
Ordaz-Moreno A. , de Jesus Romero-Troncoso R. , Vite-Frias J. A. , Rivera-Gillen J. R. , Garcia-Perez A. 2008 “Automatic Online Diagnosis Algorithm for Broken-Bar Detection on Induction Motors Based on Discrete Wavelet Transform for FPGA Implementation” IEEE Trans. Ind. Electron. 55 (5) 2193 - 2202    DOI : 10.1109/TIE.2008.918613
Yazıcı B. , Kliman G. B. 1999 “An Adaptive Statistical Time-Frequency Method For Detection of Broken Bars and Bearing Faults in Motors Using Stator current,” IEEE Trans. Ind. Appl. 35 (2) 442 - 452    DOI : 10.1109/28.753640
Wang M. , Rowe G. I. , Mamishev A. V. 2004 “Classification of Power Quality Events Using Optimal Time-Frequency Representations — Part 2: Application,” IEEE Trans. Power Del. 19 (3) 1496 - 1503    DOI : 10.1109/TPWRD.2004.829869
Davy M. , Doncarli C. 1998 “Optimal kernels of time-frequency representations for signal classification,” in Proc. IEEE-SP Int. Symp. Time-Freq. Time-Scale Anal. 581 - 584
Heitz C. 1995 “Optimum Time-Frequency Representations for the Classification and Detection of Signals,” Appl. Signal Process. 2 (3) 124 - 143
Gillespie B. W. , Atlas L. 2001 “Optimizing Time-Frequency Kernels for Classification,” IEEE Trans. Signal Process. 49 (3) 485 - 496    DOI : 10.1109/78.905863
Wong K. P. , Yuryevich J. 1998 Evolutionary Programming Based Algorithm for Environmentally Constrained Economic Dispatch IEEE Trans. Power Syst. 13 (2) 301 -    DOI : 10.1109/59.667339
Angeline P. J. 1998 Using Selection to Improve Particle Swarm Optimization in Proc. IEEE International Conference on Evolutionary. Computations 84 - 89
Kennedy J. , Eberhart R. 1995 Particle swarm optimization Proc. IEEE Int. Conf. Neural Networks IV 1942 - 1948
Medoued A. , Lebaroud A. , Boukadoum A. , Boukra T. , Clerc G. 2011 “Back Propagation Neural Network for Classification of Induction Machine Faults,” 8th SDEMPED, IEEE Symposium on Diagnostics for Electrical Machines, Power Electronics & Drives Bologna, Italy September 5-8, 2011 525 - 528
Lebaroud A. , Clerc G. 2008 “Classification of Induction Machine Faults by Optimal Time frequency Representations,” IEEE Trans. on Industrial Electronics 55 (12)
Lebaroud A. , Clerc G. 2009 “Accurate Diagnosis of Induction Machine Faults Using Optimal Time-Frequency Representations” Engineering Applications of Artificial Intelligence 22 (4-5) 815 - 822    DOI : 10.1016/j.engappai.2009.01.002
Kennedy J. , Eberhart R. 1995 “Particle swarm optimization,” in Proc. IEEE Int. Conf. Neural Netw. 4 1942 - 1948
Rashtchi V. , Aghmasheh R. 2010 “A New Method for Identifying Broken Rotor Bars in Squirrel Cage Induction Motor Based on Particle Swarm Optimization Method,” World Academy of Science, Engineering and Technology 67 694 - 698
Eberhart R. , Shi Y. 2001 “Particle swarm optimization: developments, applications and resources,” in Proc. Cong. Evol.Comput 1 81 - 86
Kennedy J. , Mendes R. 2003 “Neighborhood topologies in fully informed and best-of-neighborhood particle swarms,” Proc. of the IEEE International Workshop 45 - 50
M'hamed B. 2009 “Using Two Pso-Structures Approaches To Estimate Induction Machine Parameters ” 13th European Conference on Power Electronics and Applications 8-10 Sept. 2009 1 - 8
Hamid R.H.A. , Amin A.M.A. , Ahmed R.S. , El-Gammal A. 2006 “New Technique for Maximum Efficiency and Minimum Operating Cost of Induction Motors Based on Particle Swarm Optmization (PSO)” IEEE International Symposium on Industrial Electronics 3 (21) 2176 - 2181
Fan H. , Shi Y. 2001 “Study on Vmax of particle swarm optimization,” in Proc. Workshop on Particle Swarm Optimization, Purdue School of Engineering and Technology Indianapolis, IN
Fausett L. 1994 “Fundamentals of neural networks architectures, algorithms, and applications.” Prentice Hall Englewood Cliffs, NJ
aykin S. 1998 “'Neural networks: a comprehensive foundation” 2nd ed Macmillan New York