Short-term Electric Load Forecasting Based on Wavelet Transform and GMDH
Short-term Electric Load Forecasting Based on Wavelet Transform and GMDH
Journal of Electrical Engineering and Technology. 2015. May, 10(3): 832-837
Copyright © 2015, The Korean Institute of Electrical Engineers
This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • Received : May 01, 2014
  • Accepted : December 10, 2014
  • Published : May 01, 2015
Export by style
Cited by
About the Authors
Bon-Gil Koo
Dept. of Electrical and Electronic Engineering, Pusan National University, Korea.
Heung-Seok Lee
Dept. of Electrical and Electronic Engineering, Pusan National University, Korea.
Juneho Park
Corresponding Author: Dept. of Electrical and Electronic Engineering, Pusan National University, Korea. (

The group method of data handling (GMDH) algorithm has proven to be a powerful and effective way to extract rules or polynomials from an electric load pattern. However, because it is nonstationary, the load pattern needs to be decomposed using a discrete wavelet transform. In addition, if a load pattern has a complicated curve pattern, GMDH should use a higher polynomial, which requires complex computing and consumes a lot of time. This paper suggests a method for short-term electric load forecasting that uses a wavelet transform and a GMDH algorithm. Case studies with the proposed algorithm were carried out for one-day-ahead forecasting of hourly electric loads using data during the years 2008–2011. To prove the effectiveness of our proposed approach, the results were evaluated and compared with those obtained by Holt-Winters method and artificial neural network. Our suggested method resulted in better performance than either comparison group.
1. Introduction
The importance of short-term electric load forecasting (STLF) has been growing over the years. STLF is one of the significant factors in the operation of an electricity grid. Traditionally, accurate STLF can lead to the proper selection of the amount of reserve margin, which contributes to the improved efficiency of the power supply. STLF also helps in the controlling and scheduling of power systems. By contrast, inaccurate STLF can cause a residual or short supplement and may lead to overinvestment or shortage of electric power.
STLF is necessary not only for the stable and efficient operation of grid but also for appropriately accommodating new industrial restructuring. Nowadays, as the electric power market experiences privatization and deregulation, accurate STLF has become a matter of common interest to market participants. In the near future, demand response will be available owing to technical improvements in metering and real-time communication. Tools such as advanced metering infrastructure (AMI) will help customers access time-varying electricity prices [1] . Thus, electric load forecasting will have a key role in electric market.
Various models have been established for STLF over the course of several decades. Each model can be classified into three categories by their base method, statistical method, intelligent method and hybrid method. For statistical method, the regression method are quite common. To meet the specific purpose, improved regression algorithm, such as ARMA, ARIMA, have been presented [2 - 3] However, functional relationship between electric load and various weather variables or other social variables. Nowadays, the methods based on intelligence algorithm such as Artificial Neural Network, Fuzzy algorithm can be alternatives to overcome the finding function. They also flexible and robust for nonlinear problem. So, during recent decade, these methods are focused to solve nonlinear relationship between meteorological data and electric load data [4 - 7] . There are several hybrid method to complement the single algorithm model. [8 - 9]
Electric load data are time series data that have a certain period. However, time series data are the sum of information from various periods. The period for time series data consists of a weekly day-type pattern. Time series data also have a shorter-term pattern that comes from industrial load, and a longer-term pattern that comes from seasonal or weather changes. For this reason, it is helpful to analyze the origin of data after we divide the data into some frequencies. To express the data in the frequency domain, the Fourier transform is one of most popular methods. However, practical electric load data are nonstationary, so a Wavelet transform is more suitable than the Fourier transform. Many papers have been published that explore Wavelet decomposition. Most of these papers have featured an approach that uses artificial neural networks [4 - 7] . This approach showed relatively high accuracy compared to conventional forecasting methods [7] .
2. Wavelet Transform
Using a wavelet means that certain seismic signals can be suitably modeled by combining translations and dilations of a simple oscillatory function of finite duration [10] . Unlike trigonometric functions, this wave enables us to analyze a nonstationary signal using a pulse that is located in time and has finite energy that integrates to zero.
- 2.1 Wavelet decomposition and reconstruction
Wavelet decomposition is a method to gather information from the breaking up of a signal into shifted and scaled versions of the mother wavelet. As it travels, the original signal becomes a lower-resolution signal. There are two types of filters (low-pass and high-pass) that can decompose or reconstruct the signal. Both the low- and high-pass filters form a system. The original signal S is decomposed into details (abbreviated as D1) and approximations (abbreviated as A1) by going through each filter. The coefficient D1, obtained from the high-pass filter, contains a high-frequency component that describes the short-term period pattern. The other coefficient, A1, is obtained from the low-pass filter and contains a low-frequency component that describes the long-term period pattern. If the decomposition level is two or higher, only an approximate coefficient is decomposed. The coefficient A1 is decomposed into A2 and D2. The next level of decomposition is achieved by repeating this sequence. The decomposition can continue only until the individual details consist of a single sample or pixel [11] . After this process is complete, the signal can be smoothed by eliminating the high frequency or by separating it into high frequency and low frequency. This is shown in Fig. 1 .
PPT Slide
Lager Image
Process of a signal being decomposed into three resolution levels.
Wavelet-decomposed components can be reconstruct without loss of data. This process is called reconstruction, and the procedure is shown in Fig. 2 .
PPT Slide
Lager Image
Reconstruction tree of decomposed signal.
The major characteristic of reconstruction is that it can reconstruct the signal from the coefficients of the details and approximations. The original signal can be reconstructed from the coefficient vectors of the approximations and details. During this process, reconstruction only needs coefficient vectors, which are produced by downsampling the length of the signal by half. Therefore, before reconstruction, the coefficient should be modified by allocating zeros between samples.
Using multiresolution analysis theory, a fast discrete wavelet transform known as the Mallat algorithm was developed. When the signal is decomposed, it uses different resolution levels.
Fig. 1 shows this process using a signal that has been decomposed into three resolution levels. As shown in Fig. 1 , the Mallat algorithm focuses on a careful decomposition of the low-frequency space of the signal, resulting in better resolution in the low-frequency series.
Compared with the original signal, the series decomposed by the Mallat algorithm shows a great reduction in the nonstationary characteristics. Because the series decomposed by wavelet decomposition is an approximate stationary series, wavelet decomposition is useful in improving the forecasting accuracy.
A three-level wavelet transform can be divided into three details and three approximations, as seen below:
  • S=A1+D1
  •      =A2+D2
  •      =A3+D3+D2+D1
- 2.2 Continuous and discrete wavelet transforms
Wavelet Transform is consist of a continuous Wavelet and a discrete wavelet transform (DWT). W(a, b) of signal f(x), with respect to a mother wavelet f(x), is described as [12]
PPT Slide
Lager Image
where a and b are real, and * denotes complex conjugation. The scale parameter a controls the spread of the wavelet, and translation parameter b determines its central position.
The W(a, b) coefficient represents how well the original signal f(x) and the scaled / translated mother wavelet match. Thus, for all a, b associated with a particular signal, the set of all wavelet coefficients W(a, b) is the wavelet representation of the signal with respect to the mother wavelet. The CWT provides a redundant representation of the signal in the sense that the entire support of W(a, b) need not be used to recover f(t). Instead of using that approach, the mother wavelet can be scaled and translated using certain scales and positions usually based on powers of two. This is known as the DWT. This scheme is more efficient than, and is as accurate as, the CWT [10] . The DWT is described in the following equation:
PPT Slide
Lager Image
where a, b are same in (1) and can be expressed in m. In (2), k is an integer variable that refers to a particular point of the input signal, and n is the discrete time index.
3. GMDH(Group Method of Data Handling) Algorithm
The GMDH algorithm was first developed by A.G. Ivakhnenko. It is a heuristic self-organization method. The GMDH can formulate or identify a complex system without tracking the path of input-output. Generally, the relationship between inputs and output in a nonlinear function can be expressed by a complicated polynomial series in the form of the Volterra series:
PPT Slide
Lager Image
The discrete form of (3) can be expressed by (4) named Kolmogorov-Gabor polynomial [13]
PPT Slide
Lager Image
where x represents the input of the system, N is the number of inputs, and a represents coefficients.
In the case of double variables and the quadratic form, (4) can be simplified by (5) [14]
PPT Slide
Lager Image
To obtain coefficient a, the algorithm follows this procedure.
  • Step 1: Select m input variables affecting the output variable. If required, normalize input data. This input data calls x1, x2, …, xm.
  • Step 2: Divide N input-output data (x1i, x2i,⋯, xni, yi, i = 1, 2, …, N) into training data and testing data. N = Ntrain+ Ntest
  • Step 3: Select input data xp, xq among n input data. Establish double-variable, second-order partial descriptions (PD).
PPT Slide
Lager Image
  • where a0, a1,⋯, a5are coefficients, and Gkis a intermediate-variables
  • Step 4: Estimate coefficients a0, a1, ⋯, a5, using linear regression and its input training data. Coefficients can be obtained by minimizing E in the equation (8).
PPT Slide
Lager Image
  • Step 5: Using coefficients obtained from the previous step, calculate the square error EL. Among E1, E2, ⋯ En(n-1)/2, the lowest error E is assigned Ebest. If the polynomials and intermediate variables satisfy the condition Ebest≤ Ec, terminate the algorithm, or arrange organized polynomials by error ELfrom lowest to highest. Select m polynomials in order of lowest error.
  • Step 6: Using the selected polynomial in the previous step, construct the next layer using new input-output data, and repeat the process beginning with Step 3.
This procedure can be shown as Fig. 3 .
PPT Slide
Lager Image
GMDH network.
4. Case Study
- 4.1 Electric load classification
Before electric load forecast, electric load data has to be classified to their day-type. To do this, normally has been used calendar based method. However, electric load data does not always follows their day-type, it has possibility to obtain incorrect data with respect to their genuine day-type characteristic. To avoid error, this paper classified electric load data into seasonal and day-type using K-mean and kNN respectively.
- 4.2 Electric load decomposition
For our proposed approach, we selected the DWT with Mallat method to perform the load decomposition. For basis function f(t), we selected Db1 (Daubechies wavelet of order 1). Some of the existing research with the wavelet transform in electric load forecasting and price forecasting usually used Db4 (Daubechies wavelet of order 4), but this approach was not appropriate for GMDH. Generally, GMDH has inputs at the utmost 2nd order polynomial, so it is better to fit the curve of the load as a square rather than as a quadratic or higher order curve.
Fig. 4 shows different results of the three-layer wavelet decomposition using Db1 and Db4, respectively. In Fig. 4 , we see that Db1 disassembles the original load into a suitable wave to be fit by lower order polynomials than Db4 does.
PPT Slide
Lager Image
Comparison of results using Db1 and Db4
- 4.3 Electric load forecasting and development of whole forecasting model
GMDH has two input units, which describe the previous load and temperature trend by the hour. The layer is modeled to be created repeatedly, until there is no more improvement in the lowest error. Data groups utilize the previous 72 h, divided into 48 h of training data and 24 h of test data. Each model can have different numbers of layers, but the maximum number is set at 100.
The variation of data structure of the proposed forecasting system is shown in Fig. 5 . Before applying the wavelet transform, the electric load is classified by its day-type and seasonal type. We divide the electric data into four seasons commonly used and five day-types such as Monday, Weekdays, Saturday, Sunday, Special holiday. Each classified electric load is used to establish forecasting models except for special holiday(summer vacation, new year’s day, etc). The load is decomposed using the wavelet transform and is forecast using GMDH, Holt-Winters method, and an artificial neural network.
PPT Slide
Lager Image
Structure of overall forecasting system
PPT Slide
Lager Image
Comparison of applying wavelet and non-wavelet forecasted
- 4.4 Test and discussion
Case studies for our proposed algorithm involved a oneday-ahead forecasting of hourly electric loads using the data from 2008–2011. To compare our data with the data from other papers, we evaluated our results using the following indices:
Mean Absolute Percentage Error(MAPE)
PPT Slide
Lager Image
Standard deviation
PPT Slide
Lager Image
To prove the validity of our proposed method, we compared our results with the same data from the applications of Holt-Winters multiplicative method and the artificial neural network. We found the smoothing constant α=0.5, β=1, γ=0.1 by increasing each value from 0.1 to 1. We selected the multilayer perceptron type with 26 input nodes, 2 layers(one of which is hidden) and 24 output neurons for Artificial Neural Network. [15] It was trained by back propagation algorithm, while the selected iteration tolerance after the error is less than 0.005. Table 1 shows day-ahead forecasting results, for each day types in MAPE. Table 2 lists the standard deviations and MAPE of the prediction errors, for each hour and for all types, in 2008 and 2011.
Comparison of forecasting results in day-type
PPT Slide
Lager Image
Comparison of forecasting results in day-type
Comparison of forecasting results in hour
PPT Slide
Lager Image
Comparison of forecasting results in hour
According to Table 1 , for spring season forecasting, our proposed method improves the accuracy in MAPE difference over Holt-Winters, ANN by 1.592%, 0.312%, respectively, for summer season forecasting, by 1.191%, 0.294%, respectively, for fall season forecasting, by 1.9%, 0.317%, respectively, for winter season forecasting, by 0.568%, 0.277%, respectively. For all day-types in each season, our proposed algorithm outperforms other methods.
As shown in Table 2 , for overall mean performance, our prosed method results in the lowest forecasting error but not in standard deviation. For all hourly data, it shows better performance than other two method. The results shows, it represent 53% and 23% higher accuracy with respect to other two algorithm. Most of all, we can find its competitive performance for peak load period(2-3P.M. in summer, 10A.M. in winter). Whereas, for all hourly results, our proposed method doesn’t show better performance than ANN in standard deviation.
Two additional test cases — weekday data only (case 1) and weekend data included (case 2) — showed case 1 with 0.96% and case 2 with 1.01%, respectively. There was not much of a difference.
5. Conclusion
This paper proposed a new algorithm using DWT and the GMDH algorithm. The forecasting was performed after we classified the electric loads by their day-type and seasonal type, using a data mining technique. Refined electric load data were decomposed into their frequency. We then used GMDH to perform load forecasting, and the data were reconstructed by using an inverted DWT.
Regarding accuracy, our proposed algorithm showed a lower error rate over the tested period, not only in the comparison group but also in reference paper. Most of all, its accuracy for peak time was competitive. Whereas, it is necessary to improve in standard deviation. Including additional meteorological factors such as wind speed, solar irradiance, and rainfall rate can help to decrease the error rate.
This work was supported by the National Research Foundation of Korea (NRF-2013R1A1A2013798).
Bon-Gil Koo He received his B.S. and M.S. Degrees from Pusan National University, Busan, Korea in 2008 and 2010, respectively. He is currently PH.D candidate at Pusan National University. His research interests include electric load forecasting and intelligent systems applications to power systems.
Heung-Seok Lee He received his B.S. Degrees from Ulsan University, Ulsan, Korea in 2012. He received his M.S. Degrees from Pusan National University, 2014. He is currently PH.D candidate at Pusan National University. His research interests include electric load forecasting and also electric ship power systems.
June Ho Park He received the B.S., M.S. and Ph.D. degrees from Seoul National University, Seoul, Korea in 1978, 1980, and 1987, respectively, all in electrical engineering. He is currently a Professor at the School of Electrical Engineering, Pusan National University, Busan, Korea. His research interests include intelligent systems applications to power systems. Dr. Park has been a member of the IEEE Power Engineering Society.
2006 “Energy policy Act of 2005,” section 1252
Amjady. N 2001 “Short-term hourly load forecasting using time-series modeling with peak load estimation capability” IEEE Transactions on Power systems. 16 (3) 498 - 505
Huang Shyh-Jier , Shih Kuang-Rong 2003 “Short-term load forecasting via ARMA model identification includ-ing non-Gaussian process considerations” IEEE Transactions on Power Systems 18 (2) 673 - 679    DOI : 10.1109/TPWRS.2003.811010
Singh D. , Singh S. P. 2001 “A self selecting neural network for short-term load forecasting,” Electric Power Component System 29 117 - 130    DOI : 10.1080/153250001300003386
Ranaweera D. K. , Hubele N. F. , Karady G. G. 1996 “Fuzzy logic for short-term load forecasting,” Electric Power Energy System 18 (4) 215 - 222    DOI : 10.1016/0142-0615(95)00060-7
Srinivasan D. , Liow A. C. , Chang C. S. 1994 “Forecasting daily load curves using a hybrid fuzzy neural approach” IEE Proceedings Generation Transmission Distribution 141 (6) 561 - 567    DOI : 10.1049/ip-gtd:19941288
Park J. H. , Park Y. M. , Lee K. Y. 1991 “Composite Modeling for Adaptive Short-term Load Forecasting” IEEE Transactions on Power Systems 6 (2) 450 - 457    DOI : 10.1109/59.76686
El Desouky A.A , Elkateb. M.M 2000 “Hybrid adaptive techniques for electric-load forecast using ANN and ARIMA” Generation, Transmission and Distribution, IEE Proceedings 147 (4) 213 - 217    DOI : 10.1049/ip-gtd:20000521
Huang Dai-Zheng , Shu Gong Ren-Xi Gong 2015 ”Prediction of Wind Power by Chaos , BP Artificial Neural Networks Approach Based on Genetic Algorithm” Journal of Electrical Engineering & Technology 10 (1) 41 - 46    DOI : 10.5370/JEET.2015.10.1.041
Agnaldo J.Rocha Reis , Alexandre P.Alves da Silva 2005 “Feature Extraction via Multiresolution Analysis for Short-term Load Forecasting” IEEE Transactions on Power Systems 20 (1)
Ajay Shekhar Pandey , Devender Singh , Sunil Kumar Sinha 2010 “Intelligent Hybrid Wavelet Models for Short-Term Load Forecasting” IEEE Transactions on Power Systems 25 (3) 1266 - 1273    DOI : 10.1109/TPWRS.2010.2042471
Yanqju Bi , Jianguo Zhao , Dahai Zhang “Power Load Forecasting Algorithm Based on Wavelet Packet Analysis” Power System Technology – POWERCON 1 987 - 990
Aiyun Zheng , Weimin Liu , Fanggeng Zhao 2010 “Double Trends Time Series Forecasting Using a Combined ARIMA and GMDH Model” Control and Decision Conference (CCDC), 2010 Chinese 1820 - 1824
Ehab. E. Elattar , John Yannis Goulermas 2012 “Generalized Locally Weighted GMDH for Short Term Load Forecasting” IEEE Transactions on Systems, Man and Cybernetics-part C: Applications and Reviews 42 (3) 345 - 356    DOI : 10.1109/TSMCC.2011.2109378
Henrique Steinherz Hippert , Carlos Eduardo Pedreira , Reinaldo Castro Souza 2001 “Neural Networks for Short-Term Load Forecasting:A Review and Evaluation” IEEE Transactions on Power Systems 16 (1) 44 - 55    DOI : 10.1109/59.910780