Advanced
A HIGHER ORDER ITERATIVE ALGORITHM FOR MULTIVARIATE OPTIMIZATION PROBLEM
A HIGHER ORDER ITERATIVE ALGORITHM FOR MULTIVARIATE OPTIMIZATION PROBLEM
Journal of Applied Mathematics & Informatics. 2014. Sep, 32(5_6): 747-760
Copyright © 2014, Korean Society of Computational and Applied Mathematics
  • Received : February 22, 2014
  • Accepted : May 24, 2014
  • Published : September 30, 2014
Download
PDF
e-PUB
PubReader
PPT
Export by style
Share
Article
Author
Metrics
Cited by
TagCloud
About the Authors
SUVRA KANTI CHAKRABORTY
GEETANJALI PANDA

Abstract
In this paper a higher order iterative algorithm is developed for an unconstrained multivariate optimization problem. Taylor expansion of matrix valued function is used to prove the cubic order convergence of the proposed algorithm. The methodology is supported with numerical and graphical illustration. AMS Mathematics Subject Classification : 90C25, 49M15.
Keywords
1. Introduction
Classical Newton method is one of the popular gradient based iterative meth-ods and widely used for its quadratic convergence property. In recent years, a lot of research is going on for developing higher order iterative algorithms which are based on the logic of Newton’s method, in different areas of numerical computa-tions. Some important higher order iterative methods for finding the root of a nonlinear equation are seen in the literature. Homeier proposed a modification of Newoton method for finding the zero of univariate functions that converges cubically [6 , 7] . Kou et al. have proposed a cubic order convergent algorithm for solving nonlinear equations [8] and also some variant of Ostrowski’s method with seventh-order convergence [9] . Chun has contributed on the schemes with fourth order convergence and their family [2 , 3] . Liu et. al have proposed eighth order method with high efficiency index [10] and Cordero et. al have proposed sixth and seventh order schemes [4] for finding root of univariate nonlinear equation. Employing any of these iterative methods one can optimize a univariate, nonlin-ear differentiable function more efficiently. However in this paper, an attempt is made to develop a higher order iterative process for optimizing a multivari-ate function. For developing this scheme, trapezoidal approximation of definite integral is used and classical Newton method is considered in an implicit form. Theory of Taylor expansion of matrix valued function has helped to establish the convergence of the algorithm. It is proved that the proposed algorithm has cubic order convergence property.
Calculus of matrix valued functions has been widely used in various fields of mathematics. This theory has been developed in several directions by many researchers like Turnbull [12 , 13] , Dwyer et. al [5] , Vetter [14 , 15] . Theory of matrix calculus by Vetter [14 , 15] , which uses Kronecker algebra, is the most popular one for its consistency and completeness. In the later period it has been adopted by researchers from various fields like system theory [1] , sensitivity anal-ysis [18] , stochastic perturbation [17] , statistical models [20 , 16] ), econometric forecasting [11] , neural network [19] etc. In this paper we use the Taylor ex-pansion of a matrix valued function as developed by Vetter [15] to prove the convergence of our algorithm.
Content of this paper is summarized in the following sections. In Section 2, the new scheme is proposed. In Section 3, detailed convergence analysis of the proposed scheme is given. A comparative study between the classical Newton method and the proposed method is discussed in Section 4. Finally, a table with several test functions and a graphical illustration have been given in Appendix.
2. Proposing a new multivariate optimization algorithm
Consider an optimization problem
( P )
PPT Slide
Lager Image
f ( x ) where, f : ℝ s → ℝ is a sufficiently differentiable function.
Denote
PPT Slide
Lager Image
and φ ( θ ) = ∇ f ( xn + θ ( x xn )). Then φ (0) = ∇ f ( xn ), φ (1) = ∇ f ( x ), and φ′ ( θ ) = [∇ 2 f ( xn + θ ( x xn ))]( x xn ). From fundamental theorem of calculus,
PPT Slide
Lager Image
So
PPT Slide
Lager Image
Hence
PPT Slide
Lager Image
Hence
PPT Slide
Lager Image
Let x n+1 be the root of the equation
PPT Slide
Lager Image
Thus,
PPT Slide
Lager Image
This is an implicit functional relation in x n+1 at xn . We replace ∇ 2 f ( x n+1 ) by ∇ 2 f ( zn ), where zn is the next iteration point, derived by classical Newton method at xn . Then the new iteration scheme becomes
PPT Slide
Lager Image
3. Convergence analysis of the new scheme
To study the convergence analysis of the new scheme (1), following notations and definitions are explained as prerequisites in Subsection 3.1. In the Subsection 3.2, some new definitions and lemmas are introduced which will be used to prove the convergence theorem in Subsection 3.3.
- 3.1. Prerequisite.
Is = s × s dimensional identity matrix.
ρ ( A )= Spectral radius of the matrix A .
A B = Kronecker product of two matrices A and B .
For matrices A = ( aij ) m×n and B = ( bij ) s×t , A B = ( aijB ) ms×nt .
A ×k = A A ⊗ . . . ⊗ A (The kth Kronecker power of A ).
AB = Matrix product of two matrices A and B .
For the matrices A, B, C and D the following properties hold.
(P1) A ⊗ ( B + C ) = A B + A C .
(P2) ( A + B ) ⊗ C = A A ⊗ C + B C .
(P3) ( kA ) ⊗ B = A ⊗ ( kB ) = k ( A B ), k is a scalar.
(P4) ( A B ) ⊗ C = A ⊗ ( B C ).
(P5) ( A B )( C D ) = AC BD , matrix dimension must agree to hold the matrix product AC and BD .
(P6) ( A B ) ⊗ Is = ( A Is )( B Is ) (This follows from (P5)).
Definition 3.1 ( Matrix function ). A matrix function
PPT Slide
Lager Image
maps a matrix of s × t dimension to a matrix of p × q dimension.
Definition 3.2 ( Matrix derivative [15] ). The derivative structure of a matrix-valued function
PPT Slide
Lager Image
with respect to a scalar bkl and with respect to the matrix B s×t are defined as
PPT Slide
Lager Image
Higher order derivatives are given as
PPT Slide
Lager Image
Matrix Taylor Expansion: The Taylor expansion structures for a matrix valued function
PPT Slide
Lager Image
of a column vector u ∈ ℝ s about the column vector
PPT Slide
Lager Image
described in [15] is:
PPT Slide
Lager Image
where
PPT Slide
Lager Image
- 3.2. New Definition and Lemmas.
Definition 3.3. Let f : ℝ s → ℝ be a sufficiently differentiable function, gradi-ent of f be ∇ f . A function
PPT Slide
Lager Image
is defined as
PPT Slide
Lager Image
where
PPT Slide
Lager Image
is as defined in (2). xT is the row vector ( x 1 , x 2 , . . . , xs ).
Lemma 3.4.
PPT Slide
Lager Image
Proof . This result follows from Definition 3.3 directly.
Lemma 3.5.
PPT Slide
Lager Image
Proof.
PPT Slide
Lager Image
Lemma 3.6. If a ∈ ℝ m and b ∈ ℝ n, then || a b || = || a || || b || where || . || is Euclidean norm .
Proof . For a = ( a 1 , a 2 , . . . , am ) and b = ( b 1 , b 2 , . . . , bn ),
PPT Slide
Lager Image
So
PPT Slide
Lager Image
Lemma 3.7. If u ∈ ℝ s , then || u ×n || = || u || n for n ∈ ℕ.
Proof . For n = 1, || u ×1 || = || u || = || u || 1 .
Suppose || u ×k || = || u || k for some k . Then for n = k + 1,
PPT Slide
Lager Image
Lemma 3.8. If u ∈ ℝ s , then ( u ×n Is )( u ×1 I 1 ) = u ×(n+1) for n ∈ ℕ.
Proof . For n = 1, ( u ×1 Is )( u ×1 I 1 ) = ( u ×1 Is ) u = u ×2
Suppose ( u ×k Is )( u ×1 I 1 ) = u ×(k+1) for some k . Then for n = k + 1,
PPT Slide
Lager Image
- 3.3. Third order convergence of the algorithm.
Let α Rs be the solution of ∇ f = 0. Using matrix Taylor expansion (3) about α , ∇ f ( x ) and ∇ 2 f ( x ) can be expressed as
PPT Slide
Lager Image
where
PPT Slide
Lager Image
PPT Slide
Lager Image
where
PPT Slide
Lager Image
Using Definition 3.3 and replacing x by xn , (4) can be rewritten as
PPT Slide
Lager Image
Using Lemma 3.5 and replacing x by xn , (5) can be rewritten as
PPT Slide
Lager Image
Denote
PPT Slide
Lager Image
Neglecting remainder terms for large M and using (8) we rewrite (6) and (7) as
PPT Slide
Lager Image
PPT Slide
Lager Image
respectively. Now, (10) can be written as
PPT Slide
Lager Image
For large n, xn is in a sufficiently close neighborhood of α such that ρ ( B ) < 1. So from (11),
PPT Slide
Lager Image
From (9) and (12), we have
PPT Slide
Lager Image
Expanding each term in the right hand side of the above expression, we have
PPT Slide
Lager Image
Rearranging the terms in the right side of the above expression according to Kronecker power,
PPT Slide
Lager Image
PPT Slide
Lager Image
Using Lemma 3.8, putting C 1 = Is and rearranging the terms according to Kronecker power in the above expression, we get
PPT Slide
Lager Image
zn is the classical Newton iterate at xn (See (1)). Replacing xn by zn in (10), we get
PPT Slide
Lager Image
Substituting the expression of zn from (1) in the above expression, we have
PPT Slide
Lager Image
Denote
PPT Slide
Lager Image
and
PPT Slide
Lager Image
One may observe that in the expression of D in (15) the lowest Kronecker power of ( xn α ) is 2. As we are writing the terms which produce at most the third kronecker power of ( xn α ) there is no need of writing D ×2 and D ×3 explicitly. After simplifying, expression for P becomes
PPT Slide
Lager Image
Hence (14) can be expressed as
PPT Slide
Lager Image
So small
PPT Slide
Lager Image
(For large n, xn is in a sufficiently close neighborhood of α , so ρ ( P ) < 1. Hence ( Is + P ) −1 = Is P + P 2 − · · · . Substituting the value of P ,
PPT Slide
Lager Image
From the iteration scheme (See (1)) :
PPT Slide
Lager Image
Denote en = xn α . Then,
PPT Slide
Lager Image
Using Lemma 3.7,
PPT Slide
Lager Image
Since || en ||→ 0, for some large n onwards,
PPT Slide
Lager Image
ε is a small positive real number or,
PPT Slide
Lager Image
where r is a positive real constant.
This implies that the new scheme has third order convergence. Hence the following result holds.
Theorem 3.9. Let f : ℝ s → ℝ be sufficiently differentiable function and locally convex at α ∈ ℝ s such that f ( α ) = 0. Then the algorithm (1), with initial point x 0 , which is sufficiently close to α, converges cubically to the local minimizer α of the problem min x∈ℝs f ( x ).
4. Numerical Result
The new algorithm is executed in MATLAB (version- R2013b) and the nu-merical computations are summerized in Table 1 . One may observe that the total number of iterations in proposed method is less than the total number of iterations in classical Newton method. All the steps of one of these test functions are illustrated graphically in Fig. 1 , where it is seen that the proposed process reaches more rapidly than the existing process. CNM denotes classical Newton method and PM denotes proposed method. The Table 1 and Fig.1 are provided in the appendix.
5. Conclusion
Several higher order optimization algorithms exist for single dimension opti-mization problems in the literature of numerical optimization. Newton, Quasi Newton and Conjugate gradient algorithms, which are used for multidimen-sional optimization problems have second and super linear rate of convergence. This paper has developed a cubic order iterative algorithm for unconstrained optimization problems in higher dimension. Taylor expansion of matrix valued function is the key concept to prove the convergence of the algorithm. Using this logic the reader may extend the present work to develop similar algorithms for order of convergence more than 3. In the process of developing the recurrence relation, trapezoidal approximation is used. However one may try with other type approximations also.
BIO
S. K. Chakraborty received M.Sc. from Indian Institute of Technology, Kharagpur, India and currently pursuing Ph.D. His research interests include numerical optimization.
Department of Mathematics, Indian Institute of Technology, Kharagpur. India.
e-mail:suvrakanti@maths.iitkgp.ernet.in
G. Panda received Ph. D. from Utkal University, India. She is currently Associate Professor at Indian Institute of Technology, Kharagpur, India. Her research interests include convex optimization, numerical optimization and optimization with uncertainty.
Department of Mathematics, Indian Institute of Technology, Kharagpur. India.
e-mail:geetanjali@maths.iitkgp.ernet.in
References
Brewer John (1978) Kronecker products and matrix calculus in system theory, Circuits and Systems IEEE Transactions on 25 (9) 772 - 781
Chun Changbum (2007) A family of composite fourth-order iterative methods for solving non-linear equations Applied mathematics and computation 187 (2) 951 - 956    DOI : 10.1016/j.amc.2006.09.009
Chun Changbum (2008) Some fourth-order iterative methods for solving nonlinear equations Applied Mathematics and Computation 195 (2) 454 - 459    DOI : 10.1016/j.amc.2007.04.105
Cordero Alicia , Hueso José L , Martínez Eulalia , Torregrosa Juan R (2010) A family of iterative methods with sixth and seventh order convergence for nonlinear equations Mathe-matical and Computer Modelling 52 (9) 1490 - 1496    DOI : 10.1016/j.mcm.2010.05.033
Dwyer Paul S. , MacPhail M.S. (1948) Symbolic matrix derivatives The annals of mathematical statistics 19 (4) 517 - 534    DOI : 10.1214/aoms/1177730148
Homeier H.H.H. (2003) A modified newton method for rootfinding with cubic convergence Journal of Computational and Applied Mathematics 157 (1) 227 - 230    DOI : 10.1016/S0377-0427(03)00391-1
Homeier H.H.H. (2005) On newton-type methods with cubic convergence Journal of Computational and Applied Mathematics 176 (2) 425 - 432    DOI : 10.1016/j.cam.2004.07.027
Kou Jisheng , Li Yitian , Wang Xiuhua (2006) A modification of newton method with third-order convergence Applied Mathematics and Computation 181 (2) 1106 - 1111    DOI : 10.1016/j.amc.2006.01.076
Kou Jisheng , Li Yitian , Wang Xiuhua (2007) Some variants of ostrowski’s method with seventh-order convergence Journal of Computational and Applied Mathematics 209 (2) 153 - 159    DOI : 10.1016/j.cam.2006.10.073
Liu Liping , Wang Xia (2010) Eighth-order methods with high efficiency index for solving nonlinear equations Applied Mathematics and Computation 215 (9) 3449 - 3454    DOI : 10.1016/j.amc.2009.10.040
Estrada Alfredo Martinez , Ph.D. thesis 2007 A treatise on econometric forecasting California Institute of Technology Ph.D. thesis
Turnbull H.W. (1928) On differentiating a matrix Proceedings of the Edinburgh Mathematical Society (Series 2) 1 (2) 111 - 128    DOI : 10.1017/S0013091500007434
Turnbull H.W. (1930) A matrix form of taylor’s theorem Proceedings of the Edinburgh Mathematical Society (Series 2) 2 (1) 33 - 54    DOI : 10.1017/S0013091500007537
Vetter William J. (1970) Derivative operations on matrices Automatic Control, IEEE Transactions on 15 (2) 241 - 244    DOI : 10.1109/TAC.1970.1099409
Vetter William J. (1973) Matrix calculus operations and taylor expansions SIAM review 15 (2) 352 - 369    DOI : 10.1137/1015034
Walter Eric , Pronzato Luc (1997) Identification of parametric models Communications and Control Engineering
Yimin Zhang , Chen Suhuan , Liu Qiaoling , Liu Tieqiang (1996) Stochastic perturbation finite elements Computers & Structures 59 (3) 425 - 429    DOI : 10.1016/0045-7949(95)00267-7
Zhang Yimin , Wen Bangchun , Liu Qiaoling (2002) Sensitivity analysis of rotor–stator sys-tems with rubbing* Mechanics of structures and machines 30 (2) 203 - 211    DOI : 10.1081/SME-120003016
Zhang Y.M. , Zhang L. , Zheng J.X. , Wen B.C. (2006) Neural network for structural stress concentration factors in reliability-based optimization Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering 220 (3) 217 - 224    DOI : 10.1243/09544100G00505
Zhao Qi , Bohn Christian 2013 A linearization free recursive prediction error method for combined state and parameter estimation for nonlinear systems IEEE American Control Con-ference (ACC), 2013 899 - 904