Anais do XX Congresso Brasileiro de Automática Belo Horizonte, MG, 20 a 24 de Setembro de 2014 DISCRETE BARYCENTER METHOD FOR DIRECT OPTIMIZATION ´ n∗, Felipe Miguel Pait∗ Guilherme Scabin Vicinansa∗, Diego Colo ∗ Laborat´ orio de Automa¸ca ˜o e Controle - LAC/PTC, Escola Polit´ecnica da USP Av. Prof. Luciano Gualberto, trav 3, 158 Cidade Universit´ aria, S˜ ao Paulo, BRAZIL Emails: [email protected], [email protected], [email protected] Abstract— The Discrete Barycenter Method is a direct, derivative–free optimization algorithm which uses the weighted average values of a sequence of guesses to search for minimum points of a function. Good guesses receive larger weights while bad ones are penalized. The method converges with a superlinear order. Simulations show that its performance is at least comparable to that of existing methods, and examples of applications to controller design are presented. Finally, conclusions and suggestions of future work are presented. Keywords— Direct Optimization, Barycenter. Resumo— O M´ etodo do Baricentro Discreto ´ e um algoritmo de otimiza¸ca ˜o sem derivadas que utiliza a m´ edia ponderada de estimativas para busca de m´ınimos de uma certa fun¸c˜ ao custo. Boas estimativas recebem pesos maiores, enquanto que as piores s˜ ao penalizadas. O m´ etodo do baricentro discreto converge com ordem superlinear. S˜ ao realizadas simula¸c˜ oes que mostram que o desempenho deste m´ etodo ´ e, no m´ınimo, compar´ avel ao dos existentes atualmente, e exemplos de aplica¸c˜ ao em projetos de controladores s˜ ao apresentados. Ao final, conclus˜ oes e sugest˜ oes de trabalhos futuros s˜ ao apresentadas. Palavras-chave— 1 Otimiza¸ca ˜o Direta, Baricentro. randomly choose a curiosity that is added to the estimate, and provides the dynamics of the search. Each new estimate is considered in the calculation of next barycenter, and the goal of the algorithm is for the sequence of barycenters to converge to a minimum. This paper is organized as follows: in Section 2, we define the method and its parameters, and show an equivalent expression that is convenient for analysis. In Section 3, we prove convergence and superlinear convergence order. In Section 4, we compare the barycenter method with the Compass Search Method (Kolda et al., 2003) and Particle Swarm Optimization (Kennedy and Eberhart, 1995), a meta–heuristic method that has become popular in recent years. In Section 5, we design a proportional and a proportional plus derivative (PD) controllers using the Discrete Barycenter Method. Conclusions about the method’s performance and directions for future work are presented in section 6. Throughout this paper || · || is the usual Euclidian norm in Rm . Also R+ and R∗+ denote respectively the set of positive and strictly positive real numbers. Introduction Although the success of derivative based optimization methods, such as Newton, quasi–Newton, and steepest descent is fully demonstrated in many applications, there are cases in which they cannot be applied, for instance when the derivatives are unavailable or cannot be calculated reliably (Conn et al., 2008). To solve those problems, one may use direct search methods, also called derivative– free methods. Direct search methods arose in the 1950’s, but their golden age was the 1960’s (Lewis et al., 1998), when several successful applications were proposed. Because of the lack of convergence proofs, and the fact that direct search methods were developed heuristically, they suffered neglect from the mathematical optimization community (Kolda et al., 2003). Nonetheless, interest in such methods continued for three main reasons: • Many work well in practice and some are very intuitive. • They normally are easier to implement than derivative based methods. • They often work well when other methods fail. 2 The Discrete Barycenter Method In this section, we define the main concepts related to the barycenter method. We present the Discrete Barycenter Method, a new direct search algorithm that uses a weighted mean to search for the minimum point of a nonlinear function and solve unrestricted optimization problems. This weighted mean is found much in the same manner as in the calculation of the center of mass of a set of particles. The main characteristic of the recursive version of the method is to Definition 2.1 (Curiosity) The curiosity of the method is a nonzero vector b ∈ Rm with random components bi ≥ 0, ∀i ∈ {1, ..., m}. Roughly speaking, the higher the curiosity, the greater the search region, that is, for larger curios- 3565 Anais do XX Congresso Brasileiro de Automática Belo Horizonte, MG, 20 a 24 de Setembro de 2014 ity, the method is more appropriate for searching for a global minimum, and if the curiosity is small, the method will search locally. 3 Definition 2.2 (Discrete Barycenter Method) Let z(x) be an objective functions from Rm → R+ and µ ∈ R∗+ . The sequences {xn }n ∈ Rm and {ˆ xn }n ∈ Rm are given by: Pn −µ·zk k=0 xk · e x ˆn = P n −µ·zk k=0 e In order to prove convergence in Theorem 2, we require Lemma 1, which provides a recursive form for the Discrete Barycenter Method. Lemma 1 For n ≥ 1, equations (1) and (2) are equivalent to: (1) xn+1 = x ˆn + bn+1 , Convergence x ˆn = x ˆ0 + n X bk · e−µ·zk k=1 (2) Sk (3) where where Sk = k X 1. bn is the curiosity sequence, e−µ·zi ; i=0 2. zk is the value of the objective function z(x) at the point xk , and and if n = 0 3. {ˆ xn }n is the sequence of barycenters. with x0 an initial condition. x ˆ 0 = x0 Proof: By induction, for n = 1 equation (1) reads: We wish that the latter sequence approaches the minimum point. The idea behind the method is that if the value of objective function is small, then the corresponding point receives a large weight; and conversely if the value of the function is large, the weight will be small and the point will contribute little to the barycenter computation. Notice that even if the barycenter approaches the minimum, a high curiosity can take it away from there. So some modifications in the method should be implemented. In the following, we define another parameter that can be used to improve convergence. x ˆ1 = x ˆ0 + b1 · e−µ·z1 S1 e−µ·z1 e−µ·z0 + e−µ·z1 −µ·z0 x ˆ0 · e +x ˆ0 · e−µ·z1 + b1 · e−µ·z1 = S1 x ˆ0 · e−µ·z0 + e−µ·z1 (ˆ x0 + b1 ) = S1 x ˆ0 · e−µ·z0 + x1 · e−µ·z1 = S1 −µ·z0 x0 · e + x1 · e−µ·z1 = S1 P1 xk · e−µ·zk = k=0 S1 =x ˆ0 + b1 · Definition 2.3 (Shape Factor) The shape factor is: γ zˆn−1 maxi∈{1,...,n−1} {ˆ zi } which is a particular case of (1). For n = s + 1 where: 1. zˆk is the value of the objective function z(x) at the point x ˆk ; x ˆs+1 = x ˆ0 + 2. γ ∈ R is such that 0 ≤ γ ≤ 1; =x ˆ0 + If an is a nonzero vector such that an ∈ Rm , the curiosity becomes bn = an zˆn−1 maxi∈{1,...,n−1} {ˆ zi } s+1 X bk · e−µ·zk k=1 s X bk k=1 Sk · e−µ·zk bs+1 · e−µ·zs+1 + Sk Ss+1 bs+1 · e−µ·zs+1 Ss+1 x ˆs · Ss+1 + bs+1 · e−µ·zs+1 = Ss+1 =x ˆs + γ . The purpose of the shape factor is to diminish the amplitude of the search step when the function approaches its minimum, by reducing the curiosity. From the induction hypothesis: Ps x ˆs = 3566 k=0 xk · e−µ·zk , Ss Anais do XX Congresso Brasileiro de Automática Belo Horizonte, MG, 20 a 24 de Setembro de 2014 and limn→∞ zn 6= ∞. Hence, ∃w ∈ R∗+ such as 0 < w ≤ e−µ·zn , ∀n ∈ N, and limn→∞ w 6= 0. so x ˆs+1 Ps ( k=0 xk · e−µ·zk ) · SSs+1 bs+1 · e−µ·zs+1 s = + Ss+1 Ss+1 Ps e−µzs+1 −µzk ( k=0 xk e ) 1 + Ss bs+1 · e−µzs+1 = + Ss+1 Ss+1 Ps −µzk x e k Ps −µz k=0 s+1 −µzk + b e s+1 Ss xk e = k=0 + Ss+1 Ss+1 Ps+1 xk · e−µ·zk = k=0 Ss+1 which is a particular case of (1), concluding the proof. 2 Our main theorem shows that the Discrete Barycenter Method converges. Ideally, we would like the limit point to be a minimum of the objective function, but this cannot be guaranteed without extra assumptions. Issues concerning the convergence to the minimum will be reported in a subsequent contribution (Vicinansa et al., 2014, submitted ). k=0 1 1 Pn ≤ −µ·z n (n + 1) · w k=0 e Then: bn+1 · e−µ·zn+1 ||ˆ xn+1 − x ˆn || = Sn+1 M · e−µ·zn+1 M · e−µ·zn+1 ≤ ≤ Sn+1 (n + 1) · w and taking to the limit lim ||ˆ xn+1 − x ˆn || = 0 n→∞ 2 which concludes the proof. Note that the convexity of the function’s domain guarantees that the barycenters sequence remains inside the function’s domain. Moreover, the barycenter sequence cannot diverge for finite values of n (see footnote 1). Now we show that the method converges superlinearly. Theorem 2 (Convergence) Let z(x) be a function from Rm → R+ , µ ∈ R∗+ and {bn }n the sequence of curiosities, such that ||bn || ≤ M, ∀n with M ∈ R∗+ . Let the vector sequences {xn }n ∈ Rm and {ˆ xn }n ∈ Rm be generated by the Discrete Barycenter Method as described in equation (3), both of them belonging to the function’s domain. Then limn→∞ ||ˆ xn+1 − x ˆn || → 0. Definition 3.1 If the sequence {xn }n converges to the point x∗ , the rate of convergence is said to be superlinear if ||x∗ − x ˆn+1 || =0 n→∞ ||x∗ − x ˆn || Proof: There are two cases to consider: 1. The first case occurs when the sequence {zn }n diverges, which means limn→∞ zn = +∞. Using Lemma 1:1 bn+1 · e−µ·zn+1 M · e−µ·zn+1 ≤ ||ˆ xn+1 − x ˆn || = Sn+1 Sn+1 lim Theorem 3 (Order of Convergence ) Let z be a function from Rm → R+ , µ ∈ R∗+ , {bn }n a limited sequence of curiosities, such that ||bn || ≤ M, ∀n with M ∈ R∗+ and ||bn || ≥ r, ∀n with r ∈ R∗+ . Then the convergence order of the barycenter sequence {ˆ xn }n is superlinear. Taking limits: M · e−µ·zn+1 n→∞ Sn+1 | {z } lim ||ˆ xn+1 − x ˆn || ≤ lim n→∞ w ≤ e−µ·zn n X (n + 1) · w ≤ e−µ·zn Proof: By formula (1), ! n+1 X e−µ·zk · bk ||x∗ − x ˆn+1 || = x∗ − +x ˆ 0 Sk k=1 ∞ ∞ X X e−µ·zk e−µ·zk · bk = ≤ M Sk Sk →0 thus lim ||ˆ xn+1 − x ˆn || = 0, n→∞ and the method converges. 2. In the second case, the sequence of weights {e−µ·zn }n is limited2 to the semi-open set (0, 1] k=n+2 k=n+2 It is also true that 1 Note that the sequence {zn }n cannot diverge for a finite value of n, because there are no singularities in the function’s domain. Hence, the function can only diverge if the point xn falls in the domain’s boundary, but in that case xn must diverge as well. Once the curiosity sequence is limited, the sequence {xn }n can only diverge for a infinite value of n. 2 Because the image set of the function z(x) is R∗ , then + the image set of e−µz(x) , with µ positive, is (0,1] ! n −µ·zk X e · b k ||x∗ − x ˆn || = x∗ − +x ˆ0 Sk k=1 ∞ ∞ X e−µ·zk · b X e−µ·zk k = , ≥ r Sk Sk k=n+1 3567 k=n+1 Anais do XX Congresso Brasileiro de Automática Belo Horizonte, MG, 20 a 24 de Setembro de 2014 hence (0.935595, 0.875188) with the value for the objective function equal to z(x, y) = 0.00415028. PSO, an evolutionary method based on flocking patterns of birds (Kennedy and Eberhart, 1995), converged to the point (x, y) = (0.846025, 0.714902) with the corresponding value for the objective function equal to z(x, y) = 0.0237817. Figures 1 and 2 show the evolution of both methods in the domain space and in time, respectively. The code of colors is the following: Discrete Barycenter Method sequence in represented in red, the Compass Search Method sequence in blue, and the PSO in green.4 In the case presented, the PSO was faster. On the other hand, its performance is highly variable, and after several simulations, we concluded that it may be comparable to the Barycenter Method5 . P∞ e−µ·zk ||x∗ − x ˆn+1 || M k=n+2 Sk ≤ · P∞ e−µ·zk ||x∗ − x ˆn || r k=n+1 Sk P∞ e−µ·zk M k=n+2 Sk = · −µ·zn+1 e−µ·zk r P∞ +e k=n+2 Sk Sn+1 Taking the limit ||x∗ − x ˆn+1 || n→∞ ||x∗ − x ˆn || lim →0 z ≤ lim n→∞ }| { ∞ X e−µ·zk Sk M k=n+2 · ∞ X e−µ·zk r k=n+2 | Sk {z =0 −µ·zn+1 + e Sn+1 2.0 } →0 1.5 so ∗ lim n→∞ ||x − x ˆn+1 || =0 ||x∗ − x ˆn || 1.0 2 concluding the proof. 0.5 4 Simulation Results and Comparative Analysis -1.0 In this section, we use the Compass Search Method (Kolda et al., 2003), Particle Swarm Optimization (PSO) (Kennedy and Eberhart, 1995) and the Discrete Barycenter Method to minimize the Rosenbrock’s Function, a traditional benchmark for derivative-free optimization methods (Conn et al., 2008). The simulations in this article concern unimodal functions. Simulation results with multimodal function (Ackley’s function) appear in the paper (Vicinansa et al., 2014, submitted ), using an alternative version of the Barycenter Method. The parameters of the Discrete Barycenter Method are µ = 100000 and γ = 0.5 (shape factor) and the parameter of the Compass Search Method is = 10. For the PSO, we used the parameters3 w = 0.5, c1 = 2.0 and c2 = 2.0, with 100 iterations and 30 particles. All methods start with the same initial point, that is (x0 , y0 ) = (0.0, 1.2) The Rosenbrock function is given by 2 2 0.5 -0.5 1.0 1.5 2.0 Figure 1: Evolution of Compass Search (blue), Barycenter Method (red), and PSO (green). 1.0 0.8 0.6 0.4 0.2 20 40 60 80 100 Figure 2: Value of the objective function versus number of iterations. 5 2 z(x, y) = 100 · (x − y) + (1 − x) . Application to Optimal Controller Design In this section, we present applications of the barycenter method to problems of interest in controls: the off–line optimal design of closed loop P (proportional) and PD (proportional plus derivative) controllers for linear time invariant plants. It has a single minimum at (1, 1) where its value is zero. After 100 iterations, the Compass Search Method converged to the point (x, y) = (0.544434, 0.29668) with the corresponding value for the objective function equal to z(x, y) = 0.205895, while the Discrete Barycenter Method converged to the point (x, y) = 4 The graphic that shows the evolution of the objective function value takes in consideration each time that the PSO method find a new candidate to the minimum point. 5 However, further research and statistical analysis must be made to assure that 3 The parameter w is the inertia factor, c is the self 1 confidence factor and c2 is the swarm confidence factor 3568 Anais do XX Congresso Brasileiro de Automática Belo Horizonte, MG, 20 a 24 de Setembro de 2014 We also apply the Compass Search and the PSO methods for comparison. 5.1 Proportional Controller For the proportional controller design, we chose a quadratic cost function, as usual in optimal control. 0.8 0.6 0.4 0.2 Definition 5.1 (Cost Function) The cost function in the interval from t = 0 to t = tf is Ztf z= 5 10 15 20 Figure 3: Closed–loop step response with proportional control tuned by PSO (green), Compass Search (blue) and Barycenter Method (red). (q · (y(t) − v(t))2 + r · u(t)2 )dt 0 where: 1. y(t) is the plant’s output, 20 2. v(t) is the reference signal, 15 3. u(t) is the controller output, and q and r are weights of the error between the plant output and the reference signal and the controller effort. 10 5 Considering a linear time–invariant SISO b , with plant with transfer function G(s) = s·(s+a) a = 1 and b = 1 and a unit step function as a reference signal, we used the Discrete Barycenter Method, Compass Search, and PSO to find a proportional controller that minimizes the cost function for 0 ≤ t ≤ 1, with the parameters q and r equal to 1. The PSO method was used with 5 particles and 50 iterations. The method parameters are c1 = 1.0, c2 = 1.0 and w = 0.5. The method found a proportional gain k = 0.0813761 and the corresponding cost function value equal to z(k) = 0.985255. For the case of Compass Search, 50 iterations were performed. The parameter is = 1.0. The method found a proportional gain k = 0.133467 and the corresponding cost function value equal to z(k) = 0.982671. Finally, we used the Discrete Barycenter Method with 50 iterations. The parameters are µ = 10000 and Shape Factor γ = 0.5. The method found a proportional gain k = 0.136357 and the corresponding cost function equal to z(k) = 0.982679. Figure 3 shows the closed–loop unit step response for the proportional controller found by the three methods: 1) PSO in green, 2) Compass Search in blue and 3) Barycenter in red. Figure 4 shows the evolution of the cost with the number of iterations with the same code of colors. For the sake of comparison, in figure 5 the cost function for the proportional controller is presented (it was numerically determined). 0 5 10 15 20 Figure 4: Cost function evolution for PSO (green), Compass Search (blue), and the Barycenter Method (red). The differences in the step responses (particularly the one found by PSO, in green) tend to disappear with longer simulation times. 1.5 1.4 1.3 1.2 1.1 0.2 0.4 0.6 0.8 1.0 Figure 5: Cost function of the closed–loop system with proportional control As shown in Figure 4, the PSO and the Barycenter Method converged to a final value after 3569 Anais do XX Congresso Brasileiro de Automática Belo Horizonte, MG, 20 a 24 de Setembro de 2014 a small number of steps. The bigger the number of particles in PSO, the faster the convergence (in terms of steps) but higher the time between successive steps. In the case of Compass Search, it was estimated that the execution time was, in average, 1.3 times higher than the Barycenter. Thus, we can say that, as far as the execution time is concerned, the barycenter method might have a similar performance to the compass search method and PSO, if the parameters of the last one are chosen properly. It is important to say that our aim is to present a new simple and intuitive method that works at least as well as the old ones. value z(kp , kd ) = 8.56515. Figure 7 shows the evolution of the gains in the cost function domain for 1) PSO in green, 2) Compass Search in blue and 3) Barycenter in red. Figure 8 shows the cost evolution as a function of time, and Figure-9 the closed loop unit step response for the closed loop system (the colors are the same in the three figures). 5 4 3 5.2 PD controller We also used the Discrete Baricenter Method, Compass Search and PSO to design a PD (proportional-derivative) controller for an unstab with a = −1 and b = 1. ble plant G(s) = s·(s+a) The cost function, which has the same expression as in the last section, is now defined for 0 ≤ t ≤ 10 and with weights q = 10.0 and r = 1.0 (with an increased penalty on the output error term), and again the reference input is unit step. The graph of cost function under the later condition is shown in Figure 6. 2 1 2 3 4 5 Figure 7: Evolution of the method, the x axis represents kd and the y axis represents kp 100 80 60 40 20 0 5 10 15 20 25 30 Figure 8: Evolution of the objective function with the number of iterations Figure 6: Cost function for the PD design 1.0 The PSO method was used with 5 particles and 50 iterations. The method parameters are c1 = 1.0, c2 = 1.0 and w = 0.5. The method found a proportional gain kp = 3.16214 and a derivative gain kd = 3.70651 with the correspondent cost function value equals to z(kp , kd ) = 8.55836. For the case of Compass Search, 50 iterations were performed using the parameter = 2.5. The method found a proportional gain kp = 3.16228, a derivative gain kd = 3.70639, and the corresponding cost function z(kp , kd ) = 8.55836. Finally, we used the Discrete Barycenter Method with 50 iterations, µ = 10 and shape factor γ = 0.5. The method found a proportional gain kp = 2.97849, a derivative gain kd = 3.56687, and cost function 0.8 0.6 0.4 0.2 5 10 15 20 Figure 9: Closed-loop system output with the PD controller designed by the three optimization methods 3570 Anais do XX Congresso Brasileiro de Automática Belo Horizonte, MG, 20 a 24 de Setembro de 2014 The methods worked equally well in minimizing the cost. Finding the PD controller took approximately 10 minutes in a personal computer with CPU AMDTurion X2 Ultra Dual-Core with memory of 3GB running Mathematica, while PSO and Compass Search Methods took approximately 20 and 13 minutes respectively. In this particular case, the Barycenter Method outperformed the others from the point of view of computation time. 6 The authors are also grateful for useful comments by the anonymous reviewer, which were helpful in improving and clarifying the text. References Conn, A. R., Scheinberg, K. and N. Vicente, L. (2008). Introduction to derivative-free optimization, SIAM, Society for Industrial and Applied Mathematics Philadelphia, PA, USA. Conclusion Kennedy, J. and Eberhart, R. (1995). Particle swarm optimization, IEEE Internatinal Conference on Neural Networks 4: 1942–1948. In this article, we presented the Discrete Barycenter Method, a derivative–free approach proposed by the authors for solving optimization problems, especially those that cannot be solved by derivative based methods such as Newton’s and gradient–type. We have shown that the method converges superlinearly, however we could not yet prove that it will converge to the minimum point (local or global), if it exists (see (Vicinansa et al., 2014, submitted ) for a modification in the method that guaratees convergence to the minimum). In order to argue that the method reaches the minimum in practice, we applied it to find the minimum of the Rosenbrock function, a benchmark function for testing nonlinear optimization algorithms. We also compared the Barycenter Method with the Compass Search and PSO (particle swarm) methods. The Barycenter Method resulted in performance at least as good as the other two, and in many cases, better. The method was also applied to designing optimal controllers offline. Based on the simulations, we could conclude that the Barycenter Method was able to find quickly an optimal proportional controller for that cost function with a good precision, after a small number of iterations. For the case of PD controller, the Barycenter Method performed better. Future work will analyze conditions under which the method converges to the actual minimum point, and apply the Barycenter Method to real time optimization. The goal is to find minima of functions whose values for specified parameters can be computed by experiment or simulation, but whose mathematical closed–form expression is not a priori known. One application we have in mind is to Adaptive Control, where heretofore methods such as recursive least squares have been employed. 7 Kolda, T. G., Lewis, R. M. and Torczon, V. (2003). Optimization by direct search: New perspectives on some classical and modern methods, SIAM Review 45: 385–482. Lewis, R. M., Torczon, V. and Trosset, M. W. (1998). Direct search methods: Then and now, Optimia 59: 1–7. Vicinansa, G. S., Col´on, D. and Pait, F. (2014, submitted ). Directional barycenter method for control engineering optimization, 4th International Conference on Engineering Optimization . Acknowledgements The author Guilherme Scabin Vicinansa thanks to FAPESP for the scholarship under grant 2014/00539-0. The author Diego Col´ on also thanks to FAPESP for the grant to participate in this reunion. 3571
© Copyright 2024 ExpyDoc