Last Update: 02.06.2014 1 State Dependent Riccati Equation Method and Hinfty Control 1.1 Introduction 1.1.1 The Linear Quadratic Regulator (LQR) Consider the system x = Ax + B u And the performance criteria ∞ ∫ J [u (⋅)][= [ xT Qx + u T Ru ]dt , Q ≥ 0, R > 0, 0 p Problem: Calculate function u : [0, ∞] ℜ such that J [u ] is minimized. Remarks: 1. LQR can be considered for final times 2. LQR can be considered for time varying matrices 3. LQR can be extended in several ways to nonlinear systems (e.g. State Dependent Riccati Equations) 4. LQR assumes full knowledge of the state The LQR controller has the following form u (t ) = − R −1BT Px(t ) n× n Where P ∈ ℜ is given by the positive (symmetric) semi definite solution of 0 = PA + AT P + Q − PBR −1 BT P This equation is called Ricatti equation. It is solvable iff the pair ( A, B ) is controllable and (Q, A) is detectable LQR controller design 1. 2. ( A, B) is given by “design” and can not be modified at this stage (Q, R) are the controller design parameters. Large Q penalizes large R penalizes usage of control action u -1- transients of x, 1.1.2 The Linear Quadratic Gaussian Regulator (LQG) In LQR we assumed that the whole state is available for control at all times (see formula for control action above). This is unrealistic as the very least there is always measurement noise. One possible generalization is to look at x = Ax + B u + w y = Cx + v Where v, w are stochastic processes called measurement and process noise respectively. For simplicity one assumes these processes to be white noise (ie zero mean, uncorrelated, Gaussian distribution) Now only y (t ) is available for control. It turns out that for linear systems a separation principle holds 1. First, calculate xˆ (t ) estimate the full state x(t ) using the available information 2. Secondly, apply the LQR controller, using the estimation xˆ (t ) in place of the true (now unknown) state x(t ) . Observer design (Kalman Filter) The estimation xˆ (t ) is calculated by integrating in real time the following ODE xˆ = Axˆ + Bu + L( y − Cxˆ ) With the following matrices calculated offline (ie before hand) L = PC T R −1 0 = AP + PAT − PC T R −1CP + QY , P ≥ 0, Q = E ( wwT ), R = E (vvT ) The Riccati equation above has its origin in the minimization of the cost functional 0 J [ xˆ (⋅)][= ∫ [( xˆ − x)( xˆ − x) T ]dt −∞ -2- 1.2 State Dependent Riccati Equation Approach The State-Dependent Riccati Equation (SDRE) strategy is provides an effective algorithm for synthesizing nonlinear feedback controls by allowing nonlinearities in the system states while additionally offering design flexibility through state-dependent weighting matrices. The method entails factorization (that is, parameterization) of the nonlinear dynamics into the state vector and the product of a matrix-valued function that depends on the state itself. In doing so, the SDRE algorithm brings the nonlinear system to a non-unique linear structure having state-dependent coefficient (SDC) matrices. The method includes minimizing a nonlinear performance index having a quadratic-like structure. An algebraic Riccati equation (ARE) using the SDC matrices is then solved on-line to give the suboptimum control law. The coefficients of this equation vary with the given point in state space. The algorithm thus involves solving, at a given point in state space, an algebraic state-dependent Riccati equation, or SDRE. The non-uniqueness of the parameterization creates extra degrees of freedom, which can be used to enhance controller performance. Problem Formulation Consider the deterministic, infinite-horizon nonlinear optimal regulation (stabilization) problem, where the system is full- state observable, autonomous, nonlinear in the state, and affine in the input, represented in the form x (t) = f (x) + B(x)u(t), x(0) = x0 (1) where x ∈ IRn is the state vector, u ∈ IRm is the input vector, and t ∈[0, ∞) , with C1 (IRn) functions f : IRn → IRn and B : IRn → IRn×m , and B(x) ≠ 0 ∀x . Without any loss of generality, x = 0 is assumed to be an equilibrium point: f (0) = 0. In this context, the minimization of the infinite-time performance criterion J (x0 , u(⋅)) = ∫[x(t)Q(x)x(t) + u(t)R(x)u(t)] dt (2) is considered, which is non-quadratic in x but quadratic in u . The state and input weighting matrices are assumed state dependent such that -3- Q : IRn → IRn×n and R : IRn → IRm×m . These design parameters satisfy Q(x) ≥ 0 and R(x) > 0 for all x . Under the specified conditions, a control law u(x) = k(x) = −K(x)x, k(0) = 0 , (3) where K(⋅) ∈ C1 (IRn ) , is then sought that will (approximately) minimize the cost (2) subject to the input-affine nonlinear differential constraint (1) while regulating the system to the origin ∀x , such that lim x(t) = 0, t->∞. Extended Linearization is the process of factorizing a nonlinear system into a linear-like structure which contains SDC matrices. Under the assumptions f (0) = 0 and f (⋅) ∈ C1 (IRn ) , a continuous nonlinear matrix-valued function A(x) always exists such that f (x) = A(x)x , (4) where A : IRn → IRn× is found by mathematical factorization and is, clearly, nonunique when n > 1 . Hence, after extended linearization of the input-affine nonlinear system (1) becomes x (t) = A(x)x(t) + B(x)u(t), x(0) = x0 , (5) which has a linear structure with SDC matrices A(x) , B(x). The application of any linear control synthesis method to the linear-like SDC structure (5), where A(x) and B(x) are constant matrices, forms an extended linearization control method, which is, broadly, a family of control methods that find the controller u(x)=K(x) such that A(x)-B(x)K(x) is point-wise Hurtwitz. The following conditions are required for guaranteeing local asymptotic stability. Condition 1.: A(⋅) ,B(⋅) ,Q(⋅) and R(⋅) are C1 ( ) matrix-valued functions. -4- Condition 2.: The respective pairs {A(x), B(x)} and {A(x), Q1 2 (x)} are pointwise stabilizable and detectable SDC parameterizations of the nonlinear system (1) for all x Theorem 2 (Mracek & Cloutier, 1998). Under Conditions 1 &2 , consider the nonlinear multivariable system x (t) = f (x) + B(x)u(t), x(0) = x0 (1) where x ∈ IRn is the state vector, u ∈ IRm is the input vector u(x) = −R−1 (x)BT (x)P(x)x where P(x) is the unique, symmetric, positive-definite solution of the algebraic StateDependent Riccati Equation P(x)A(x)+A(x)P(x) - P(x)B(x)R{-1}B(x){T}(x)P(x) + Q(x) = 0 Then, the method produces a closed-loop solution which is locally asymptotically stable. Remark. Note that global stability has not been established, this is a local result. In general, even if Acl(x)=A(x)-B(x)K(x) is Hurtwitz for all x, this does not imply global stability. One can prove though that is Acl(x) is symmetric and Hurtwitz for all x, then global stability holds. The proof is simply obtained by showing that under these condtions V(x)=x*x is a Lyapunov function for system (1). Optimality of solution: (Mracek & Cloutier, 1998). Under Conditions 1 &2 , the SDRE nonlinear feedback solutionn and its associated state and costate trajectories satisfy the first necessary condition for optimality ∂H/∂u=0 of the nonlinear optimal regulator problem. Example. Steer to x=(d,0) the following system. dx1/dt = x2 dx2/dt= -a sin(x(1)) - b x2 + c u(t) Indeed, A(x)=[0 1; -a sin(x1-d)/(x1-d) –b]; B(x)=[0;1]; We choose Q=[max(1,(x(1)-d)^2 0; 0 max(1,x(2)^2]; R=1 ; -5- The choice of Q(x) ensures larger control actions for large deviations from the equilibrium. The blue trajectory is obtained using LQR on the standard linearization of the original system with Q=eye(2), R=1. The magenta line is the SDRE method described in the example. Note how the state is brought faster to the equilibrium in the SDRE case. -6- 1.3 H∞ Control p×q Let us introduce the so called H∞ norm of a transfer function G ( s ) ∈ C . This is a mapping from the space of matrix transfer functions into the positive real numbers defined by G ( s) ∞ = maxω∈ℜ G ( jω ) 2 = maxω∈ℜ σ 1 (G ( jω )) , where σ 1 denotes the largest singular value of the complex matrix G ( jω ) . The intuition associated to this definition is that the H∞ norm quantified the maximal amplification that a signal may have once applied to the system given by the transfer matrix G (s ) . Figure 1. H∞ norm Let us now consider the system depicted below. Figure 2. H∞ Objects -7- Here w(t ) is a disturbance acting on the system, while u (t ) is the control action. Further, z (t ) is to be understood as performance index. Without loss of generality, one assumes that this continuous time system is given by the matrices and equations below. x = Ax + B1 w + B2u z = C1 x + D12 w y = C 2 x + D21 w Figure 3. Underlying H∞ System Equations Problem: H∞ Control is about finding the stabilizing control (not necessarily memoryless) law u (t ) = F ( y (t )) that stabilizes the system above AND minimizes the effect of the disturbance w(t ) on the performance index z . This goal can be achieved by minimizing the H∞ norm of the transfer function T : w z that is generated once a controller design has been chosen. Figure 4. H∞ Goal In other words, one designs a controller that minimizes the effect of the worst possible disturbance. Another related formulation is to have a controller that renders the system dissipative and internally stable, see below. -8- 1.3.1 Linear H∞ Control Design By analogy with the LQG case, for the design of this controller we expect to have to solve 2 Riccati equations, one for calculation the optimal action given the system state, and one for generating an estimation of the state. The main difference now is that we are dealing now with two objects 1. The worst possible disturbance 2. The best possible reaction to that worst disturbance Relation to the Game Theory is seen here, where we have two players, one that is trying to maximize a cost function and one that is trying to minimize it. The applicable cost function here is t J [ w, u , x0] = ∫ ( z 2 − γ w 2 )ds 0 subject to the equations, see Figure 3. In practice one does not seek the optimal controller (i.e. the ones that produces the absolute minimal value for T ∞ but the design is made using an iterative procedure that seeks for reducing the norm while still looking at other performance measures. Indeed, for any sufficiently large given γ > 0 one can calculate a controller that makes using the following formulae i) Find X ≥ 0 that satisfies the ARE 0 − XA + AT X − C1T C1 + X (1 / γ 2 B1B1T − B 2 B 2T ) X And A − (1 / γ 2 B1B1T − B 2 B 2T ) X is stable ii) ii) Find Y ≥ 0 that satisfies the ARE T 0 − AY + YA − B1B1T + Y (1 / γ 2C1T C1 − C 2T C 2)Y And A − (1 / γ 2C1T C1 − C 2T C 2)Y is stable iii) ρ ( XY ) < γ 2 (lowest singular value) Then the dynamic H∞ controller has the form xˆ = Axˆ + B1 wworst (t ) + B 2uopt (t ) + ZL(C 2 xˆ (t ) − y (t )) wworst (t ) = γ 2 B1T Xxˆ (t ) uopt (t ) = Fxˆ (t ) -9- T ∞ <γ Where Z = ( I − 1 / γ 2 XY ) −1 L = −YC 2T F = − B2 X Remarks 1. For very small values of γ , these equations will have no solution while for very large γ , these equations reduce to the LQG case. 2. The best design is made by an iterative procedure reducing γ that keeps an eye on the disturbance rejection properties at all interesting frequencies (and not jus the maximum value). It can be proved that these Riccati Equations may have solution only if the following “structural” assumptions hold. Figure 5. Linear H∞ Assumptions These assumptions are rather natural as they transfer properties as observability and controllability to this new context. Remark: H∞ design provides robustness at the cost of a pessimistic control law: it assumes that the worst possible perturbation is acting on the system at all times. 1.3.2 Nonlinear H∞ Control Design All elements of the linear case are here present, and we expect a solution schema where one has an optimal control law given the full system state, and a law that helps us to generate optimal estimations of this actually unknown full system state. The difference we do expect though is that in place of 2 Riccati equations, one for optimal control and one for the optimal observer, we will have to deal here with two Hamilton Jacobi equations. We have met these partial differential equations already, when we treated “Optimal Control and Dynamic Programming”. We know they are hyperbolic, that they are amenable for treatment with numerical methods, and that they have an interesting - 10 - mathematical theory due to the fact that their solutions may become discontinuous and those usual concepts like differentiability can not be applied straightforwardly. The theory has been developed for nonlinear systems of the form. x = A( x) x + B1 ( x) w + B2 ( x)u z = C1 ( x) x + D12 ( x) w y = C 2 ( x) x + D21 ( x) w For this system the “Observation” PDE has the form. Figure 6. Observation Hamilton Jacobi Equation It turns out that the best estimate of the full state is given by the formula. x ( p ) = arg max x ( p ( x) + V ( x) ) where V (x ) is the solution of the optimal control Hamilton Jacobi PDE. - 11 - In the graphic above, the blue line is the true state, while the magenta line is its estimation via formula x ( p ) = arg max x ( p ( x) + V ( x) ) . The control law is then given by * u * ( p ) = u state ( x ( p )) = −C1 ( x ( p )) − B2 ( x ( p ))∇ xV ( x ( p ))' and steers the system to equilibrium. - 12 - Naturally, for all this to hold we need the system matrix functions to have good structural properties for this PDE to make sense in the first place. So we expect 1. 2. 3. A( x), C1 ( x), C 2 ( x) globally Lipschitz, smooth, vanish at 0 B1 ( x), B2 ( x) globally Lipschitz, bounded, smooth D12 ( x), D21 ( x) constant (for simplicity) 1.4 References J.W. Helton & M.R. James, A General Framework for Extending H-Infinity Control to Nonlinear Systems, SIAM 1998. J.A. Ball & J.W. Helton, H-Infinity Control for Stable Plants, MCSS 5 (1992) 233-262. J.A. Ball, J.W. Helton & M. Walker, H-Infinity Control for Nonlinear Systems via Output Feedback, IEEE TAC 38 (1993) 546-559. M.R. James & J.S. Baras, Robust H-Infinity Output Feedback Control for Nonlinear Systems, IEEE TAC 40 (1995) 1007-1017. - 13 -
© Copyright 2024 ExpyDoc