A Continuous Feedback Optimal Control based on Second-Variations for Problems with Control Constraints

A Continuous Feedback Optimal Control based on Second-V ariations for Problems with Control Constraints Joris T . Olympio Abstract The p aper describes a continuous second -v ariation algorithm to solv e optimal control problems where the control is de ﬁned on a clo sed set. A sec ond order exp ansion of a Lagrangian provides linear updates o f the control to cons truct a locally feedbac k optimal control of the prob lem. Since the process in volv es a backward and a forward stag e, which require storing trajectories, a me thod has been de vised to accurately store continuous solutions of ordinary differential equations. Thanks to the continuous approach, the method adapts implicit ly the numerical time mesh. The nov el method i s demonstrated on bang-bang optimal control problems, sho wi ng the suitability of t he method to identify automatically optimal switching points in the control. 1 Introd uction The current paper is con cerned with the resolution of optimal control prob lems with bang-bang structure. Most optimal control problem solvers use a direct pr oblem form ulation[5] that transforms the optima l contro l problem into a n onlinear program[1] [2][ 3][4]. Indeed , one c ould argue that direct m ethods of fer th e most straigh tforward formu lation to a generic op timal co ntrol solver , and they are quite robust. Alternative methods, fo r instan ce ind irect methods[6], req uire form ulating a bou ndary value pro blem and convergence is usually difﬁcult. Neither d irect, nor indirect method-b ased algor ithms can be both robust and accurate in computing the optimal control solution. Besides, direct and indirect appro ach, one should explore second-orde r meth ods. Second-o rder methods ap pear as very good alterna ti ve. In the 196 0’ s, while tr ying to elegantly solve LQP re- searchers devised second-or der methods such as seco nd-variation method s o r differential dy namic progra mming (DDP)[7] [8] [9] [10] [11] [1 2]. Th e classical second order algo rithms fo r o ptimal c ontrol problems use dynam ic progr amming equations. Referred as differential dynamic pro gramm ing (DDP), DDP provides a simple way of com - puting an optimal control through a quadratic expansion of the Hamilto n-Jacobi-Bellma n value function a round a nominal trajectory[8 ]. Over the past 50 years, these methods h av e also b een tested and adapted on no n-LQP pr oblems, and in the spac e community , DDP based method s seem to get a ne w lease of lif e. Curr ent computer architecture allo w better programin g of the methods thanks to sign iﬁcant increase i n memory capacity and pr ocessing unit speed. So far , Whiffen [13][14] certain ly pro posed the best imp rovement of DDP m ethod to provide a gen eric tool f or high -ﬁdelity low-thrust space tra jectory o ptimisation. The major difﬁculty in low-thrust op timal contr ol prob lems is the c ontrol is often of a ban g-ban g structu re because the contro l is lin ear in the dynam ics. Some work using DDP th en consider simpliﬁcations in the dynamics[15], o r consider a quadratic cost to th e designing o f lo w- thrust tr ajectories to as teroids to impr ove co n vergence rate[ 16]. T o the auth or knowledge there is not a lot of sup porting theo ries for conver g ence and behaviour of DDP . Second-o rder meth ods can also be devised u sing Lagr angian expansions, r ather than the Bellm an’ s value fun c- tion. The ap proach has been intro duced b y Bullock[1 7 ][18] and Bry son an d extended b y others for space tr ajectory optimization [19 ] [20]. Altho ugh the basis is dif ferent from the der iv ation of DDP the end results are quite similar . Both DDP and gradien t-based Lagran gian meth ods compu te a linear co ntrol update iteratively using gradien t of the Hamiltonian f unction. Howev er , the d eriv ation s based on th e expan sion o f th e Lagrang ian rather than the Ham ilton- Jacobi-Bellman value function provide a better theoretical framew ork. The algorithm pr esented in this paper uses as baseline an expa nsion o f the Lag rangian . Th e novelty of the ap - proach is howe ver m any fold . This paper pro poses an app roach f or ap proxim ating th e continuo us solution throu gh 1 interpolatio n, thus improving the backward-forward inte grations, and increasing the ac curacy of the sensiti vity matri- ces and the f eedback coefﬁcients. A method for han dling dyna mical control constrain ts without u sing pena lization is presented. Penalization of the co ntrol or the state for c onstraints is one of the major dr awback of curren t DDP an d second-variation method s. Th at is, a regular isation of the Lagran gian is also presented to improve the conv ergence rate. In section 2, the main prob lem is introduced with notation s and assumptions. The method ology is de velop ed in section 3. This includes th e seco nd-or der developments, the generation of a n o ptimal feed back contro l law , the condition s of optimality , and the treatm ent of state constraints. Speciﬁc section 4 presents the op timal control update, and deals with th e p articular case o f co ntrol constraints, which is important f or the bang-b ang pr oblem class. Section 5 introdu ced a practical solu tion metho d to produce continu ous so lution for the pro blem, with an adap tiv e implicit m esh. The a lgorithm of the method is presented in section 6 . Eventu ally , the last section, section 7 illustrates th e method, and demonstra tes that the meth od can ﬁn d the ba ng-ban g optimal contro l without r equiring any in sights o n the switching structure. 2 Pr oblem Statement 2.1 Pr eliminaries T o simp lify nota tion, the following conventions are used . Scalars are denoted in italics, while column vectors, matrices and tensors are in bold. In ad dition, column vectors use lowercase and matrices use upp ercase. And for deriv atio n, F x = ∂ F ∂ x F xy = ∂ 2 F ∂ x ∂ y where F is a gen eral scalar function. For clarity of the expressions, when the points of ev a luation are omitted it is assumed that th e expression is ev alu ated around a no minal trajectory [ ¯ x ( t ) , ¯ u ( t ) , ¯ p ] . T he dimen sion of the state x ( t ) is n x , and the dimension of control u ( t ) is n u . 2.2 Formu lation The d ev elopmen ts include the case where a scalar constant para meter p is included and needs to be optim ised alon g with the contro l. The dynamics are described by an ordinary differential equation d x d t = f ( x , u , p ; t ) , t ∈ [ t 0 , t f ] , u ( t ) ∈ U , p ∈ P (1) 0 = ϕ ϕ ϕ ( x ( t 0 ) , x 0 , p 0 ) (2) where ϕ ϕ ϕ : R n x × R → R n ϕ describes initial cond itions on the state at date t 0 and un known parameter p 0 . Function al f , of state function x ∈ C ([ t 0 , t f ] , R n x ) a nd unk nown con stant paramete r p ∈ P ⊂ R , is assumed at least twice continuously differentiable, f ∈ W 2 2 ( R n x × U × P ) , with t he Sobo lev spa ce W k p ( Ω ) = { f ∈ L 2 ( Ω ) : ∀ α ∈ N , | α | ≤ k , ∂ α f ∈ L p ( Ω ) } Furthermo re, the contro l is ass umed a bo unded measurab le functio n, u ∈ L 2 ([ t 0 , t f ] , U ) . Let th e co ntrol region U ⊂ R n u be deﬁned by the box constraints: u l i ≤ u i ≤ u u i 1 ≤ i ≤ n u (3) where u l i ∈ R an d u u i ∈ R ar e respectiv ely the lo wer and u pper bounds on the i t h compon ent of the control. Let terminal constraints be deﬁned as ψ ψ ψ ( x ( t f ) , p f ) = 0 (4) 2 where ψ ψ ψ : R n x × R → R n ψ is assumed twice continuously differentiable. Scalar p f is an unknown par ameter . With very few modiﬁcation s, these problem parameters can also be slack variables of possible terminal inequality constraints. Consider the minimisation of an objectiv e function J written und er the Mayer form, J = J ( x ( t f ) , p f ) − → m in (5) where J : R n x × R → R is a functional of the vector x ( t f ) and the parameter p . It is assumed well deﬁned (Jacobian of maximum rank), and at least twice continuo usly dif fer entiable. A second o rder metho d is p resented to min imise the objective function J , Eq. (5), su bjects to the state co nstraints Eqs. (2) and (4), an d the co ntrol constrain t Eq. (3), with iterative updates on the co ntrol u ( t ) and the par ameters p , p 0 and p f . 3 Second Order Gradient Method 3.1 Second Order Equations For this problem, an extended terminal value function is introdu ced, φ ( x , ν ν ν , p f ; t f ) = J ( x ( t f ) , p f ) + ν ν ν T ψ ψ ψ ( x ( t f ) , p f ) + ψ ψ ψ ( x ( t f ) , p f ) T C ψ ψ ψ ( x ( t f ) , p f ) (6) where ν ν ν ∈ R n ψ is a non-zero constant Lagrange multiplier vector for t he constraints ψ ψ ψ , and C > 0 is a s quare diagonal regularisation matrix, C ∈ M n ψ , n ψ ( R ) . Let the au gmented Lagran gian L ∈ W 2 2 ( R nx × U × P × R 2 × R n ψ ) of the constrained optimal control problem be deﬁned with L ( x , u , p , p 0 , p f , ν ν ν ) = φ ( x , ν ν ν , p f ; t f ) + η η η T ϕ ϕ ϕ ( x 0 , p 0 ; t 0 ) + Z t f t 0 λ λ λ T  f ( x , u , p ; t ) − d x d t  d t (7) It acco unts for initial co nditions Eq. (2) , terminal con straints Eq. (4) a nd ob jectiv e function Eq. ( 5). Costate vector λ λ λ and Lagran ge multiplier ν ν ν have bee n introduce d for the dynam ics and the initial constraints Eq. (2) respectively . While most second- order methods published to date o nly use pen alisation to reach constrain t satisfaction, the presented method introduces Lagrange multipliers to ha ve a quicker and better satisf action of the constrain ts. As part of a min-max pr oblem, V shou ld b e m inimised for u an d p , while maximised for ν ν ν . In deed, the co nstrained maximisation supposes the existence of a min -max and max-min pr oblem, or als o the e xistence of a saddle point. The regularisation matr ix C of the augm ented Lagrangia n (Eq. (7)) is thus used to regularise the Lagra ngian and ﬁnd a saddle poin t of L b y red ucing any dua lity gap. Gill[21], Hestenes[2 2], Powell[23] and o thers have demonstrated that solving the extended fun ction of th e uncon strained pr oblem Eq . (6) is eq uiv alen t to solv ing the co nstrained p roblem Eqs.(4 - 5). Let the Hamiltonian H ∈ W 2 2 ( R nx × R nx × U × R ) be deﬁne d as: H ( x , λ λ λ , u , p ; t ) = λ λ λ ( t ) T f ( x , u , p ; t ) (8) since the objective fun ction J ( Eq. 5) does not include Lagran gian terms (integral terms). Lookin g for contr ol u pdates δ u , the un knowns of the p roblem are δ x , d p a nd d ν ν ν , such th at a con trol up date c an be expressed in the form δ u = α α α + β β β δ x + γ γ γ d p + ω ω ω d ν ν ν (9) u = ¯ u + ε u δ u (10) with α α α ∈ R n u , β β β ∈ M n u , n x ( R ) , γ γ γ ∈ R n u and ω ω ω ∈ M n u , n ψ ( R ) . ¯ u is the current nom inal co ntrol and ε u ∈ [ 0 , 1 ] is an up date factor . This contro l update can be seen as a feedback control la w on the state x . Since J , f , ϕ ϕ ϕ and ψ ψ ψ are assum ed at least twice con tinuously differentiable, the de velop ment o f the Lagrangian L ∈ W 2 2 ( R n x × U × R ) to second order , around the nominal trajectory ¯ x and control ¯ u and nomin al parameter v alu es, yields, 3 d L ( ¯ x , ¯ u , p , p 0 , p f , ν ν ν ) = d φ + d  η η η T ϕ ϕ ϕ  − λ λ λ T f δ x f + λ λ λ T 0 δ x 0 − δ λ λ λ T f δ x f + δ λ λ λ T 0 δ x 0 + Z t f t 0  d λ λ λ + δ λ λ λ d t  T δ x − δ λ λ λ T d x d t ! d t + Z t f t 0 d H dt + o ( k δ x k 2 , k δ λ λ λ k 2 , d p 2 ) (11) with: d H = H x δ x + H u δ u + H p d p + f δ λ λ λ + 1 2 δ x T H xx δ x + 1 2 δ u T H uu δ u + 1 2 H p p d p 2 + δ u T H ux δ x + δ x T H xu δ u + δ x T H xp d p + H px δ x d p + δ u T H up d p + H pu δ u d p + δ λ λ λ T f x δ x + δ λ λ λ T f u δ u + δ λ λ λ T f p d p + δ x T f x δ λ λ λ + δ u T f u δ λ λ λ + f p δ λ λ λ d p (12) Residual term o ( k δ x k 2 , k δ λ λ λ k 2 , d p 2 ) describes the higher order terms of the de velop ment, and, o ( k δ x k 2 , k δ λ λ λ k 2 , d p 2 ) = ∆ L − d L ( x , u , p , p 0 , p f , ν ν ν ) (13) ∆ L = L ( ¯ x , ¯ u , ¯ p , ¯ p 0 , ¯ p f , ¯ ν ν ν ) − L ( x , u , p , p 0 , p f , ν ν ν ) (14) ∆ L is a measure of the imp rovement owing to the u pdated c ontrol law (E q. ( 10)). The secon d term of the r ight hand side must thus be of th e same order of magnitu de o f ∆ L . For linear or quadr atic pr oblems in x , u and p , the residual term is zero. Lemma 1 (Riccati T ransform ation) . The costate vector is r elated to the other variables with the transformation, δ λ λ λ = R δ x + K d ν ν ν + T d p (15) with R ∈ R n x , n x ( R ) , K ∈ R n x , n φ ( R ) and T ∈ R 1 , n x ( R ) . Pr oof. The state transition matrix Φ Φ Φ ( t f , t ) for the dynam ics f yields, x ( t f ) = Φ Φ Φ 11 ( t f , t ) x ( t ) + Φ Φ Φ 12 ( t f , t ) λ λ λ ( t ) + Φ Φ Φ 13 ( t f , t ) p (16) λ λ λ ( t ) = Φ Φ Φ 21 ( t , t f ) x ( t f ) + Φ Φ Φ 22 ( t , t f ) λ λ λ ( t f ) + Φ Φ Φ 23 ( t , t f ) p (17) p = Φ Φ Φ 31 ( t , t f ) x ( t f ) + Φ Φ Φ 32 ( t , t f ) λ λ λ ( t f ) + Φ Φ Φ 33 ( t , t f ) p (18) The transversality conditio ns fo r the ter minal constraints p rovide the value for λ λ λ ( t f ) . The transition matrix is in vertib le. Comb ining equations (16) and (17), and considering small variations of the variables x , p and ν ν ν yield: δ λ λ λ ( t ) = ( 1 − Φ Φ Φ 21 ( t , t f ) Φ Φ Φ 12 ( t f , t )) − 1 Φ Φ Φ 21 ( t , t f ) Φ Φ Φ 11 ( t f , t ) δ x ( t ) + ( 1 − Φ Φ Φ 21 ( t , t f ) Φ Φ Φ 12 ( t f , t )) − 1 ( Φ Φ Φ 21 ( t , t f ) Φ Φ Φ 13 ( t f , t ) + Φ Φ Φ 23 ( t , t f )) δ p + ( 1 − Φ Φ Φ 21 ( t , t f ) Φ Φ Φ 12 ( t f , t )) − 1 Φ Φ Φ 22 ( t , t f ) ψ T x f d ν ν ν Assumption 1. Consider a n ominal trajectory ¯ x and a nomina l co ntr ol ¯ u , po ssibly not optima l. The solu tions o f Eqs. (25, 26,27) ar e b ounde d with th e following assumptions[24][25][26]: 4 • Legendr e-Clebsch condition H uu > 0 . This discards any pr oblems wher e the co ntr ol app ears linearly . A p r oce- dur e that guarantees this condition in the method shall be pr esented hereafter . • H xx − H xu H − 1 uu H ux is positive-semid eﬁnite. • R ( t f ) is positive-semideﬁnite The ﬁr st assum ption is a n ecessary c ondition to ensure tha t n o sing ular arcs are en counter . Howe ver , this ha s n o inﬂuence on the local optimality of the ﬁnal solution. Deﬁnition 1. Considering a q uadratic mod el ( Q , R ) , the T rust-Region pr o blem c onsists in minimising the q uadratic expansion for t he solution to be within a given ball B ( x , r ) , min p  f + p T R + 1 2 p T Qp  s . t . k p k ≤ ∆ wher e p = δ x and ∆ is called the T rust-R e g ion radius. Furthermore , the solution p ∗ is solution of linear pr oblem ( Q + λ I ) p = − R for some λ ≥ 0 such that Q + λ I i s positive deﬁnite. Applying the T ru st-Region algorithm[27 ] [28][29] results then in the compu tation of a shift matrix S uu = λ I such that the Hessian H uu ← H uu + λ I is positive-deﬁnite. A cou nterpar t of app lying the Trust-Re gion meth od to co rrect the Hessian is a limitation of the control update amplitude δ u in accordance to the selectio n of T rust-Region radius ∆ , which eventually h as an inﬂuence on result of Eq. (14). T r ansforma tion of Eq. (1 5) allows to compute the feedback co efﬁcients in Eq. ( 9). This yields, as suming H uu > 0, α α α = − H uu − 1 H u (19) β β β = − H uu − 1  H xu + R T f u  T (20) γ γ γ = − H uu − 1  H p u + T T f u  T (21) ω ω ω = − H uu − 1  K T f u  T (22) And, using th e con ditions of optimality , o rdinary differential equ ations f or R , K , T and th e costate vector λ λ λ are obtained. Thu s, d λ λ λ d t T = − H x + H u H uu − 1 ( H ux + H H H u λ R ) (23) d x d t = H λ λ λ T = f (24) − d R d t = H xx + R T f x + f x R + β β β T ( H ux + f u R ) (25) − d K d t = K T f x + ω ω ω T ( H ux + f u R ) (26) − d T d t = H px + T T f x + γ γ γ T ( H ux + f u R ) + f p R (27) These ODEs allow of com puting a bounded update δ u of the current n ominal control ¯ u throug h Eq . (9). If the truncation error deﬁned by Eq. (14 ) is limited, this control update should minimise the L agrangia n of the prob lem an d to some extent the terminal constraints violation and/or the terminal cost J . Additionally , to decr ease the constraint violatio n, an upd ate of the Lagran ge multiplier ν ν ν should be p rovided, and similarly , an update for the parameter p sho uld be produ ced to min imize the Lagrangian. 5 3.2 Parameter and Lagrangian V ector Updates Proposition 1. The following transformations ar e consid er ed d ψ ψ ψ = K T δ x + Q d ν ν ν + V d p (28) d τ τ τ p p p = T T δ x + V T d ν ν ν + W d p (29) This can be demonstrated by computing the ODEs and identifying the similarities with previous ODE s. This yields additional ODEs: − d V d t = K T ( f u γ γ γ + f p ) (30) − d Q d t = K T f u ω ω ω (31) − d W d t = H p p + γ γ γ H up + T T ( f u γ γ γ + f p ) + f p T (32) Eventually , assuming W ( t 0 ) < 0 , the problem parameter update is, d p = − W ( t 0 ) − 1 φ φ φ ν p (33) Also, assuming Q ( t 0 ) − 1 exists, lagrange multipliers are updated with the linear update d ν ν ν = ε ν Q ( t 0 ) − 1 ( δ φ φ φ − V ( t 0 ) d p ) (34) If Q ( t 0 ) < 0 , d ν ν ν deﬁnes a direction for th e constra ints r eduction , and ε ν ∈ [ 0 , 1 ] is thu s d etermined to minimise the constraints ψ ψ ψ . The ODE system (Eqs. (3 0 - 32)) is inte grated backwards from som e terminal co nditions depen ding o n the ter minal constraints ψ ψ ψ and the termin al cost J , wh ile the state dy namics (Eq. ( 1)) ar e integrate d forward. This results in numerically decoupled schemes, which yield to numerica l rob u stness. 3.3 T erminal Conditions The f ormulatio n allows the handling of various types o f constrain ts withou t changin g o f the backward in tegrated ODE, since the treatment of constraints has an inﬂuence only on the terminal condition s. The following terminal condition s ar e used to integrate back wards the ODEs (Eqs. (2 5), ( 26), (27), (30), (31), (32)): R ( t f ) = J xx ( t f ) + φ φ φ xx ( t f ) (35) K ( t f ) = φ φ φ ν x ( t f ) (36) Q ( t f ) = 0 (37) T ( t f ) = φ φ φ x p ( t f ) (38 ) V ( t f ) = φ φ φ ν p ( t f ) (39) W ( t f ) = φ φ φ p p ( t f ) ( 40) λ λ λ ( t f ) = φ φ φ x ( t f ) (41) The satisf action of the con straints ψ ψ ψ depend s on the Lagrange multiplier ν ν ν . Lemma 2 . Assuming H − 1 uu is p ositive deﬁ nite, the J a cobian ψ ψ ψ x ( x ( t f )) o f the constraints with r espect to the terminal state x f has fu ll rank n ψ , and Q ( t 0 ) m ust be ne gative deﬁnite, then the Lagrange multipliers u pdate d ν ν ν gives the (local) dir ectio n of minimisation of the constraints ψ ψ ψ . Pr oof. See [19]. 6 3.4 Parameters Determin ation The unknown parameter s for the initial c ondition s an d th e dy namics do not follow the same upd ate rule as of d p . Because the equation s of the sensiti vity matr ices, ( R , K , and Q ) an d ( T , V , and W ), are in tegrated backwards, setting the in itial con straint ϕ ϕ ϕ does no t chang e the terminal co nditions of integration Eqs. (35, 36, 3 7, 38, 3 9, a nd 40). Assuming the initial co nstraint ϕ ϕ ϕ is well deﬁned ( maximum rank con dition), the upd ate formula for the par ameters p is d p 0 = − ϕ ϕ ϕ − 1 p p ϕ ϕ ϕ p (42) There is no ne ed to u pdate the Lag range vector η η η introdu ced in Eq. (11). One may want to in clude a relaxa tion factor to control the update δ p 0 . Like wise, for the parameter part of the terminal constraint, d p f = − ψ ψ ψ − 1 p p ψ ψ ψ p (43) 4 T reatment of Contr ol Constraints 4.1 Acceptance of Contr ol Update Once th e ba ckward equatio ns are integrated from their r espective terminal conditions, and ar ound a n ominal trajec tory , the control update is computed using Eqs. (9) and (10). Note that ac cording to Eq. (34), δ u depend s on ε ν . Both ε ν and ε u permit of adjusting the control update to yield an improvement of the extended value f unction, and to ensure that the second order developments truncatio n erro r (Eq . ( 14)) is small enough. I f Lagrang e multipliers ν ν ν are not used, a linesearch procedu re is u sed to ﬁnd ε u that minimizes the merit function . When ε ν is used, the direction of minimization is g iv en b y a linear co mbination o f α α α and d ν ν ν . One app roach is to p erform linesear ches with ε u and ε ν successiv e ly , the p urpose b eing to ﬁnd quickly an improvement of th e merit f unction, regardless whe ther it is o r not an strict minimizer . Indeed , as in [9], the validity of the second order dev e lopment can be checked using ∆ L = Z t f t 0 [ H ( ¯ x , ¯ λ λ λ , ¯ u , p , t ) − H ( ¯ x , ¯ λ λ λ , u , p , t )] d t (4 4) ∆ L ≈ ( ∆ L ) d ν ν ν = 0 (45) Clearly , lim ε u → 0 ∆ L = lim ε u → 0 ( ∆ L ) d ν ν ν = 0 = 0 (46) For ε u small eno ugh, ∆ L g ets close to zer o th us ensu ring the second order developments are assum ed valid. If in addition, the value ∆ L of Eq. (44) is negativ e then the control update d u can be accepted . Under these conditions the method generates iterates of control u i ( t ) that minimize the Lagrangian L . 4.2 Bang-Bang Contr o l and Control Constraints According to Pontryagu in Max imum Principle[6], when t he control belongs to a compact s et U , for the problem class considered where the control ap pears lin early in the Hamiltonian, th e optimal control has a bang-ban g structur e. Obvi- ously , this re sult do es not come from the variational eq uations (th at seek ∇ u H = 0), b ut directly from the maximisation of the Hamilton ian over the compact set. I n the problem formulatio n, and Eq. (3), closed set on the control is assumed. Howe ver, the dev elopment of th e metho d, r eferred to calculus of variations, assume the Lagrang ian is differentiated on an open set. Constraints o n th e con trol can be set su ch tha t the contro l is im plicitly lim ited. One app roach is to f ormulate th e control constraint ( Eq. 3 ) using a barrier function ρ that is incorpo rated into the c ost [13][15]. T hus, when the contro l violates bounds at any time, the solution gets penalised an d e ventua lly the solver corrects the violation. This appr oach seems to pr ovide practical results, and also as th e control b ound s may not b e respe cted in the initial iteratio ns of the 7 solver , the solver can go th rough a forbid den sp ace o f the search spa ce an d p ossibly ﬁnd solutions with that extra controllab ility . This appr oach, h owe ver, does no t ensure the o ptimal con trol solution is an admissible contr ol (e. g. u / ∈ U ). The appro ach considered here is to explicitly force the contr ol to not exceed the desired limits. Th e principle is to identify a continuous function m : R → U and to introduce a slack variable v i such that the box constraint Eq. 3 can be replaced with the explicit equality constraint u i − m ( v i ) = 0 (47) Indeed , on e can thus replace u i accordin gly in the developments, and solve th e pro blem fo r v i directly instead o f u i . Since the compact set U is th en not used, the gradient ∇ v H ca n be used for the computation of the control updates, and this also pr ovides an indic ation of th e o ptimality of th e co ntrol u (W eierstrass con dition). Such an ap plication m can be for instance, m ( v i ) = u l i + ( u u i − u l i ) sin 2 v i This appr oach, althou gh numeric ally more challen ging than pen alization, makes sur e that the co ntrol satisﬁes the problem description and never v iolates the box constraints Eq. (3). 4.3 Heuristic for Con vergence of Bang-B ang Problems Some transf ormation s m : R → U may allow , howe ver, triviality or singu larity to the problem. For instance , wh en Lagrang e multipliers ν ν ν are not used, for the i-th compo nent of the control u , some transform ations m yield, ∃ t ∈ [ t 0 , t f ] , m ( v i ( t )) = 0 ⇒ ( H v i ( t ) = 0 , δ v i ( t ) = 0 ) (48) No updates are made e ven though the control solution is not optimal. A diagonal matrix D > 0 is thus introduced in the Hamiltonian, H ( x , λ λ λ , u , p ; t ) = λ λ λ T f ( x , u , p ; t ) + u T Du (49) This is equiv alent to ad ding a q uadratic L agrang e term in th e cost J = J ( x ( t f ) , u , p f ) . A s a result, contro l u is smoo thed. T o obtain a b ang-b ang saturated control, D is up dated and decr eased periodically with the iteration s for the con trol to be in a neighbo urhoo d of an optim al bang-bang control s olution. 5 Continuous Solution A pproximation 5.1 Pr oblematic T o compute the feedback control coef ﬁcien ts, α α α , β β β , γ γ γ and ω ω ω , a b ackwards in tegration (e.g . Eqs. (25), (26), 27) i s done around a nominal trajectory { x , p } that results from a forward in tegration. It is thus not possible to solve all the ODEs concur rently . It is necessary to store c ontinuo us solution o f th e ODEs, that is both the trajectory (e.g. x an d λ λ λ ), and the sensitivity m atrices (e.g . R , K ) to the n com pute the control up date. T o respect the con tinuous developments of previous section, an efﬁcient and reliable continuous data storage mechanism has to be devised. 5.2 Curr ent Discrete A pproximation A pproaches Often, th e altern ativ e is to work with the best discr ete time app roximate so lutions and th e deﬁnitio n of a tim e mesh where the solution can be e valuated. Many ap proach es have been prop osed in the literature to produ ce a d iscrete con trol time mesh. The app roach of keeping the con trol c onstant for ﬁxed duration s [15][ 13] produ ces go od r esults when th e nu mber of nod es h as been well selected. Likewis e, the control ca n be kept constant over multip le Rung e-Kutta time s teps for sm ooth control[1 6 ]. But, since the co ntrol is used in the for ward integration but n ot du ring the back wards integration, using the Runge- Kutta ( RK) time steps alone do es n ot seem approp riate for the b ang-b ang prob lem conside red. Rung e-Kutta integratio n is o nly time-r ev ersal when u sed with a con stant step size [30]. Others[3 1][32] pro pose a reﬁn ement pr ocedur e based 8 on an in teger p rogra mming prob lem, an d co nsider th e minim isation of the continu ity error an d the min imisation o f number of the points to add. When t he control is o f a bang-ban g structur e, an other approach should be followed, because a con stant time mesh for the con trol is likely to m iss the switching times o f the op timal control. The usual appr oaches increase the numb er of points close to the discontinu ity . Th is has a major drawback of increasing the number of data points to store. The choice o f a discrete, or multi-stage , formu lation could be argued in regard s to potential ap plications, since it is always simp ler to implem ent a piecewise co ntrol. Fr om a theo retical point of view , howe ver, it is necessary to propo se a scheme f or limiting the ap prox imation mad e on the co ntrol. Pro perties of the co ntinuo us con trol so lution (e.g. switching stru cture, singular arcs) can e ventu ally be used to produce a good discrete control s olution. 5.3 Continuous T rajectory Ap pr oximation The trajector y and the co ntrol are initially sam pled at N given node points tha t can b e d escribed b y a g rid t i ∈ [ t 0 , t f ] , i = 1 .. N . The tr ajectory and the contr ol are thus exactly deﬁned at the n odes, and an ap proxim ation to the continuo us solution is co nstructed between the node points. Consider an explicit ad aptiv e step-size Run ge-Kutta integration method. It is the baseline for solv ing the ODE systems in the current paper . An explicit Runge-Kutta integration method uses the following iterativ e equations for integrating, y n + 1 = y n + h s ∑ i = 1 b i k i (50) k i = f t n + c i h , y n + h s ∑ j = 1 a i j k j ! , (51) where a i j , b i and c i are con stant scalars, and s is the order of the method . Sh ampine’ s method for de nse outp ut computatio n in Rung e-Kutta computatio n is used [33]. O ne can introduc e a polyn omial S ( x ) , and an error function e ( x ) such that, y ( x ) = S ( x ) + e ( x ) (52) d e d x = f ( x , S ( x )) − S ( x ) d x e ( x 0 ) = 0 Shampine[ 33 ] thus de ﬁnes po lynomials P ( x ) o f max imum degree m + 1 ( m being the n umber of su b-integration s in one subinterval of length h ), d P ( x ) d x = m ∑ i = 0 l i , m ( x ) f ( x i , e i + S ( x i )) (53) where l i , m are La grange basis p olyno mials of kth-or der m . It is demon strated there exists an integer n , and con stants c 1 and c 2 , m such that     d 2 y ( x ) d 2 x − d S ( x ) d 2 x     ≤ c 1 h n (54) k y ( x ) − P ( x ) k ≤ c 2 , m h 2 + min ( m , n + 1 ) (55) That is, P is a b etter higher or der ap prox imation of y that S is. An iterativ e scheme is used to construct P ( x ) . The proced ure starts with a simple Euler integration model f or S ( x ) with m = 1, and then increases the ord er m by ev aluating successiv e ly S ( x ) and P ( x ) . The c onstruction of S and P is depend ant of the extra polation method (e.g . Run ge-Kutta method) , and P is only then av ailab le at the end of the integration over the interval o f length h . For a ﬁfth order e x plicit RK method, ∀ x ∈ [ x n , x n + 1 ] , y ( ¯ x ) = r 1 + ¯ x ( r 2 + ( 1 − ¯ x )( r 3 + ¯ x ( r 4 + ( 1 − ¯ x ) r 5 ))) , (5 6) 9 where r 1 = y n (57) r 2 = h s ∑ i = 1 b i k i (58) r 3 = h k 1 − ( y n + 1 − y n ) (59) r 4 = − h ( k 7 + k 1 ) (60) r 5 = h ( d 1 k 1 + d 3 k 3 + d 4 k 4 + d 5 k 5 + d 6 k 6 + d 7 k 7 ) (61) and d i are co nstants of the meth od. Glo bally , the solution for th e state tra jectory and the sensitivity matrices can be stored as a C 1 approx imation. For a ﬁfth order RK method, ﬁ ve coef ﬁcien ts f or each element of the ODE system, plus the range of validity of the polynomials are stored. Consequently , du ring the f orward in tegration, the tra jectory state x ( t ) is stor ed co ntinuo usly , using Eq. (56), and this continuou s appro ximation is used d uring th e ba ckward integration. Likewis e, during the backward integration , the costate λ λ λ ( t ) and the sensiti vity matrices R ( t ) , K ( t ) , T ( t ) , V ( t ) continuou s approx imation are stored. Thus, all matr ices and vectors inv olved in the computations follow a h igh resolution con tinuous appr oximation . Eventually , th e backward stage being fully continuo us, a continu ous control update δ u is av ailable. Additionally , to o btain the b est trajecto ry and th e most accu rate op timal control, f rom the continu ous control update δ u , an appro ach s hould be devised to construct a continuous approximation of the nominal control ¯ u . 5.4 Continuous Nominal Contr ol App r oximation The continuous contro l appro ximation is co nstructed a-p osteriori, from th e resu lts of the back ward integration. In regards to Eq. (10), the con tinuou s co ntrol upda te sho uld be added to the n ominal control ¯ u to p rovide an u pdated control u that can be used as the new no minal co ntrol fo r the following iteration d uring th e ba ckward integration . Consider a nom inal control ¯ u , du ring the forward recursion, when integrating ODE Eq. (1). According to section 5.3, control update δ u can b e constructed accurately using RK interpo lant of matrice s R , K for all t ∈ [ t 0 , t f ] , and likewise, an accurate updated control u can be computed . Th is online produced nominal control has ho wev er to be stored, such that it can be u sed a gain for the b ackward recursion an d futur e iterations. Practically , this is a difﬁcult problem since neither ¯ u nor δ u have a r eadily a vailable closed form, in general. Consider t k the RK no des of the f orward integration with th e con trol ¯ u + ε u δ u with values u ( t k ) . The problem is to completely determined u such that | J ( ¯ u + ε u δ u ) − J ( u ) | ≤ η 1 (62) k φ ( ¯ u + ε u δ u ) − φ ( u ) k ≤ η 2 (63) where J and φ are, r espectively , the target m erit f unction value an d the constraints, com puted with the imp roved control ¯ u + δ u . η 1 and η 2 are g iv en small toleran ces. F or very sen siti ve p roblems, the d iscrepancy between this n ew control u an d th e contro l ¯ u + ε u δ u ca n resu lt in large difference in the merit function value. This difference can possibly be bigger than the expected improvement (Eq. (4 4) ) thus resulting in a failure of the method to converge p roperly . Each componen t of u is thus appro ximated between the coarse grid nodes to follow a piecewise-polynom ial fu nc- tion, or spline (linear fu nction, cub ic spline interpolan t). Inter mediate nodes at t k + 1 / 2 can be tested, and added if it results in a better approx imation of ¯ u + ε u δ u . E ventually , u → ¯ u . 6 Second Order Algorithm and Implementation 6.1 The Algorithm The algorithm is im plemented (C-SOGO) and coded in C++. It uses efﬁcient linear algebra classes with share d- memory parallelisation. A hig h le vel description of the algorith m is the follo wing: Giv e n: 10 • co n vergence toleranc e on the constraints η ψ , and the optimality condition η H . • Trust-Region radius for the control ∆ u > 0. • Trust-Region radius for Lagrange multipliers ∆ ν > 0 , and parameter ∆ p > 0 . • a no minal control ¯ u ( t ) for t ∈ [ t 0 , t f ] . • a no minal Lagrange multiplier ¯ ν ν ν ∈ R n ψ . • a regular isation matrix C ∈ R n ψ , n ψ . • in itial state x 0 = x ( t 0 ) . Step 0 Initialisation Step 0.a. Initialise iter ation counter . Step 0.b . Set ε ∗ u = 0 and ε ∗ ν = 0 . Step 1 Nominal trajecto ry Step 1.a. Using a no minal control ¯ u ( t ) , compute the state trajectory x ( t ) . Step 1.b . Compu te nominal merit function v alue ¯ L . Step 2. Check t erminal criteria Compute the terminal constraints ψ ψ ψ Eq. (4). If co nstraints k ψ k > η ψ go to Step 3. If optimality condition max ( ∆ L , k H u k ∞ ) > η H go to Step 3. Otherwise, an optimal solution has been foun d. Sto p. Step 3 Backwards Step Step 3.a. Comp utation of the terminal conditions with Eqs. 36, 37, 35 and Eqs. 38, 40, 36. Step 3.b . Compu tation of the Shift matrix S uu thanks to the T rust-Region algorithm (see deﬁnitio n 1 ) fo r H uu ( t ) + S uu ( t ) > 0 for all t ∈ [ t 0 , t f ] . The Shift matrix S uu shall ensure the boun dedness of the ODE solutions. Step 3.c. I ntegration of Eqs. (25, 26, and 31) and compu tation of continuous time approx imation of λ λ λ (t), R R R ( t ) , K K K ( t ) and Q Q Q ( t ) for all t . Durin g the integration, H uu is shifted with S uu . As the equ ations are integrated, in accordan ce to sec. 5.3, a m esh and poly nomials are construc ted to provide a continuou s approximation of them. Step 3.d. Likewise, if the p roblem inclu des a par ameter, integrate Eqs. ( 27), (30), and (32) and compu te co n- tinuous time appro ximations of T T T ( t ) , V V V ( t ) and W W W ( t ) for all t . Step 4 Computation of Lagrange multipliers ν ν ν Step 4.a. If Q ( t 0 ) < 0 go to 4.c Step 4.b . Apply T rust-Region algorithm to ha ve a negati ve d eﬁnite Q ( t 0 ) . Step 4.c. Com putation of Lagrange multipliers ν ν ν for the constraints using Eq. (34). Step 5. Forward Integration Loop 11 Step 5.a. Integra te the dynamical state eq uations Eq. (1) with the contr ol u + ε u δ δ δ u , wh ere the con trol upda te δ δ δ u is c omputed using Eq. (10). The continuo us appr oximation of sensiti vity matrices R ( t ) , K ( t ) , and Q ( t ) accurately provide the f eedback coef ﬁcients α , β and γ . T o pro duce th e imp roving contro l update δ δ δ u algorithm A1 is used to ﬁnd ε u and ε ν that minimise the merit function Eq. (6). Step 5.b . Evaluation of the co nstraints ψ ψ ψ ( x ; t f ) , the objecti ve function J ( x ; t f ) , and the merit function value L ( x , ν ν ν ; t f ) . Step 5.c. I f improvement in m erit function, L < ¯ L , take ε ∗ u = ε u and ε ∗ ν = ε ν . Go to step 6. Step 5.d. Im provement could not b e fo und. Condition s of validity of the seco nd or der development might n ot be meet. Redu ce T ru st-Region radius ∆ u . Incr ease Iteration counter . Go to step 3. Step 6. Accept current control. Step 6.a. Constru cting a continuo us time approxima tion o f the new nominal control ¯ u ← ¯ u + ε ∗ u δ δ δ u . Step 6.b . Update Lagran ge multipliers, ν ν ν = ¯ ν ν ν + ε ∗ ν d ν ν ν . Step 6.c. Up date nominal merit function v alu e ¯ L . Step 6.d. Up date regularisation matrix C . Step 7. Cont inue with the next iteration. Incr ease iteration counter . Go to step 2. Algorithm A1 is either a line-sear ch m ethod to get ε u ∈ [ 0 , 1 ] that minimises L when ν ν ν is not used, or otherwise, a simple two-dim ensional [ ε u , ε ν ] -grid sear ch m ethod to get the additional parameter ε ν ∈ [ 0 , 1 ] . On th e cur rent iteration, linesearch, or gridsearch , is stopped when a sufﬁcient decrease is obtained . Con vergence is achieved whe n both the constraints ψ ψ ψ and the optimality n orm are satisﬁed to a gi ven tolerance. A norm of optimality can be giv en by Eq. (44) and H u . 7 Examples The initial m otiv ation of this work has been th e optim ization of low-thrust space tra jectories, which p resent a bang- bang optimal contro l structu re. Both presented examples h av e afﬁne control d ynamics with the contr ol de ﬁned o n a closed set, and thus the optimal control should be bang-b ang. 7.1 Double Integrator Prob lem The dynam ics f ( x , u ; t ) = d d t  p v  =  v u  (64) where p is the 1-D position of the particle, and v is its velocity . T he control bound s are − 1 ≤ u ≤ 1 (65) Owing to the control constraints, the transfor mation u = 1 − 2 sin 2 v is used, and v is the new control to seek. T erm inal constraints are φ φ φ ( x ; t f ) =  p ( t f ) = 0 v ( t f ) = 0  (66) And the initial conditions are p ( 0 ) = 1, v ( 0 ) = 1. The objectiv e is to minimize the total time,. J ( x ; t f ) = t f → max (67) 12 The time of ﬂight t f is thus consid ered as a p arameter of the prob lem. The integration of th e equation s is thu s d one throug h a chan ge of variable τ = t / t f such th at the integration is done on [ 0 , 1 ] with varying jaco bian d τ . Th at is we should use ˜ f ( x , u , t f ; τ ) = f ( x , u ; τ t f ) t f (68) 0 1 2 3 4 −1 0 1 u 0 1 2 3 4 −5 0 5 H u 0 1 2 3 4 −1 0 1 u 0 1 2 3 4 −0.5 0 0.5 H u 0 1 2 3 4 −1 0 1 u 0 1 2 3 4 −0.2 0 0.2 H u 0 1 2 3 4 −1 0 1 u 0 1 2 3 4 −0.1 0 0.1 H u 0 1 2 3 4 −1 0 1 u 0 1 2 3 4 −0.1 0 0.1 H u 0 1 2 3 4 −1 0 1 u Time, TU 0 1 2 3 4 −0.05 0 0.05 H u Time, TU Figure 1: Iterates of a simple control p roblem. The coa rse grid includes 3 nod e points. Left, c ontinuo us co ntrol iterates. Right, gra dient of the Hamiltonian with respect to the continuou s c ontrol. Figure 1 sh ows the iterates o f pr ogram to ﬁn d th e solutio n. The co ntrol is app roximated with a constrained cu bic spline to prevent overshoot at intermediate points (with constrained cubic splines, the s lope is imposed rather than the curvature). At the plateau of the control, in ﬁgure 1, the control is thus approxim ated with straigh t lines. Using as initial guess t f = 3 . 5 we found an optimal time t ∗ f = 3 . 4495 81. One can note how the number of points varies. The com parison to the smoothness of the con trol or the optimality of the solution is not straightforward, howe ver . It is arguable th at the contro l ob tained with the method is not strictly b ang-ba ng in the sense that it can reach intermediate v alues on a set of measure not null. Howe ver , from a numerical point of view , for an optimality tolerance small enough the er ror to th e true b ang-b ang solution should be negligible. An d, possibly , the so lution obtained can be r eﬁned in a later stage with the switching time a s p arameter since an optimal co ntrol structure is th en known. The result of the reﬁnement is likely to be negligible if the optima 13 7.2 Simple Orbital T ransfer Prob lem The problem is the one of transferring a s pacecraft from o ne circular orbit to a n other with higher radius. T he terminal orbit is sou ght in the same plane as the initial orb it, and thu s th e dynam ics is simpliﬁed to be two-d imensional. The spacecraft is propelled with continuo us and constant thrust. With the gravitational constant µ , the dynamics are then, f ( x , u ; t ) = d d t     r v r v θ m     =      v r v 2 θ r − µ r 2 + F m sin u − v r v θ r + F m δ cos u − F V e      (69) State elemen ts r , v r , v θ and m de note respec ti vely the rad ial po sition, the rad ial velocity , the o rtho-r adial velocity and the ma ss of the space craft. F is the th rust f orce (0 ≤ F ≤ F 0 = 5 N ) an d V e is th e exhaust velocity ( V e = g 0 I s p = 19612 m / s ). The con trols are thus the steering angle u and the thrust amplitude F . The in itial orbit is circular of radius r 0 = 200 00 k m. The constraint is to be on a circular orb it of radius r f = 420 00 km at ﬁnal time. Using the cylindric al coordinate, this yields the constraints, φ φ φ ( x ; t f ) =    v r v θ − q µ r ( r − r f ) 2    (70) The time of ﬂight is ﬁxed to t f = 4 days. Th e initial mass is 1000 kg. Th e objecti ve is to maximize the ﬁnal mass. J ( x ; t f ) = m ( t f ) → max (71) The ﬁnal optimal mass is m ( t f ) ∗ = 93 2 . 15 kg and the spacecra ft make about 8 rev olu tions arou nd Ear th. The optimal control is depicted on ﬁgure 3, and the op timal trajec tory is on ﬁgure 2. As can be seen on ﬁgure 3, the optimal control is of bang-b ang typ e. The solver is able to ﬁnd accurately the commutation points, while no insight of the control stru cture is initially pr ovided. Such problem has usu ally ma ny lo cal optimal that can be sorted with r espect to the nu mber of rev olu tions. T hat is some solu tions may have more, or less, th an 8 rev olutions for a con stant time of ﬂight. In gener al however , the best con trol is to th rust at the perigee (wh ere the cost and the termin al con straints are more sensible to thrust) and coast at the apogee, as depicted in ﬁgure 2. 14 −1.5 −1 −0.5 0 0.5 1 1.5 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 7.96 revolutions x Orbit problem y Figure 2: Op timal trajectory of the orbital transfer problem. 0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.2 0.4 0.6 0.8 1 Normalized Control Amplitude Time t, days Figure 3: Con trol of the orbital transfer problem. 15 8 Conclusions The solving of optimal contro l pro blems with a contin uous back ward-forward sweep algorith m based on a second - order v ariations is presented . A second order expa nsion of a L agrang ian provid es linear updates of the control to construct a loca lly feedback optimal con trol o f the p roblem. Th e contro l upd ates are compu ted following backward and forward stages. Ordin ary d ifferential equatio ns are inte grated backward ar ound a state trajector y that has been computed forwardly in an earlier stage. The method uses an accurate continuous approximation of the ODE solutions to ensure precision of the control updates. The m ethod was demon strated o n two examples with b ang-ban g optim al control solution. It was shown that the solver can ﬁnd accurately the swit ching structur e of the solution, without any in sight. Refer ences [1] I. Michael Ross and Fariba Fahroo . Legend re pseud ospectral ap proxim ations o f optimal control problem s. 20 03. [2] A. V . Rao, D. A. Benson , C. L. D arby , M. A. Patterson, C. Franc olin, , a nd G. T . Hu ntington . Algorithm 9 02: Gpops, a matlab so ftware f or solv ing m ultiple-ph ase optim al contr ol pr oblems using the gauss pseudospe ctral method, . ACM T ransactions on Mathematical Software , 37( 2):1–3 9, April 2010. do i:10.11 45/173 1022.1731032 . [3] V .M. Becerra. Solvin g complex op timal control problem s at no cost with psopt. In Pr oc. IEE E Multi-conference on Systems and Contr ol , pag es 1391–1396 , Sep tember 2010. [4] J.T . Betts and W . P . Huffman. Sparse o ptimal c ontrol so ftware, socs. T echnical re port, Boein g I nform ation and Suppor t Services, Seattle, W ashington, 1997 . [5] J.T . Betts. Sur vey of numerical metho ds for trajector y optimization. Journal Guidance, Contr ol an d Dynamics , 21(2) :193 – 207, 1998. doi:10.25 14/2.4 231 . [6] L.S. Pon tryaguin . The Mathematica l Theory of Optimal Pr ocesses . John W iley and Sons Inc, 1962 . [7] S. R. Mc Reynolds an d A. E. Jr . Bryson. A s uccessiv e sweep method for solvin g optim al pro gramm ing problem s. T echn ical Report AD0459518 , Har vard Un iv e rsity , 1965. [8] D.H. Jacob son and D.Q. Mayne. Dif ferential Dynamic Pr ogramming . American Else v ier , New Y ork, 1969 . [9] D.A. Mayne. A secon d or der grad ient method for determin ing optimal trajectories for nonlinea r discrete-time systems. Int. J . Contr o l , 3:85 –95, 1966 . [10] M itter S.K. Successiv e approx imation metho ds f or the solu tion of o ptimal control pro blems. A uto matica , (3):13 5–149 , 1966 . [11] S. B. G ershwin and D.H. Jacob son. A discrete-time differential d ynamic p rogram ming with application to optim al orbit transfer . AIAA J ournal , 8(9):1616 –162 6, 1970. [12] P . Dy er and S. R McReynolds. Optimization of control systems with d iscontinuities and terminal constrain ts. IEEE T ransactions on au tomatic Contr ol , 14(3) , 1969. [13] G. J. Whiffen and J.A. Sims. Ap plication of a novel optimal contro l algorithm to low-thrust trajectory op timiza- tion. In AAS/AIAA Space Fligth Mechanics Meeting , Feb 2001 . AAS 01-2 09. [14] G. J. Whiffen an d J.A. Sims. Application of the sdc optimal contro l algo rithm to low-thrust escape and captu re including four th-bod y ef f ects. In 2nd I nternation al Sympo sium on Lo w-Thrust T rajectories , T oulouse, June 20 02. 18-20 June. [15] G. Lan toine and R.P . Russel. A hyb rid differential d ynamic programm ing algo rithm for lo w- thrust optimization. In AIAA/AAS Astr o dynamics Spe cialist C onfer ence and Exhibit, 18-2 1 August 2008, Honolulu, Hawaii , 2008. 16 [16] C. Colomb o, M. V asile, an d G. Radice. Optima l low-thrust trajectories to astero ids th rough an algorithm based on differential dy namic pr ogramm ing. Celestial Mechanics and Dynamical A str onomy , 1 05(1 ):75 – 112 , 2 009. doi:10.1 007/s10 569-009-9224-3 . [17] T .E. Bu llock. Computation of optima l controls by a m ethod based on secon d variations. T echnical r eport, Stanford University , 1966. [18] T . Bullock an d G. Franklin. A seco nd-o rder feedback method for optimal contro l computations. IEEE T ransac- tions on Automatic C ontr ol , 12( 6):666 – 673, 1967. [19] J.T . Olympio. Optimisation and Optimal Con tr ol Methods for P lanet Sequence Design of Low-Thrust Interplan- etary T ransfer Pr ob lems with Gravity-Assists . PhD thesis, Eco le des Mines de Paris, October 2008. [20] J.T . Olympio . Algorithm for low thru st optimal in terplanetar y tr ansfers with escape and capture phases. In AIAA/AAS Astr o dynamics Spec ialist Confer ence and Exhibit , Hono lulu, Ha waii, August 200 8. 20 08-73 63. [21] P .E. Gill, W . Murray , M. A. Saunders, a nd M.H. Wright. Some theoretical pro perties of an augmented lagran gian merit function . T ec hnical Report SOL 86-6R, Stanford University , Sep 1986 . [22] M . R. Hestenes. Mu ltiplier and grad ient methods. Journal o f Optimizatio n Theory a nd Ap plications , 4 (5):30 3– 320, 1969. doi:10.1 007/BF009 27673 . [23] M .J.D. Powell. A metho d for nonlin ear constraints in minimization pr o blems , pages 283 –298 . Academic Press, optimization edition, 1969. [24] R.E . Kalma n. Contributions to th e theo ry of optimal contro l. Boletin de la Socieda d Matematica Mexicana , 5(102 ):102– 119, 1960. [25] R.S. Bucy . Global theo ry of the riccati equ ation. Journal o f Co mputer and System Scie nces , 1(4), 1967 . doi:10.1 016/S00 22-0000(67)80025-4 . [26] D. H. Jacobson. New conditions for boun dedness of the solution of a matrix ricc ati dif fer ential equation . Journal of Differ ential Equatio ns , 8(2):25 8–26 3, 197 0. do i:10.101 6/002 2-0396(70)90005-7 . [27] J.F . Rodrigue z, J.E. Renaud , and L.T . W atson. T ru st region augmented lagrangian methods for sequen- tial r esponse surface ap proxim ation an d op timization. Journal of Mechanica l Design , 15 :58–6 6, 1997. doi:10.1 115/1 .2826677 . [28] A. R. Conn, N. I.M. Gould, and P .L. T oint. T rust-re gion method s . MPS-SIAM Series on Op timization 1. Society for Industrial and Applied Mathematics, 2000 . [29] T .F . Coleman and A. Liao. An efﬁcient tru st region method fo r unconstrained discrete-time optimal control problem s. Computatio nal Optimization and Applications , 4:47–66, 1993. [30] D. Stoffer . V ariab le steps for re versible integration methods. Computing , 55(1) :1–22, 1995. doi:10.1 007/BF02 238234 . [31] J. Laur ent-V arin, J.F . Bonnan s, N Berend, M. Haddou , an d C. T albo t. Interior-point approach to trajector y optimization . Journal of Guidance, Contr ol and Dyna mics , 30(5), 2007. d oi:10.2 514/1.1 8196 . [32] J.T . Betts and W .P . Huffman. Mesh reﬁnem ent in direct transcription me thods for o ptimal control. Optimal contr ol app lications and method , 19(1):1 – 21, 1998. d oi:10.2 514/2. 4231 . [33] L awrence F . Shampine. Inter polation f or variable o rder, rung e-kutta meth ods. ”Compute rs an d Mathematics with Applicatio ns” , 14(4):2 55–2 60, 1987 . d oi:10.10 16/08 98-1221(87)90133-7 . 17

A Continuous Feedback Optimal Control based on Second-Variations for Problems with Control Constraints

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment