Modeling of Dynamical Systems via Successive Graph Approximations
In this work, we propose a non-parametric technique for online modeling of systems with unknown nonlinear Lipschitz dynamics. The key idea is to successively utilize measurements to approximate the graph of the state-update function using envelopes d…
Authors: Siddharth H. Nair, Monimoy Bujarbaruah, Francesco Borrelli
Modeling of Dynamical Systems via Successiv e Graph A ppr oximations Siddharth H. Nair , Monimoy Bujarbaruah, and Francesco Borrelli Abstract — In this work, we pr opose a non-parametric tech- nique f or online modeling of systems with unknown nonlinear Lipschitz dynamics. The key idea is to successiv ely utilize measurements to approximate the graph of the state-update function using env elopes described by quadratic constraints. The proposed approach is then demonstrated on two control applications: ( i ) computation of tractable bounds for un- modelled dynamics, and ( ii ) computation of positive inv ariant sets. W e further highlight the efficacy of the pr oposed appr oach via a detailed numerical example. I . I N T RO D U C T I O N Characterization of system model and associated uncer - tainties is of paramount importance while dealing with autonomous systems. In recent times, as data-driven decision making and control becomes ubiquitous [1], [2], system identification methods are being integrated with control algo- rithms for control of uncertain dynamical systems. In com- puter science community , data driv en reinforcement learning algorithms [3], [4] hav e been extensi vely utilized for policy and v alue function learning of uncertain systems. In control theory , if the actual model of a system is unkno wn, adaptiv e control [5], [6] strategies have been applied for simultaneous system identification and control. T echniques for system modelling and identification have been traditionally rooted in statistics and data sciences [7], [8]. Statistical models that describe observ ed data, can be classified into parametric [9], non-parametric and semi-parametric [10] models. Parametric models assume a model structure a priori, based on the application and domain expertise of the de- signer . In almost all of classical adapti ve control methods, parametric models are learned from data in terms of point estimates, and asymptotic con ver gence of such estimates are prov en under persistence of excitation [11] conditions. The concept of online model learning and adaptation has been extended to systems under constraints as well, after obtaining a set or a confidence interval containing possible realizations of the system model. Gaussian Mixture Modeling (GMM) [12], [13] has also been used to identify unknown system parameters, via an expectation maximization algorithm. Howe ver , parametric models are restricted only to speci- fied forms of function classes, and so to widen the richness of model estimates, non-parametric models are increasingly being utilized, whereby the model structure is also inferred from data. For non-parametric modeling of systems, Gaus- sian Process (GP) regression [14] has been one of the most widely used tools in control theory literature. GP regression keeps track of a Gaussian distribution ov er infinite The authors are with the MPC Lab, UC Berkeley , USA; E-mails: { siddharth nair, monimoyb, fborrelli } @berkeley.edu. dimensional function spaces, in terms of a mean function and a covariance kernel, which are updated with data. Giv en any system state, GP regression returns the mean function value at that state, along with a confidence interval. Kernel regression methods such as local linear re gression [15], [16] and Nadaraya-W atson estimator [17] are some other non- parametric methods for system identification and control. Estimates obtained using these methods often come with confidence intervals as detailed in [18], instead of sets containing all possible realizations of the system, which is a critical drawback from the perspectiv e of robust control. The focus of this paper is to propose a simple non- parametric approach for modelling the unknown dynamics of a discrete time autonomous system. The proposed approach applies to unknown nonlinear systems with dynamics de- scribed by a state-update function that is globally Lipschitz ov er a bounded domain, with kno wn Lipschitz constant. Instead of identifying the state-update function itself, we identify its graph- the set of all state and corresponding state-update function value pairs. This is done by computing en velopes of the state-update function, which are sets that contain the graph of the state-update function. These en- velopes are b uilt by using historical data of state trajectories and the Lipschitz property of the function. The paper is divided into two parts. In the first part we describe a method to compute the en velope set which contains all possible realizations of the unknown state-update function at any giv en state. The authors in [19], [20] use GP regression modeling to provide probabilistic confidence intervals on the state-update function at an y gi ven state. The key difference is that we approximate a function via a subset of the Euclidean space rather than approximating it directly in a function space. In the second part, we provide two applications of the proposed approach, namely ( i ) obtaining tractable set based outer approximations of the unknown state-update function and ( ii ) computing positiv e in variant sets [21], [22] for the unknown system using the s-procedure [23]. I I . N OTA T I O N k · k denotes the Euclidean norm in R n unless explicitly stated otherwise. An open ball in R n , of radius r and centered at x is denoted as B r ( x ) . Notation O ( · ) is used to describe an expression that decays to 0 as fast as its argument. The Minko wski sum of two sets A and B is given by A ⊕ B = { a + b | a ∈ A, b ∈ B } . W e use ell( c, R ) to denote an ellipse that is centered at point c and has a shape matrix R = R > 0 . I I I . P RO B L E M F O R M U L A T I O N Consider the discrete time autonomous, time in variant system x k +1 = f ( x k ) , (1) where the state-update function f ( · ) : X → X describes the system dynamics and is defined o ver the state space X ⊆ R n . Assumption 1: The function f ( · ) : X → X is continuous and differentiable over a con vex and closed domain X ⊂ R n with k∇ f ( x ) k ≤ L , for all x ∈ X and some L > 0 . Pr oposition 1 ([24]): Let Assumption 1 hold. Then k f ( x ) − f ( y ) k ≤ L k x − y k for all x, y ∈ X , i.e., f ( · ) is L -Lipschitz in the domain X . Now suppose that the function f ( · ) is unknown. The objectiv e of this work is to compute a set containing f ( x ) for any state x in the state space X using trajectory data { x 0 , x 1 , x 2 , . . . } and the Lipschitz property of the unknown function f ( · ) . Assumption 2: The Lipschitz constant L is kno wn. In case the Lipschitz constant L is unknown, it can be estimated using methods such as [25]. Integrating such estimation methods into the proposed work is a subject of future research. Remark 1: The problem of characterizing L − Lipschitz un-modelled dynamics d ( · ) in x k +1 = ¯ f ( x k ) | {z } assumed model + d ( x k ) | {z } un-modelled dynamics can also be cast into a problem of the form (1). In this case, we use the trajectory data { x 0 , x 1 , x 2 , . . . } to construct { x 1 − ¯ f ( x 0 ) , x 2 − ¯ f ( x 1 ) , . . . } which is then used for computing a set containing d ( x ) at x ∈ X . I V . P R O P O S E D A P P R OAC H W e will make use of the follo wing definitions. Definition 1 (Graph): The graph of function f ( · ) : X → X is the set G ( f ) = { ( x, f ( x )) ∈ R n × R n | ∀ x ∈ X } . (2) Definition 2 (En velope): An en velope of function f ( · ) : X → X is a set E f ⊆ R n × R n , with the property G ( f ) ⊆ E f . (3) W e use trajectory data { x 0 , x 1 , x 2 , . . . } of the system dy- namics (1) to construct an en velope of the system dynamics f ( · ) . Observe that the trajectory data can be used to construct tuples ( x k , f ( x k )) = ( x k , x k +1 ) . In particular, at e very time instant N , we hav e access to measurements ( x k , f ( x k )) , for all k = 0 , 1 , . . . , N − 1 . These measurements are utilized to construct env elopes recursively . Our approach for en velope construction is summarised as follo ws: 1) At time N , compute an en velope E ( x N − 1 ) using the tuple ( x N − 1 , f ( x N − 1 )) and the L − Lipschitz property of f ( · ) . 2) Compute a refined env elope E f N by intersecting the en velope E f N − 1 from time N − 1 with the en velope E ( x N − 1 ) computed in step 1), i.e., E f N = E f N − 1 ∩ E ( x N − 1 ) . For 1), the env elope is obtained as the sublevel set of a quadratic function. Afterwards, 2) is obtained by using the set membership approach [26]–[29]. Finally , we use the computed env elope for obtaining a set containing the value of f ( x ) at any x ∈ X , using the notion of a slice of an en velope defined below . Definition 3 (En velope Slice): The slice of an en velope E f ⊆ R n × R n at a gi ven ¯ x ∈ X is the set defined as E f x = ¯ x = { ( x, y ) ∈ E f ⊆ R n × R n | x = ¯ x } . (4) Fig. 1 sho ws a typical realization of the proposed approach along with the associated set definitions which are detailed next. f ( · ) x y ¯ x Fig. 1: Construction of an en velope for a one dimensional system to approximate the graph G ( f ) (black curve) of its state-update function f ( · ) . Tuples ( x, f ( x )) (red points) obtained from trajectory data are used for constructing the en velope (blue set) and its slice (yello w set) at x = ¯ x . A. En velope Construction Inspired by [30], [31], we use quadratic constraints (QCs) as our main tool to approximate the graph of a function. A definition appropriate for our purposes is presented below . Definition 4 (QC Satisfaction): A set X ⊂ R n satisfies the quadratic constraint specified by a symmetric matrix Q if x 1 > Q x 1 ≤ 0 , ∀ x ∈ X . (5) The following proposition uses a QC to characterize a coarse approximation of the graph of an L − Lipschitz function. Pr oposition 2: The graph G ( f ) of an L − Lipschitz func- tion f ( · ) satisfies the QC specified by the matrix Q f L ( x k ) = − L 2 I n 0 n × n L 2 x k 0 n × n I n − f ( x k ) L 2 x > k − f > ( x k ) − L 2 x > k x k + f > ( x k ) f ( x k ) , (6) for any ( x k , f ( x k )) ∈ G ( f ) . Pr oof: Using the definition of the L − Lipschitz property of f ( · ) (from Proposition 1), at any ( x k , f ( x k )) ∈ G ( f ) , we hav e k f ( x ) − f ( x k ) k ≤ L k x − x k k ∀ ( x, f ( x )) ∈ G ( f ) , ⇐ ⇒ ( f ( x ) − f ( x k )) > ( f ( x ) − f ( x k )) ≤ L 2 ( x − x k ) > ( x − x k ) ∀ ( x, f ( x )) ∈ G ( f ) , ⇐ ⇒ x f ( x ) 1 > Q f L ( x k ) x f ( x ) 1 ≤ 0 , ∀ ( x, f ( x )) ∈ G ( f ) . Therefore G ( f ) satisfies the QC specified by Q f L ( x k ) . The follo wing corollary then giv es us the definition of the en velope E ( x k ) . Cor ollary 1: The set defined by E ( x k ) = { ( x, y ) ∈ R n × R n | x y 1 > Q f L ( x k ) x y 1 ≤ 0 } (7) is an env elope for all L − Lipschitz functions that pass through ( x k , f ( x k )) . Pr oof: Let g ( · ) be an y L − Lipschitz function such that g ( x k ) = f ( x k ) . From the definition of Lipschitz property we hav e k g ( x ) − f ( x k ) k ≤ L k x − x k k , ∀ ( x, g ( x )) ∈ G ( g ) , ⇐ ⇒ x g ( x ) 1 > Q f L ( x k ) x g ( x ) 1 ≤ 0 , ∀ ( x, g ( x )) ∈ G ( g ) , ⇐ ⇒ ( x, g ( x )) ∈ E ( x k ) , ∀ ( x, g ( x )) ∈ G ( g ) , ⇐ ⇒ G ( g ) ⊆ E ( x k ) . Remark 2: The proposed formulation can also be ex- tended to accommodate bounded noise in the measurements of x k in (1). Suppose that the measurement model is gi ven by z k = x k + w k , where w k belongs to a compact set W . Then the en velope that is guaranteed to contain G ( f ) is giv en by E ( z k ) ⊕ ( W × W ) where Q f L ( · ) is no w constructed using ( z k , z k +1 ) . B. Successive Graph Appr oximation At time N , the env elope E ( x N − 1 )) constructed in (7) using the tuple ( x N − 1 , f ( x N − 1 )) can now be used to recur - siv ely compute a ne w en velope E f N by refining the en velope E f N − 1 from time N − 1 via set intersection- E f N = E f N − 1 ∩ E ( x N − 1 ) (8) In the following lemma we show that the sets computed in this fashion are indeed env elopes. Lemma 1: For N ≥ 1 , giv en a sequence { x k } N − 1 k =0 ob- tained under the dynamics (1), we have G ( f ) ⊆ E f N = E f N − 1 ∩ E ( x N − 1 ) = N − 1 \ k =0 E ( x k ) . (9) Pr oof: See Appendix. The recursion is initialized with the trivial env elope E f 0 = R n × R n . The procedure is described in Algorithm 1. Algorithm 1: Recursive En velope Refinement Initialization: E f 0 = R n × R n Input: At time N , ( x N − 1 , f ( x N − 1 )) and E f N − 1 Output: Approximation of G ( f ) at time N : E f N 1 Compute Q f L ( x N − 1 ) (from (6)) 2 Compute E ( x N − 1 ) using Q f L ( x N − 1 ) (from (7)) 3 Set E f N = E f N − 1 ∩ E ( x N − 1 ) Note that since the en velope at any time N is computed by intersecting with the en velope at time N − 1 , the y are getting successi vely refined, i.e., E f N ⊆ E f N − 1 ⊆ E f N − 2 · · · ⊆ E f 0 (10) Now we provide a condition under which the shrinking sets generated by recursion (8) stop shrinking i.e., recursion (8) attains a fixed point. Intuitiv ely , we would expect this to happen when the incoming tuples ( x, f ( x )) constructed from trajectory data hav e already been seen previously . The fol- lowing definition formalises the notion of such trajectories. Definition 5 (P eriodic Orbit [32]): A p -periodic orbit of the discrete dynamical system (1) is the set of states obtained under dynamics x k +1 = f ( x k ) with the property that x k = x k + p for some finite p ≥ 1 and for all k ≥ 0 , i.e., O p ( x 0 ) = { x ∈ X | x k +1 = f ( x k ) , x k = x k + p , x = x k , ∀ k ≥ 0 } . (11) Note that the set O p ( ¯ x eq ) = { ¯ x eq } for all p ≥ 1 where ¯ x eq is the fixed point ¯ x eq = f ( ¯ x eq ) of system (1). Associated to each fixed point, one can define the set of states that con verge to it as follo ws. Definition 6 (Domain of Attr action [33]): The domain of attraction of fixed point ¯ x eq is defined as the set D ( ¯ x eq ) = { x ∈ X | x k +1 = f ( x k ) , lim k →∞ x k = ¯ x eq , x = x 0 } . The following proposition uses Definition 5 and Definition 6 to identify sufficient conditions on system trajectories for termination of the recursion (8). Pr oposition 3: Gi ven a system trajectory denoted by the set { x 0 , x 1 , x 2 , . . . } , the recursion (8) has a fixed point if either of the follo wing conditions hold: 1) O p ( x k ) ⊆ { x 0 , x 1 , x 2 , . . . } for some finite p ≥ 1 and some k ≥ 0 . 2) x 0 ∈ D ( ¯ x eq ) for some fixed point ¯ x eq . Pr oof: See Appendix. Next we present how the en velope slice is derived from the constructed env elopes for obtaining a set-valued estimate of f ( x ) at any x ∈ X C. En velope Slice Computation The set of possible values of function f ( ¯ x ) at any ¯ x ∈ X can be obtained using (7) from the function values f ( x k ) collected at k = { 0 , 1 , 2 , . . . , N − 1 } . From only the k -th measurement, we can obtain an estimate of the set of possible values of f ( ¯ x ) , by constructing the slice of en velope E ( x k ) at x = ¯ x , from Definition 3. W e denote this slice with the set S ( x k , ¯ x ) as S ( x k , ¯ x ) = E ( x k ) x = ¯ x , = { y ∈ X | ¯ x y 1 > Q f L ( x k ) ¯ x y 1 ≤ 0 } = { y ∈ X | y 1 > ¯ A ( k , ¯ x ) y 1 ≤ 0 } , (12) where we ha ve denoted ¯ A ( k , ¯ x ) = M Q f L ( x k ) M > − L 2 0 0 0 ( ¯ x − 2 x k ) > ¯ x , for any k = { 0 , 1 , 2 , . . . , N − 1 } , with matrix M = 0 1 0 0 0 1 . Corollary 1 then implies f ( x ) ∈ S ( x k , x ) at any x ∈ X . Pr oposition 4: At any ¯ x ∈ X , S ( x k , ¯ x ) is a closed norm ball of radius L k ¯ x − x k k , centered at f ( x k ) for each k = { 0 , 1 , . . . , N − 1 } . Pr oof: Expanding out (12) gi ves us k f ( ¯ x ) − f ( x k ) k 2 ≤ L 2 k ¯ x − x k k 2 , (13) for each k = { 0 , 1 , . . . , N − 1 } and thus proves the claim. As we successively collect data points ( x k , f ( x k )) for k = { 0 , 1 , 2 , . . . , N − 1 } under dynamics (1), the set of possible values of f ( ¯ x ) at any ¯ x ∈ X is refined as F N ( ¯ x ) = N − 1 \ k =0 S ( x k , ¯ x ) = N − 1 \ k =0 E ( x k ) x = ¯ x = E f N x = ¯ x , (14) with the guarantee f ( ¯ x ) ∈ F N ( ¯ x ) at any giv en time N ≥ 1 . Notice that F N ( ¯ x ) is a slice of env elope E f N at x = ¯ x , as per Definition 3. W e further note that F N ( ¯ x ) in (14) is con vex and compact, as it is an intersection of con vex and compact sets (13). So far we hav e seen that the env elopes generated by Algorithm 1 are getting successiv ely refined (in (10)) and possibly stop improving (as noted in proposition 3). But giv en a trajectory that yields a terminating recursion (8), where in the state space X are the env elope slices “tight”? W e use the notion of the diameter of a compact set ([24]) to quantify “tightness” or size of the env elope slice. In the following theorem, we show that if a trajectory starts in the domain of attraction D ( ¯ x eq ) of a fixed point ¯ x eq of (1), then the error in approximation of G ( f ) by E f N at points arbitrarily close to ¯ x eq (measured by the diameter of the en velope slice F N ( x ) at any x ∈ X ), gets arbitrarily small for large enough N . Theor em 1: Suppose we are giv en a system trajectory denoted by the set { x 0 , x 1 , x 2 , . . . , x N } where x 0 ∈ D ( ¯ x eq ) . Then for states x arbitrarily close to ¯ x eq , the diameter of F N ( x ) is arbitrarily small for large enough N , i.e., ∀ > 0 , ∃ ¯ N ( ) : max y ∈F N ( x ) k y − f ( x ) k = O ( ) , ∀ x ∈ B ( ¯ x eq ) , ∀ N ≥ ¯ N ( ) . Pr oof: See Appendix. V . A P P L I C A T I O N S In this section we demonstrate two applications and cor- responding computationally tractable algorithms that utilize the proposed approach in the paper . A. Ellipsoidal Outer Appr oximation of F N ( ¯ x ) In order to design computationally tractable rob ust opti- mization [34] algorithms for all realizations of f ( x ) at any x ∈ X and N ≥ 1 , one must have a “nice” geometric representation of the en velope slice F N ( ¯ x ) = E f N x = ¯ x , for all N ≥ 1 . W e hereby propose an approach to obtain an ellipsoidal outer approximation to F N ( ¯ x ) for any N ≥ 1 using the s-procedure [35, Section 11.4], having collected measurements at k = 0 , 1 , . . . , N − 1 , Let us parametrize an ellipsoidal outer approximation of F N ( x N ) , which we denote by ell( c ( ¯ x ) , R ( ¯ x ) ) as ell ( c ( ¯ x ) , R ( ¯ x )) = { y ∈ R n | ( y − c ( ¯ x )) > R − 1 ( ¯ x )( y − c ( ¯ x )) ≤ 1 } , where v ector c ( ¯ x ) and matrix R ( ¯ x ) are the decision v ariables, and are functions of ¯ x . W e seek the smallest ellipsoidal set such that F N ( ¯ x ) = N − 1 \ k =0 S k ( x k , ¯ x ) ⊆ ell ( c ( ¯ x ) , R ( ¯ x )) . From s-procedure [36] we know that the abov e holds true, if there exists scalars { τ 0 , τ 1 , . . . , τ N − 1 } ≥ 0 such that R − 1 ( ¯ x ) − R − 1 ( ¯ x ) c ( ¯ x ) − c > ( ¯ x ) R − 1 ( ¯ x ) c > ( ¯ x ) R − 1 ( ¯ x ) c ( ¯ x ) − 1 − N − 1 X k =0 τ k ¯ A s ( k , ¯ x ) 0 . (15) W e reformulate the above feasibility problem (15) as a semi- definite program (SDP) in the appendix. B. P ositive Invariant Set Computation Definition 7 (P ositive In variant Set): A set X I ⊆ X is said to be positiv e inv ariant for the system with dynamics (1) if x ∈ X I ⇒ f ( x ) ∈ X I , i.e. the set X I maps to itself, under the dynamics map f ( · ) : X → X . Let there be an equilibrium point ¯ x eq defined as f ( ¯ x eq ) = ¯ x eq . W e wish to characterize a positiv e in v ariant set X I ⊆ X containing this equilibrium. For the sake of computational E F X I X I ¯ x eq y x Fig. 2: Illustration of in variance using an env elope of system dynamics. X I (orange) is in variant for all f ( · ) with G ( f ) ⊂ E f (blue). tractability , we represent this set as an intersection of n I ellipsoids, centered at ¯ x eq . W e parameterize the inv ariant set for some P j 0 with j = 1 , 2 , . . . n I as follo ws X I = { x ∈ R n | x 1 > P j − P j ¯ x eq − ¯ x > eq P j ¯ x > eq P j ¯ x eq − 1 x 1 ≤ 0 , ∀ j = 1 , 2 , . . . n I } . (16) Observing that the tuple ( x, f ( x )) ∈ G ( f ) ⊆ E f N consists of a point x ∈ X and its image f ( x ) ∈ X under the map f ( · ) , we can use the collected Q f L ( x k ) ’ s for k = { 0 , 1 , . . . , N − 1 } at any time N , to obtain a sufficienc y condition for (16) as detailed in the follo wing proposition. Pr oposition 5: Suppose that we are giv en an approxi- mation of G ( f ) at time N , i.e., E f N = T N − 1 k =0 E ( x k ) , constructed by Algorithm 1. If there exists τ k ≥ 0 , for all k = 0 , 1 , . . . N − 1 and P j 0 , ρ j m > 0 for all j, m = 1 , 2 , . . . , n I , such that the following Bilinear Matrix Inequality (BMI) is feasible n I X j =1 − ρ j m P j 0 ρ j m P j ¯ x eq 0 P m − P m ¯ x eq ¯ x > eq ρ j m P j − ¯ x > eq P m − ¯ x > eq ( ρ j m P j − P m ) ¯ x eq +( ρ j m − 1) − N − 1 X k =0 τ k Q f L ( x k ) 0 , (17) for all m = 1 , 2 , . . . , n I , then the set X I is a positi ve in variant set for the system with dynamics (1). Pr oof: See Appendix. One of many approaches to solving such a BMI (see [37]) is detailed in the appendix. V I . N U M E R I C A L E X A M P L E In this section we demonstrate the approach proposed in Section IV for characterizing the un-modelled dynamics of a pendulum. W e also showcase construction of a positiv e in variant set for this system, utilizing the tools from Sec- tion V -B. A. P endulum Model The continuous time model of the considered pendulum is gi ven by ml 2 ¨ θ + mg l sin θ + ˜ d ( θ , ˙ θ ) = T , (18) where m is the mass, l is the length, θ is the angle the pendulum makes with the vertical axis, ˜ d ( θ , ˙ θ ) is an un- modelled damping force with known Lipschitz constant L d and T is a known e xternal torque. In this work, we simulate the system with the damping force ˜ d ( θ , ˙ θ ) = − L d ˙ θ and characterize state-dependent bounds for the same. W e write the pendulum dynamics (18) in state-space form as ˙ θ ¨ θ = 0 I 0 0 θ ˙ θ + " 0 T ml 2 − g l sin θ − ˜ d ( θ, ˙ θ ) ml 2 # , (19) where x = [ θ ˙ θ ] > is the state of the pendulum. W e consider a torque T that stabilizes the pendulum’ s state when it’ s upright, i.e., when ¯ x eq = [ π 0] > . W e discretize system (19) and write it in the form of (1) as x k +1 = f ( x k ) . W e then simulate the system forward in time with a v ariational integrator for mechanical systems, as in [38]. The simulation parameters are: m = 2kg , l = 2m and L d = 0 . 2N . B. En velope Construction for Damping F or ce The discrete time model x k +1 = f ( x k ) is decomposed as x k +1 = ¯ f ( x k ) | {z } assumed model + d ( x k ) | {z } un-modelled damping , (20) where x k = [ θ k ˙ θ k ] > , d ( · ) is the unkno wn damping in discrete time with Lipschitz constant ˜ L d = L d T S ml 2 and T S = 0 . 005 s is the sampling period. Our experiment is succinctly described belo w: • T rajectories up to a specified time instant N , start- ing from four different initia l conditions x 0 = { [ 5 π 6 0] > , [ 5 π 3 − 0 . 5] > , [ π 6 0] > , [ 5 π 4 − 0 . 2] > } are simulated (solid lines in Fig. 3) and stored. • Realizations of the un-modelled dynamics d ( x k ) are recorded via the measurement model d ( x k ) = x k +1 − ¯ F ( x k ) . • Having recorded the measurements ( x k , d ( x k )) for k = 0 , 1 , . . . , N − 1 along all four trajectories, we construct the estimate D N ( ¯ x ) (defined as in (14)) of d ( ¯ x ) at six different query points ( ? and ◦ in Fig. 3) using Algorithm 1. From T able I, we observe that the range of un-modelled dynamics D N ( ¯ x ) shrinks at all query points ¯ x , as more data is collected. This is a direct consequence of the fact that as shown in (14), D N ( ¯ x ) is obtained with successive inter- section operations upon gathering new measurements. More- ov er , the learned dynamics are more accurate for query points near the fixed point ¯ x eq = [ π , 0] > than for query points far away , as shown in Fig. 3. For example, at query points ¯ x = { [1 . 40 0 . 34] > , [3 . 05 − 0 . 37] > , [3 . 11 0 . 84] > , [5 . 60 0 . 22] > } , we see around 100% decrease in the uncertainty range estimate as N increases from 100 to 4000 . The corresponding percentages for ¯ x = { [4 . 21 0 . 38] > , [2 . 12 − 0 . 45] > } are just around 73% and 34% respecti vely . T ABLE I: Uncertainty range (up to three significant digits). Symbol [ · , · ] denotes an interval Query Point ¯ x D 100 ( ¯ x ) / 10 − 4 D 4000 ( ¯ x ) / 10 − 4 [2 . 12 − 0 . 45] > [ − 0 . 837 , 0 . 759] [ − 0 . 001 , 0 . 759] [3 . 11 0 . 84] > [ − 1 . 22 , 0 . 73] [ − 1 . 04 , − 1 . 04] [1 . 40 0 . 34] > [ − 1 . 58 , 0 . 79] [ − 0 . 43 , − 0 . 43] [3 . 05 − 0 . 37] > [ − 0 . 708 , 0 . 486] [0 . 46 , 0 . 46] [4 . 21 0 . 38] > [ − 2 . 05 , 0 . 74] [ − 0 . 56 , 0 . 16] [5 . 60 0 . 22] > [ − 3 . 73 , 2 . 46] [ − 0 . 28 , − 0 . 28] 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 -1 -0.5 0 0.5 1 Fig. 3: Data collection trajectories (solid lines) and query points ( ? and ◦ ) in state-space. C. Computation of P ositive In variant Set For pendulum dynamics (19) in discrete-time, we use the BMI (17) of Proposition 5 to compute an ellipsoidal positive in variant set. In this specific example, we hav e used all the N = 4000 samples from each of the previously collected four trajectories in Section VI-B. The number of intersecting ellipsoids n I in (17) is set as n I = 2 . Fig. 4 shows the in variant set X I ⊆ X , where X = [0 , 2 π ] × [ − 2 . 5 , 2 . 5] . T o further check numerically that the set X I in Fig. 4 is indeed a positiv e inv ariant set, we run simulations from six initial conditions inside the set. As seen in Fig. 4, all six trajectories stay within X I . V I I . C O N C L U S I O N S W e presented a non-parametric technique for online mod- eling of systems with nonlinear Lipschitz dynamics. The k ey idea is to successively use measurements to approximate the graph of the function using env elopes described by quadratic constraints. Using techniques from conv ex optimization, we also computed a set valued estimate of the range of the unknown function at any giv en point in its domain, and a positiv e inv ariant set around a known equilibrium. W e further 1 2 3 4 5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 Fig. 4: Inv ariant set (solid black line) computed using (17). T rajectories starting inside the set are contained within. highlighted the efficac y of the proposed methodology via a detailed numerical example. A C K N O W L E D G E M E N T S This work was partially funded by Of fice of Nav al Re- search grant ONR-N00014-18-1-2833. R E F E R E N C E S [1] B. Recht, “ A tour of reinforcement learning: The view from contin- uous control, ” Annual Re view of Contr ol, Robotics, and Autonomous Systems , vol. 2, pp. 253–279, 2019. [2] U. Rosolia, X. Zhang, and F . Borrelli, “Data-driven predictive control for autonomous systems, ” Annual Review of Control, Robotics, and Autonomous Systems , vol. 1, pp. 259–286, 2018. [3] D. P . Bertsekas and D. A. Castanon, “ Adaptive aggregation methods for infinite horizon dynamic programming, ” IEEE T ransactions on Automatic Contr ol , v ol. 34, no. 6, pp. 589–598, 1989. [4] C. J. W atkins and P . Dayan, “Q-learning, ” Machine learning , vol. 8, no. 3-4, pp. 279–292, 1992. [5] M. Krstic, I. Kanellakopoulos, and P . V . K okotovic, Nonlinear and adaptive contr ol design . Wile y , 1995. [6] S. Sastry and M. Bodson, Adaptive control: Stability , conver gence and r obustness . Courier Corporation, 2011. [7] J. Friedman, T . Hastie, and R. T ibshirani, The elements of statistical learning . Springer series in statistics New Y ork, 2001, vol. 1, no. 10. [8] T . J. Hastie, “Generalized additiv e models, ” in Statistical models in S . Routledge, 2017, pp. 249–307. [9] T . Hothorn, F . Bretz, and P . W estfall, “Simultaneous inference in general parametric models, ” Biometrical journal , v ol. 50, no. 3, pp. 346–363, 2008. [10] W . K. H ¨ ardle, M. M ¨ uller , S. Sperlich, and A. W erwatz, Nonparametric and semiparametric models . Springer Science & Business Media, 2012. [11] M. Green and J. B. Moore, “Persistence of excitation in linear systems, ” Systems & control letters , vol. 7, no. 5, pp. 351–360, 1986. [12] Z. Ghahramani and S. T . Roweis, “Learning nonlinear dynamical systems using an em algorithm, ” in Advances in neural information pr ocessing systems , 1999, pp. 431–437. [13] N. Kalouptsidis, G. Mileounis, B. Babadi, and V . T arokh, “ Adaptiv e al- gorithms for sparse system identification, ” Signal Processing , vol. 91, no. 8, pp. 1910–1919, 2011. [14] C. E. Rasmussen, “Gaussian processes in machine learning, ” in Summer School on Machine Learning . Springer, 2003, pp. 63–71. [15] J. Fan et al. , “Local linear regression smoothers and their minimax efficiencies, ” The annals of Statistics , vol. 21, no. 1, pp. 196–216, 1993. [16] U. Rosolia and F . Borrelli, “Learning how to autonomously race a car: a predictiv e control approach, ” arXiv preprint , 2019. [17] E. Schuster , S. Y ako witz et al. , “Contributions to the theory of nonparametric regression, with application to system identification, ” The Annals of Statistics , vol. 7, no. 1, pp. 139–149, 1979. [18] T . Armstrong and M. Koles ´ ar , “Simple and honest confidence intervals in nonparametric regression, ” Cowles F oundation Discussion P aper , 2018. [19] F . Berkenkamp, M. T urchetta, A. Schoellig, and A. Krause, “Safe model-based reinforcement learning with stability guarantees, ” in Advances in Neural Information Pr ocessing Systems , 2017, pp. 908– 918. [20] T . K oller , F . Berkenkamp, M. T urchetta, and A. Krause, “Learning- based model predictiv e control for safe exploration, ” in 2018 IEEE Confer ence on Decision and Contr ol (CDC) . IEEE, 2018, pp. 6059– 6066. [21] F . Blanchini, “Set inv ariance in control, ” Automatica , vol. 35, no. 11, pp. 1747–1767, 1999. [22] I. K olmanovsky and E. G. Gilbert, “Theory and computation of dis- turbance in variant sets for discrete-time linear systems, ” Mathematical Pr oblems in Engineering , v ol. 4, no. 4, pp. 317–367, 1998. [23] I. P ´ olik and T . T erlaky , “ A survey of the s-lemma, ” SIAM revie w , vol. 49, no. 3, pp. 371–418, 2007. [24] W . Rudin et al. , Principles of mathematical analysis . McGra w-hill New Y ork, 1964, vol. 3. [25] A. Chakrabarty , D. K. Jha, and Y . W ang, “Data-driv en control policies for partially known systems via kernelized lipschitz learning, ” in 2019 American Contr ol Conference (A CC) . IEEE, 2019, pp. 4192–4197. [26] M. T anaskovic, L. Fagiano, R. Smith, and M. Morari, “ Adaptiv e receding horizon control for constrained MIMO systems, ” Automatica , vol. 50, no. 12, pp. 3019–3029, 2014. [27] M. Bujarbaruah, X. Zhang, and F . Borrelli, “ Adaptive MPC with chance constraints for FIR systems, ” in 2018 Annual American Contr ol Confer ence (ACC) , June 2018, pp. 2312–2317. [28] M. Bujarbaruah, X. Zhang, M. T anaskovic, and F . Borrelli, “ Adaptive MPC under time varying uncertainty: Robust and Stochastic, ” arXiv pr eprint arXiv:1909.13473 , 2019. [29] D. Bertsekas and I. Rhodes, “Recursive state estimation for a set- membership description of uncertainty , ” IEEE T ransactions on A uto- matic Contr ol , vol. 16, no. 2, pp. 117–128, 1971. [30] M. Fazlyab, M. Morari, and G. J. Pappas, “Safety verification and robustness analysis of neural networks via quadratic constraints and semidefinite programming, ” arXiv pr eprint arXiv:1903.01287 , 2019. [31] A. Megretski and A. Rantzer, “System analysis via integral quadratic constraints, ” IEEE T ransactions on Automatic Control , vol. 42, no. 6, pp. 819–830, 1997. [32] Z. Zhou, “Periodic orbits on discrete dynamical systems, ” Computers & Mathematics with Applications , vol. 45, no. 6-9, pp. 1155–1161, 2003. [33] J. M. Ortega, “Stability of difference equations and con vergence of iterativ e processes, ” SIAM Journal on Numerical Analysis , vol. 10, no. 2, pp. 268–282, 1973. [34] A. Ben-T al, L. El Ghaoui, and A. Nemirovski, Robust optimization . Princeton University Press, 2009, vol. 28. [35] G. C. Calafiore and L. El Ghaoui, Optimization models . Cambridge univ ersity press, 2014. [36] S. Boyd and L. V andenberghe, Conve x optimization . New Y ork, NY , USA: Cambridge University Press, 2004. [37] Z. Jarvis-Wloszek, R. Feeley , W eehong T an, Kunpeng Sun, and A. Packard, “Some controls applications of sum of squares pro- gramming, ” in 42nd IEEE International Confer ence on Decision and Contr ol , vol. 5, Dec 2003, pp. 4676–4681. [38] S. H. Nair and R. N. Banav ar , “Discrete optimal control of inter- connected mechanical systems, ” 11th IF AC Symposium on Nonlinear Contr ol Systems (NOLCOS) , 2019. A P P E N D I X A. T ractable Optimization Pr oblems for Section V 1) SDP for Ellipsoidal Outer Appr oximation of F ( x N ) : Follo wing [35, Section 11.4] finding the the minimum trace ellipsoid ell ( c ( x N ) , R ( x N )) that satisfies (15) can be posed as an SDP using Schur complement rule as: min ξ trace ( R ( x N )) s.t. p ( x N ) q ( x N ) − I n q > ( x N ) r ( x N ) c > ( x N ) − I n c ( x N ) − R ( x N ) 0 , τ k ≥ 0 , ∀ k = 0 , 1 , . . . , N − 1 , R ( x N ) 0 , where ξ = { R ( x N ) , c ( x N ) , τ 0 , . . . , τ N − 1 } and we have used the v ariable nomenclature p ( x N ) = − N − 1 X k =0 τ k I n , q ( x N ) = N − 1 X k =0 τ k f ( x k ) , r ( x N ) = − 1 − N − 1 X k =0 τ k − L 2 ( x N ) > x N + 2 L 2 ( x N ) > x k − L 2 ( x k ) > ( x k ) + f > ( x k ) f ( x k ) . 2) Bisection Method for P ositive In variant Set Computa- tion: Note that (17) is linear in P j for a fixed ρ j m and vice- versa. This facilitates using a bisection search on ρ j m until a feasible solution is obtained. For bounded X , feasibility is guaranteed for some ρ such that all ρ j m = ρ because of continuity of (17) as well as its feasibility for ρ = 0 . After iterating over ρ j m , (17) is solv ed as a Linear Matrix Inequality (LMI) min P 1 ,...,P n I ,τ 0 ,...,τ N − 1 P n I j =1 trace ( P j ) s.t. (17) , τ k ≥ 0 , ∀ k = 0 , 1 , . . . , N − 1 . B. Pr oofs Pr oof of Lemma 1: For any ( x, f ( x )) ∈ G ( f ) , we ha ve from the Lipschitz inequality , k f ( x ) − f ( y ) k ≤ L k x − y k , ∀ y ∈ X , and choosing y = x k for k = 0 , 1 , . . . N − 1 in the abov e inequality , in view of Corollary 1 yields, ( x, f ( x )) ∈ E ( x k ) , ∀ k = 0 , 1 , . . . N − 1 , ⇒ ( x, f ( x )) ∈ N − 1 \ k =0 E ( x k ) . Note the fact that f is globally Lipschitz ensures that the intersections are non-empty . Since this was sho wn for an y ( x, f ( x )) ∈ G ( f ) , we can thus conclude that G ( f ) = [ x ∈X ( x, f ( x )) ⊆ N − 1 \ k =0 E ( x k ) . The other equalities follo w from (8). Pr oof of Pr oposition 3: W e first prove the implica- tion for when condition (1) holds. Let k ? be the time at which the system enters the p -periodic orbit, i.e., O p ( x k ? ) ⊆ { x 0 , x 1 , x 2 , . . . } . From Lemma 1 we ha ve E f N = T N − 1 k =0 E ( x k ) at any time N . For N ≥ k ? + p , consider the set E f N +1 = N \ k =0 E ( x k ) = ( k ? − 1 \ k =0 E ( x k )) ∩ ( k ? + p − 1 \ k = k ? E ( x k )) ∩ ( N \ k = k ? + p E ( x k )) , = E f k ? + p ∩ ( N \ k = k ? + p E ( x k )) . From Definition 5, we have that x k ∈ O p ( x k ? ) for all k = k ? + p, . . . , N . Using (9) and the fact that f ( · ) is globally Lipschitz on X , we thus have E f k ? + p ⊆ E ( x k ) , for all k = k ? + p, . . . , N . Combining this implication with the definition of E f N +1 abov e yields E f N +1 = E f k ? + p , ∀ N ≥ k ? + p, and so E f k ? + p is a fixed point for recursion (8). Now we prov e the implication for when condition (2) holds, i.e., lim k →∞ x k = ¯ x eq . Since the sets E f N = T N − 1 k =0 E ( x k ) are non-increasing in the sense of (10), the follo wing limit set is well defined E f ? = lim N →∞ E f N = lim N →∞ N − 1 \ k =0 E ( x k ) , = lim N →∞ ( N − 2 \ k =0 E ( x k )) ∩ E ( x N − 1 ) , = lim N →∞ N − 2 \ k =0 E ( x k ) ∩ E ( lim N →∞ x N − 1 ) . The last equality follows from the property of product of con vergent sequences because all the limits on both sides of the equation are well defined. Computing the limits then giv es us the following equality E f ? = E f ? ∩ E ( ¯ x eq ) . Thus E f ? is a fixed point for recursion (8). Pr oof of Theor em 1: From the definition x 0 in the theo- rem, we hav e that the sequence n x k o ∞ k =0 con verges to the fixed point ¯ x eq of (1). From the definition of the conv ergence of a sequence, we have that for ev ery > 0 , there exists a ¯ N ( ) , such that k x k − ¯ x eq k ≤ , ∀ k ≥ ¯ N ( ) . The con ver gent sequence is a Cauchy sequence satisfying with the same and ¯ N ( ) as abov e. That is, k x k 1 − x k 2 k ≤ 2 , ∀ k 2 , k 1 ≥ ¯ N ( ) . (21) Consider a subsequent queried point ¯ x at most distance away from ¯ x eq . W e further have from the Lipschitz inequal- ity , k f ( x k 1 ) − f ( ¯ x ) k ≤ L k x k 1 − ¯ x k . (22) From Proposition 4, we kno w that the possible v alues of f ( ¯ x ) lie within a sphere of radius L k x k 1 − ¯ x k centered at f ( x k 1 ) . The diameter of the above sphere bounds the maximum error in the estimate of f ( ¯ x ) , i.e., k y − f ( ¯ x ) k ≤ 2 L k x k 1 − ¯ x k , ∀ y ∈ S ( x k 1 , ¯ x ) . For k 1 chosen as in (21), the above inequality can be written as k y − f ( ¯ x ) k ≤ 4 L, ∀ y ∈ S ( x k 1 , ¯ x ) . Now for another k 2 chosen as in (21) such that f ( x k 1 ) 6 = f ( x k 2 ) , we ha ve k f ( x k 2 ) − f ( ¯ x ) k ≤ L k x k 2 − ¯ x k . (23) The intersections of the env elopes constructed from (22) and (23) is depicted in Fig. 5. W e thus obtain a tighter bound on the error in the estimate of f ( ¯ x ) via the diameter of the Fig. 5: Intersection of set of possible values of f ( ¯ x ) , giv en two samples x k 1 , x k 2 n − 2 dimensional sphere obtained at the intersection of n − 1 dimensional spheres, as gi ven by k y − f ( ¯ x ) k ≤ 2 r L 2 ( k x k 1 − ¯ x k 2 + k x k 2 − ¯ x k 2 ) − k f ( x k 2 ) − f ( x k 1 ) k 2 2 , ≤ min (2 L k x k 2 − ¯ x k , 2 L k x k 1 − ¯ x k ) ≤ 4 L, ∀ y ∈ S ( x k 1 , ¯ x ) ∩ S ( x k 2 , ¯ x ) ⇒ max y ∈S ( x k 1 , ¯ x ) ∩S ( x k 2 , ¯ x ) k y − f ( ¯ x ) k ≤ 2 r L 2 ( k x k 1 − ¯ x k 2 + k x k 2 − ¯ x k 2 ) − k f ( x k 2 ) − f ( x k 1 ) k 2 2 , ≤ min (2 L k x k 2 − ¯ x k , 2 L k x k 1 − ¯ x k ) ≤ 4 L T aking intersections using all the en velopes collected (which are non-empty due to Lipschitz property of f ( · ) on X ) further shrinks the possible error and hence yields the desired result. Pr oof of Proposition 5: Consider any vector [ x > y > 1] > ∈ R 2 n +1 such that x ∈ X I and [ x > y > ] > ∈ G ( f ) . Given that BMI (17) is feasible, we multiply [ x > y > 1] > ∈ R 2 n +1 on both sides of (17) for any m ∈ { 1 , 2 , . . . , n I } to get n I X j =1 " − ρ j m P j 0 ρ j m P j ¯ x eq 0 P m − P m ¯ x eq ¯ x > eq ρ j m P j − ¯ x > eq P m − ¯ x > eq ( ρ j m P j − P m ) ¯ x eq + ρ j m − 1 # − N − 1 X k =0 τ k Q f L ( x k ) 0 , ⇒ n I X j =1 − ρ j m h x 1 i > ¯ P j h x 1 i + n I h y 1 i > ¯ P m h y 1 i − " x y 1 # > N − 1 X k =0 τ k Q f L ( x k )) " x y 1 # ≤ 0 , ⇒ h y 1 i > ¯ P m h y 1 i ≤ 0 , ∀ m = 1 , 2 . . . , n I , where ¯ P j = P j − P j ¯ x eq − ¯ x > eq P j ¯ x > eq P j ¯ x eq − 1 . The last implication follows from the fact x ∈ X I and G ( f ) ⊆ E f (as proved in Lemma 1). The last inequality further implies that G ( f ) 3 y ∈ X I , for all x ∈ X I , and hence, that X I is in variant.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment