Persistence of Excitation in Reproducing Kernel Hilbert Spaces, Positive Limit Sets, and Smooth Manifolds

This paper studies the relationship between the positive limit sets of continuous semiflows and the newly introduced definition of persistently excited (PE) sets and associated subspaces of reproducing kernel Hilbert (RKH) spaces. It is shown that if…

Authors: Andrew J. Kurdila, Jia Guo, Sai Tej Paruchuri

Persistence of Excitation in Reproducing Kernel Hilbert Spaces, Positive   Limit Sets, and Smooth Manifolds
P ersiste nce of Excitation in Repro ducing Ker nel Hil b er t Spaces, P ositi v e Limit S ets, an d Smo oth Manifold s Andrew J. Kur dila a , Jia Guo a , Sa i T ej P a ruc huri a , P a rag Bobad e b a Dep artment of Me chanic al Engine ering, Vir gi nia T e ch, Blacksbur g, V A 24060, USA b Dep artment of A er osp ac e Engine ering, University of Michigan, An n Arb or, MI 48109, USA Abstract This paper studies the relationship b etw een th e p ositiv e limit sets of conti nuous semiflo ws and th e newly introdu ced definition of p ersisten tly excited (PE) sets and associated sub spaces of repro ducing kernel Hilb ert (RKH) spaces. It is shown t hat if the RKH space contains a ric h collection of cut-off functions, p ersistently excited sets are contained as subsets of the p ositive limit set of th e semiflo w. The p ap er d emonstrates how the new PE condition can b e used to guarantee converg ence of function estimates in the RKH space embedd ing metho d for adaptive estimation. In particular, the pap er is applied to un certain O DE systems with positive limit sets giv en by certai n types of smooth manifolds, and it establishes con vergence of adaptiv e function estimates over the manifolds. Key wor ds: Adaptive Estimation, R eprod ucing Kernel, Persistence of Excitation 1 In tro duction In this pap er w e study the metho d of repro ducing ker- nel Hilber t (RKH) spac e embedding for adaptive es- timation of uncertain, or unknown, dynamic systems that are gov erned by systems of coupled, no nlinear or - dinary differential equations (ODEs). The RKH embed- ding metho d for ada ptive estimation has been intro- duced in [1,2,3]. This g e neral formulation constructs es- timates in R d of the state o f the unkno wn governing ODEs as well as estimates of an unknown function c o n- tained in the RKH space H that characterizes the un- certain gov er ning ODEs. This pap e r investigates several unanswered questio ns r elated to the new no tion of p er- sistency of exc itation (PE) that has b een introduced in the latter tw o o f these three pap ers. W e derive rela tion- ships betw een the PE co ndition over an indexing set Ω that is a subset of the state spa ce X and the p o sitive limit sets of semiflo ws ov er X . W e also construct or se- lect go o d kernels that define the RKH s pace H in ap- plications where the governing semiflows exhibit certain asymptotic str uctural pro per ties. These latter pr op er- ties are expressed in ter ms of PE conditions ov e r some classes o f smo o th manifolds . Email addr esses: kurdila@vt.edu (Andrew J. Kurdila), jguo18@vt. edu (Jia Guo), saitejp@vt. edu (Sai T ej P aruchuri), paragsb@umi ch.edu (Pa rag Bobade). The RKH em b edding method g enerates a distributed parameter system, a nd its asso ciated estimates evolve in the g enerally infinite dimensional space R d × H . Ther e are many nontrivial questions ab out appr oximations and realizable implemen tatio ns of the metho d, and some study of the c o nv ergence of finite dimensional approxi- mations of so lutions o f the RKH embedding equatio ns is given in [2,3]. In this shor t pap er, we only consider the conv erg e nce of the es timates ge ne r ated by the gov ern- ing DPS system in the infinite dimensional state spa ce R d × H that defines the RKH embedding formulation. The curre n t inv es tigation can b e viewed as providing needed insight and intu itio n into the structur e o f the solutions of the RKH em b edding equatio ns, whic h is m uch needed for the effective choice of approximating subspaces in pra c tica l implementations. 1.1 A daptive Estimation for Unc ertain Nonline ar ODEs A common setup for es tima tion o f uncertain nonlin- ear s y stems starts with an ordinary differential equation that can b e decomp ose d in to known and unkno wn par ts, ˙ x ( t ) = g 0 ( x ( t )) + g ( x ( t )) , x (0) = x 0 (1) with x ( t ) ∈ R d for t ∈ R + , g 0 : R d → R d a known func- tion, and g : R d → R d an unknown function. One imp or- tant problem of adaptive e s timation for such a nonlinear system is to use the full state observ ations { x ( t ) } t ∈ R + to co nstruct a n evolution law for a state estimate ˆ x ( t ) Preprint subm itted t o Au tomatica 27 Sept em b er 2019 that approximates x ( t ), in the sense that ˆ x ( t ) → x ( t ) as t → ∞ . In the lang uage of ada ptive estimation this is re- ferred to a s convergence o f sta te es timates. A ca nonical mo del es timator for the or iginal equation migh t choose the evolution law for the estima te to b e ˙ ˆ x ( t ) = A ˆ x ( t ) + g 0 ( x ( t )) + ˆ g ( t, x ( t )) − Ax ( t ) (2) for a known matrix A ∈ R d , althoug h many alter natives exist of course. Here ˆ g ( t ) := ˆ g ( t, · ) is an estimate o f the unknown function g . O n defining the state er ror ˜ x ( t ) = x ( t ) − ˆ x ( t ) and the function error ˜ g ( t ) := ˜ g ( t, · ) := g ( · ) − ˆ g ( t, · ), the asso cia ted erro r eq uation is obta ined a s ˙ ˜ x ( t ) = A ˜ x ( t ) + ˜ g ( t, x ( t )) . (3) A t a bar e minimum then, adaptiv e es timation metho ds for the ab ove uncertain nonlinear ODEs must g uarantee that tra jectories of this err or equation conv erg e to zero. It is usually c o nsiderably more difficult to guara n tee tha t the time-v arying function estimate ˆ g ( t, · ) : R d → R con- verges in the sense that ˆ g ( t, · ) → g as t → ∞ . It is this latter pro blem that is the pr imary co ncern of this pap er. T o gain some apprecia tion of the issues a nd nuances aris- ing in the function estimation problem, we consider tw o examples. Figures 1 and 2 depict the phase p ortaits of the uncertain systems studied in Examples 1 and 2, re- sp ectively . These figures also include plots of the er ror in function estimates obtained b y the RKH embedding techn iques with the kernel of the RKH space selected as describ ed in 11 . Example 1 The firs t example is a c ase of a sup er critic al Hopf bifur c ation, whi ch c an b e found in many tex tb o oks on dynamic al syst ems [4,5]. The syst em e quations ar e given by ( ˙ x 1 ˙ x 2 ) = ( x 2 + x 1 (1 − x 2 1 − x 2 2 ) − x 1 + x 2 (1 − x 2 1 − x 2 2 ) ) . (4) Her e we define g 0 ( x ) := { x 2 + x 1 (1 − x 2 1 − x 2 2 ) , 0 } T , g ( x ) := { 0 , − x 1 + x 2 (1 − x 2 1 − x 2 2 ) } in Equation 1. The figur es b elow make cle ar that t he p ositive limit set ω + ( x 0 ) is the cir cle S 1 for al l x 0 ∈ R 2 for t his dynamic al system. When the metho d of RKH emb e dding is applie d to this unc ertain nonline ar syst em, we c an obtain est imates ˆ g ( t ) of g whose err or is depicte d in Figur es 1(b, c). These estimates have b e en c onstru cte d fr om finite dimensional appr oximations as discusse d in [2] using b asis functions that ar e a c ol le ction of (extr insic) Sob olev-Matern kernels discusse d in Cor ol lary 11 c enter e d over the p ositive limit set. Example 2 In this example, the dynamic al system c on- tains a homo clinic lo op. The ex ample is studie d in detail in [4 ]. The governing e quations ar e ( ˙ x 1 ˙ x 2 ) = ( 2 x 2 2 x 1 − 3 x 2 1 + λx 2 ( x 3 1 − x 2 1 + x 2 2 ) ) (5) In this example we define g 0 ( x ) := { 2 x 2 , 0 } T and g ( x ) := { 0 , 2 x 1 − 3 x 2 1 + λx 2 ( x 3 1 − x 2 1 + x 2 2 ) } T in Equation 1. Aga in, applic at ion of the R KH emb e dding metho d of adaptive estimation to this pr oblem c an yield appr oximations ˆ g ( t ) of g with err or depicte d in Figur e 2(b,c). These estimates have b e en c onstructe d fr om finite dimensional appr oxi- mations as discusse d in [2] using b asis fun ctions that ar e a c ol le ction of (extrinsic) Sob olev - Matern kernels dis- cusse d in Cor ol lary 11 c enter e d over the p ositive limit set. Several observ ations ab out these t wo e xamples of ap- plication of the RKH e mbedding metho d are notewor- th y and motiv ate this pap er. In each case, the g ov erning equations hav e the form o f the no nlinear O DEs g iven in Equations 1, a nd are have error equa tio ns o f the fo rm in Equation 14 tha t is studied in deta il in this pap er. The first important o bs erv ation to make fro m the examples is to note that the p ositive orbit Γ + ( x 0 ) := ∪ t ∈ R + x ( t ) is the only data that is used to construct estimates of the unknown function. If we were interested in so me offline, optimization-based estimate of an unknown function, it would come as no sur prise that its estimates cons ist of functions that ar e supp orted o n or nea r the prescrib ed data. W e will see that, ro ug hly s pea king, conv erge nc e of the RKH em b edding metho d is guaranteed by a newly int r o duced PE condition and estimates of the unknown function a re built ov er r egions of the state spa c e where tra jectories are in some sense “concentrated.” Here the notion of concent ration is understo o d in terms of the po sitive limit set ω + ( x 0 ), w hich is known to attract tra - jectories if the or bit is precompa ct. [6] Moreov e r, b oth of the po sitive limit sets in the exa mples are s tr iking and e x hibit conside r able struc tur e: they c an often b e interpreted a s manifolds. In Example 1 shown in Figure 1, glo bal solutions of the gov erning equations exist for every initial condition in R d . All tra jectories conv erg e to the po sitive limit set, which happ ens to be the canonica l connected, c ompact, Riemannian mani- fold, S 1 . The flows gener ated for v arious par a meters in λ in Example 2 exhibit more diverse qualitative limit- ing b ehaviors. As shown in Figure 2 (a) when λ = 0 , the homo clinic loo p encir cles a stable r egion. T ra jectories inside this r egion are all limit cycles. F or these initial conditions, the p ositive limit sets ar e smo oth, regular ly embedded submanifolds of R d . The form of these embed- ded ma nifolds is not as s imple as in Exa mple 1, that is, they are not o ne of the well-known, “iconic” manifolds. When λ < 0, the equilibrium ( x e , 0) for x e > 0 b ecomes unstable. It can b e shown that the ho mo c linic lo op b e- comes the ω -limit set of a ll the tr a jectories star ting from this r egion [4]. In either cas e , the examples illustrate a phenomenon that is common to many uncer tain estimation problems. While the observ ations Γ + ( x 0 ) = ∪ t ∈ R + x ( t ) are c o n- tained in R d , there is an under lying s et or manifold that suppo rts, approximately supp orts, or attracts the ob- served tra jectories. W e are interested in this pap er in understanding co nditions that e s tablish that the RK H embedding metho d “c o nv erges over” these under lying structures. 2 (a) Ph ase p ortrait (b) Error in F unction Estimate (c) Error Conto u r Fig. 1. Example 1 (a) Ph ase p ortrait (b) Error in F unction Estimate (c) Error Conto u r Fig. 2. Example 2 T o fr a me our discus sion of the RKH embedding metho d we briefly review the g eneral strategy of “linear-in- parameters ” (LIP) metho ds for adaptive estimation of uncertain nonlinear s ystems of ODEs. So-called LIP estimation might b est b e describ ed as a part of the tec h- nical folklor e for metho ds in adaptive estimation. This approach is ubiquitous in the a daptive estimation liter- ature and is a well-known too l a mo ng resea rchers who study this to pic . It is safe to say that the most p opula r versions o f ada ptiv e estimatio n for the ab ov e t y p e of un- certain nonlinear ODEs choo se the function estimate in terms of a linear- in- parameters representation ˆ g ( t, · ) = P n k =1 φ k ( · ) α k ( t ) = Φ T ( · ) α ( t ) with α k ( t ) a time-v arying parameter, φ k ( · ) : R d → R d a function for k = 1 , . . . , n , the vector α ( t ) := { α 1 ( t ) , . . . , α n ( t ) } T ∈ R n , and the matrix of functions Φ T ( · ) = [ φ 1 ( · ) , . . . , φ n ( · )]. Here the functions in Φ a re known as the re g ressor s, a co mmon term arising from applications in nonlinear re gres- sion. If the unknown function g has the r e pr esenta- tion g = P n k =1 φ k ( · ) α ∗ k = Φ T ( · ) α ∗ for some unknown constants { α ∗ k } k ≤ n , then the erro r in function esti- mates is ˜ g ( t, · ) = Φ( · ) ˜ α ( t ) with the pa rameter error ˜ α ( t ) := α ∗ − α ( t ) ∈ R n . In this case the err or in the func- tion estimates ˜ g ( t ) → 0 if the finite set of parameters error s conv erg e ˜ α ( t ) → 0 ∈ R n . It is fo r this rea son that the task o f e s timating functions in the usual LIP fra me- work reduces to questions of par ameter conv erg ence in R n . One o f the foundations of mo dern adaptive estimation for ODEs has b een r ecognition of the fact that p e rsis- tency of e xcitation conditions can b e sufficien t to guar- antee parameter conv er gence. The no tion of p ersis tence of excitation in its conven tional for m, tha t is, as it p e r - tains to the the ODE er ror Equations 14, is defined nex t. Definition 3 The r e gr essors Φ ar e p ersistently excite d by the p ositive orbit Γ + ( x 0 ) if ther e ar e p ositive c onst ant s γ 1 , γ 2 , T , and ∆ such that for e ach t ≥ T , γ 1 k α k 2 R d ≤ Z t +∆ t  α ⊤ Φ  x ( τ )  Φ  x ( τ )  ⊤ α  dτ ≤ γ 2 k α k 2 R d (6) for al l α ∈ R N . The pap ers [7,8,9,10,11] and a num b er o f standard texts [12,13,14,15] on ada ptive estimation make a careful study of this conditio n a nd how it facilitates a pro of that the para meter e r ror ˜ α ( t ) conv erges to z e ro as t → ∞ . In some cases it is to o muc h to ho p e that all the pa - rameters ˆ α ( t ) := { ˆ α 1 ( t ) , . . . , ˆ α n ( t ) } in the approxima- tions ˆ g ( t, · ) = P k =1 ,..., n φ k ( · ) ˆ α k ( t ) co nv erge. A means of weakening the ab ov e PE condition intro duces the notion of p artial p ersistency o f excitatio n. One version of the definition of a partia l P E condition mo difies the inequalities ab ov e and replaces them with the co ndition that γ 1 k P V v k 2 R n ≤ α T Z t +∆ t Φ( x ( τ ))Φ T ( x ( τ )) d τ · α ≤ γ 2 k P V v k 2 R n 3 with P V : R n → R n a pro jection onto a linear subspac e V ⊂ R n . This generaliza tion then can be used to guar an- tee, as a sp ecial case, that o nly certain of the co efficient estimates co n verge, but not all. As we will discuss in mor e detail sho r tly , the metho d of RKH embedding recasts the ab ov e a daptive estimation problem so that the state er rors ˜ x ( t ) a nd function er- rors ˜ g ( t ) := ˜ g ( t, · ) evolv e in a pro duct space having the form R d × H with H = H d a vector-v alued RK H space o f functions. The spa ce H is known as the hypothesis spa ce and its selec tio n is based on what cla ss of prior s or infor- mation seems r elev an t r e g arding the estimation proble m at hand. The precise form of the PE definition in this pap er, and the a sso ciated theor ems that dep end on it, are written for a mo del pr oblem with the vector-v alued function g := B f with f : R d → R a scalar v alued func- tion and B ∈ R d × 1 . This restriction does not seem to o severe, simplifies the notation co nsiderably , and conv e ys the underlying geometric r elationships be tween orbits Γ + ( x 0 ) of s emiflows star ting at x 0 ∈ X , pers is tency of sets Ω, and RKH spaces H Ω . Mor eov er, the extension to general vector-v alued functions w o uld pro ceed in pr inci- ple along the sa me lines a s the strategy in [16] used for consensus estimation. 1.2 Overview of New Results In either of the pap ers [2,3] some o f the standard ques- tions regarding RKH em b edding ha ve b een discussed such as e x istence of solutions and well-pos e dness, con- tin uo us dep endence on initial co nditions, as well as stability and conv erge nce o f finite dimensional approx- imations. In this pap er we fo cus primar ily on build- ing mor e intuition and insight regar ding the newly int r o duced notion of p ersistency of excitatio n in the RKH e mbedding metho d. Starting with an RKH space H X = span { K ( x, · ) | x ∈ X } of functions ov er X , w e then define for some indexing set Ω ⊆ X the s ubspace H Ω := spa n { K ( x, · ) | x ∈ Ω } . Note ca r efully that func- tions in H Ω are supp o rted on X , which is wh y Ω is referred to as the indexing set. This space is r elated to, but distinct fro m, the space R Ω ( H X ) tha t a re r estric- tions of functions in H X to the s ubset Ω. W e hav e the following definition of p ersistence of excita tio n for the RKH error Equations 1 5. Definition 4 The indexing set Ω and RKH sp ac e H Ω ar e p ersisten t ly excite d by the orbit Γ + ( x 0 ) if ther e ar e p ositive c onstants γ 1 , γ 2 , T , and ∆ such that for e ach t ≥ T , γ 1 k f k 2 H Ω ≤ Z t +∆ t  E ∗ x ( τ ) E x ( τ ) f , f  H X dτ ≤ γ 2 k f k 2 H Ω (7) Here E x : f 7→ f ( x ) is the ev alua tion functional at x and E ∗ x is its adjoint o per ator. The clas sical definition given above defines per sistency o f excitation for a sp ecific set o f regresso rs a nd the tra jectory of a semidynamical system. The new PE condition holds for an indexing set Ω ⊆ X , space of functions H Ω , and a tra jecto r y of a semidynamical sys tem. It should b e noted tha t the PE co ndition a bove is over a set Ω ⊂ X , which may or may no t b e the entire state space X . It this s e ns e it bear s s ome resem bla nce o f the pa rtial PE conditions that a re defined ov er s ubspaces o f para meters in R n . The similarity in form is all the mo re apparent when we no te that k · k H Ω := k P Ω ( · ) k H X with P Ω the H X -orthogo nal pro jection onto H Ω : the closed subspace H Ω is endow ed with the norm it inherits fro m H X . It should also be po int ed out that b oth the set Ω and the kernel K X (that determines H X , and ther e fore determines H Ω ⊆ H X ) are free to b e selected when tr ying to apply the ab ove PE condition in the metho d of RKH embedding. Int uitively , we exp ect so me kernels a re more useful than others in the RKH embedding metho d, and one o f the primary thrusts of this pap er is to explore the alterna- tives. As we will see, we obtain a strong conclusion ab out what type o f indexing sets Ω are PE when we r estrict attent ion to kernels that define function spaces that are go o d at separating imp ortant subsets of X . There ar e many w ays to think ab out how well the functions in a space H X separates po int s or sets. W e find tha t o ne im- po rtant class o f RKH spac e s consists of functions that feature a rich set of (pos s ibly smo oth) cut-off or bump functions. Our first prima ry result in T heo rem 9 is that if the RKH space do es indeed co n ta in a r ich co llection of these functions, then we have the following implication, “Ω and H Ω are PE ” = ⇒ Ω ⊆ ω + ( x 0 ) , (8) with ω + ( x 0 ) the positive limit set o f a tra jectory start- ing at x 0 . In o ther w or ds in ter ms of the new definition of PE, if a tra jectory Γ + ( x 0 ) p ersis ten tly excites a set Ω, the set Ω is contained in the p ositive limit set ω + ( x 0 ). This r esult provides nov el insig ht int o the s tr ucture o f this t yp e of p e r sistently excited systems : PE sets a re not transient but rather consist of p oints whose neigh- bo rho o ds are visited b y the tr a jectory infinitely o ften. In fact, a bit more is actually required as illustr ated in Theorem 7: the “time of visitation” is b ounded b elow in a cer tain sense. This intuition should b e compa red with the in ter pretations o f the usual Definition 3: a v e ctor signal t 7→ Φ( x ( t )) ∈ R n is (partially) P E if on a verage it vis its all directio ns in (a subspace of ) R n . W e should a lso emphasize at this p oint that while the int ent of this pap er is to inform a nd enhance o ur under - standing of the RKH embedding metho d, the result in Equation 8 is no t dep endent on the fact tha t the tra - jectory under study t 7→ x ( t ) ∈ R d happ ens to b e the solution o f our model ODE pro blem in Equa tion 2. W e hav e worked to express the co ndition in E quation 8 in very general ter ms. As lo ng a s the P E co ndition holds under the hypothesis des crib ed ab ov e, a nd Γ + ( x 0 ) is the forward orbit of a contin uous semiflow on the complete metric spa c e ( X , d X ), we conclude that Ω ⊆ ω + (Ω). It is then natura l to ask ho w to choose kernels that ex- hibit the separ ation pro per ties that enable the conclu- sion ab ov e. One a pproach, which we refer to as an in- 4 trinsic metho d, applies to cas es in which X = Ω is in fact a compact, co nnected, smoo th, Riemannian mani- fold M . Here we assume that M is known and that the kernel ov er M is known. Example 1 is the type of pr o b- lem we hav e in mind here, wher e the p ositive limit set is a simple well-kno wn manifold. Numer ous intrinsic ker- nels can be defined ov er the circle, the sphere, or more generally homogeneous manifolds [17]. In this case we choose kernels that gua rantee that the na tiv e space H M is in fact equiv alent to a cer tain So bo lev space W r, 2 ( M ) for r lar ge enough. That s uch eq uiv alences are p ossible follows from the Sob olev embedding theorem. [18] The Sob olev spaces defined over such a manifo ld M can b e shown to co nt a in a rich family o f smo oth cutoff func- tions. In this framework, if the forward or bit Γ + ( x 0 ) fo r some t 0 ∈ R of any c o nt inuous flow o n M is PE , Corol- lary 10 implies that the manifold is transitive. That is, it supp orts a flow that has a dense orbit. The study of when a particular manifold is tr ansitive is of interest in its own right [19], so the new PE co ndition can b e us ed to study whether a ma nifold is transitive. While this is an in ter esting result, it is not usually strictly or directly applica ble to understanding the conv erg e nce pr op erties of the RKH embedding prob- lem. There a re t wo essential pro blems her e . First, there are many pro blems wher e the p ositive limit set might b e a nice smo o th, compact, Riemannian mani- fold ω + ( x 0 ) = M , but w e do not know the form of the manifold a priori . In such ca ses defining the kernel in closed fo r m to be us e d in analysis or approximation is impossible. Example 2 is of this t yp e: the p ositive limit set is a smo oth ma nifold, but it is not one ov er which catalogs of intrinsic kernels a re defined. It is a lso po ssible, on the other ha nd, that w e do kno w the exact form of the ma nifold, but it is not o ne of the standa rd manifolds like the circle, sphere, or torus. E ven if we in principle ca n define the kernel thro ugh the fundamental solutions of c ertain elliptic differential op era tor on the manifold M as in Cor o llary 10, it may b e intractable to compute this fundamen ta l solution for the ma nifold at hand. This pr oblem can b e a s hard, or har de r pe r haps, that the o r iginal estimation pr o blem. It should b e kept in mind that the aim of the RKH embedding metho d is to car ry o ut adaptive es timation of unc ertain nonlinear ODEs. It is typically the cas e in s uc h situations that the exact form o f the p ositive limit set is unknown. That is, we are more interested in pro blems like Exa mple 2, in contrast to E xample 1. In this case, we assume the M is a n unknown, connected, smo o th, (regularly ) embedded submanifold of R d , a nd we resort to a n extrinsic metho d. In this tec hnique w e build a well-defined kernel on X for a larg e set X that co n ta ins M , and then w e define a kernel by restriction o n the manifold M ⊆ X . It can be the case that a plethora of kernels exist for go o d kernels ov er the large spac e X = R d . T aking car e to choose the kernel smo oth eno ugh, we obtain a kernel on M defined by restrictio n. The expression for the kernel on M is g iven in terms of the kernel on the large r space X , which is known. Coro llary 11 then shows that that M = ω + ( x 0 ) in this case. All of the numerical examples depicted in Figures 1 and 2 hav e b een computed using this extrinsic metho d. 2 Notation In this pap er the sy mbo ls N + , R , R + denote the non- negative integers, real num b ers, and non-negative real nu mber s, r e spe ctively . The expression a . b means that there is a constant c > 0 that do es not dep end o n a, b such that a ≤ c · b . The symbol & is defined similarly . The pap er ma kes use of Leb esg ue spac e s and Sob olev spaces on subsets Ω of R d , and it also uses these s pa ces when they are defined more gener a lly on measura ble s ubsets Ω of ce r tain Riemannian manifolds M . The norm on the Ba nach spaces L p (Ω) := L p µ (Ω) of µ -integrable func- tions ov er Ω ⊆ R d take the familiar form k f k p L p (Ω) := R Ω | f ( x ) | p dµ with the measure µ on R d for 1 ≤ p < ∞ , with the us ual mo dification for p = ∞ . Recall that when Ω ⊆ R d , the Sob olev space W r,p (Ω) for a pos itive inte- ger r consists of functions that hav e weak deriv atives of all order s le s s than o r equal to r in L p (Ω), and the nor m on these Ba nach spa ces is usually written k f k p W r,p (Ω) := X 0 ≤| α |≤ r Z Ω     ∂ | α | f ∂ x α     p dx (9) with the summation taken over all mult i- indices α = ( α 1 , . . . , α d ), | α | = P i =1 ,...,d α i , and her e the meas ure µ is selected to b e Le b esg ue measure dx . The So bo lev spaces for non-in teg e r r > 0 are defined in terms of in- terp olation theor y a s discus sed in [1 8]. A bit more de- tail is required to define the spaces L p (Ω) and W r,p (Ω) for Ω ⊆ M , with M a manifold. In this pap er M is a l- wa ys ass umed to b e a connected, complete Riemannian manifold with a p ositive injectivity ra dius and b ounded geometry . See [20], Chapter 7 or [17,21] for a discus- sion of these prop erties. F or purp oses in this pap er, it suffices to note that compact, connected Riemannian manifolds and R d satisfy these conditions. F or such a Riemannian manifold M denote the metric g a nd inner pro duct < · , · > g,p on the tangent space T p M . W e de- fine the asso ciated volume measure dµ on M , and its lo cal represe ntation in terms of the set of co or dinates ( x 1 , . . . , x d ) is given b y dµ ( x ) := p det ( g ) dx 1 . . . dx d . The norm k f k L p (Ω) has the same e x pression given ab ove with the measure selected to b e the us ual volume mea- sure on the manifold M . The Banach spaces W r,p (Ω) for measurable subsets Ω ⊆ M are equipp ed with the no r m k f k p W r,p (Ω) := X j =0 ,...,r Z Ω |∇ j f | p g,p dµ ( p ) (10) for 1 ≤ p < ∞ where ∇ is the cov ariant deriv ative ov er ( M , g ). When a pplied to a s et Ω ⊆ M = R d , the defini- tions ov er the manifold M define nor ms that are equiv a- lent to the usual ones fo r Sob olev spaces defined on sub- sets of R d . As discussed in [17,21] in this cas e the expr e s- sion in Equation 10 amo unts to a simple reweigh ting of 5 the deriv ative terms in E q uation 9. The Sob olev space s W r,p ( M ) for no n-int e g er r > 0 are, as in the case ab ov e, defined via interpolation theory . [20,22] The non-integer spaces are c r ucial to the sta temen t o f trace theor e ms for Sob olev s pa ces, which a r e used in this pap er to study the restrictions o f functions that define certain RK H spa ces. 3 Repro ducing Kernel Hilb ert (RKH) Spaces In this pap er we make use of s everal prop erties of r e al , scalar- v alued, RKH spaces. Also, the analys is b elow is readily extended to real, vector-v alued RKH spaces H := H k for k ∈ N . See [16] for the cas e whe r e this is car ried out in the co n tex t of consensus estimation. 3.1 Basic Defi nitions and Constructions An RKH space H X of functions that map a set X ⊆ R d → R is defined in terms of a r eal-v alued, contin uous , symmetric, and p ositive type function K X : X × X → R that is r eferred to as the kernel underlying the RKH space. The subscript on K X is used to emphasize the set ov er which the kernel, as well as the functions in H X are defined. When we sa y that K X is of po sitive t yp e, this means that P N i,j ≤ 1 α i K X ( x i , x j ) α j := α T K X,N α ≥ 0 for all { x 1 , . . . , x N } ⊂ R d and α := { α 1 , · · · , α N } T ∈ R N , with the collo catio n matrix asso ciated with { x i } 1 ≤ i ≤ N defined as K X,N := [ K X ( x i , x j )] ∈ R N × N . So a ll the collo cation ma tr ices o f a kernel of p ositive type ar e p os- itive semidefinite. W e say that the kernel is of str ictly po sitive t y p e if a ll of its collo ca tion matr ices K X,N := [ K X ( x i , x j )] for distinct p oints { x k } 1 ≤ k ≤ N are strictly po sitive definite. The function K X,x := K X ( x, · ) is known as the kernel function centered at x ∈ X , and a candidate for the inner pro duct of tw o such functions K X,x , K X,y is defined to b e ( K X,x , K X,y ) H X := K X ( x, y ) for all x, y ∈ X . The RKH s pa ce H X is the closed finite span of the set of functions { K X,x | x ∈ X } , that is, H X : = span { K X,x | x ∈ X } = ( f : X → R     f = lim N →∞ N X i =1 α N ,i K X,x N ,i ) , where α N ,i ∈ R and x N ,i ∈ X . The closur e a bove is taken with respe c t to the ca ndida te inner product. The Hilber t space ( H X , ( · , · ) H X ) above is also known a s the native space induced by the kernel K X . It is well-known [23,24,25] that with this constructio n the r epro ducing prop erty ( K X,x , f ) H X = f ( x ) ho lds fo r all f ∈ H X and x ∈ X . An y Hilb ert s pace H is in fac t a RK H spac e if all o f the ev aluation functionals that act o n H a r e in fact bounded op erators from H → R . If it is fur- ther known that if for s ome p ositive constant K we hav e sup x ∈ X K X ( x, x ) ≤ ¯ K X < ∞ , then the ev aluation op er- ator E x : H X → R given by E x := E H X ,x : f 7→ f ( x ) is a uniformly b ounded linear op era tor s ince | f ( x ) | = | E x f | = | ( K X,x , f ) H X | ≤ p K X ( x, x ) k f k H X . This im- plies that k f k C ( X ) . k f k H X , and therefor e we hav e the contin uous inclusion H X ֒ → C ( X ). W e will only consider kernels K X on X for whic h such a co nstant K X exists. Later in the pap er we als o make extensive use of the closed subspa ces H Ω := span { K X,x | x ∈ Ω } ⊆ H X for some subset Ω ⊆ X . These spaces are imp ortant in understanding how the new PE condition a re applied. One imp ortant fact is that we hav e the H X − orthog onal decomp osition H X = H Ω ⊕ V Ω with V Ω the kernel of the trace or restric tion opera tor on the set Ω ⊆ X , V Ω := { f ∈ H X | R Ω f = f | Ω = 0 } . Tha t is, f ∈ V Ω if and only if f ( x ) = 0 for all x ∈ Ω. This fundamental prop erty follows from the analy s is in [24] and [23]. Fi- nally , in some cases when we sp ecifically discuss spaces derived fro m r estrictions o f functions H X to a s ubset Ω, we denote these RK H spaces as R Ω ( H X ). 3.2 Sep ar ation of Close d S et s by R epr o ducing Kernels The curre nt pap er is in ter ested in understanding how the use of a RKH spa ce can make precis e certain no- tions of conv erge nce in a daptive estimation. W e wan t to understand the ge ometric implica tio ns of the PE c ondi- tion, that is, what it implies ab out the tra jectories of the dynamical sys tem and the PE set. E ssentially , we will “test conv er gence” in X of tr a jectories x ( t ) → ¯ x by the condition that f  x ( t )  → f ( ¯ x ) for all f ∈ H X . As we will s ee, it can b e imp orta n t for understanding per- sistence that the spac e H X contain enough functions to separate, in a certa in sense, the p oints o f X . Here an ex- ample can illustra te the the pro blem. It is known that it is alwa ys p oss ible to induce a metric d K asso ciated with the kernel K X as desc r ibe d in [26]. The problem is , o ur semiflows will be con tinuous with resp e c t to so me met- ric d X , and the top ology induced by d X may not b e the same a s that genera ted by d K . In fact it is e asy to come up with kernels fo r which this is the case . As noted in Remark 1 o f [26], the bilinear kernel k X ( x, y ) := x ⊤ y for x, y ∈ R d induces a RKH spa ce H X for whic h the only subsets that ca n b e separa ted are linear ma nifolds. In this s pecific case, d K induces a top olo gy that is strictly coarser than the usual top olog y on X := R d . Sp ecifi- cally , the metric d K can b e used to discriminate con- vergence to a pa rticular line through the origin, but not conv erg e nce to a p oint on that line. W e will see that some useful geo metric insights r egard- ing the P E condition a nd p ositive or bits result if w e do not allow the kernel to induce such a coarse top olog y . W e would lik e the metric genera ted by the kernel to b e equiv- alent with that on the state spac e. Refer ence [2 6] giv es one ex ample of a useful and simple sepa r ation prop er t y . An RKH space H X is said to separate a subset A ⊂ X if for each b / ∈ A there is a function f ∈ H X such that f ( a ) = 0 for a ll a ∈ A and f ( b ) 6 = 0 . This co ndition can be us ed to prov e tha t d K and d X define the same top ol- ogy . Howev er, we will emplo y a n even stro nger condi- tion, o ne that is well-suited to the construction of native spaces that contain well-known c la sses of differen tiable functions. W e a ssume the ex is tence of a rich family of 6 bump functions in H X . W e say that b r,x : X → [0 , 1] is a bump function on ( X , d X ) asso ciated with the op en ball B r,x = { y ∈ X | d X ( x, y ) < r } provided that 1) b r,x = 1 on a neighborho o d o f x , and 2) b r,x is z e ro outside a compact set contained in B r,x . It is immediate that if for any op en set B r,x , there is a n a sso ciated bump function b r,x ∈ H X , then the RKH space H X separates the d X - closed subsets o f X . W e say that the spa c e H X contains a rich family of bump functions if it co nt a ins a bump function b r,x for each op en ball B r,x . The c onstruction of s mo o th bump functions o n X = R d is a cla ssical ex er- cise in analy sis on manifo lds , [27] pages 4 9 –51. In prac- tice, the RKH space H X (even when X 6 = R d ) will b e selected so that it contains them. See the pro ofs b elow of Corollar ies 1 0 and 11. 4 The R KH Emb edding M etho d In this pap er, we study a mo del pr oblem of a daptive estimation for uncerta in nonlinear systems g ov erned by ordinary differential eq uations that ha ve the fo r m ˙ x ( t ) = Ax ( t ) + B f ( x ( t )) , x (0) = x 0 (11) with A ∈ R d × d a Hurwitz matrix, B ∈ R d × 1 , and f : R d → R . This equa tion is a sp ecial case of the general form in Equation 2, with g : R d → R d := B f . Meth- o ds for ensuring that this system of ODEs has lo cal or global solutions a re well-kno wn, [5], a nd in this pa p er we alwa ys a ssume that for each x 0 ∈ X the eq uations have classical solutions on R + . In this equation, it is a ssumed that the matric e s A and B are known, but the (non- linear) function f is unknown. The ada ptive estimation problem considered in this pap er uses the observ atio ns of the full state, x ( t ) for all t ≥ 0 , to co nstruct estimates ˆ x ( t ) → x ( t ) and ˆ f ( t ) → f a s t → ∞ . While f is unknown in o ur adaptiv e estimation problem, information ab out this function is r e fle c ted in the choice of an hypothesis space H of functions to which f b elongs. Perhaps the most familiar choice of hypothesis space H is one that is finite dimensio nal H n := { f = P i =1 ,...,n α i φ i } with { φ i } n i =1 some fixe d s e t of basis functions and φ i : R d → R for 1 ≤ i ≤ n . If we suppo se for the moment that the un- known function f = P n i =1 α ∗ i φ i ∈ H n , then one canoni- cal choice of an estimator is ˙ ˆ x ( t ) = A ˆ x ( t ) + B Φ T ( x ( t )) ˆ α ( t ) ˙ ˆ α ( t ) = − Γ − 1 Φ( x ( t )) B T P ( x ( t ) − ˆ x ( t )) with P ∈ R d × d the symmetric p ositive definite solution of Lyapuno v’s eq uation P A + A T P = − Q for a user- designed sy mmetric p os itiv e definite matrix Q ∈ R d × d , and Γ ∈ R n × n symmetric and p ositive definite. When the e r rors in state ˜ x ( t ) = x ( t ) − ˆ x ( t ) and par ameter erro rs ˜ α ( t ) = α ∗ − ˆ α ( t ) ar e defined, it can b e shown dir ectly that the er rors sa tisfy the equations    ˙ ˜ x ( t ) ˙ ˜ α ( t )    =   A B Φ T ( x ( t )) − Γ − 1 Φ( x ( t )) B T P 0      ˜ x ( t ) ˜ α ( t )    + E ( t ) for E ( t ) := { B e ( t ) , 0 } T , with e ( t ) = 0 if f ∈ H n . If it happens that f / ∈ H n , then e ( t ) := f ( x ( t )) − P n i =1 α ∗ i φ i ( x ( t )) := f ( x ( t )) − f n ( x ( t )) with f n a suit- able finite dimensio nal appr oximation of f . Pr e c ise con- ditions on the ex po nen tia l stability of this system ar e a classical topic in ada ptiv e estimation for uncertain ODEs. See [13,28] when e ( t ) = 0. When e ( t ) 6 = 0, s ee [15,14,29] for r e la ted discussions of ultimate b ounded- ness o f er rors. In this paper, we ar e interested in a class of dynamical systems where the unknown function f b elongs to the RKH space H . The generic RKH space H may b e the full space H X or one of its closed subspa ces H Ω describ ed in Section 3. T he pla nt , e s timator and the lear ning laws for this ca s e can b e expres sed as ˙ x ( t ) = Ax ( t ) + B E x ( t ) f , (12) ˙ ˆ x ( t ) = A ˆ x ( t ) + B E x ( t ) ˆ f ( t ) , (13) ˙ ˆ f ( t ) = Γ − 1 ( B E x ( t ) ) ∗ P ( x ( t ) − ˆ x ( t )) , (14) where x ( t ), ˆ x ( t ), A , B , and P are defined a s ab ov e. But the (nonlinear ) functions f and ˆ f ( t ) b elong to the RKH space H and E x : H → R d is the ev aluation functional that is defined as E x f = f ( x ) for a ll x ∈ X and f ∈ H . F urthermo re, the term Γ ∈ L ( H , H ) in the ab ov e equa- tion is a self-a djoin t, linear p ositive definite op erator . The erro r equa tion analo gous to the clas sical case shown ab ov e is given by ( ˙ ˜ x ( t ) ˙ ˜ f ( t ) ) = " A B E x ( t ) − Γ − 1 ( B E x ( t ) ) ∗ P 0 # ( ˜ x ( t ) ˜ f ( t ) ) = A ( t ) ( ˜ x ( t ) ˜ f ( t ) ) . (15) Note that the evolution of the ab ov e error eq uation is in R d × H as opp ose d to o n R d × R n in the cla ssical ada ptive estimator case. Some elementary conditions that gua r- antee the existence of solutions, a s well as their co n tinu- ous dep endence on initial co nditions, are given in [2,3]. In this pap er w e alwa ys a ssume that for ea ch x 0 ∈ X the equations admit a classical s olution t 7→ ( ˜ x ( t ) , ˜ f ( t )) ∈ R d × H for t ∈ R + . The following theor em, which simpli- fies considera bly the analys is in [2 ,3], shows that this is reasona ble for many co mmon choices of the RK H space H . Theorem 5 Supp ose that t he RKH sp ac e H is gen- er ate d by a kernel K X : X × X → R for which sup x ∈ X K X ( x, x ) ≤ K < ∞ . Then for e ach ( ˜ x 0 , ˜ f 0 ) ∈ R d × H t her e is a unique solution of Equation 15 in C 1 ([0 , ∞ ) , R d × H ) . PR OOF. As discussed in Section 3, the hypotheses guarantee that the ev aluatio n oper ator E x : H → H is linear and unifor mly b ounded in x ∈ X . It is immediate that lim t →∞ 1 t log + ( k A ( t ) k ) = 0 with log + ( ξ ) := max(0 , log( ξ )). As discuss e d on pag e 211 of [30] the governing equations hav e a unique global solution in time. 7 The definition of the PE condition prov es sufficient for conv erg e nce of function estimates ˆ f ( t ) → f gener ated by the RKH embedding metho d, mu ch a s in the conv en- tional, finite dimensio nal case. The analys is o f c o nv er- gence of para meters (ie, functions in our case) is no tori- ously long, so in this shor t pa per we merely o utline the pro of in a sp ecia l case. The full and lengthy details (for general P and Hurwitz A ) are given in [3 1]. W e s ay that a family of functions F over a set S is uni- formly equi-contin uous if for each ǫ > 0, ther e is a δ ǫ > 0 such that for all f ∈ F and a, b ∈ S , | a − b | < δ ǫ ⇒ k f ( a ) − f ( b ) k < ǫ . Theorem 6 Su pp ose that P = I , A is ne gative def- inite, F = { f ( x ( · )) | f ∈ H , k f k H = 1 } is uniformly e qui-c ontinuous, and the tr aje ctory Γ + ( x 0 ) is p ersistently exciting in the sense of Defin ition 4. Then the solu- tion ˜ x, ˜ f of Equations 15 s atisfy lim t →∞ ˜ x ( t ) = 0 and lim t →∞ ˜ f ( t ) = 0 . PR OOF. The pr o of that ˜ x ( t ) → 0 follows along lines that are en tir ely analo gous to the cla ssical or finite di- mensional case, see [3] for the details when ar gument s are lifted to the infinite-dimensiona l sta te spa ce R d × H . The conclusio n that ˜ f ( t ) → 0 ∈ H follows immediately from Theo rem 3 .4 of [32], provided that we ca n prove that there exists constants T , ∆ , δ, γ > 0 such that for each t ≥ T and f ∈ H with k f k H = 1 there is an s ∈ [ t, t + ∆] such that k B k R d      Z s + δ s E x ( τ ) f dτ      =      Z s + δ s B E x ( τ ) f dτ      R d > γ . (16) How ever, the condition ab ove can b e s hown to b e equiv- alent to the PE Definition 4 pr ovided that the integrand is smo oth enoug h to eliminate the possibility o f certain “rapid switching” behavior. The equiv alence of condi- tions as in Equation 16 to thos e similar to Definition 4 in the classic al, finite dimensiona l case have been studied in great detail. See [9] for a detailed discussion with excel- lent illustrative examples of pathological rapid switching in the finite dimensional ca se. In the case at ha nd, Equa- tion 1 6 follows from the fa c t that f ( x ( · )) ∈ F , a family of uniformly equi-co n tinuous functions. The lengthy de- tails of the pro o f ca n b e found in [31]. 5 Semiflows and Persistence of Excitation (PE) In this section we recall o f few o f the bas ic definitions o f dynamical systems theor y that will b e essential to the analysis of this pa pe r . The a im is to b e able to define per sistence of excitation, not only for the mo del proble m in Equa tions 2 or 11 , but for mo re g eneral evolutions on metric spaces. In particular we obtain a P E co ndition that can b e applied to flows on Riemannian manifolds, which encompass a few of o ur ex a mples. A contin uous semiflow o r semidynamical system o n the complete met- ric space ( X , d X ) is defined in terms of a co ntin uo us semi- group { S ( τ ) } τ ≥ 0 on X . The manner in which systems of ODEs can g enerate s uch a semigr oup, and thereb y a semidynamical sys tem is well-studied [6,33]. The p osi- tive or bit Γ + ( x 0 ) sta rting a t x 0 defined to b e the set Γ + ( x 0 ) := [ τ ≥ 0 S ( τ ) x 0 ⊆ X . The p ositive limit se t ω + ( x 0 ) a sso ciated with the initial condition x 0 is defined to b e ω + ( x 0 ) := \ t ≥ 0 [ τ ≥ t S ( τ ) x 0 , which is equiv alently ex pr essed as ω + ( x 0 ) =  y ∈ X | ∃ t k → ∞ s .t. lim k →∞ S ( t k ) x 0 → y  . 5.1 Persistenc e and Posi t ive Limit S et s The next few results illus trate simple and often intuitiv e relationships b etw een p ersis tently excited s ets, p ositive orbits Γ + ( x 0 ), and the p ositive limit set ω + ( x 0 ). W e s tart with a simple result that illustrates an in tuitive no tion of what the new PE Definition 4 entails. Theorem 7 L et K : X → R b e a monotone nonin- cr e asing r adial b asis funct ion and supp ose the asso ciate d kernel K X ( x, y ) := K ( d X ( x, y )) induc es the R KH sp ac e H X ֒ → C ( X ) for some fix e d ǫ > 0 . Define the me asur- able s et s I t,ǫ =  τ ∈ [ t, t + ∆]     x ( τ ) ∈ B ǫ,x ∞  for e ach t ≥ T 0 with B ǫ,x ∞ the op en b al l of r adius ǫ c ent er e d at x ∞ . If the t he Le b esgu e me asur e µ satisfies µ ( I t,ǫ ) ≥ γ ǫ > 0 for some c onstant γ ǫ for al l t ≥ T 0 , then the singleton in- dexing set Ω := { x ∞ } and the close d subsp ac e H Ω ⊂ H X ar e p ersisten t ly excite d in t he sense of Definition 4. In p articular, if x ( t ) → x ∞ , the set Ω := { x ∞ } and close d subsp ac e H Ω ar e p ersistently excite d. Before proving the above theore m, let us unpack the ab ov e definition to understand the relatively str aightfor- ward underlying idea. The interv al I t,ǫ is the set of times contained in the interv al [ t, t + ∆] dur ing which the tra- jectory t 7→ x ( t ) is within ǫ of the point x ∞ . This theo - rem says that if a tra jectory spends at lea st γ ǫ amount of time in each interv al [ t, t + ∆] in the ba ll of radius ǫ centered a t x ∞ , then Ω = { x ∞ } and H Ω are p ers istent ly excited. PR OOF. Without lo ss of g enerality , we assume that the kernel K X is no rmalized so that K X ( x ∞ , x ∞ ) = 1. By definition H Ω := span { K X,x ∞ } when Ω = { x ∞ } , and for each f ∈ H Ω with f = α K X,x ∞ we hav e k f k H Ω = α = f ( x ∞ ). Only the low er b ound of the p ers istency 8 definition is problematic, and we co mpute dir e c tly that Z t +∆ t  E ∗ x ( τ ) E x ( τ ) f , f  H X dτ = α 2 Z t +∆ t K 2 X ( x ( τ ) , x ∞ ) dτ ≥ α 2  min 0 ≤ ξ ≤ ǫ K 2 ( ξ )  µ ( I t,ǫ ) ≥ γ ǫ K ǫ k f k 2 H Ω with K ǫ := min 0 ≤ ξ ≤ ǫ K 2 ( ξ ). Theorem 7 gives a direct interpretation of the p ersis- tency condition when we consider a sing leton Ω := { x ∞ } in terms of visitation to a neighborho o d of x ∞ . It a lso suggests that there are many choices of kernels K X that induce PE spaces H Ω for any co nv ergent tra jec- tory x ( t ) → x ∞ . The mono to nicit y of the kernel in the ab ov e theorem is s a tisfied for a host of commo n choices of RKH space s, see Chapter 9 o f [34] for the definition of c o mpletely monotone radial bas is functions and ker- nels. This fact illustrates a significant difference with the con ventional PE definition: there are man y con ver- gent tra jectories that s imply are no t classically P E for a given set of r egresso rs. W e also note that if X = R d , there is a direc t extensio n of this theorem for the finite set Ω := { x ∞ , 1 , . . . , x ∞ ,M } , see Le mma 3.4 in [3 5]. W e b egin our study of the geometric nature o f PE sets by noting that the forw a rd or bit is alw ays dense in PE sets. Theorem 8 L et H X b e an RKH sp ac e of functions over the domain X and supp ose t hat this RKH sp ac e includes a rich family of bump functions. If the PE c ondition in Definition 4 holds for a subset Ω ⊆ X , then the forwar d orbit Γ + ( x 0 ) is dense in Ω , Ω ⊆ Γ + ( x 0 ) . That is, we have y ∈ Ω = ⇒ ∃{ t k } k ∈ N with lim k →∞ S ( t k ) x 0 → y . PR OOF. Supp ose to the contrary that there is an y ∈ Ω for which there is no such c onv ergent s e - quence. This mea ns that there is a n op en ball B r,y such that Γ + ( x 0 ) T B r,y = ∅ . But since we hav e assumed there is a rich collection of bump functions, there is a bump function b r,y ∈ H X that satisfies b r,y ( y ) = 1, b r,y ( x ) = 0 ∀ x 6∈ C r,y with C r,y ⊂ B r,y a compact set. How ever, from Section 3, H X = H Ω ⊕ V Ω with V Ω = { f ∈ H X | f ( x ) = 0 , ∀ x ∈ Ω } . It follows that b r,y ∈ H Ω . Since Γ + ( x 0 ) T B r,x = ∅ , the integral in Definition 4 is equal to z e r o Z t +∆ t  E ∗ x ( τ ) E x ( τ ) b r,y , b r,y  H Ω dτ = Z t +∆ t | b r,y ( x ( τ )) | 2 dτ = 0 for ea ch t ≥ T . Since k b r,y k H Ω := k P Ω b r,y k H X = k b r,y k H X & k b r,y k C ( X ) > 0 , this is a contradiction of the PE pr op erty in Definition 4 and the theor em is proven. Note that Theorem 8 do es not r equire tha t the set of times t k → ∞ . Reca ll, o n the other hand, that the pos- itive limit set ω + ( x 0 ) is contained in the closure of all accumulation p oints of the o rbit Γ + ( x 0 ) for sequences of the form { S ( τ k ) x 0 } k ∈ N , as τ k → ∞ . Next, we discuss a re la tionship of the p ositive limit set ω + ( x 0 ) a nd a PE space H Ω ov er the indexing se t Ω ⊂ X in Definition 4 . Theorem 9 L et H X b e the R KH sp ac e of functions over X and su pp ose that t his RKH sp ac e includes a rich family of bump fun ct ions. If the PE c ondition in Definition 4 holds for Ω , then Ω ⊆ ω + ( x 0 ) . PR OOF. The pro o f of this result is similar to the argu- men t in Theo rem 8 , so we only outline it. F or an a r bitrary y ∈ Ω we build a sequence { x ( t k ) } k ∈ N := { S ( t k ) x 0 } k ∈ N such that lim k →∞ t k = ∞ , lim k →∞ S ( t k ) x 0 = y . Pick the ar bitrary y ∈ Ω a nd fix r 0 > 0. Cho os e t 0 ∈ ( T , T + ∆) such that x ( t 0 ) ∈ B r 0 ,y . Suc h an x ( t 0 ) m us t exis t. If such a time does not exist, we could choose a bubble function b r 0 ,y on B r 0 ,y as in the last example such that k b r 0 ,y k H Ω > 0, for which the int e- gral R T +∆ T | b r 0 ,y ( x ( τ )) | 2 dτ = 0 would follow fro m the condition that B r 0 ,y ∩ { x ( τ ) | τ ∈ ( T , T + ∆) } = ∅ . This is a contradiction of the PE condition. W e can then set r 1 = r 0 / 2 and rep eat this pro cess seeking a t 1 ∈ (2 T , 2 T + ∆) s uch that x ( t 1 ) ∈ B r 1 ,y , and s o forth to generate { t k } k ∈ N 0 with t k → ∞ and { x ( t k ) } k ∈ N 0 with x ( t k ) → y . These sequences satisfy the desir ed conditions a bove, and we must have y ∈ ω + ( x 0 ). 5.2 Persistenc e of Excitation for Semiflows on Mani- folds A careful r e ading o f the Definition 4 makes cle ar that it depe nds o n the orbit Γ + ( x 0 ) o f a contin uous semiflow { S ( t ) } t ≥ 0 on a complete metric space ( X , d X ), a sub- set Ω ⊆ X , and an admiss ible kernel K X that defines the RKH space H X (and therefore also the clo sed sub- space H Ω ). Since it applies to subsets of complete met- ric spaces, it ma kes sense to consider muc h more general systems than the ODEs in the mo de l Equa tions 2 or 11. F or instance, we hav e the follo wing result for semiflows on manifolds, the case when the state space X = Ω = M in the PE Definition 4. No te that below the semigroup S ( t ) that defines the p ositive or bit Γ + ( x 0 ) is de fined on all of M = X . Corollary 10 Supp ose that M is a c omp act, c onne cte d, d -dimensional Riema n nian manifold , and K M is kernel that induc es a n ative sp ac e H M whose norm k · k H M is e quivalent to that of the Sob olev sp ac e W r, 2 ( M ) . If the orbit Γ + ( x 0 ) p ersistently excites H M , then ω + ( x 0 ) = M . PR OOF. W e first show that there a re indeed such ker- nels K M that induce a native space H M ≈ W r, 2 ( M ). The 9 Sob olev embedding theorem on R d states that W r, 2 ( R d ) is contin uously embedded in C ( R d ), W r, 2 ( R d ) ֒ → C ( R d ) when r > d/ 2. As noted on page 17 4 8 of [1 7], this fact can b e used to conclude that W r, 2 ( M ) is contin uously embedded in C ( M ), W r, 2 ( M ) ֒ → C ( M ). This means that we have | E x f | = | f ( x ) | ≤ k f k C ( Ω) . k f k W r, 2 ( M ) for eac h x ∈ M a nd f ∈ W r, 2 ( M ). In other words, each ev aluation functional E x : W r, 2 ( M ) → R on M is bo unded. But W r, 2 ( M ) is a Hilbe r t space; b oundedness of a ll its ev aluation functionals implies that W r, 2 ( M ) is a RKH space. W e define the Sob olev-Mater n kernel K r M of smo othness r > 0 to b e the unique fundamen- tal solutio n of the elliptic differential ope r ator equa - tion P 1 ≤ ℓ ≤ r ( ∇ ℓ ) ∗ ∇ ℓ K r M = δ w he r e ∇ is the cov ariant deriv ative o per ator ov e r the manifold M a nd δ denotes the Dirac distribution. When we define the na tiv e space H M in terms o f the Sob olev- Matern kernel K r M , we hav e H M ≈ W r, 2 ( M ) for the chosen rang e r > d/ 2. The de- tails of this analysis are given in [17] for the case when M is a smo oth Riemannian ma nifold that satisfies our standing ass umptions on M , or see refer e nc e [36] for the sp ecial case M := R d . W e next show that the RKH space H M defined in this wa y contains a rich fa mily o f (smo oth) cutoff or bubble functions. This pro of is not surprising g iven wha t we know ab out So bo lev space s on s ubsets of R d . One wa y to define W r, 2 (Ω) is a s the completion o f C ∞ (Ω) in the Sob olev norm, so the space C ∞ (Ω) is dense in W r, 2 (Ω). It is w ell-known that for any op en ball contained R d , ther e is a smo oth cutoff function with compact supp or t con taine d in that ball. This is a standard result in the study of manifolds and the con- struction of pa r titions o f unity . [27] It follows that the Sob olev space W r, 2 (Ω) contains a rich family of bubble functions. The re sult extends more genera lly to So bo lev spaces W r, 2 ( M ) using the exp onential map. The details of the pro of are rather long, which we simply o utline be - low. (Particular examples of such a constructio n can b e found in [17] o n pag e 1749 a nd ag ain on pa ge 175 1 o f the same reference.) If ˆ f is a cutoff function o n a ball B x,r ⊂ R d , it is p ossible to cons truct an as so ciated cuto ff function on the image E xp q ( B x,r ) ⊂ M under the ex- po nent ia l map E xp q : T q M → M fro m f := ˆ f ◦ E xp − 1 q . Such a n f is alwa ys an element of C ∞ ( M ) since ˆ f is just a smo o th re pr esentation of f with res pec t to a compat- ible C ∞ chart. The only techn ic a l difficulty is showing that f ∈ W r, 2 ( M ). But this follows from Lemma 3.2 of [17] which states that the ex po nen tia l op er ator E xp q in- duces a map g → g ◦ E xp q that is b oundedly in vertible from W r,p ( E xp q (Ω)) to W r,p (Ω) for any mea surable set Ω ⊆ R d . Alternatively , we can argue that W r, 2 ( M ) is the completion o f C ∞ ( M ) ([20 ], Section 7.4.5) with r esp ect to the norm in Equation 1 0. W e conclude that W r, 2 ( M ) contains a rich family of (smo oth) cutoff functions. If the motio n ov er the manifold M sa tisfies the p ersistency condition in Definition 4, then M := ω + ( x 0 ). This example illus trates that the newly introduced p er- sistency co ndition can b e applicable, in principle , to the study o f certain ev olutions ov er smooth Rieman- nian manifolds. Still, the a nalysis in the exa mple a bove is fairly abstract. Perhaps more imp ortantly , it is not a simple task to come up with a closed form expres sion for the Sobo le v-Matern kernel. Of course this c a n b e done for s ome standa rd manifolds like R d , the cir cle, or a torus, since the Sob olev-Matern kernels can b e wr it- ten down for these case s . But it is not readily a ccom- plished for so me arbitrary manifold M . The definition of the space H M ≈ W r, 2 ( M ) is intrinsic here: it dep ends on the (usually unknown) domain of the manifold M , the a tlas of c harts used to define the manifold, a nd the cov a riant deriv ative o pe rator intrinsic to the ma nifold. W e nex t discuss how it is p os sible to co me up with con- structions of a kernel for M that is extrins ic in the sense that it is defined by the restriction of so me known kernel on a larg er domain that contains M . This termino logy is used in [22] that studies the approximation prop erties of spaces constructed in such a fashion. This line of at- tack is particula rly useful to the study o f unknown or uncertain dy namical s ystems via the RKH e mbedding metho d. The pe rsistency of exc ita tion condition is cast in terms of the kernel o n the lar ger spa ce in this case, which is assumed to hav e a known closed form expres- sion. Carefully note that the fo r ward orbit Γ + ( x 0 ) in the following theo rem is defined in terms of a semigr oup S ( t ) : M → M , but M is a prop er subset of X . Corollary 11 L et M b e an m -dimensional, smo oth, c omp act, (r e gularly) emb e dde d submanifold of X := R d , and supp ose t hat the { S ( t ) } t ∈ R + defines a d M -c ontinuous semiflow on M with d M the metric on M . D enote by K r X the Sob olev-Matern kernel on X = R d for some r > d/ 2 , define the kernel K M ( · , · ) = K r X | M ( · , · ) , and denote by R M ( H X ) the RKH sp ac e gener ate d by K M . If the orbit Γ + ( x 0 ) of the semiflow on M p ersistently excites R M ( H X ) , then M = ω + ( x 0 ) . PR OOF. The Matern-Sob olev kernels ov er R d are given for r > d/ 2 by K X ( x, y ) = K X ( k x − y k R d ) with K ( ξ ) = 2 1 − ( r − d/ 2) Γ( r − d/ 2) ξ r − d/ 2 B r − d/ 2 ( ξ ) for all x, y ∈ R d and ξ := k x − y k R d with B r − d/ 2 the Bessel function of or der r − d/ 2. ([22], page 17 71 or [21], page 195 7) As in the last example, we hav e H X ( R d ) ≈ W r, 2 ( R d ) under the condition that r > d/ 2 . That the ca ndidate k er nel K M defined the restriction K M ( x, y ) := K | M × M ( x, y ) for all x, y ∈ M is in fact an admissible kernel for a RKH space follows fr om standa rd results o n RKH spa ces, [23] Section 4 .2 and [2 4] Sections 2.2.1-2 .2.2. At this po in t we do not y et ha ve a rigorous notion of exactly how smo oth the restricted functions in R M ( H X ) a re, nor do we k now whether the spaces R M ( H X ) contain a rich set of cutoff functions. But fr o m Lemma 4 of [22], we know tha t R M ( H X ) = T ( H X ) = T ( W r, 2 ( R d )) wher e T is the trace opera tor T : f → f | M . F rom P r op osition 2 of [22], under the s tanding a s sumptions on M , the trace op erator T : f → f | M is a con tinuous op era tor from W r, 2 ( R d ) onto W r − ( d − m ) / 2 , 2 ( M ) for r > ( d − m ) / 2 and 10 1 ≤ m ≤ d . In summar y then, if we c ho o se the kernel K X on R d with a sufficien tly large smoo thness index r , we hav e R M ( H X ) = T ( W r, 2 ( R d )) ≈ W r − ( d − m ) / 2 , 2 ( M ) . This s et of equiv alencies gives a precis e notion of the smo othness or regular it y of the res tricted functions in the RKH space R M ( H X ): the RKH space ov er M is e q uiv alen t to the Sob olev space having smo o thness r − ( d − m ) / 2 . The remainder of the pro o f is now that same a s in Coro llary 10 . Note that the statement of p ersistence in Definition 4 is expressed in terms of the kernel K M := K X | M × M , which can be used for computatio ns since a closed fo rm fo r K X is known. 6 Conclusions This pap er derives sufficient conditions for the co n- vergence of function estimates in the RK H e m b edding metho d that ar e based o n the rece n tly introduced notion of p ersistently excited indexing sets Ω and subspa ces H Ω of an RKH spac e H X . The pa pe r establis hes that per sistently excited subsets are contained as subsets of the p ositive limit sets, if the RKH space has a rich collection of bump functions. W e hav e also introduced bo th intrinsic a nd extr insic methods for defining a n appropria te RKH space in the event that the po sitive limit set is in fact certain t yp es of smo o th manifold. The extrinsic metho d seems particula rly well-suited for the estimation of unce rtain nonlinear sy s tems s ince the form of the p ositive limit set is unknown. The theoretica l results of this pap er establish that a r ea- sonable choice of bas is functions for pra ctical finite di- mensional a pproximations include radial ba sis functions (defined in terms of the kernel of the RK H space) that are centered on or near the p ositive limit set. It rema ins an op en ques tio n as to how to devis e versions of the RKH embedding stra tegy that adaptiv e ly selects the basis a s estimation is ca rried out. Ac kno wl edgements Andrew J. K ur dila would like to ackno wledge the sup- po rt of the Army Res earch Office under the aw ard Di s- tributed Consens us Learning for Geometric and Abstract Surfaces , AR O Grant W911NF-1 3-1-04 07. References [1] Andrew Kurdila and Y u Lei. A daptiv e control vi a embedding in repro ducing k er nel hilb ert spaces. In 2013 Americ an Contr ol Confer enc e , pages 3384–3389. IEEE, 2013. [2] Parag Bobade, Suprotim Ma jumdar, Savio Pereira, Andrew J Kurdila, and John B F erris. Adaptive estimation for nonlinear systems using repro ducing ke r nel hilb ert spaces. A dvanc es in Computational Mathematics , 45(2):869– 896, 2019. [3] Parag Bobade, Suprotim Ma jumdar, Savio Pe r eira, Andrew J Kurdila, and John B F erris. A daptiv e estimation in repro ducing kernel hilb ert spaces. In 2017 Americ an Contr ol Confer enc e (ACC) , pages 5678–5683. IEEE, 2017. [4] Jac k K Hale and H ¨ useyin Ko¸ cak. Dynamics and bifur ca t ions , v ol um e 3. Springer Science & Business Media, 2012. [5] Hassan K Khalil . Nonlinear s ystems. Upp er Sadd le River , 2002. [6] JA W alk er. Abstract dynamical systems and evo lution equations. In Dynamic al Systems and Evolution Equations , pages 85–136. Spri nger, 1980. [7] KS N ar endra and P Kudv a. Stable adaptiv e sc hemes for iden tification and con trol. IEEE T r ans. System. Man Cyb ernet, SMC-4 , 1974. [8] Nahum Shimkin and Arie F euer. P ersi stency of excitation in con tinuous-time systems. Sy stems & c ontr ol letters , 9(3):225– 233, 1987. [9] K.S. Narendra and A.M. Annaswam y . Persistent excitation in adaptive systems. International Journal of Contr ol , 45(1):127– 160, 1987. [10] JB M o ore, R Horowitz, and W Messner. F unctiona l pers i stence of excitation and observ ability for learning con trol systems. Journal of dynamic syste ms, me asur ement, and c ontr ol , 114(3):500–507 , 1992. [11] Stephen Bo yd and Shan k ar Sastry . On parameter con vergence in adaptiv e cont r ol. Syste ms & c ontr ol letters , 3(6):311–3 19, 1983. [12] Shank ar Sastry and M arc Bo dson. A daptive c ontr ol: stability, c onver genc e and r obustness . Couri er C or poration, 2011. [13] Kumpati S Narendra and An uradha M Annasw amy . Stable adaptive systems . Courier Corp oration, 2012. [14] Petros A Ioannou and Jing Sun. Ro b ust adaptive co nt r ol . Courier Corp oration, 2012. [15] Jay A F arrell and Marios M P olycarp ou. A daptive appr oximation b ase d c ontr ol: unifyi ng neur al, fuzzy and tr aditional adaptive appr oximation appr o aches , volume 48. John Wiley & Sons, 2006. [16] Parag Bobade, Dimitra Pana gou, and Andrew J Kurdil a. Multi-agent adaptiv e estimation with consensus in repro ducing kernel hilb ert spaces. In 2019 18th Eur op e an Contr ol Confer enc e (ECC) , pages 572–577. IEEE, 2019. [17] Thomas Hangelbro ek, F rancis J Narcowic h, and Joseph D W ard. K ernel approximation on m anifolds i: b ounding the lebesgue constan t. SIAM Journal on Mathematic al Analysis , 42(4):1732 –1760, 2010. [18] R. A. Adams and John F ournier. Sob olev sp ac es , volume 140. Elsevier, 2003. [19] V ´ ıctor Jim´ enez L´ op ez, Gabriel Soler L´ op ez, et al. T ransitive flo ws on manifolds. R evista Matem´ atic a Ib e r o americ ana , 20(1):107– 130, 2004. [20] Hans T r iebel. Th eo ry of F unction Sp ac es, V olume 2 . Birkhauser, 1992. [21] Thomas Hangelbro ek, F Narcowic h, Chris tian Ri eger, and J W ard. An inv erse theorem for compact li psc hitz regions in R d using localized kernel bases. Mathematics of Computation , 87(312) : 1949–198 9, 2018. [22] Edward F use l ier and Grady B W righ t. Scattered data int er p olation on embedded submanifolds with restri cted posi tive definite kernels: Sobolev error estimates. SIAM Journal on Numerica l A nalysis , 50(3):1753–1776 , 2012. 11 [23] Alain Berlinet and Christine Thomas-Agnan. R epr o ducing kernel Hilb ert sp ac es in pr ob ability and st atistics . Spri nger Science & Business M edia, 2011. [24] Saburou Saitoh and Y oshihiro Saw ano. The ory of r epr o ducing kernels and applic ations . Springer, 2016. [25] V ern I P aulsen and M rinal Raghup athi. An intr o duction to the t he ory of r epr o ducing ke rnel Hilb ert sp ac es , volume 152. Camb r idge Universit y Press, 2016. [26] Ernesto De V ito, Lorenzo Rosasco, and A lessandro T oigo. Learning sets with separating k ernels. Applie d and Computational Harmonic Ana ly sis , 37(2):185–217, 2014. [27] John M Lee. Intr o duction to smo oth manifolds . Spri nger, 2001. [28] AP Morgan and KS Narendra. On the stability of nonautono m ous different i al equations x=a+b(t)x, with skew symmetric matrix b(t). SIAM Journal on Contr ol and Optimization , 15(1):163– 176, 1977. [29] Nair a Hov akim yan and Chengyu Cao. 1 A daptive Contr ol The ory: Guar ante e d R obustness with F ast A daptation . SIAM, 2010. [30] Luis Barreir a and Claudia V alls. Stabili t y of nonautonomous differen tial equations i n hi l bert spaces. Journal of Differential Equations , 217(1):204–2 48, 2005. [31] Jia Guo, Sai T ej Paruc huri, and Andrew J Kurdila. Pe r sistence of excitation in contin uously embedded repro ducing kernel hil b ert space. In (submitte d to) 2020 Am e ric an Contr ol Confer enc e (A CC) . IEEE, 2020. [32] J. B aumeister, W. Scondo, M.A. Demetriou, and I. G. Rosen. On-line parameter estimation for infinite dimensional dynamical systems. SIAM Journal of Contr ol and Optimisation , 35(2):678– 713, 1997. [33] Stephen H Sap erstone. Semidynamic al systems in infinite dimensional sp ac es , volume 37. Springer Science & Business Media, 2012. [34] Holger W endland. Sc attere d data appr oximation , volume 17. Camb r idge universit y press, 2004. [35] AJ Kurdil a, F rancis J Narco wi ch, and Joseph D W ard. Pe r sistency of excitation in iden tification using radial basis function appro ximants. SIAM journal on c ontr ol and optimization , 33(2):625–6 42, 1995. [36] Roland Opfer. M ultiscale kernels. Ad vanc e s in c omputational mathematics , 25(4):357–380 , 2006. 12

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment