Optimal stopping for the predictive maintenance of a structure subject to corrosion

1 Optimal stopping for the predicti v e maintenance of a structure subject to corrosion Beno ˆ ıte de Saporta 1,2 , Franc ¸ ois Dufour 1 , Huilong Zhang 1 , and Charles Elegbede 3 1 Uni versit ´ e de Bordeaux, IMB, CNRS UMR 5251 INRIA Bordeaux Sud Ouest team CQFD 2 Uni versit ´ e de Bordeaux, GREThA, CNRS UMR 5113 3 Astrium Abstract W e present a numerical method to compute the optimal maintenance time for a complex dynamic system applied to an example of maintenance of a metallic structure subject to corrosion. An arbitrarily early intervention may be uselessly costly , b ut a late one may lead to a partial/complete failure of the system, which has to be av oided. One must therefore ﬁnd a balance between these too simple maintenance policies. T o achiev e this aim, we model the system by a stochastic hybrid process. The maintenance problem thus corresponds to an optimal stopping problem. W e propose a numerical method to solve the optimal stopping problem and optimize the maintenance time for this kind of processes. Index T erms Dynamic reliability , predictive maintenance, Piece-wise-deterministic Markov processes, optimal stopping times, optimization of maintenance. I . I N T R O D U C T I O N A complex system is inherently sensitiv e to failures of its components. W e must therefore determine maintenance policies in order to maintain an acceptable operating condition. The optimization of maintenance is a very important problem in the analysis of complex systems. It determines when maintenance tasks should be performed on the system. These intervention dates should be chosen to optimize a cost function, that is to say , maximize a performance function March 13, 2021 DRAFT 2 or , similarly , to minimize a loss function. Moreover , this optimization must take into account the random nature of failures and random ev olution and dynamics of the system. Theoretical study of the optimization of maintenance is also a crucial step in the process of optimization of conception and study of the life service of the system before the ﬁrst maintenance. W e consider here an example of maintenance related to an aluminum metallic structure subject to corrosion. This example was provided by Astrium. It concerns a small structure within a strategic ballistic missile. The missile is stored successi vely in a w orkshop, in a nuclear submarine missile launcher in operation or in the submarine in dry-dock. These various en vironments are more or less corrosiv e and the structure is inspected with a giv en periodicity . It is made to ha ve potentially large storage durations. The requirement for security is very strong. The mechanical stress ex erted on the structure depends in part on its thickness. A loss of thickness will cause an ov er-constraint and therefore increase a risk of rupture. It is thus crucial to control the e volution of the thickness of the structure ov er time, and to intervene before the failure. The only maintenance operation we consider here is the complete replacement of the structure. W e do not allo w partial repairs. Mathematically , this problem of prev enti ve maintenance corre- sponds to a stochastic optimal stopping problem as explained by example in the book of A ven and Jensen [1]. It is a dif ﬁcult problem, because on the one hand, the structure spends random times in each en vironment, and on the other hand, the corrosi veness of each en vironment is also supposed to be random within a gi ven range. In addition, we search for an optimal maintenance date adapted to the particular history of each structure, and not an average one. W e also want to be able to update the predicted maintenance date given the past history of the corrosion process. T o solve this maintenance problem, we propose to model this system by a piecewise-deter- ministic Markov process (PDMP). PDMP’ s are a class of stochastic hybrid processes that hav e been introduced by Davis [3] in the 80’ s. These processes have two components: a Euclidean component that represents the physical system (e.g. temperature, pressure, thickness loss) and a discrete component that describes its regime of operation and/or its en vironment. Starting from a state x and mode m at the initial time, the process follows a deterministic trajectory giv en by the laws of physics until a jump time that can be either random (e.g. it corresponds to a component failure or a change of en vironment) or deterministic (when a magnitude reaches a March 13, 2021 DRAFT 3 certain physical threshold, for e xample the pressure reaches a critical v alue that triggers a v alve). The process restarts from a new state and a new mode of operation, and so on. This deﬁnes a Markov process. Such processes can naturally take into account the dynamic and uncertain aspects of the e v olution of the system. A subclass of these processes has been introduced by De v ooght [5] for an application in the nuclear ﬁeld. The general model has been introduced in dynamic reliability by Dutuit and Dufour [6]. The theoretical problem of optimal stopping for PDMP’ s is well understood, see e.g. Gugerli [7]. Howe ver , there are surprisingly few works in the literature presenting practical algorithms to compute the optimal cost and optimal stopping time. T o our best kno wledge only Costa and Da vis [2] hav e presented an algorithm for calculating these quantities for PDMP’ s. Y et, as illustrated abov e, it is crucial to have an efﬁcient numerical tool to compute the optimal maintenance time in practical cases. The purpose of this paper is to adapt the general algorithm recently proposed by the authors in [4] to this special case of maintenance and show its high practical po wer . More precisely , we present a method to compute the optimal cost as well as a quasi optimal stopping rule, that is the date when the maintenance should be performed. As a byproduct of our procedure, we also obtain the distribution of the optimal maintenance dates and can compute dates such that the probability to perform a maintenance before this date is belo w a prescribed threshold. The remainder of this paper is organized as follo ws. In section II, we present the example of corrosion of the metallic structure that we are interested in with more details as well as the frame work of PDMP’ s. In section III, we brieﬂy recall the formulation of the optimal stopping problem for PDMP’ s and its theoretical solution. In section IV, we detail the four main steps of algorithm. In section V we present the numerical results obtained on the example of corrosion. Finally , in section VI, we present a conclusion and perspecti ves. I I . M O D E L I N G Throughout this paper , our approach will be illustrated on an example of maintenance of a metallic structure subject to corrosion. This example was proposed by Astrium. As explained in the introduction, it is a small homogeneous aluminum structure within a strategic ballistic missile. The missile is stored for potentially long times in more or less corrosiv e en vironments. March 13, 2021 DRAFT 4 The mechanical stress e xerted on the structure depends in part on its thickness. A loss of thickness will cause an o ver -constraint and therefore increase a risk of rupture. It is thus crucial to control the ev olution of the thickness of the structure ov er time, and to intervene before the failure. Let us describe more precisely the usage proﬁle of the missile. Its is stored successiv ely in three different en vironments, the workshop, the submarine in operation and the submarine in dry-dock. This is because the structure must be equipped and used in a giv en order . Then it goes back to the workshop and so on. The missile stays in each en vironment during a random duration with exponential distribution. Its parameter depends on the en vironment. At the beginning of its service time, the structure is treated against corrosion. The period of effecti veness of this protection is also random, with a W eibull distribution. The thickness loss only begins when this initial protection is gone. The degradation law for the thickness loss then depends on the en vironment through two parameters, a deterministic transition period and a random corrosion rate uniformly distribut ed within a giv en range. T ypically , the workshop and dry-dock are the more corrosi ve en vironments. The randomness of the corrosion rate accounts for small variations and uncertainties in the corrosi veness of each en vironment. W e model this degradation process by a 3 -dimensional PDMP ( X t ) with 3 modes correspond- ing to the three different en vironment. Before gi ving the detailed parameters of this process, we shortly present general PDMP’ s. A. Deﬁnition of piecewise-deterministic Markov pr ocesses Piece wise-deterministic Markov processes (PDMP’ s) are a general class of hybrid processes. Let M be the ﬁnite set of the possible modes of the system. In our e xample, the modes correspond to the various en vironments. For all mode m in M , let E m an open subset in R d . A PDMP is deﬁned from three local characteristics (Φ , λ, Q ) where • the ﬂow Φ : M × R d × R → R d is continuous and for all s, t ≥ 0 , one has Φ( · , · , t + s ) = Φ(Φ( · , · , s ) , t ) . It describes the deterministic trajectory of the process between jumps. For all ( m, x ) in M × E m , we set t ∗ ( m, x ) = inf { t > 0 : Φ( m, x, t ) ∈ ∂ E m } , the time to reach the boundary of the domain starting from x in mode m . March 13, 2021 DRAFT 5 • the jump intensity λ characterizes the frequency of jumps. For all ( m, x ) in M × E m , and t ≤ t ∗ ( m, x ) , we set Λ( m, x, t ) = Z t 0 λ (Φ( m, x, s )) ds. • the Marko v kernel Q represents the transition measure of the process and allows to select the new location after each jump. The trajectory X t = ( m t , x t ) of the process can then be deﬁned iterati vely . W e start with an initial point X 0 = ( k 0 , y 0 ) with k 0 ∈ M and y 0 ∈ E k 0 . The ﬁrst jump time T 1 is determined by P ( k 0 ,y 0 ) ( T 1 > t ) =    e − Λ( k 0 ,y 0 ,t ) if t < t ∗ ( k 0 , y 0 ) , 0 if t ≥ t ∗ ( k 0 , y 0 ) . On the interv al [0 , T 1 ) , the process follows the deterministic trajectory m t = k 0 and x t = Φ( k 0 , y 0 , t ) . At the random time T 1 , a jump occurs. Note that a jump can be either a discontinuity in the Euclidean v ariable x t or a change of mode. The process restarts at a ne w mode and/or position X T 1 = ( k 1 , y 1 ) , according to distribution Q k 0 (Φ( k 0 , y 0 , T 1 ) , · ) . W e then select in a similar way an inter jump time T 2 − T 1 , and in the interv al [ T 1 , T 2 ) the process follows the path m t = k 1 and x t = Φ( k 1 , y 1 , t − T 1 ) . Thereby , iterati vely , a PDMP is constructed, see Figure 1 for an illustration. Let Z 0 = X 0 , and for n ≥ 1 , Z n = X T N , location and mode of the process after Q k 1 ( φ ( k 1 , y 1 , S 2 ) , · ) E k 0 y 0 T 1 E k 1 Q k 0 ( φ ( k 0 , y 0 , T 1 ) , · ) S 2 y 1 Fig. 1. An ex emple of path for a PDMP until the second jump. The ﬁrst jump is random. The second jump is deterministic because the process has reached the boundary of the domain. each jump. Let S 0 = 0 , S 1 = T 1 and for n ≥ 2 , S n = T n − T n − 1 the inter-jump times between two consecuti ve jumps, then ( Z n , S n ) is a Markov chain, which is the only source of randomness of the PDMP and contains all information on its random part. Indeed, if one knows the jump times and the positions after each jump, we can reconstruct the deterministic part of the trajectory between jumps. It is a very important property of PDMP’ s that is at the basis of our numerical procedure. March 13, 2021 DRAFT 6 B. Example of corr osion of metallic structur e W e can now turn back to our example of corrosion of structure and giv e the characteristics of the PDMP modeling the thickness loss. The ﬁnite set of modes is M = { 1 , 2 , 3 } , where mode 1 corresponds to the workshop en vironment, mode 2 to the submarine in operation and mode 3 to the dry-dock. Although the thickness loss is a one-dimensional process, one needs a three dimensional PDMP to model its ev olution, because it must also take into account all the sources of randomness, that is the duration of the initial protection and the corrosion rate in each en vironment. The corrosion process ( X t ) is deﬁned by: X t = ( m t , d t , γ t , ρ t ) ∈ { 1 , 2 , 3 } × R + × R + × R + , where m t is the en vironment at time t , d t is the thickness loss at time t , γ t is the remainder of the initial protection at time t and ρ t is the corrosion rate of the current en vironment at time t . Originally , at time 0 , one has X 0 = (1 , 0 , γ 0 , ρ 0 ) , which means that the missile is in the workshop and the structure has not started corroding yet. The original protection γ 0 is drawn according to a W eibull distribution function F ( t ) = 1 − exp  −  t β  α  with α = 2 . 5 and β = 11800 hours − 1 . The corrosion rate in the workshop is drawn according to a uniform distribution on [10 − 6 , 10 − 5 ] mm/hour . The time T 1 spent in the workshop is drawn according to an exponential distrib ution with parameter λ 1 = 17520 hour − 1 . At time t between time 0 and time T 1 , the remainder of the protection is simply γ t = max { 0 , γ 0 − t } , ρ t is constant equal to ρ 0 and the thickness loss d t is giv en by d t =      0 if t ≤ γ 0 , ρ 0  t − ( γ 0 + η 1 ) + η 1 exp  − t − γ 0 η 1  if t > γ 0 , (1) where η 1 = 30000 hours. At time T 1 , a jump occurs, which means there is a change of en vironment and a new corrosion rate is dra wn for the new environment. The other two components of the process ( X t ) modeling the remainder of the protection γ t and the thickness loss d t naturally ev olve continuously . Therefore, one has m T 1 = 2 , γ T 1 = 0 if γ 0 < T 1 , γ T 1 = γ 0 − T 1 otherwise ; that is to say March 13, 2021 DRAFT 7 that once the initial protection is gone, it has no ef fect any longer , ρ T 1 is drawn according to a uniform distribution on [10 − 7 , 10 − 6 ] mm/hour . The process continues to e v olve in the same way until the next change of en vironment occurring at time T 2 . Between T 1 and T 2 , just replace ρ 0 by ρ T 1 , γ 0 by γ T 1 , η 1 by η 2 = 200000 hours and t by t − T 1 in equation (1). The process visits successi vely the 3 en vironments always in the same order 1, 2 and 3 and then returns to the en vironment 1. . The time spent in the en vironment i is a random variable exponentially distributed with parameters λ i with λ 1 = 17520 hours − 1 , λ 2 = 131400 hours − 1 and λ 3 = 8760 hours − 1 . The thickness loss ev olves continuously according to equation (1) with suitably changed parameters. The period of transition in the mode i is η i with η 1 = 30000 hours, η 2 = 200000 hours and η 3 = 40000 hours. The corrosion rate ρ i expressed in mm per hour is drawn at each change of en vironments. In en vironments 1 and 3, it follows a uniform distribution on [10 − 6 , 10 − 5 ] and in en vironment 2, it follows a uniform distribution on [10 − 7 , 10 − 6 ] . Figure 2 shows examples 0 1 2 3 4 5 6 7 8 9 x 10 5 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 (a) One trajectory 0 1 2 3 4 5 6 7 8 x 10 5 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 (b) 100 trajectories Fig. 2. Examples of trajectories of thickness loss ov er time. of simulated trajectories of the thickness loss. The slope changes correspond to changes of en vironment. The observed dispersion is characteristic of the random nature of the phenomenon. Note that the various physical parameters were gi ven by Astrium and will not be discussed here. The missile is inspected and the thickness loss of the structure under study is measured at each change of en vironment. Note that the structure is small enough for only one measurement point to be signiﬁcant. The structure is considered unusable if the loss of thickness reaches 0 . 2 mm. March 13, 2021 DRAFT 8 The optimal maintenance time must therefore occur before reaching this critical threshold, which could cause the collapse of the structure, but not too soon which would be unnecessarily expensi ve. It should also only use the av ailable measurements of the thickness loss. I I I . O P T I M A L S T O P P I N G P RO B L E M W e now brieﬂy formulate the general mathematical problem of optimal stopping corresponding to our maintenance problem. Let z = ( k 0 , y 0 ) be the starting point of the PDMP ( X t ) . Let M N be the set of all stopping times T for the natural ﬁltration of the PDMP ( X t ) satisfying T ≤ T N that is to say that the intervention takes place before the N th jump of process. The N th jump represents the horizon of our maintenance problem, that is to say that we impose to intervene no later than N th change of en vironment. The choice of N is discussed belo w . Let g be the cost function to optimize. Here, g is a reward function that we want to maximize. The optimization problem to solve is the follo wing v ( z ) = sup τ ∈ M N E z [ g ( X τ )] . The function v is called the value function of the problem and represents the maximum perfor- mance that can be achiev ed. Solving the optimal stopping problem is ﬁrstly to calculate the value function, and secondly to ﬁnd a stopping time τ that achie ves this maximum. This stopping time is important from the application point of vie w since it corresponds to the optimum time for maintenance. In general, such an optimal stopping time does not e xist. W e then deﬁne  -optimal stopping times as achieving optimal value minus  , i.e. v ( z ) −  . Under fairly weak regularity conditions, Gugerli has shown in [7] that the v alue function v can be calculated iterativ ely as follows. Let v N = g be the re ward function, and we iterate an operator L backwards. The function v 0 thus obtained is equal to the v alue function v .    v N = g , v k = L ( v k +1 , g ) , 0 ≤ k ≤ N − 1 . The operator L is a complex operator which inv olves a continuous maximization, conditional expectations and indicator functions, ev en if the cost function g is very regular . L ( w , g )( z ) ≡ sup u ≤ t ∗ ( z )  E  w ( Z 1 )1 S 1

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment