Dynamic Contract Design for Systemic Cyber Risk Management of Interdependent Enterprise Networks
The interconnectivity of cyber and physical systems and Internet of things has created ubiquitous concerns of cyber threats for enterprise system managers. It is common that the asset owners and enterprise network operators need to work with cybersec…
Authors: Juntao Chen, Quanyan Zhu, Tamer Bac{s}ar
Noname manuscript No. (will be inserted by the editor) Dynamic Contract Design f or Systemic Cyber Risk Management of Interdependent Enterprise Netw orks Juntao Chen · Quanyan Zhu · T amer Bas ¸ ar Receiv ed: date / Accepted: date Abstract The interconnectivity of c yber and physical systems and Internet of things has created ubiquitous concerns of cyber threats for enterprise system managers. It is common that the asset o wners and enterprise network operators need to work with cybersecurity professionals to manage the risk by remunerating them for their ef forts that are not directly observable. In this paper , we use a principal-agent frame work to capture the service relationships between the tw o parties, i.e., the asset o wner (princi- pal) and the cyber risk manager (agent). Specifically , we consider a dynamic systemic risk management problem with asymmetric information where the principal can only observe c yber risk outcomes of the enterprise network rather than directly the ef forts that the manager expends on protecting the resources. Under this information pattern, the principal aims to minimize the systemic c yber risks by designing a dynamic con- tract that specifies the compensation flows and the anticipated ef forts of the manager by taking into account his incentiv es and rational behaviors. W e formulate a bi-level mechanism design problem for dynamic contract design within the framework of a class of stochastic differential games. W e sho w that the principal has rational con- trollability of the systemic risk by designing an incenti ve compatible estimator of the agent’ s hidden ef forts. W e characterize the optimal solution by reformulating the problem as a stochastic optimal control program which can be solv ed using dynamic programming. W e further in v estigate a benchmark scenario with complete informa- tion and identify conditions that yield zero information rent and lead to a new cer- tainty equi v alence principle for principal-agent problems. Finally , case studies over networked systems are carried out to illustrate the theoretical results obtained. J. Chen and Q. Zhu Department of Electrical and Computer Engineering, T andon School of Engineering New Y ork Univ ersity , Brooklyn NY 11201 USA E-mail: { jc6412,qz494 } @nyu.edu T . Bas ¸ar Coordinated Science Laboratory , Uni versity of Illinois at Urbana-Champaign Urbana, IL 61801 USA E-mail: basar1@illinois.edu Juntao Chen et al. Keyw ords Systemic Risk · Dynamic Contracts · Differential Games · Internet of Things · Economics of Cybersecurity 1 Introduction Cybersecurity is a critical issue in modern enterprise networks due to the adoption of advanced technologies, e.g., Internet of things (IoT), cloud and data centers, and supervisory control and data acquisition (SCADA) system, which create abundant surfaces for cyber attacks [19, 37, 46]. Due to the interconnections between nodes in the network, the c yber risk can propag ate and escalate into systemic risks, which hav e been a major contrib utor to massive spreading of Mirai botnets, phishing messages, and ransomware, causing information breaches and financial losses. In addition, sys- temic risks are highly dynamic by nature as the network faces a continuous flow of cybersecurity incidents. Hence, it becomes critical for the network and asset owner to protect resources from cyber attacks. The complex interdependencies between nodes and fast ev olution nature of threats hav e made it challenging to mitigate systemic risks of enterprise network and thus re- quires expert knowledge from cyber domains. The asset owners or system operators need to delegate tasks of risk management including security hardening and risk mit- igation to security professionals. As depicted in Fig. 1, the owner can be vie wed as a principal who employs a security professional to fulfill tasks that include moni- toring the network, patching the software and de vices, and recov ering machines from failures. The security professionals can be vie wed as an agent whose efforts are remu- nerated by the principal. This principal-agent type of interaction models the service relationships between the two parties. The ef fort of the agent can be measured by the hours he spends on the security tasks. Moreover , the amount of allocated effort has a direct impact on the systemic cyber risk. For example, with more frequent scans on suspicious files and the Internet traffic at each node, the cyber risk becomes lo w and less likely to spread. An agent plays an important role in systemic risk as he can determine the amount of his effort and the way of distrib uting ef forts on protecting nodes over the netw ork. Hence, it is essential for the principal to incentivize the agent to distribute his resources desirably to protect the netw ork. In the cyber risk management of enterprise network, one distinction is the lack of knowledge of the principal about the effort spent by the agent. The principal is only able to observe risk outcomes, e.g., the denial or failures of services and con- spicuous performance degradation. Moreov er , due to the randomness in the cyber network, e.g., the biased assessment of risks and the unknown attack behaviors, the cyber risk e volv es under uncertainties, making it difficult for the principal to infer the exact effort of the agent from the observations. This type of incomplete information structure is called moral hazard in contracts, under which the asset owner aims to minimize the systemic cyber risk by providing sufficient incentives to the risk man- ager through a dynamic contract that specifies the compensation flows and suggested effort, while the risk manager’ s objectiv e is to maximize his payoff with minimum effort by responding to the agreed contract. Dynamic Contract Design for Systemic Cyber Risk Management Fig. 1 Systemic cyber risk management for enterprise network. The asset owner (principal) delegates the risk management tasks, e.g., network monitoring and softw are patching, to security professionals (agents) by designing a contract which specifies the remuneration schemes. The amount of remuneration is directly related to the systemic risk outcome of the network. The dynamic principal-agent problem has an asymmetric information structure in which the risk manager determines his effort o ver time, while this ef fort is hidden to or unobservable by the asset owner . This information structure makes the con- tract design a challenging decision making problem. Conv entional methods to ad- dress problems of incomplete information include information state based separation principle [34, 35] and belief update scheme [23]. Howe ver , these methods cannot be directly applied to design an optimal contract for the players. T o address this chal- lenge, we dev elop a systematic solution methodology which includes an estimation phase, a v erification phase, and a control phase. Specifically , we first anticipate the risk manager’ s optimal effort based on the systemic risk outcome by designing an es- timator for the principal. Then, we sho w that the principal has r ational controllability of the systemic risk by v erifying that the estimated ef fort is incentiv e compatible. Fi- nally , we transform the problem using decision v ariables that adapt to the principal’ s information set and obtain the solution by solving a reformulated standard stochastic control program. The optimal dynamic mechanism design (ODMD) includes the compensation flows and the suggested effort. The designed optimal dynamic contract includes the compensations for direct cost of effort, discounted future re venue, cyber risk uncer - tainties, as well as incenti ve pro visions. Furthermore, under the incenti ve compatible contract, the risk manager’ s beha vior is strategically neutral in the sense that his cur- rent action depends solely on the present stage’ s cost. The policies of the optimal contract can be determined by solving a stochastic optimal control problem. Under mild conditions, the decision v ariables associated with the suggested effort and the compensation can be solved in parallel, leading to a separation principle for dynamic mechanism design. As a benchmark problem for comparison, we further inv estigate the dynamic contract under full information where the principal can fully observ e the agent’ s ef fort. In general cases, we sho w that there is a positi v e information rent quan- tifying the dif ference of principal’ s objecti ve value between the contracts designed Juntao Chen et al. under incomplete information and full information. In addition, we identify condi- tions under which the information rent is degenerated to zero, yielding a certainty equivalence principle in which the mechanism designs under full and asymmetric information become identical. For example, the hidden-action impact is absent in the linear quadratic (LQ) case where the principal achieves a perfect estimation and control of the risk manager’ s dynamic effort. The incentiv e provided by the principal to the agent is critical for mitigating the cyber risk. W ithout sufficient control ef fort, the risk would grow and propagate over the network. Under the optimal dynamic contract, both the systemic cyber risk and adopted effort decrease over time. Moreover , the effort con ver ges to a positiv e con- stant and the systemic risk can remain at a low lev el. Furthermore, a higher network connectivity requires the agent to spend more effort to reduce the systemic cyber risk. In the linear quadratic (LQ) scenario, we observe that the nodes in the c yber network hav e self-accountability , i.e., the amount of effort allocated on each node depends only on its risk influences on other nodes and is independent of exogenous risks com- ing from neighboring nodes. This observation enables large-scale implementation of distributed risk mitigation polic y by determining the outer degrees of the nodes. The contributions of this w ork are summarized as follows. 1) W e formulate a dynamic mechanism design problem for systemic cyber risk man- agement of enterprise networks under hidden-action type of incomplete informa- tion. 2) W e provide a systematic methodology to characterize the optimal mechanism design by transforming the problem into a stochastic optimal control problem with compatible information structures. 3) W e define the concept of “rational controllability” to capture the feature of indi- rect control of cyber risks by the principal, and identify the explicit conditions under which the designed dynamic contract is incentiv e compatible. 4) W e identify a separation principle for dynamic contract design under mild con- ditions, where the estimation variable capturing the suggested risk management effort and the control variable specifying the compensation can be determined separately . 5) W e reveal a certainty equiv alence principle for a class of dynamic mechanism design problems where the information rent is zero, i.e., the contracts designed under asymmetric and full information cases coincide. 6) W e observe that larger enterprise network connectivity and risk dependency strength require the principal to provide more incentives to the agent. Under the optimal contract in the LQ case, the allocated effort depends on the nodes’ outer degree, leading to a self-accountable and distributed risk mitigation scheme. 1.1 Related W ork Cybersecurity becomes a critical issue due to the large-scale deployment of smart de- vices and their integration with information and communication techologies (ICTs) [42, 46]. Hence, security risk management is an important task which has been in vesti- gated in different research fields, such as communications and infrastructures [17, 55], Dynamic Contract Design for Systemic Cyber Risk Management cloud computing [48] and IoT [21]. The interconnections between nodes and de vices make the risk management a challenge problem as the cyber risk can propogate and escalate into systemic risk [28], and hence the interdependent security risk analy- sis is necessary [18]. Managing systemic risk is nontrivial as demonstrated in finan- cial systems [11], critical infrastructures [24], and communication networks [22]. In a network with a small number of agents, graph-theoretic methods have been widely adopted to model the strategic interactions and risk interdependencies be- tween agents [11, 27]. When the number of nodes becomes large, [15] has proposed a mean-field game approach where a representative agent captures the system dynam- ics. Dif ferent from [1, 26] in minimizing the static systemic risk at equilibrium, we focus in this paper on a mechanism design problem that can reduce the systemic risks by understanding the system dynamics. Dynamic games of incomplete or imperfect information hav e been studied within the context of different classes of games, such as repeated games [3], differential games [13], and stochastic games [54]. Many types of information structures that entail incomplete or imperfect information have been inv estigated in the literature, such as partial or noisy measurements of system states [4, 8, 33 – 35], and asymmet- ric information for the players [14, 31, 32]. Approaches to control and optimization under classical information structures, also extended to games, include the infor- mation state based separation principle [16, 34, 35], belief updates on players’ pri- vate information [23], generalized belief states of agents [33], and control o ver net- works [52]. Decision-making under nonclassical information structures has also been studied (such as [5, 9, 47]), where the players are coupled through the system dynam- ics and/or the performance indices do not share the same information and could be memoryless. Our bi-level dynamic mechanism design problem exhibits a unique in- formation structure in that the principal delegates the risk control tasks to the agent without observing the applied control ef fort, while the agent has complete informa- tion of the system, which leads to informational asymmetry . Dynamic mechanism design has been studied broadly [2, 30]. In [25], the au- thors hav e provided a comprehensive summary of dynamic contract design based on the stochastic maximum principles where solving forward-backw ard stochastic dif- ferential equations (FBSDEs) becomes necessary . Instead of controlling the output density using Girsanov transform, which has an indirect interpretation in applica- tions [45, 50], the authors in [10, 44] directly control the system output and adopt an alternativ e approach by regarding the agent’ s future payoff as a variable in the stochastic control dynamics. Our approach in this paper adopts the agent’ s current payoff as a state variable which is dif ferent from the abov e discussed methods. The purpose of this work is to de v elop risk management solutions for networked systems using a remuneration scheme that combines intermediate and terminal compensa- tions. W e will develop a systematic solution methodology for this class of problems by capturing the systemic cyber risk dynamics and provide principles for optimal mechanism design. The current work is different from the preliminary version [20] in multiple as- pects. First and foremost, in [20], the risk management policy is designed only for the LQ framework, while the current one extends the model to arbitrarily general scenarios. Thus, the analysis and deri ved results in this work are much more funda- Juntao Chen et al. mental by focusing on a broader class of dynamic contract design problems. Second, we additionally in vestigate the dynamic contract design under full information for comparison and obtain a new certainty equiv alence principle for a number of sce- narios. Third, we provide comprehensiv e motiv ations for the established dynamic risk model in the problem formulation, and include discussion and illustration on the timing of ev ents during contract design. Fourth, the introduction section is sub- stantially e xpanded, including depiction of risk management for enterprise networks, background on systemic risks, and description of explicit contributions of the work. Fifth, we enrich the related work section completely by discussing more literature and highlight the differences. Sixth, we include a higher number of case studies to thoroughly illustrate the dynamic contract design principles for systemic cyber risk management in enterprise networks. 1.2 Organization of the Paper The paper is org anized as follo ws. W e formulate the systemic cyber risk management problem in Section 2. Section 3 analyzes the dynamic contract forms and the incen- tiv e constraints. Section 4 reformulates the principal’ s problem and solves a linear quadratic case explicitly . Section 5 presents a complete-information benchmark sce- nario for comparison. Section 6 presents examples to illustrate the dynamic contract design for systemic risk management. Section 7 concludes the paper . 2 Problem F ormulation This section formulates the dynamic systemic cyber risk management problem of en- terprise networks under asymmetric information using a principal-agent framework, and presents an ov erview of the adopted methodology . 2.1 Systemic Cyber Risk Management An enterprise network is comprised of a set N of nodes, where N = { 1 , 2 , ..., N } . Due to the interdependencies among different nodes and fast changing nature of the threats, mitigating the systemic c yber risk is a challenging task which requires e xper- tise from cybersecurity professionals. For example, to reduce the enterprise network vulnerability , it requires a constant monitoring of the Internet traf fic into and out of the system, regular patching and updating of the device software, and continuous traf- fic scanning for intrusion detection. The principal 1 can dele gate the risk management tasks ov er a time period [ 0 , T ] to a professional manager . The cyber risk of each node depends on the lev el of compliance with security criteria, the number of vulnerabilities of the software and hardware assets, the system configurations, and the concerned threat models [43]. The risk also e volv es ov er time 1 The principal refers to the network/asset owner , and the agent refers to the risk manager or security professional which are used interchangeably . Dynamic Contract Design for Systemic Cyber Risk Management Fig. 2 Systemic cyber risk management of an enterprise network containing two nodes. The cyber risk at node i is denoted by Y i t and the applied risk manager’ s effort is E i t , i ∈ { 1 , 2 } . The cyber risk at each node depends on its system configuration, the attack model, and the risk manager’ s effort. Note that the cyber risk can propagate due to the connections between nodes. as the enterprise node constantly updates its software, introduces new functionalities, and interconnects with other nodes. W e let Y i t ∈ R be the state of node i ∈ N to capture the risk of each node that maps the system configurations at time t and the threat models to the associated risk. For example, under the advanced persistent threat (APT) type of cyber attacks, one can assess the node’ s risk using FlipIt game model in which the defender strate gically configures the system by reclaiming the control of the node with some frequencies [49]. The FlipIt game outcome yields node’ s risk which is the expected proportion of time that the node may be compromised by the adversary . As the nodes in the enterprise network are connected, their risks become interdependent. W e use an N × N -dimensional real matrix A with non-negati v e entries to model the influence of node i on node j , i , j ∈ N . The diagonal entries in A represent the strength of internal risk ev olution, and the off-diagonal entries capture the risk influence magnitude between nodes [21, 41]. For con venience, the risk profile of the network is denoted by Y t = [ Y 1 t , Y 2 t , · · · , Y N t ] . The dynamics of the risk profile describes the ev olution of the systemic risk of the whole network. T o manage the risk profile, the risk manager can apply ef fort continuously over the time period [ 0 , T ] . Specifically , at e very time t , t ∈ [ 0 , T ] , the risk manager can spend effort E t ∈ E ⊆ R N + on the nodes that mitigates the systemic cyber risk, where E is a compact set. As fore-mentioned, the effort can be measured by the amount of time and effecti v eness of the risk manager spent on monitoring the cyberspace of the enterprise network. The amount of reduced risk is monotonically increasing with the allocated effort E t [40]. This fact is reflected by many security practices, e.g., frequent scanning and analyzing the log files as well as timely patching the software can reduce the probability of successful cyber compromise by the adversary . Another critical factor to be considered is that the cyber risk faces uncertainties due to the randomness in the cyber network, e.g., the biased assessment and measurement of Juntao Chen et al. Th e p ri nci p al d e si gns c o n tr act: and su g ge ste d e ff ort ( P er i o d 1). Th e age n t deci de s w h e th e r to acce p t the con tr act or no t (Pe r i o d 2 ). Base d o n th e cy b e r r i sk , th e ag ent is d y n am i cal l y r em u n e r ate d wi th fo r h i s e ff ort acc o r d i n g to the ag re e d con tr act ( P e ri od 4 ) . A f te r fini s hi n g the ta s k , t h e age n t i s f urther r em u n e r ate d wi th ( P e ri od 5 ). Ti m e li n e C o n t r acti n g st a g e E x e c u ti o n st a ge If th e age nt ac ce p ts th e con tr act, h e sp e n ds e ffo r t i n the cy b e r ri sk man ag em e n t ( P e ri od 3) . , }, [ 0 , ] , { tT c t T p , [ 0 , ] , t t E T t Y t p T c Fig. 3 Timeline of the dynamic contract design for systemic c yber risk management. risk losses and under-modelling of random c yber threats [39]. Similar to [15], we use an N -dimensional standard Brownian motion B t which is defined on the complete probability space ( Ω , F , P ) to model the risk uncertainties on nodes. For clarity , Fig. 2 depicts an example of cyber risk management of the enterprise netw ork containing two interdependent nodes. Each node stands for a subnetwork with its own system configuration, and the adversary can target different assets, e.g., application servers and workstations. The risk manager applies ef forts E 1 t and E 2 t to node 1 and node 2 continuously to reduce the cyber risks Y 1 t and Y 2 t , respectively . The interdependency between two nodes is captured by the factor A 12 = A 21 . In sum, we focus on a model of systemic cyber risk e volution described by the following stochastic dif ferential equation (SDE): dY t = A Y t d t − E t d t + Σ t ( Y t ) d B t , Y 0 = y 0 , (1) where y 0 ∈ R N + is a known positiv e vector denoting the initial systemic risk. Let D N × N + denote the space of diagonal real matrices with positiv e elements. Then, Σ t : R N → D N × N + captures the volatility of cyber risks in the network. Here, the diffusion co- efficient Σ t ( Y t ) indicates that the magnitude of uncertainty can be related to the dy- namic risk of each node. W e assume that the entries in Σ t ( Y t ) are bounded, satisfying R T 0 k Σ t ( Y t ) 1 N k 2 d t ≤ C 1 almost surely , where C 1 is a positi ve constant, k · k denotes the standard Euclidean norm, and 1 N is an N -dimensional vector with all ones. Fur- thermore, the risk manager’ s effort E t satisfies the condition R T 0 | E t | d t ≤ C 2 almost surely , where C 2 is a positiv e constant. Since the manager can apply ef fort to ev ery node through E t , the systemic risk le v el Y t is fully manageable in the sense that more effort on each node reduces its c yber risk more significantly . Note that the model in (1) captures the characteristics of systemic cyber risks of enterprise network, and it is also adopted in v arious others’ risk management scenarios inluding c yber-physical industrial control systems [53] and financial networks [29]. As sho wn in Fig. 3, the dynamic contract design for cyber risk management can be broken into two stages, namely the contracting stage and the execution stage. In the contracting stage, the principal first pro vides a dynamic contract that specifies the payment rules for the risk management to the agent and suggested/anticipated ef fort. Then, the agent chooses to accept the contract or not based on the provided benefits. Dynamic Contract Design for Systemic Cyber Risk Management If the agent accepts, then at the execution stage he needs to determine the adopted effort E t to reduce the systemic cyber risk. During the task, the principal observ es the dynamic risk outcome Y t and pays p t ∈ P ⊆ R + compensation to the agent according to the agreed contract, where P is a compact set. After completing the task, the agent also receiv es a terminal payment c T ∈ R + which finalizes the contract. Therefore, the principal needs to decide on the payment process { p t } 0 ≤ t ≤ T as well as the final compensation c T by observing the systemic risks. Note that the ef- fort le vel E t , t ∈ [ 0 , T ] , is hidden information of the agent, which corresponds to the hidden-action scenario, or moral hazard, in contract theory . This feature a reflection of the fact that the principal (asset owner) of the enterprise network cares about the cyber risk outcome Y t rather than the implicit effort E t adopted by the risk manager . Furthermore, we denote the principal’ s information set by Y t , representing the aug- mented filtration generated by { Y s } 0 ≤ s ≤ t . The agent’ s information set is denoted by A t , including { Y s } 0 ≤ s ≤ t and { B s } 0 ≤ s ≤ t . Note that for the agent, kno wing { Y s } 0 ≤ s ≤ t or { B s } 0 ≤ s ≤ t is equiv alent as he can determine one based on the other using also his effort process { E s } 0 ≤ s ≤ t . Specifically , at time t , the principal’ s knowledge includes only the path of Y s , 0 ≤ s ≤ t . In comparison, the agent can observe ev ery term in the system, including the principal’ s information as well as the path of B s , 0 ≤ s ≤ t . The principal observ es risk outcome Y t , and his goal is to reduce the systemic risk by providing incenti ves to the manager . Therefore, the principal has no direct control of the systemic risk, and the difficulty he faces is in designing an efficient remuneration scheme based only on the limited observable information. Next, we rewrite the Y T -measurable terminal payment as c T = R T 0 d c t + c 0 , to facilitate the contract analysis, where c t has an interpretation of cumulativ e payment during [ 0 , t ] , and c 0 is a constant to be determined. Note that c 0 is a virtual initial payment and the agent receives it not at initial time 0, but rather at the terminal time T which is captured by the term c T . The evolution of the aggregated equi valent Y t - measurable financial income process M t of the cyber risk manager can be described by d M t = d c t + p t d t . (2) The cyber risk manager’ s cost function is: J A ( { E t } 0 ≤ t ≤ T ; { p t } 0 ≤ t ≤ T , c T ) = E Z T 0 e − rt f A ( t , p t , E t ) d t + e − rT h A ( M T ) , (3) where E is the expectation operator, r ∈ R + is a discount factor , f A : [ 0 , T ] × R + × E → R is the running cost, and h A : R + → R − is the terminal cost. The function f A is (implicitly) composed of two terms: the cost of spending effort E t in risk man- agement, and the receiv ed compensation p t from the principal. Note that the final compensation c T is incorporated into h A ( M T ) . Assumptions we make on the two ad- ditiv e terms of the cost functions are as follo ws. Assumption 1 The running cost function f A ( t , p t , E t ) is uniformly continuous and differ entiable in p t and E t . Further , it is monotonically decr easing in p t , and mono- tonically incr easing and strictly con vex in E t . The terminal cost function h A ( M T ) is a continuously differ entiable, con vex, and monotonic decr easing function. Juntao Chen et al. The principal’ s cost function, on the other hand, is specified as: J P ( { p t } 0 ≤ t ≤ T , c T ) = E Z T 0 e − rt f P ( t , Y t , p t ) d t + e − rT ( c T + h P ( Y T )) , (4) where f P : [ 0 , T ] × R N × P → R is the running cost, and h P : R N → R denotes the terminal cost. The function f P captures the instantaneous cost of dynamic systemic risk and the payment to the agent. Assumption 2 The running cost for the principal, f P ( t , Y t , p t ) , is uniformly continu- ous and differ entiable in Y t and p t . Further , it is monotonically increasing in p t and Y t . The terminal cost for the principal, h P ( Y T ) , is a continuously differ entiable and monotonic incr easing function. 2.2 Dynamic Principal-Agent Model In cyber risk management, the principal contracts with the agent over [ 0 , T ] . For a giv en contract, the risk manager is strategic in minimizing the net cost. This rational behavior can be captured by the follo wing definition. Definition 1 (Incentive Compatibility) Under a gi ven payment process { p t } 0 ≤ t ≤ T and terminal compensation c T of the principal, the ef fort trajectory { E ∗ t } 0 ≤ t ≤ T of the agent is incentiv e compatible (IC) if it optimizes the cost function (3), i.e., J A ( { E ∗ t } 0 ≤ t ≤ T ; { p t } 0 ≤ t ≤ T , c T ) ≤ J A ( { E t } 0 ≤ t ≤ T ; { p t } 0 ≤ t ≤ T , c T ) , ∀ E t ∈ E , t ∈ [ 0 , T ] . (5) The asset owner needs to provide sufficient incentiv es for the agent to fulfill the task of risk management, and this fact is captured through indi vidual rationality as follows. Definition 2 (Individual Rationality) The agent’ s policy is indi vidually rational (IR) if the effort trajectory { E ∗ t } 0 ≤ t ≤ T leads to satisfaction of J A ( { E ∗ t } 0 ≤ t ≤ T ; { p t } 0 ≤ t ≤ T , c T ) = inf E t ∈ E J A ( { E t } 0 ≤ t ≤ T ; { p t } 0 ≤ t ≤ T , c T ) ≤ J A , (6) where J A is a predetermined non-positiv e constant. Note that the non-positiveness of J A ensures the profitability of risk manager by ful- filling the risk management tasks. W e next provide precise formulations of the problems faced by the agent and the principal. Under a contract {{ p t } 0 ≤ t ≤ T , c T } , the agent minimizes his total cost by solving the following problem: ( O − A ) : min E t ∈ E , t ∈ [ 0 , T ] J A ( { E t } 0 ≤ t ≤ T ; { p t } 0 ≤ t ≤ T , c T ) subject to the stochastic dynamics (1) , and the payment process (2) . Dynamic Contract Design for Systemic Cyber Risk Management By taking into account the IC and IR constraints, the principal addresses the fol- lowing optimization problem: ( O − P ) : min p t ∈ P , t ∈ [ 0 , T ] , c T J P ( { p t } 0 ≤ t ≤ T , c T ) subject to the stochastic dynamics (1) , IC (5) , and IR (6) . Note that the designed contract terms { p t } 0 ≤ t ≤ T and c T should adapt to the infor- mation a vailable to the principal in vie w of the underlying incomplete information. Denote the solution to ( O − P ) by { p ∗ t } 0 ≤ t ≤ T and c ∗ T . W e present the solution concept of the formulated problem as follows. Definition 3 (Optimal Dynamic Mechanism Design (ODMD)) The ODMD con- sists of the contract {{ p ∗ t } 0 ≤ t ≤ T , c ∗ T } as well as the effort process { E ∗ t } 0 ≤ t ≤ T that solve the problems ( O − P ) and ( O − A ) , respectiv ely . In addition, the compensation processes p ∗ t and c ∗ T are adapted to Y t and Y T , respectively , and the risk manager’ s effort E ∗ t is adapted to A t . Remark: ODMD captures the bi-lev el interdependent decision making of the prin- cipal and the agent, which is a Stackelber g differential game with a nonstandard in- formation structure. Since the principal (leader) delegates the control task to the agent (follower) b ut cannot observe his adopted action, ODMD features the limited nature of the principal’ s information. Due to the hidden effort of the risk manager , ( O − P ) is not a classical stochastic optimal control problem. Specifically , the principal only observes the cyber risk out- come rather than the ef fort which has to be incentivized. T o address this challenge brought about by the presence of asymmetric information, we adopt a systematic approach to design an incentiv e compatible and optimal mechanism. 2.3 Overvie w of the Methodology W e present an overvie w of the steps in v olved in our deriv ation, with details worked out in the following sections. The principal first estimates the risk manager’ s effort based on the systemic risk output (estimation phase), and then verifies that the estimated effort is incenti v e com- patible (verification phase), and finally designs an optimal compensation scheme un- der the incentiv e compatible estimator (control phase). T o address the challenge, our goal is to transform the problem using variables that adapt to the principal’ s informa- tion set. T o this end, the principal first assumes that the agent beha ves optimally with effort lev el E ∗ t (ev en though the principal does not know the exact value) and calcu- lates the corresponding cost of the agent. Another interpretation for this step would be that the principal anticipates the agent implementing E ∗ t which satisfies the IC constraint. Then, the principal designs the terminal payment form using the estimated agent’ s cost (Section 3.1). The agent responds to the contract strategically through his best effort E o t . When the anticipated E ∗ t coincides with E o t , E ∗ t is an incentiv e com- patible estimator and the principal facilitates the agent implementing E ∗ t successfully (Section 3.2). Therefore, the principal can determine the optimal payment p ∗ t based on E ∗ t by solving a standard stochastic optimal control problem (Section 4.2). Juntao Chen et al. 3 Analysis of Risk Manager’s Incenti ves W e first provide a form of the terminal payment contract term and then focus on deriving an incenti v e compatible estimator of the cyber risk manager’ s effort. 3.1 T erminal Payment Analysis W e first present the follo wing result on the IR constraint. Lemma 1 The IR constraint holds as an equality , i.e., J A ( { E ∗ t } 0 ≤ t ≤ T ; { p t } 0 ≤ t ≤ T , c T ) = J A . (7) Pr oof If J A ( { E ∗ t } 0 ≤ t ≤ T ; { p t } 0 ≤ t ≤ T , c T ) < J A , the designed contract is not optimal as the principal can further reduce his cost by paying less to the agent. u t Next, we first express the agent’ s cost under the principal’ s information set Y t as well as using the property that the agent chooses an optimal E ∗ t , and then use the principal’ s estimation about the agent’ s cost to characterize the cumulative payment process. W e introduce a new variable W t representing the expected future cost of the agent anticipated by the principal as follows: W t = E Z T t e − r ( s − t ) f A s , p s , E ∗ s ) d s + e − r ( T − t ) h A ( M T ) Y t . (8) Note that W t is ev aluated under the information av ailable to the principal at time t . Thus, the total expected cost of the agent under the information Y t can be expressed as U t = E Z T 0 e − rt f A t , p t , E t d t + e − rT h A ( M T ) Y t , E t = E ∗ t = Z t 0 e − rs f A s , p s , E ∗ s d s + e − rt W t . (9) W e further have conditions U 0 = W 0 = J A and W T = h A ( M T ) . The effort E t = E ∗ t indicates that the agent behav es optimally under a gi ven contract. Proposition 1 The total expected cost of the agent, U t , is a martingale under Y t . In addition, there exists an N -dimensional pr ogr essively measur eable pr ocess ζ t such that dU t = e − rt ζ T t ( dY t − A Y t d t + E ∗ t d t ) , (10) wher e T denotes the transpose operator . Pr oof First, we hav e E [ U t | Y τ ] = E Z τ 0 e − rs f A ( s , p s , E ∗ s ) d s + e − r τ W τ Y τ + E Z t τ e − rs f A ( s , p s , E ∗ s ) d s + e − rt W t − e − r τ W τ Y τ = U τ + E Z t τ e − rs f A ( s , p s , E ∗ s ) d s + e − rt W t Y τ − e − r τ W τ . (11) Dynamic Contract Design for Systemic Cyber Risk Management Then, using (8), we obtain E Z t τ e − rs f A ( s , p s , E ∗ s ) d s + e − rt W t Y τ = E Z T τ e − rs f A ( s , p s , E ∗ s ) d s + e − rT h A ( M T ) Y τ = e − r τ W τ . (12) Hence, E [ U t | Y τ ] = U τ , and U t is a Y t -measurable martingale. Using martingale rep- resentation theorem [36] yields (10). u t Based on Proposition 1, we can subsequently obtain the following lemma which facilitates design of the terminal payment term design in the optimal contract. Lemma 2 The aggr e gate equivalent income pr ocess M t evolves accor ding to: d M t = rh A ( M t ) h 0 A ( M t ) d t − f A ( t , p t , E ∗ t ) h 0 A ( M t ) d t + 1 h 0 A ( M t ) ζ T t ( dY t − A Y t d t + E ∗ t d t ) − 1 2 h 00 A ( M t ) h 0 A ( M t ) ζ T t Σ t ( Y t ) Σ t ( Y t ) T ζ t h 0 2 A ( M t ) d t . (13) Pr oof By substituting (10) into (9), we obtain dU t = e − rt f A t , p t , E ∗ t d t − re − rt W t d t + e − rt dW t , ⇒ dW t = r W t d t − f A t , p t , E ∗ t d t + ζ T t ( dY t − A Y t d t + E ∗ t d t ) . (14) Since W T = h A ( M T ) , we adopt the form W t = h A ( M t ) and aim to characterize the contract that yields this form. Then, we hav e J A = h A ( M 0 ) = h A ( c 0 ) . Further, (14) indicates that h 0 A ( M t ) d M t + 1 2 h 00 A ( M t ) χ 2 t d t = r h A ( M t ) d t − f A t , p t , E ∗ t d t + ζ T t ( dY t − A Y t d t + E ∗ t d t ) , (15) where χ t is the volatility of process M t . Matching the v olatility terms in (15) gi ves h 0 2 A ( M t ) χ 2 t = ζ T t Σ t ( Y t ) Σ t ( Y t ) T ζ t . Then, (15) yields the result. u t Remark: Note that (10) includes information on the cyber risk dynamics (1). Thus, (13) can be seen as a modified stochastic dynamic system of the agent with M t as a ne w state variable. In addition, ζ t can be interpreted as the principal’ s control ov er the agent’ s re v enue. Another point to be highlighted is the role of p t in (13). Here, p t is not optimal yet and its value needs to be further determined by the principal. Currently , we can view p t as an exogenous variable that enters the constructed dynamic contract form (13). In addition, the feedback structure of the dynamic contract on Y t is reflected by the cumulativ e payment term c t shown later in Lemma 3. Interpr etation of Dynamic Contract: The dynamic contract determines the risk manager’ s revenue in (13), which includes four separate terms. The first term, rh A ( M t ) h 0 A ( M t ) d t , Juntao Chen et al. indicates that the risk manager’ s payoff should be increased to compensate the dis- counted future revenue. The second term, − f A ( t , p t , E ∗ t ) h 0 A ( M t ) d t , is an offset of the dir ect cost of agent’ s effort. The third part, 1 h 0 A ( M t ) ζ T t ( dY t − A Y t d t + E ∗ t d t ) , is an incentive term, which captures the agent’ s benefit from spending ef fort in risk management. Here, the agent’ s real ef fort enters into the Y t term. The last one, − 1 2 h 00 A ( M t ) h 0 A ( M t ) ζ T t Σ t ( Y t ) Σ t ( Y t ) T ζ t h 0 2 A ( M t ) d t , is a risk compensation term (the manager is risk-a verse), capturing the fact that the risk manager faces uncertainties in the performance outcome due to the Brownian motion. For completeness, we present the cumulati ve payment process c t in the following lemma. Lemma 3 The cumulative payment pr ocess c t evolves accor ding to: d c t = rh A ( M t ) h 0 A ( M t ) d t − f A ( t , p t , E ∗ t ) h 0 A ( M t ) d t + 1 h 0 A ( M t ) ζ T t ( dY t − A Y t d t + E ∗ t d t ) − 1 2 h 00 A ( M t ) h 0 A ( M t ) ζ T t Σ t ( Y t ) Σ t ( Y t ) T ζ t h 0 2 A ( M t ) d t − p t d t . (16) Pr oof The result can be directly obtained from (2) and Lemma 2. u t Lemma 3 characterizes the cumulativ e payment process c t with initial value c 0 giv en by h A ( c 0 ) = J A . W e focus on the class of contracts in (16), and aim to deter- mine the optimal v ariables ( ζ t and p t ) to minimize the principal’ s cost. Note that (16) is adapted to the principal’ s information set Y t , since the principal observes M t and Y t , determines p t , ζ t , and anticipates E ∗ t . In addition, this payment process is directly related to the actual effort that the agent adopts, captured by dY t . The variable ζ t can be further interpreted as the sensitivity (or gain) of contract payment to the risk dif- ference under the agent’ s optimal and actual efforts. In addition, since W t = h A ( M t ) , based on (8), we obtain U t = E Z T 0 e − rt f A t , p t , E t d t + e − rT h A ( M T ) A t = Z t 0 e − rs f A s , p s , E ∗ s d s + e − rt h A ( M t ) , (17) where the conditional expectation on A t admits the same value as that on Y t . Propo- sition 1 indicates that U t is a martingale. Then, the expected value of e − rt h A ( M t ) in (17) is zero which confirms the zero expected future cost of the agent. 3.2 Incentiv e Analysis of Cyber Risk Manager Recall that the principal suggests an optimal effort process E ∗ t by assuming that the agent beha ves optimally . Howe ver , the agent can determine his actual effort E t that minimizes the cost J A based on A t which might not be the same as E ∗ t that the prin- cipal suggests. Thus, the next important problem for the principal is to determine an incenti ve compatible contract. T o achiev e this goal, the principal determines the process ζ t and the payment p t strategically to control the agent’ s actual effort E t . Dynamic Contract Design for Systemic Cyber Risk Management Denote by V a ( t , M t ) the agent’ s value function with terminal condition V a ( T , M T ) = h A ( M T ) . The property of value function ensures that the risk management ef fort is optimal if it satisfies the following dynamic programming equation: e − rt V a ( t , M t ) = min E t E { R s t e − ru f A ( u , p u , E u ) d u + e − rs V a ( s , M s ) } . Then, using (1), (2), and (16), the cyber risk manager’ s rev enue can be expressed as: d M t = rh A ( M t ) h 0 A ( M t ) d t − f A t , p t , E ∗ t h 0 A ( M t ) d t + 1 h 0 A ( M t ) ζ T t ( E ∗ t − E t ) d t − 1 2 h 00 A ( M t ) h 0 A ( M t ) ζ T t Σ t ( Y t ) Σ t ( Y t ) T ζ t h 0 2 A ( M t ) d t + 1 h 0 A ( M t ) ζ T t Σ t ( Y t ) d B t . (18) W e re write the risk manager’ s problem as follo ws: ( O − A 0 ) : min E t ∈ E , t ∈ [ 0 , T ] J A ( { E t } 0 ≤ t ≤ T ; { p t } 0 ≤ t ≤ T , c T ) subject to the stochastic dynamics (18) , and the payment process (2) . The Hamilton-Jacobi-Bellman (HJB) equation associated with the stochastic optimal control problem ( O − A 0 ) is min E t " 1 2 ∂ 2 V a ∂ M 2 t 1 h 0 2 A ( M t ) ζ T t Σ t ( Y t ) Σ t ( Y t ) T ζ t + ∂ V a ∂ M t rh A ( M t ) h 0 A ( M t ) − f A t , p t , E ∗ t h 0 A ( M t ) + 1 h 0 A ( M t ) ζ T t ( E ∗ t − E t ) − 1 2 h 00 A ( M t ) h 0 A ( M t ) ζ T t Σ t ( Y t ) Σ t ( Y t ) T ζ t h 0 2 A ( M t ) + f A ( t , p t , E t ) # + ∂ V a ∂ t = r V a , V a ( T , M T ) = h A ( M T ) . (19) Based on the candidate value function V a ( t , M t ) = h A ( M t ) , the second-order con- dition of (19) is satisfied. Then, the optimal solution to ( O − A 0 ) is E o t = arg max E t ∂ V a ∂ M t 1 h 0 A ( M t ) ζ T t E t − f A ( t , p t , E t ) = arg max E t ζ T t E t − f A ( t , p t , E t ) . (20) For a gi ven contract, E o t is the optimal ef fort of the agent. Then, when the anticipated effort E ∗ t of the principal coincides with E o t , i.e., E ∗ t = E o t , the provided contract is IC and E ∗ t is implemented. The following theorem captures this result. Theorem 1 When the compensation pr ocess in the contr act is specified by (16) , then the IC constraint is satisfied, i.e., E ∗ t is implemented as expected by the principal, if and only if the following condition holds: E ∗ t = arg max E t ζ T t E t − f A ( t , p t , E t ) , (21) wher e ζ t is adapted to the information Y t available to the principal. Juntao Chen et al. Pr oof W e verify that E ∗ t is implemented by the agent. For an arbitrary process { E t } 0 ≤ t ≤ T , we define a variable ˜ U t = Z t 0 e − rs f A s , p s , E s d s + e − rt h A ( M t ) , where M t is given by (18). Note that the HJB equation associated with ( O − A 0 ) can also be written as 0 = min E t E d ˜ U t | A t . Then, we know that when E t 6 = E ∗ t , the drift term of ˜ U t is positi ve and yields ˜ U t < E [ ˜ U T | A t ] . Hence, at time t , the expected total cost of the risk manager is greater than ˜ U t . When E t = E ∗ t , we hav e E d ˜ U t | A t = 0, and thus ˜ U t = E [ ˜ U T | A t ] . This verifies that E ∗ t is the incentiv e compatible optimal decision of the risk manager such that his total e xpected cost is achie v ed at the lo wer bound. u t Based on Theorem 1, the principal can indirectly manipulate the implemented effort of the agent by determining the v ariables ζ t and p t jointly . Hence, under (21), the suggested effort E ∗ t is incentiv e compatible. Remark: From (21), we can see that the risk manager’ s behavior is strate gically neutral . Specifically , at time t , the risk manager decides on the optimal ef fort E ∗ t based only on the current cost (term f A ( t , p t , E t ) ) and benefit (term ζ T t E t ) instead of future-looking v ariables. This neutral beha vior is consistent with the fact that a larger current effort does not induce a higher payoff for the agent after time t , since as sho wn in (17), the e xpected future cost ov er time ( t , T ] is zero due to the marting ale property . 4 The Principal’s Pr oblem: Optimal Dynamic Systemic Cyber Risk Management Our next goal is to characterize the dynamic contracts designed by the principal. Furthermore, we present a separation principle and explicit solutions to an LQ case in this section. 4.1 Rational Controllability The controllability of the cyber risk is critical to the principal. T o account for the incentiv es in the management of risk, we ha ve the follo wing definition. Definition 4 (Rational Controllability) The dynamic systemic cyber risk is ratio- nally controllable if the principal can provide incenti ves { p t } 0 ≤ t ≤ T and c T such that the risk manager’ s effort { E t } 0 ≤ t ≤ T coincides with the one suggested by the principal. In ODMD, the rational controllability indicates that under {{ p ∗ t } 0 ≤ t ≤ T , c ∗ T } , the best-response behavior { E ∗ t } 0 ≤ t ≤ T of the agent is the same as the principal’ s predicted effort. The unique feature of rational controllability is that the principal cannot control the cyber risk directly but can rely on other terms to infer the rational beha vior of the agent, which further influences the applied effort in risk management. Corollary 1 later captures this result. Dynamic Contract Design for Systemic Cyber Risk Management 4.2 Stochastic Optimal Control Reformulation Knowing that the cyber risk manager behav es strategically , the principal aims to im- plement E ∗ t and thus (16) becomes d c t = rh A ( M t ) h 0 A ( M t ) d t − f A t , p t , E ∗ t h 0 A ( M t ) d t − 1 2 h 00 A ( M t ) h 0 A ( M t ) ζ T t Σ t ( Y t ) Σ t ( Y t ) T ζ t h 0 2 A ( M t ) d t − p t d t + 1 h 0 A ( M t ) ζ T t Σ t ( Y t ) d B t . (22) Instead of dealing with the complex re venue dynamics (18) of the principal, we deal with its equi valent counterpart d h t shown in Theorem 2 belo w , which is much simpler . W e reformulate the principal’ s problem as a standard stochastic optimal con- trol problem as follows. Theorem 2 The principal’s pr oblem is reformulated as a stochastic optimal contr ol pr oblem as follows: ( O − P 0 ) : min p t ∈ P , ζ t E Z T 0 e − rt f P ( t , Y t , p t ) − e − r ( T − t ) p t d t + e − rT h P ( Y T ) + h − 1 A ( h T ) such that dY t = A Y t d t − E ∗ t d t + Σ t ( Y t ) d B t , Y 0 = y 0 , d h t = rh t d t − f A ( t , p t , E ∗ t ) d t + ζ T t Σ t ( Y t ) d B t , h 0 = J A , E ∗ t = arg max E t ζ T t E t − f A ( t , p t , E t ) . Pr oof Recall that the expected cost of the cyber risk manager is equal to W t = h A ( M t ) . Then, under the optimal risk management ef fort and denoting h t = h A ( M t ) , we obtain d h t = rh t d t − f A ( t , p t , E ∗ t ) d t + ζ T t Σ t ( Y t ) d B t , h 0 = J A . In addition, based on d c t = d M t − p t d t , we have c T = M T − R T 0 p t d t . Since M T = h − 1 A ( h T ) , we ha ve e − rT c T = e − rT h − 1 A ( h T ) − e − rt R T 0 e − r ( T − t ) p t d t . Thus, the cost func- tion of the principal can be rewritten as E Z T 0 e − rt f P ( t , Y t , p t ) − e − r ( T − t ) p t d t + e − rT h P Y T ) + h − 1 A ( h T ) , which yields the result. u t In the inv estigated incomplete information situations, the principal preserves the indirect controllability of systemic risk Y t by estimating the agent’ s ef fort E ∗ t as well as specifying the contract terms p t , c T and process ζ t . Corollary 1 By pro viding incentives {{ p t } 0 ≤ t ≤ T , c T } and specifying pr ocess { ζ t } 0 ≤ t ≤ T , the dynamic systemic cyber risk is rationally contr ollable, and the incentive compat- ible ef fort follows (21) . The optimal { p ∗ t } 0 ≤ t ≤ T and { ζ ∗ t } 0 ≤ t ≤ T can be obtained fr om Theor em 2. Juntao Chen et al. Pr oof The result directly follows from Theorems 1 and 2. u t Remark: Theorem 2 presents solution to a standard optimal control problem for the principal, whose the existence and uniqueness have been well studied [51]. W ith f P , h P , f A , and h A satisfying the conditions in Assumptions 1 and 2, and the corre- sponding coef ficients in the functions well selected ensuring the feasibility of ( O − P 0 ) , the control problem can be solved efficiently by numerical methods [38]. Therefore, the ODMD for the systemic risk management problem, i.e., E ∗ t , p ∗ t , and c ∗ T , can be determined from (21), (22) and Theorem 2, respectiv ely . 4.3 Separation Principle W e next present a separation principle for the asset owner in determining the com- pensation p t and the auxiliary parameter ζ t . First, we mak e assumptions on the sepa- rability of the cost functions. (S1) : The agent’ s running cost can generally be separated into two parts, including the effort and payment. Accordingly , we take f A ( t , p t , E t ) to be in the form f A ( t , p t , E t ) = f A , E ( E t ) − f A , p ( p t ) , (23) where f A , E : E → R + is monotonically increasing, continuously differentiable and strictly con ve x, i.e., f 0 A , E ( E t ) > 0 and f 00 A , E ( E t ) > 0, and f A , p : P → R + . Then, the constraint E ∗ t = arg max E t ζ T t E t − f A ( t , p t , E t ) can be simplified to E ∗ t = f 0− 1 A , E ( ζ t ) . (24) (S2) : W e also assume that the principal’ s running cost takes the form f P ( t , Y t , p t ) = f P , Y ( Y t ) + f P , p ( p t ) , (25) where f P , Y : R N → R and f P , p : P → R + are monotonically increasing and continu- ously differentiable. The in verse function h − 1 A plays a role in the principal’ s objecti ve. W e further have the following assumption. (L1) : The agent’ s terminal cost function h A is linear, i.e., h A ( M T ) = γ M T , where γ < 0. Then, we hav e the follo wing separation principle . Theorem 3 Under conditions (S1) , (S2) , and (L1) , the principal’ s pr oblem ( O − P 0 ) can be separated into two subpr oblems with respect to the decision variables ζ t and p t as: ( SP 1 ) : min ζ t E Z T 0 e − rt f P , Y ( Y t ) − 1 γ f A , E f 0− 1 A , E ( ζ t ) d t + e − rT h P ( Y T ) + 1 γ Z T 0 e − rt ζ T t Σ t ( Y t ) d B t such that dY t = A Y t d t − f 0− 1 A , E ( ζ t ) d t + Σ t ( Y t ) d B t , Y 0 = y 0 . ( SP 2 ) : min p t ∈ P Z T 0 e − rt f P , p ( p t ) − e − r ( T − t ) p t + 1 γ f A , p ( p t ) d t . Dynamic Contract Design for Systemic Cyber Risk Management Pr oof For the constraint d h t = r h t d t − f A , E ( f 0− 1 A , E ( ζ t )) d t + f A , p ( p t ) d t + ζ T t Σ t ( Y t ) d B t , we obtain h t = e rt h 0 − R t 0 e r ( t − s ) [ f A , E f 0− 1 A , E ( ζ s ) − f A , p ( p s )] d s + R t 0 e r ( t − s ) ζ T s Σ s ( Y s ) d B s . Thus, the principal’ s problem can be rewritten as min p t ∈ P , ζ t E Z T 0 e − rt f P , Y ( Y t ) + f P , p ( p t ) − e − r ( T − t ) p t d t + e − rT h h P ( Y T ) + h − 1 A e rT J A − Z T 0 e r ( T − s ) f A , E f 0− 1 A , E ( ζ s ) d s + Z T 0 e r ( T − s ) f A , p ( p s ) d s + Z T 0 e r ( T − s ) ζ T s Σ s ( Y s ) d B s i such that dY t = A Y t d t − f 0− 1 A , E ( ζ t ) d t + Σ t ( Y t ) d B t , Y 0 = y 0 . Then, the decomposition of the problem follows naturally . u t Remark: ζ t can be regarded as an estimation variable since it determines the an- ticipated effort E ∗ t . The payment p t is a contr ol variable that manipulates the risk manager’ s incentiv es and is determined at the control phase. Under appropriate con- ditions, these two estimation and control variables can be designed in a separate man- ner , yielding a separation principle in dynamic contract design for systemic risk man- agement. T o obtain more insights, we next focus on a class of models where the value function of the principal and the ODMD can be explicitly characterized. 4.4 ODMD in LQ Setting In the LQ setting, the cost functions take forms as f A , E ( E t ) = 1 2 E T t R t E t , and f A , p ( p t ) = δ A p t , where R t is a positi ve-definite N × N -dimensional symmetric matrix and δ A is a positiv e constant. Then we obtain E ∗ t = f 0− 1 A , E ( ζ t ) = R − 1 t ζ t . (26) Further , we consider h P ( Y T ) = ρ T Y T , where ρ ∈ R N + maps the cyber risks to monetary loss, and f P ( t , Y t , p t ) = ρ T Y t + δ P p t , where δ P is a positi ve constant. In addition, h A ( M T ) = − M T and Σ t ( Y t ) = D t · d iag ( Y t ) , where D t ∈ R N × N and ‘ d iag ’ is a diagonal operator . The principal’ s problem becomes: min p t ∈ P , ζ t E Z T 0 e − rt ( ρ T Y t + δ P p t − e − r ( T − t ) p t ) d t + e − rT ( ρ T Y T − h T ) such that dY t = ( A Y t − R − 1 t ζ t ) d t + D t · d iag ( Y t ) d B t , Y 0 = y 0 , d h t = rh t − 1 2 ζ T t R − 1 t ζ t + δ A p t d t + ζ T t Σ t ( Y t ) d B t , h 0 = J A . The principal aims to maximize h T , which is equiv alent to minimizing the agent’ s total rev enue based on the relationship h T = − M T . The principal also considers the agent’ s participation constraint by setting h 0 = W 0 = J A , ensuring that the cyber risk manager has sufficient incenti v e to fulfill the task. Juntao Chen et al. Since e − rT h T = h 0 − R T 0 e − rs 1 2 ζ T s R − 1 s ζ t − δ A p s d s + R T 0 e − rs ζ T s D s · d iag ( Y s ) d B s , the principal’ s problem can be rewritten as: min p t ∈ P , ζ t E Z T 0 e − rt ρ T Y t + ( δ P − δ A ) p t − e − r ( T − t ) p t + 1 2 ζ T t R − 1 t ζ t d t + e − rT ρ T Y T − J A such that dY t = ( A Y t − R − 1 t ζ t ) d t + D t · d iag ( Y t ) d B t , Y 0 = y 0 . According to Theorem 3, the separation principle holds in the LQ case. T o determine the optimal p t , we solve the follo wing unconstrained optimization problem: min p t ∈ P Z T 0 e − rt ( δ P − δ A − e − r ( T − t ) ) p t d t . Depending on the v alues of parameters δ P and δ A , we obtain the follo wing results. If δ P − δ A ≥ 1, there is no intermediate payment, i.e., p t = 0, ∀ t ∈ [ 0 , T ] . In this regime, the principal has a higher valuation on the monetary payment than the agent does. In other words, the agent is relativ ely hard to be incentivized to do the risk manage- ment. When δ P − δ A ≤ 0, i.e., the principal focuses more on the c yber risk deduction rather than the e xpenditure on incentivizing the agent, the optimal p t is positi v ely un- bounded. Howe ver , in this regime, the terminal payment c T is negati vely unbounded based on (22). This contract corresponds to the scenario where the risk manager re- ceiv es a large amount of intermediate payment during the task while returning it to the principal after finishing the task which is not practical. Under 0 < δ P − δ A < 1, the intermediate compensation is either 0 or unbounded depending on the time index. Hence, to design a practical contract, we focus on the re gime in which the intermedi- ate payment is zero, and the risk manager receiv es a positi ve terminal payment c T . T o obtain the optimal { ζ ∗ t } 0 ≤ t ≤ T , we assume that the process ζ t , t ∈ [ 0 , T ] , is non-anticipativ e, which can be verified later after obtaining the solution ζ ∗ t . Then, the problem can be further simplified to: min ζ t E Z T 0 e − rt ρ T Y t + 1 2 ζ T t R − 1 t ζ t d t + e − rT ρ T Y T − J A such that dY t = ( A Y t − R − 1 t ζ t ) d t + D t · d iag ( Y t ) d B t , Y 0 = y 0 . The following theorem pro vides the optimal solution ζ ∗ t . Theorem 4 In the LQ case, the optimal solution to the principal’ s problem is given by ζ ∗ t = K t , (27) wher e K t satisfies, and is the unique solution to ˙ K t + ( A − r I ) T K t + ρ = 0 , K T = ρ . (28) Furthermor e, the minimum cost of the principal is given by J ∗ p = K T 0 y 0 + m 0 − J A , (29) wher e m 0 is obtained uniquely fr om ˙ m t − r m t − 1 2 K T t R − 1 t K t = 0 , m T = 0 . (30) Dynamic Contract Design for Systemic Cyber Risk Management Pr oof W ithout loss of generality , we solve the optimal control problem by ignoring the constant term J A in the cost function. The HJB equation min ζ t h 1 2 t r ∂ 2 V p ∂ Y 2 t D t · d iag ( Y t ) · d iag ( Y t ) D T t + ∂ V p ∂ Y t A Y t − R − 1 t ζ t + ρ T Y t + 1 2 ζ T t R − 1 t ζ t i + ∂ V p ∂ t = r V p , V p ( T , Y T ) = ρ T Y T , (31) yields the first-order condition ζ ∗ t = ∂ V p ∂ Y t . Assume that the value function takes the form: V p ( t , Y ) = 1 2 Y T S t Y + K T t Y + m t , where S t is an N × N symmetric matrix with continuously dif ferentiable entries, K t is a continuously dif ferentiable N -dimensional vector , and m t is a continuously differentiable function. Then, we obtain ζ ∗ t = S t Y t + K t . Substituting ζ ∗ t into the HJB equation yields 1 2 t r S t D t · d iag ( Y t ) · d iag ( Y t ) D T t + ( S t Y t + K t ) T ( A Y t − R − 1 t S t Y t − R − 1 t K t ) + ρ T Y t + 1 2 ( S t Y t + K t ) T R − 1 t ( S t Y t + K t ) = r 1 2 Y T t S t Y t + K T t Y t + m t − 1 2 Y T t ˙ S t Y t − ˙ K T t Y t − ˙ m t , V p ( T , Y T ) = ρ T Y T . (32) Denote by I the N -dimensional identity matrix and by e i the N -dimensional vector whose i -th element is 1 and the others are zero. Matching the coefficients in (32) further yields the following coupled ordinary dif ferential equations (ODEs): ˙ S t + S t A + A T S t − r S t − S t R − 1 t S t + 1 2 N ∑ i = 1 e i e T i D T t S t D t = 0 , S T = 0 , (33) ˙ K t + ( A − R − 1 t S t − r I ) T K t + ρ = 0 , K T = ρ , (34) ˙ m t − r m t − 1 2 K T t R − 1 t K t = 0 , m T = 0 . (35) Here, (33) is a matrix Riccati equation. Howe ver , based on the terminal condition S T = 0, we see that the unique solution to (33) is S t = 0, ∀ t . Therefore, a linear v alue function V p ( t , Y ) = K T t Y + m t is sufficient. Then, the ODEs (34) and (35) can be rewritten as (28) and (30), respecti v ely , which being linear admit unique solutions. u t W e then obtain the explicit form of optimal dynamic contract in the subsequent lemma. Lemma 4 In the LQ case, the optimal dynamic contract designed by the principal is given by d c t = rc t + 1 2 K T t R − 1 t K t d t − K T t dY t − A Y t d t + R − 1 t K t d t = rc t − 1 2 K T t R − 1 t K t d t − K T t ( dY t − A Y t d t ) , (36) Juntao Chen et al. with c 0 = − J A > 0 , and K t is given by (28) . The intermediate payment p t de gener ates to zero, and the anticipated effort of the agent under the optimal contract is E ∗ t = R − 1 t K t . Pr oof The result follows from Theorems 1, 4, and (22). u t Remark: As shown in Lemma 4, the cyber risk volatility Σ t ( Y t ) does not impact the optimal dynamic contract design, since the principal’ s expected cost is linear in the systemic risk Y t . When one of the functions f p , h A and h p is not linear , the volatility Σ t ( Y t ) will play a role in the contract design in solving the problem presented in Theorem 2. Even though the optimal dynamic contract does not depend on the cyber risk volatility in the LQ case, the risk volatility influences the real compensation during contract implementation. Corollary 2 The terminal compensation of risk manager has a lar ger variance when ther e ar e mor e comple x inter dependencies of risk uncertainties between nodes. Corollary 2 will further be illustrated through case studies in Section 6. 5 Benchmark Scenario: Systemic Cyber Risk Management under Full Information In the full-information case, the principal observes the efforts that the cyber risk man- ager implements. W e first solv e the team problem in which the agent cooperates with the principal. T o that end, the principal’ s cost under the team optimal solution is the best that he can achieve. Then, we aim to design a dynamic contract mechanism under which the agent will adopt the same policy as the team optimal one. In the coopera- tiv e case, the contract only needs to guarantee the participation constraint. Then, the principal’ s problem can be formulated as follows: ( O − B ) : min p t ∈ P , c T , E t ∈ E E Z T 0 e − rt f P ( t , Y t , p t ) d t + e − rT ( c T + h P ( Y T )) such that dY t = A Y t d t − E t d t + Σ t ( Y t ) d B t , Y 0 = y 0 , J A ( { E ∗ t } 0 ≤ t ≤ T ; { p t } 0 ≤ t ≤ T , c T ) = J A . As in the asymmetric information scenario, it is more con v enient to deal with the dynamics of the cyber risk manager’ s expected cost. By designing the contract, the principal only needs to ensure the participation of the agent. Then, the principal’ s problem can be rewritten as follo ws: ( O − B 0 ) : min p t ∈ P , ζ t , E t ∈ E E Z T 0 e − rt f P ( t , Y t , p t ) − e − r ( T − t ) p t d t + e − rT h P Y T + h − 1 A ( h T ) such that dY t = A Y t d t − E t d t + Σ t ( Y t ) d B t , Y 0 = y 0 , d h t = rh t d t − f A t , p t , E t d t + ζ T t Σ t ( Y t ) d B t , h 0 = J A . Dynamic Contract Design for Systemic Cyber Risk Management W ith the full observ ation of Y t and E t , ζ t can be chosen freely , and E t can be seen as a control v ariable of the principal. Note that the IC constraint (21) does not enter into ( O − B 0 ) . In addition, the equiv alent terminal payment process c t admits the same form as (22). ( O − B 0 ) is a standard stochastic optimal control problem which can be solved ef ficiently . T o quantify the ef ficiency of dynamic contract designed in Section 4, we hav e the following definition. Definition 5 (Information Rent) Denote the solutions to ( O − A ) and ( O − P ) by { E ∗ t } 0 ≤ t ≤ T and {{ p ∗ t } 0 ≤ t ≤ T , c ∗ T } , respectiv ely . Further , denote the solution to ( O − B ) by {{ p b t } 0 ≤ t ≤ T , c b T , { E b t } 0 ≤ t ≤ T } . Then, the information rent is giv en by I R = J P ( { p ∗ t } 0 ≤ t ≤ T , c ∗ T ) − J P ( { p b t } 0 ≤ t ≤ T , c b T ) . (37) Intuitiv ely , information rent quantifies the difference between the principal’ s costs with optimal mechanisms designed under incomplete and full information. W e hav e follo wing result on information rent. Corollary 3 The optimal cost of the principal under full information is no larg er than the one under asymmetric information. Hence, I R ≥ 0 . Pr oof Comparing with the optimal { E ∗ t } 0 ≤ t ≤ T in ( O − P 0 ) , the implemented ef fort { E b t } 0 ≤ t ≤ T in ( O − B 0 ) does not depend on the variables ζ t and p t . Thus, ( O − B 0 ) admits a larger feasible solution space, which yields the result. u t 5.1 LQ Setting: Certainty Equiv alence Principle T o further characterize the optimal contracts under full information and quantify the information rent, we in vestigate a class of special scenarios. Specifically , we take the functions to hav e the same forms as in Section 4.4. The principal’ s problem can then be written as min p t ∈ P , E t ∈ E E Z T 0 e − rt ρ T Y t + ( δ P − δ A ) p t − e − r ( T − t ) p t + 1 2 E T t R t E t d t + e − rT ρ T Y T − J A such that dY t = ( A Y t − E t ) d t + D t · d iag ( Y t ) d B t , Y 0 = y 0 . Note that ζ t does not appear in the optimization problem. Ho wev er , ζ t enters the de- signed contract (22) through the term − ζ T t Σ t ( Y t ) d B t . In the long term contracting when T is relativ ely lar ge, the expected v alue of − ζ T t Σ t ( Y t ) d B t is zero which is irrel- ev ant with ζ t . Hence, the principal can set ζ t = 0 to reduce the contract complexity . Similar to the analysis in Section 4.4, we focus on the re gime where the interme- diate payment flo w p t is zero, to av oid the unrealistic situation of negati ve terminal payment. W e obtain the following lemma characterizing the certainty equivalence principle . Juntao Chen et al. Lemma 5 In the LQ settings, I R = 0 which r eveals the certainty equivalence prin- ciple, i.e., the designed optimal contracts under the incomplete information ar e as efficient as those designed under complete information. Pr oof By re garding E t as the role of R − 1 t ζ t , we see that the problem is reduced to the one in Section 4.4. Hence, the minimum cost of the principal in the full information case is the same as that under the incomplete information. u t Remark: When the agent’ s terminal cost function h A is not linear , h − 1 A ( h T ) will not be linear in h T . Thus, the decision v ariable ζ t remains in the principal’ s objecti ve function. Then, the contract design under full information becomes more efficient as there is no dependency between ζ t and E t introduced by the IC constraint. In the LQ case, the team optimal contract is summarized as follows. Lemma 6 In the LQ setting, the team optimal dynamic contract is d c b t = rc b t + 1 2 K T t R − 1 t K t d t , E b t = R − 1 t K t , (38) with c b 0 = − J A > 0 , and K t is given by (28) . The intermediate payment is zer o. Pr oof The result follows immediately from Theorem 4 and (22) with ζ t = 0. u t The follo wing lemma provides a mechanism that leads to implementation of the team optimal solution presented in Lemma 6 without forcing the agent to follow E b t . Lemma 7 In the LQ setting, the implementable optimal dynamic contract designed by the principal under full information is d c t = rc t − 1 2 K T t R − 1 t K t + K T t E t d t , (39) with c 0 = − J A > 0 and K t given by (28) . The intermediate payment is zero, and the agent’ s best response is E t = R − 1 t K t . Pr oof Similar to the methodologies proposed in [6, 7, 12], we let the contract take the following form: d c t = rc t + 1 2 K T t R − 1 t K t d t + Γ T t ( E t − R − 1 t K t ) d t , (40) where Γ t is an N -dimensional vector to be determined. The second term Γ T t ( E t − R − 1 t K t ) d t is introduced to penalize the agent when his action deviates from R − 1 t K t . The agent solves his problem by responding to this announced contract from the principal. Similar to ( O − A 0 ) and using V a ( t , c t ) = h A ( c t ) = − c t , we obtain the cor- responding HJB equations as min E t ∂ V a ∂ c t rc t + 1 2 K T t R − 1 t K t + Γ T t ( E t − R − 1 t K t ) + f A ( t , p t , E t ) + ∂ V a ∂ t = r V a , V a ( T , c T ) = − c T . Dynamic Contract Design for Systemic Cyber Risk Management The optimal solution of the agent is achiev ed at E o t = arg min E t − Γ T t E t + 1 2 E T t R t E t , which yields E o t = R − 1 t Γ t . Based on Lemma 6, we choose Γ t = K t , and thus the agent implements the team optimal solution E b t . Further , (40) degenerates to the one in (38). u t Remark: In the LQ setting under full information and incomplete information, the optimal contract and the manager’ s behavior do not relate to the risk volatility Σ t ( Y t ) of the network. The reason is that the cost function of the principal is linear in the systemic risk Y t . Hence, the expectation of the risk volatility term is zero, and Σ t ( Y t ) does not play a role in the optimal dynamic contract. This fact in turn corroborates the zero information rent in the LQ setting due to the remov al of risk uncertainty . A more general class of scenarios satisfying the certainty equi v alence principle that leads to zero information rent is summarized as follows. Corollary 4 When f P ( t , φ , p t ) , h p ( φ ) and h A ( φ ) are linear in the ar gument φ , then I R = 0 , where the optimal contracts under the full information and incomplete infor- mation coincide. Pr oof The linearity of functions removes the effects of risk uncertainties on the per- formance of the principal and the agent which leads to a zero information rent. u t 6 Case Studies W e demonstrate, in this section, the optimal design principles of dynamic contracts for systemic cyber risk management of enterprise networks through examples. Specif- ically , we first utilize a case study with one node to sho w that the dynamic contracts can successfully mitigate the systemic risk in a long period of time. Then, we inv es- tigate an enterprise network with a set of interconnected nodes to reveal the network effects in systemic risk management through dynamic contracts and discov er a dis- tributed w ay of mitigating the systemic risks. 6.1 One-Node System Case First, we consider a one-dimensional case in which the enterprise network contains only one node, i.e., Y t is a scalar . Therefore, the risk manager protects the system by directing the security resources to this node. Note that for the LQ setting, the coupled ODEs in Theorem 4 admit the unique solutions: K t = ρ A − r ( A − r + 1 ) e ( A − r )( T − t ) − 1 , m t = K 2 t 2 rR t e − r ( T − t ) − 1 . (41) Therefore, based on Lemma 4, the optimal effort of the risk manager is E ∗ t = R − 1 t ζ ∗ t = ρ R t ( A − r ) ( A − r + 1 ) e ( A − r )( T − t ) − 1 , (42) Juntao Chen et al. and the optimal compensation becomes d c t = rc t − K 2 t 2 R t + AK t Y t d t − K t dY t , c 0 = − J A . (43) If the risk manager accepts this optimal contract, then the principal’ s excepted minimum cost is equal to J ∗ P = K T 0 y 0 + m 0 − J A . T o illustrate the optimal mechanism design, we choose specific values for the parameters in Section 4.4: ρ = 5 k$ / unit, r = 0 . 3, R t = 1 . 5 k$ / unit 2 , T = 1 year, y 0 = 5 unit, and J A = − 10 k$. Figure 4 shows the results for varying values of the parameter A . Note that a single node system with a larger A indicates that it is more vulnerable and harder to mitigate the cyber risk. From Fig. 4, we find that with a larger A , the system requires more ef fort from the risk manager to bring the cyber risk do wn to a relativ ely lo w le v el. In all cases, the ef fort decreases as time increases, and finally conv erges to a positi v e constant ρ R t . This phenomenon indicates that when the system risk is high, the agent should spend more effort in risk management. When the risk is reduced to a relativ ely low lev el and the system becomes secure, then less effort is preferable as the risk will not grow . In addition, the corresponding terminal compensation c T increases with the amount of effort spent. 6.2 Network Case W e next in vestigate cyber risk management over enterprise networks and characterize the interdependencies between nodes. The unique solutions to the ODEs in Theorem 4 are then as follows: K t = ρ ( A − r I ) T − 1 ( A − r I ) T + I e ( A − rI ) T ( T − t ) − I , (44) m t = K T t R − 1 t K t 2 r e − r ( T − t ) − 1 , (45) The optimal effort of the risk manager is E ∗ t = R − 1 t ρ ( A − r I ) T − 1 ( A − r I ) T + I e ( A − rI ) T ( T − t ) − I , and the optimal compensation follows (36). W e first consider a cyber network containing two connected nodes. The system parameters are chosen as ρ = [ 5; 5 ] k$ / unit, r = 0 . 3, R t = [ 1 . 5 , 0; 0 , 1 . 5 ] k$ / unit 2 , T = 1 year , y 0 = [ 5; 5 ] unit, and J A = − 10 k$. Moreo ver , we compare three scenarios in terms of network interdependencies. Specifically , we ha ve case 1: A = [ 2 , 0 . 2; 0 , 2 ] , case 2: A = [ 2 , 0 . 5; 0 , 2 ] , and case 3: A = [ 2 , 0 . 8; 0 , 2 ] . Figure 5 sho ws the results, where we denote by E i ∗ t and Y i t the ef fort and the corresponding risk of node i , i = 1 , 2, respectively . Similar to the single-node case, both the effort and systemic risk decrease ov er time. Specifically , the dynamic ef fort con ver ges to R − 1 t ρ which can be verified directly by the analytical expression. Comparing E 1 ∗ t with E 2 ∗ t , we find that the risk manager should spend more effort on the nodes which can heavily influence other nodes. Ev en though there is no risk influence from node 1 to node 2, the optimal Dynamic Contract Design for Systemic Cyber Risk Management (a) Effort (b) Systemic cyber risk (c) Cumulative payment Fig. 4 (a), (b), and (c) show the ef fort, the cyber risk and the terminal payment under the optimal contract. The terminal compensation c T increases with the spent effort of the risk manager . effort E 2 ∗ t increases as the influence strength becomes larger from node 2 to node 1. This phenomenon is consistent with the idea of contr olling the origin to constrain the propagation of cyber risks. Furthermore, the value of E 2 ∗ t indicates that a higher network connecti vity requires more ef fort to mitigate the systemic cyber risk. W e next inv estigate a 4-node system where the network structures are sho wn in Fig. 6. The system parameters are the same as those in the 2-node case except for the matrix A . The diagonal entries in A are all equal to 2 and the off-diagonal entries that correspond to a link are all equal to 0.2. Figure 7 shows the results under the optimal mechanism. The risk manager spends more effort on node 1 in cases 2 and 3 than in case 1, as the risk of node 1 can propagate to node 4 in the former two cases. Another key observation is that the amount of allocated ef fort on each node mainly depends on its risk influences on other nodes rather than on the exogenous risks (node’ s outer degree), yielding a self-accountable risk mitigation scheme. For example, ev en though node 4 impacts node 2 in case 3, the risk management efforts on node 2 are close in cases 2 and 3. A similar pattern can be seen on node 4 in cases 1 and 2. This observation provides a distributed method of risk management which reduces the complexity of decision-making by simplifying the network structures and classifying the nodes based on their outer de grees. By comparing three cases, we Juntao Chen et al. (a) Effort (b) Systemic cyber risk (c) Cumulative payment Fig. 5 (a), (b), and (c) show the ef fort, the systemic risk and the terminal payment under the optimal contract. Case 1: A = [ 2 , 0 . 2; 0 , 2 ] ; Case 2: A = [ 2 , 0 . 5; 0 , 2 ] ; Case 3: A = [ 2 , 0 . 8; 0 , 2 ] . A higher network connectivity requires more ef fort to mitigate the systemic cyber risk. Fig. 6 Three different structures of enterprise network. The risk influence strengths are the same, admitting a value of 0.2 in matrix A . also conclude that more complex cyber interdependencies induce higher cost on the principal in the security in vestment. Note that in the above case studies, all variables were ev aluated under the expecta- tion with respect to the cyber risk uncertainty . As sho wn in Corollary 2, even though the expected compensation is independent of the network risk uncertainty , the ac- tual compensation during contract implementation is influenced by the volatility term Σ t ( Y t ) . W e present two scenarios in Fig. 8, where Fig. 8(a) and Fig. 8(b) are the com- Dynamic Contract Design for Systemic Cyber Risk Management (a) Effort (b) Systemic cyber risk (c) Cumulative payment Fig. 7 (a), (b), and (c) show the ef fort, the systemic risk and the terminal payment under the optimal contract. Each node is self-accountable for its risk influence on others. (a) (b) Fig. 8 (a) and (b) depict the optimal terminal payment under different risk volatility structure. The risk volatility of nodes is independent in (a), while the influence of risk volatility in (b) admits a cycle struc- ture as case 2 in Fig. 6. The results indicate that a larger interdependency of cyber risk volatility yields compensation schemes with a larger v ariance. pensation realizations under Σ t ( Y t ) = I and Σ t ( Y t ) = [ 1 , 1 , 0 , 0; 0 , 1 , 1 , 0; 0 , 0 , 1 , 1; 1 , 0 , 0 , 1 ] , respectiv ely . When the nodes’ risks face more sources of uncertainties in Fig. 8(b), Juntao Chen et al. the corresponding payment exhibits a larger v ariance comparing with the one in Fig. 8(a), which is consistent with the result of Corollary 2. 7 Conclusion In this paper, we hav e addressed the problem of dynamic systemic cyber risk man- agement of enterprise networks, where the principal provides contractual incentives to the manager , which include the compensations of direct cost of effort and indirect cost from risk uncertainties. This has in v olved a stochastic Stackelber g differential game with asymmetric information in a principal-agent setting. Under the optimal incentiv e compatible scheme we hav e designed, the principal has rational controlla- bility of the systemic risk where the suggested and adopted efforts coincide, and the risk manager’ s behavior is strategically neutral, depending only on the current net cost. Under mild conditions, we ha ve obtained a separation principle where the ef fort estimation and the remuneration design can be separately achieved. W e further hav e rev ealed a certainty equi v alence principle for a class of dynamic mechanism design problems where the information rent is equal to zero. Through case studies, we hav e identified the network effects in the systemic risk management where the connecti vity and node’ s outer degree play an important role in the decision making. Future work on this topic would consider cyber risk management of enterprise networks under Markov jump risk dynamics. References 1. Acemoglu, D., Ozdaglar, A., T ahbaz-Salehi, A.: Systemic risk and stability in financial networks. American Economic Revie w 105 (2), 564–608 (2015) 2. Athey , S., Se gal, I.: An efficient dynamic mechanism. Econometrica 81 (6), 2463–2485 (2013) 3. Aumann, R.J., Maschler , M., Stearns, R.E.: Repeated games with incomplete information. MIT press (1995) 4. Bas ¸ar, T .: An equilibrium theory for multiperson decision making with multiple probabilistic models. IEEE T ransactions on Automatic Control 30 (2), 118–132 (1985) 5. Bansal, R., Bas ¸ar, T .: Stochastic teams with nonclassical information re visited: When is an affine la w optimal? IEEE Transactions on Automatic Control 32 (6), 554–559 (1987) 6. Bas ¸ar, T .: Affine incentiv e schemes for stochastic systems with dynamic information. SIAM Journal on Control and Optimization 22 (2), 199–210 (1984) 7. Bas ¸ar, T .: Stochastic incenti ve problems with partial dynamic information and multiple levels of hier- archy . European Journal of Political Economy 5 (2-3), 203–217 (1989) 8. Bas ¸ar, T .: Stochastic differential games and intricacy of information structures. In: Dynamic Games in Economics, pp. 23–49. Springer , Berlin, Heidelberg (2014) 9. Bas ¸ar, T ., Bansal, R.: Optimum design of measurement channels and control policies for linear- quadratic stochastic systems. European Journal of Operational Research 73 (2), 226–236 (1994) 10. Biais, B., Mariotti, T ., Rochet, J.C., V illeneuve, S.: Large risks, limited liability , and dynamic moral hazard. Econometrica 78 (1), 73–118 (2010) 11. Bisias, D., Flood, M., Lo, A.W ., V alav anis, S.: A surv ey of systemic risk analytics. Annu. Rev . Financ. Econ. 4 (1), 255–296 (2012) 12. Cansever , D.H., Bas ¸ar, T .: On stochastic incentive control problems with partial dynamic information. Systems & Control Letters 6 (1), 69–75 (1985) 13. Cardaliaguet, P .: Differential games with asymmetric information. SIAM journal on Control and Optimization 46 (3), 816–838 (2007) Dynamic Contract Design for Systemic Cyber Risk Management 14. Cardaliaguet, P ., Rainer, C.: On a continuous-time game with incomplete information. Mathematics of Operations Research 34 (4), 769–794 (2009) 15. Carmona, R., Fouque, J.P ., Sun, L.H.: Mean field games and systemic risk. Communications in Mathematical Sciences 13 (4), 911–933 (2015) 16. Charalambous, C.D.: The role of information state and adjoint in relating nonlinear output feedback risk-sensitiv e control and dynamic games. IEEE Transactions on Automatic Control 42 (8), 1163– 1170 (1997) 17. Chen, J., T ouati, C., Zhu, Q.: A dynamic game approach to strategic design of secure and resilient infrastructure network. IEEE Transactions on Information Forensics and Security , T o Appear (2019). DOI 10.1109/TIFS.2019.2924130 18. Chen, J., T ouati, C., Zhu, Q.: Optimal secure two-layer IoT network design. IEEE Transactions on Control of Network Systems, T o Appear (2019). DOI 10.1109/TCNS.2019.2906893 19. Chen, J., Zhu, Q.: Security as a service for cloud-enabled internet of controlled things under adv anced persistent threats: a contract design approach. IEEE Transactions on Information Forensics and Secu- rity 12 (11), 2736–2750 (2017) 20. Chen, J., Zhu, Q.: A linear quadratic differential game approach to dynamic contract design for sys- temic cyber risk management under asymmetric information. In: 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 575–582 (2018) 21. Chen, J., Zhu, Q.: Interdependent strategic security risk management with bounded rationality in the Internet of things. IEEE Transactions on Information Forensics and Security 14 (11), 2958–2971 (2019) 22. Cherdantseva, Y ., Burnap, P ., Blyth, A., Eden, P ., Jones, K., Soulsby , H., Stoddart, K.: A revie w of cyber security risk assessment methods for SCAD A systems. Computers & Security 56 , 1–27 (2016) 23. Cho, I.K., Kreps, D.M.: Signaling games and stable equilibria. The Quarterly Journal of Economics 102 (2), 179–221 (1987) 24. Crowther , K.G., Haimes, Y .Y .: Application of the inoperability input–output model (IIM) for systemic risk assessment and management of interdependent infrastructures. Systems Engineering 8 (4), 323– 341 (2005) 25. Cvitanic, J., Zhang, J.: Contract Theory in Continuous-Time Models. Springer (2013) 26. Eisenberg, L., Noe, T .H.: Systemic risk in financial systems. Management Science 47 (2), 236–249 (2001) 27. Elliott, M., Golub, B., Jackson, M.O.: Financial networks and contagion. American Economic Re vie w 104 (10), 3115–53 (2014) 28. Fouque, J.P ., Langsam, J.A.: Handbook on Systemic Risk. Cambridge Uni versity Press (2013) 29. Garnier, J., Papanicolaou, G., Y ang, T .W .: Di versification in financial networks may increase systemic risk. Handbook on Systemic Risk p. 432 (2013) 30. Gershkov , A., Moldov anu, B.: Dynamic Allocation and Pricing: A Mechanism Design Approach, vol. 9. MIT Press (2014) 31. Gupta, A., Langbort, C., Bas ¸ar, T .: Dynamic games with asymmetric information and resource con- strained players with applications to security of cyberphysical systems. IEEE T ransactions on Control of Network Systems 4 (1), 71–81 (2016) 32. Gupta, A., Nayyar , A., Langbort, C., Bas ¸ar, T .: Common information based Marko v perfect equilibria for linear –Gaussian games with asymmetric information. SIAM Journal on Control and Optimization 52 (5), 3228–3260 (2014) 33. Hansen, E.A., Bernstein, D.S., Zilberstein, S.: Dynamic programming for partially observable stochastic games. In: AAAI, vol. 4, pp. 709–715 (2004) 34. James, M.R., Baras, J.: Partially observed differential games, infinite-dimensional Hamilton–Jacobi– Isaacs equations, and nonlinear H ∞ control. SIAM Journal on Control and Optimization 34 (4), 1342– 1364 (1996) 35. James, M.R., Baras, J.S., Elliott, R.J.: Risk-sensitiv e control and dynamic games for partially observ ed discrete-time nonlinear systems. IEEE Transactions on Automatic Control 39 (4), 780–792 (1994) 36. Karatzas, I., Shreve, S.: Bro wnian Motion and Stochastic Calculus. Springer (2012) 37. Knowles, W ., Prince, D., Hutchison, D., Disso, J.F .P ., Jones, K.: A survey of cyber security manage- ment in industrial control systems. International journal of critical infrastructure protection 9 , 52–80 (2015) 38. Kushner , H.J.: Numerical methods for stochastic control problems in continuous time. SIAM Journal on Control and Optimization 28 (5), 999–1048 (1990) 39. Li, J., Ou, X., Rajagopalan, R.: Uncertainty and risk management in cyber situational awareness. In: Cyber Situational A wareness, pp. 51–68. Springer (2010) Juntao Chen et al. 40. Miura-Ko, R.A., Y olken, B., Bambos, N., Mitchell, J.: Security in vestment games of interdependent organizations. In: Annual Allerton Conference on Communication, Control, and Computing, pp. 252–260 (2008) 41. Nguyen, K.C., Alpcan, T ., Bas ¸ar, T .: Stochastic games for security in networks with interdependent nodes. In: IEEE Conference on Game Theory for Networks, pp. 697–703 (2009) 42. Pawlick, J., Chen, J., Zhu, Q.: iSTRICT: An interdependent strategic trust mechanism for the cloud- enabled Internet of controlled things. IEEE T ransactions on Information Forensics and Security 14 (6), 1654–1669 (2019) 43. Refsdal, A., Solhaug, B., Stølen, K.: Cyber-risk management. In: Cyber-Risk Management, pp. 33– 47. Springer (2015) 44. Sannikov , Y .: A continuous-time version of the principal-agent problem. The Review of Economic Studies 75 (3), 957–984 (2008) 45. Sch ¨ attler , H., Sung, J.: The first-order approach to the continuous-time principal–agent problem with exponential utility . Journal of Economic Theory 61 (2), 331–371 (1993) 46. Sicari, S., Rizzardi, A., Grieco, L.A., Coen-Porisini, A.: Security , pri vac y and trust in Internet of things: The road ahead. Computer networks 76 , 146–164 (2015) 47. Srikant, R., Bas ¸ ar, T .: Asymptotic solutions to weakly coupled stochastic teams with nonclassical information. IEEE Transactions on Automatic Control 37 (2), 163–173 (1992) 48. T akabi, H., Joshi, J.B., Ahn, G.J.: Security and privac y challenges in cloud computing environments. IEEE Security & Priv acy 8 (6), 24–31 (2010) 49. V an Dijk, M., Juels, A., Oprea, A., Riv est, R.L.: Flipit: The game of stealthy takeov er. Journal of Cryptology 26 (4), 655–713 (2013) 50. Williams, N.: A solvable continuous time dynamic principal–agent model. Journal of Economic Theory 159 , 989–1015 (2015) 51. Y ong, J., Zhou, X.Y .: Stochastic controls: Hamiltonian systems and HJB equations, vol. 43. Springer (1999) 52. Y ¨ uksel, S., Bas ¸ar, T .: Stochastic networked control systems: Stabilization and optimization under information constraints. In: Systems & Control: Foundations and Applications Series. Birkh ¨ auser , Boston, MA (2013) 53. Zhu, Q., Bas ¸ar, T .: A dynamic game-theoretic approach to resilient control system design for cas- cading failures. In: Proceedings of the 1st International Conference on High Confidence Networked Systems, pp. 41–46. A CM (2012) 54. Zhu, Q., T embine, H., Bas ¸ ar, T .: Heterogeneous learning in zero-sum stochastic games with incom- plete information. In: IEEE Conference on Decision and Control (CDC), pp. 219–224 (2010) 55. Zhu, Q., Y uan, Z., Song, J.B., Han, Z., Bas ¸ar, T .: Interference aware routing game for cognitiv e radio multi-hop networks. IEEE Journal on Selected Areas in Communications 30 (10), 2006–2015 (2012)
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment