Nonzero-sum Adversarial Hypothesis Testing Games

Nonzer o-sum Adversarial Hypothesis T esting Games Sarath Y asodharan Department of Electrical Communication Engineering Indian Institute of Science Bangalore 560 012, India sarath@iisc.ac.in Patrick Loiseau Univ . Grenoble Alpes, Inria, CNRS, Grenoble INP , LIG & MPI-SWS 700 av enue Centrale Domaine Univ ersitaire 38400 St Martin d’Héres, France patrick.loiseau@inria.fr Abstract W e study nonzero-sum hypothesis testing games that arise in the context of adversar - ial classiﬁcation, in both the Bayesian as well as the Neyman-Pearson framew orks. W e ﬁrst show that these games admit mixed strategy Nash equilibria, and then we examine some interesting concentration phenomena of these equilibria. Our main results are on the exponential rates of con ver gence of classiﬁcation errors at equilibrium, which are analogous to the well-known Chernoff-Stein lemma and Chernoff information that describe the error e xponents in the classical binary hypothesis testing problem, but with parameters derived from the adversarial model. The results are validated through numerical e xperiments. 1 Introduction Classiﬁcation is a simple b ut important task that has numerous applications in a v ariety of domains such as computer vision or security . A traditional assumption that is used in the design of classiﬁcation algorithms is that the input data is generated without knowledge of the classiﬁer being used, hence the data distribution is independent of the classiﬁcation algorithm. This assumption is no longer valid in the presence of an adversary , as an adversarial agent can learn the classiﬁer and deliberately alter the data such that the classiﬁer makes an error . This is the case in particular in security applications where the classiﬁer’ s goal is to detect the presence of an adv ersary from the data it observes. Adversarial classiﬁcation has been studied in two main settings. The ﬁrst focuses on adversarial versions of a standard classiﬁcation task in machine learning, where the adversary attacks the classiﬁer (defender/decision maker) by directly choosing vectors from a given set of data vectors; whereas the second focuses on adversarial hypothesis testing, where the adversary (attacker) gets to choose a distribution from a set of distributions and independent data samples are generated from this distribution. The main dif ferences of the latter framew ork from the former are that: (i) the adversary only gets to choose a distrib ution (rather than the actual attack v ector) and data is generated independently from this distrib ution, and (ii) the defender makes a decision only once after it observes a whole data sequence instead of making a decision for each individual data sample it receives. Both of these framew orks hav e applications in a v ariety of domains, but prior literature has mainly focused on the ﬁrst setting; see Section 1.1 for a description of the related literature. 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), V ancouver , Canada. In this paper , we focus on the setting of adversarial hypothesis testing. T o model the interaction between the attacker and defender, we formulate a nonzero-sum two-player game between the adversary and the classiﬁer where the adversary picks a distribution from a gi ven set of distributions, and data is generated independently from that distribution (a non-attacker always generates data from a ﬁxed distrib ution). The defender on his side makes a decision based on observ ation of n data points. Our model can also be vie wed as a game-theoretic e xtension of the classical binary hypothesis testing problem where the distrib ution under the alternate hypothesis is chosen by an adv ersary . Based on our game model, we are then able to extend to the adversarial setting the main results of the classical hypothesis testing problem (see Section 2) on the form of the best decision rule and on the rates of decrease of classiﬁcation errors. More speciﬁcally , our contributions can be summarized as follows: 1. W e propose nonzero-sum games to model adversarial hypothesis testing problems in a ﬂexible manner . 2. W e show e xistence of mixed strategy Nash equilibria in which the defender employs certain likelihood ratio tests similar to that used in the classical binary hypothesis testing problem. 3. W e show that the classiﬁcation errors under all Nash equilibria for our hypothesis testing games decay exponentially fast in the number of data samples. W e analytically obtain these error exponents, and it turns out that they are same as those arising in certain classical hypothesis testing problem, with parameters deri ved from the adv ersarial model. 4. W e illustrate the results, in particular the importance of some assumptions, using simulations. Throughout our analysis, an important difﬁculty lies in that the strategy spaces of both the players are uncountable; we believ e, howe ver , that it is an important feature of the model to be realistic. 1.1 Related W ork Adversarial classiﬁcation and the security of machine learning have been studied extensiv ely in the past decade, see e.g., [ 9 , 21 , 4 , 15 , 19 , 24 ]; here we focus only on game-theoretic approaches to tackle the problem. Note that, besides the adversarial learning problem, game theory has been successfully used to tackle se veral other security problems such as allocation of monitoring resources to protect targets, see e.g., [8, 18]. W e revie w here only papers relating to classiﬁcation. A number of game-theoretic models have appeared in the past decade to study the adversarial classiﬁcation problem in the classical setting of classiﬁcation tasks in machine learning. [ 9 ] studies the best response in an adversarial classiﬁcation game, where the adversary is allowed to alter training data. A number of zero-sum game models were also proposed where the attacker is restricted on the amount of modiﬁcations he can do to the training set, see [ 17 , 31 , 30 ]. [ 6 ] studies the problem of choosing the best linear classiﬁer in the presence of an adversary (a similar model is also studied in [ 7 ]) using a nonzero-sum game, and shows the existence of a unique pure strategy Nash equilibrium. Similar to our formulation, the strategy sets in this case are uncountable, and therefore sho wing the existence and uniqueness of Nash equilibrium needs some work. Howe ver , in our formulation, there may not always e xist a Nash equilibrium in pure strategies, which makes the subsequent analysis of error exponents more dif ﬁcult. [ 20 ] studies an adversarial classiﬁcation g ame where the utilities of the players are deﬁned by using R OC curves. The authors study Nash equilibria for their model and provide numerical discretization techniques to compute the equilibria. [ 12 ] studies a nonzero-sum adversarial classiﬁcation game where the defender has no restriction on the classiﬁer , but the attacker is limited to a ﬁnite set of vectors. The authors show that the defender can, at equilibrium, use only a small subset of “threshold classiﬁers” and characterize the equilibrium through linear programming techniques. In our model, the utility functions share similarities with that of [ 12 ], but we work in the hypothesis testing frame work and with uncountable action sets, which completely modiﬁes the analysis. Sev eral studies appeared recently on “strategic classiﬁcation”, where the objectiv e of the attacker(s) is to improv e the classiﬁcation outcome in his own direction, see [14, 11]. On the other hand, adversarial hypothesis testing has been studied by far fe wer authors. [ 2 ] studies a source identiﬁcation game in the presence of an adv ersary , where the classiﬁer needs to distinguish between two source distributions P 0 and P 1 in which the adversary can corrupt samples from P 0 before it reaches the classiﬁer . They sho w that the game has an asymptotic Nash equilibrium when the number of samples becomes large, and compute the error exponent associated with the false negati ve probability . [3] and [29] study further extensions of this framew ork. A (non game-theoretic) hypothesis testing problem in an adversarial setting has been studied by [ 5 ], which is the closest to our work. Here, there are two sets of probability distributions and nature 2 outputs a ﬁxed number of independent samples generated by using distributions from either one of these two sets. The goal of the classiﬁer is to detect the true state of nature. The authors deri ve error exponents associated with the classiﬁcation error , in both Bayesian and Neyman-Pearson frame works using a worst-case maxmin analysis. Although we restrict to i.i.d. samples and let the non-attacker play a single distribution, we believe that our nonzero-sum game model with ﬂexible utilities can better capture the interaction between adversary and classiﬁer . There also exists extensi ve prior work within the statistics literature [ 16 ] on minimax hypothesis testing, which relates to our paper, b ut we defer a discussion of how our w ork differs from it to after we ha ve exposed the details of our model. Game-theoretic models were also used to study adversarial classiﬁcation in a sequential setting, see [28, 1, 22], but with v ery different techniques and results. 2 Basic Setup and Hypothesis T esting Background In this section, we present the basic setup results in classical binary hypothesis testing. Throughout the paper , we consider an alphabet set X that we assume ﬁnite. In a classical hypothesis testing problem, we are gi ven two distrib ution p and q , and a realization of a sequence of independent and identically distributed random variables X 1 , . . . , X n , which are distributed as either p (under hypothesis H 0 ) or q (under hypothesis H 1 ). Our goal is to distinguish between the two alternati ves: H 0 : X 1 , X 2 , . . . , X n i.i.d. ∼ p versus H 1 : X 1 , X 2 , . . . , X n i.i.d. ∼ q . In this setting, we could make two possible types of errors: (i) we declare H 1 , whereas the true state of nature is H 0 (T ype I error , or false alarm), and (ii) we declare H 0 whereas the true state of nature is H 1 (T ype II error, or missed detection). Note that one can make one of these errors arbitrarily small at the expense of the other by alw ays declaring H 0 or H 1 . The trade-off between the two types of errors can be captured using two framew orks. If we have knowledge on the prior probabilities of the two hypotheses, then we can seek a decision rule that minimizes the average probability of error (this is the Bayesian framework). On the other hand, if we do not have any information on the prior probabilities, then we can ﬁx ε > 0 and seek a decision rule that minimizes the T ype II error among all decision rules whose T ype I error is at most ε (this is the Neyman-Pearson frame work). In both of these frame works, it can be sho wn that the optimal test is a likelihood ratio test, i.e., given x n = ( x 1 , . . . , x n ) we compute the likelihood ratio q ( x n ) p ( x n ) and compare it to a threshold to make a decision (with possible randomization at the boundary in the Neyman-Pearson framework). Here, p ( x n ) (resp. q ( x n )) denotes the probability of observing the n -length word x n under the distribution p (resp. q ). See Section II.B and II.D in [ 25 ] for an introduction to hypothesis testing. For lar ge enough n , by the law of lar ge numbers, the fraction of i in an observation x n is very close to p ( i ) (resp. q ( i ) ) under H 0 (resp. under H 1 ), for each i ∈ X . Therefore, one anticipates that the probability of correct decision is v ery close to 1 for large enough n . Hence, one can study the rate at which the errors go to 0 as n becomes large. It is shown that, under both framew orks, the error decays exponentially in n . In the Bayesian frame work, the error e xponent associated with the a verage probability of error is − Λ ∗ 0 (0) , where Λ ∗ 0 ( · ) is the Fenchel-Legendre transform of the log-moment generating function of the random variable q ( X ) p ( X ) under H 0 , i.e., when X ∼ p . In the Neyman-Pearson case, the error exponent associated with the T ype II error is − D ( p || q ) where D is the relati ve entrop y functional. The above error exponents are known as Chernoff information and Chernof f-Stein lemma, respectiv ely (see Section 3.4 in [10] for the analysis on error exponents). In this work, we propose extensions of the classical hypothesis testing frame work to an adversarial scenario modeled as a game, both in the Bayesian and in the Ne yman-Pearson framew orks; and we in vestigate how the corresponding results are modiﬁed. Due to space constraints, we present only the model and results for the Bayesian framework in the main body of the paper . The corresponding analysis for the Ne yman-Pearson framew ork follows similar ideas and is rele gated to Appendix A. The proofs of all results presented in the paper (and in Appendix A) can be found in Appendix B. 3 3 Hypothesis T esting Game in the Bayesian Framework In this section, we formulate a one-shot adv ersarial hypothesis testing game in the Bayesian frame- work, moti vated by security problems where there might be an attacker who modiﬁes the data distribution and a defender who tries to detect the presence of the attacker . Game theoretic modelling of such problems has found great success in understanding the behavior of the agents via equilibrium analysis in many applications, see Section 1.1. W e ﬁrst present the model and then elaborate on its motiv ations and on how it relates to related w orks in statistics. 3.1 Problem F ormulation Let X = { 0 , 1 , . . . , d − 1 } denote the alphabet set with cardinality d , and let M 1 ( X ) denote the space of probability distributions on X . Fix n ≥ 1 . The game is played as follows. There are two players: the external agent and the defender . The external agent can either be a non-attacker or an attacker . In the Bayesian framework, we assume that the external agent is an attacker with probability θ , and a non-attacker (normal user) with probability 1 − θ . The non-attacker is not strate gic and she does not ha ve any adversarial objecti ve. If the external agent is a non-attacker , she generates n samples independently from the distribution p . If the external agent is an attacker , she picks a distribution q from a set of distributions Q ⊆ M 1 ( X ) and generates n samples independently from q . The defender , upon observing the n -length word generated by the external agent, wants to detect the presence of the attack er . Throughout the paper , a decision rule implemented by the defender is denoted by ϕ : X n → [0 , 1] , with the interpretation that ϕ ( x n ) is the probability with which hypothesis H 1 is accepted (i.e., the presence of an adversary is declared) when the defender observes the n -length word x n = ( x 1 , . . . , x n ) . W e say that a decision rule ϕ is deterministic if ϕ ( x n ) ∈ { 0 , 1 } for all x n ∈ X n . T o deﬁne the game, let the attacker’ s strategy set be Q ⊆ M 1 ( X ) , and that of the defender be Φ n = { ϕ : X n → [0 , 1] } , which is the set of all randomized decision rules on n -length words. T o deﬁne the utilities, consider the attacker ﬁrst. W e assume that there is a cost associated with choosing a distribution from Q which we model using a cost function c : Q → R + . The goal of the attacker is to fool the defender as much as possible, i.e., he wants to maximize the probability that the defender classiﬁes an n -length word as coming from the non-attacker whereas it is actually being generated by the attacker . T o capture this, the utility of the attacker when she plays the pure strategy q ∈ Q and the defender plays the pure strategy ϕ ∈ Φ n is deﬁned as u A n ( q , ϕ ) = X x n (1 − ϕ ( x n )) q ( x n ) − c ( q ) , (3.1) where q ( x n ) denotes the probability of observing the n -length word x n when the symbols are generated independently from the distribution q . For the defender , the goal is to minimize the classiﬁcation error . Similar to the classical hypothesis testing problem, there could be tw o types of errors: (i) the external agent is actually a non-attacker whereas the defender declares that there is an attack (T ype I error , or false alarm), and (ii) the external agent is an attacker whereas the defender declares that there is no attack (T ype II error, or missed detection). The goal of the defender is to minimize a weighted sum of the above two types of errors. After suitable normalization, we deﬁne the utility of the defender as u D n ( q , ϕ ) = − X x n (1 − ϕ ( x n )) q ( x n ) + γ X x n ϕ ( x n ) p ( x n ) ! , (3.2) where γ > 0 is a constant that captures the exogenous probability of attack (i.e., θ ), as well as the relativ e weights given to the error terms. W e denote our Bayesian hypothesis testing game with utility functions (3.1) and (3.2) by G B ( d, n ) . W ith a slight abuse of notation, we denote by u A n ( σ A n , σ D n ) and u D n ( σ A n , σ D n ) , the utility of the players under a mixed strate gy ( σ A n , σ D n ) , where σ A n ∈ M 1 ( Q ) , and σ D n ∈ M 1 (Φ n ) . For our analysis of game G B ( d, n ) , we will make use of the following assumptions: 4 (A1) Q is a closed subset of M 1 ( X ) , and p / ∈ Q . (A2) p ( i ) > 0 for all i ∈ X . Furthermore, for each q ∈ Q , q ( i ) > 0 for all i ∈ X . (A3) c is continuous on Q , and there exists a unique q ∗ ∈ Q such that q ∗ = arg min q ∈ Q c ( q ) . (A4) The point p is distant from the set Q relati ve to the point q ∗ , i.e., { µ ∈ M 1 ( X ) : D ( µ || p ) ≤ D ( µ || q ∗ ) } ∩ Q = ∅ , where D ( µ || ν ) = P i ∈X µ ( i ) log µ ( i ) ν ( i ) , µ, ν ∈ M 1 ( X ) , denotes the relati ve entropy between the distributions µ and ν . Note that (A1) and (A2) are very natural. In (A2), if p ( i ) = 0 for some i ∈ X and q ( i ) > 0 for some q ∈ Q , then the adversary will never pick q , as the defender can easily detect the presence of the attacker by looking for element i . On the other hand, if p ( i ) = 0 and q ( i ) = 0 for all q ∈ Q , we may consider a ne w alphabet set without i . In (A3), continuity of the cost function c is natural and we do not assume any extra condition other than the requirement that there is a unique minimizer . Assumption (A4) is used to sho w certain property of the equilibrium of the defender , which is later used in the study of error exponents associated with classiﬁcation error . Speciﬁcally , Assumption (A4) is used in the proofs of Lemma 4.4, Lemma 4.5 and Theorem 4.1; all other results are valid without this assumption. W e will further discuss the role of (A3) and (A4) in Section 4.3 after Theorem 4.1. 3.2 Model discussion Our setting is that of adversarial hypothesis testing, where the attacker chooses a distribution and points are then generated i.i.d. according to it. This is a reasonable model in applications such as multimedia forensics (where one tries to determine if an attacker has tampered with an image from signals that can be modeled as random variables follo wing an image-dependent distribution) or biometrics (where again one tries to detect from random signals whether the perceiv ed signals do come from the characteristics of a giv en indi vidual or they come from tampered characteristics)—see more details about these applications in [ 2 , 3 , 29 ]. In such applications, it is reasonable that different ways of tampering have dif ferent costs for the attacker and that one can estimate those costs for a gi ven application at least to some extent. Modeling the attack er’ s utility via a cost function is classical in other settings, for instance in adversarial classiﬁcation [ 12 , 28 , 6 ] and experiments with real-world applications where a reasonable cost function can be estimated has been done, for instance, in [6]. Our setting is very similar to that of a composite hypothesis testing frame work where nature picks a distribution from a gi ven set and generates independent samples from it. Howe ver , in such problems, one does not model a utility function for the nature/statistician and one is often interested in existence and properties of uniformly most powerful test or locally most powerful test (depending on the Bayesian or frequentist approach; see Section II.E in [ 25 ]). In contrast, here, we speciﬁcally model the utility functions for the agents and in vestigate the behavior at Nash equilibrium using v ery different analysis, which is more natural in adv ersarial settings where two rational agents interact. Our setting also coincides with the well-studied setting of minimax testing [ 16 ] when c ( q ) = 0 for all q ∈ Q (and hence every q is a minimizer of c ). Note, howe ver , that this case is not included in our model due to Assumption (A3)—rather we study the opposite extreme where c has a unique minimizer . Our results are not an easy e xtension of the classical results because our game is no w a nonzero-sum game (whereas the minimax setting corresponds to a zero-sum game). W e can therefore not inherit any of the nice properties of zero-sum games; in particular we cannot compute the NE and we instead ha ve to pro ve properties of the NE (e.g., concentration) without being able to explicitly compute it. In fact, our results too are quite different since we sho w that the error rate is the same as a simple test where H 1 would contain only q ∗ , which is different from the classical minimax case. Finally , in our model we ﬁx the sample size n , i.e., the defender makes a decision only after observing all n samples. W e restrict to this simpler setting since it has applications in v arious domains (see Section 1.1), and understanding the equilibrium of such games leads to interesting and non-tri vial results. W e leave the study of a sequential model where the defender has the ﬂexibility to choose the number of samples for decision making as future work. 5 4 Main Results 4.1 Mixed Strategy Nash Equilibrium for G B ( d, n ) W e ﬁrst examine the Nash equilibrium for G B ( d, n ) . Note that the strategy sets of both the attacker and the defender are uncountable, hence it is a priori not clear whether our game has a Nash equilibrium. T ow ards this, we equip the set Φ n of all randomized decision rules with the sup-norm metric, i.e., d n ( ϕ 1 , ϕ 2 ) = max x n ∈X n | ϕ 1 ( x n ) − ϕ 2 ( x n ) | , for ϕ 1 , ϕ 2 ∈ Φ n . It is easy to see that the set Φ n endowed with the above metric is a compact metric space. W e also equip M 1 ( X ) with the usual Euclidean topology on R d , and equip Q with the subspace topology . Also, for studying the mixed e xtension of the game, we equip the spaces M 1 ( Q ) and M 1 (Φ n ) with their corresponding weak topologies. Product spaces are always equipped with the corresponding product topology . W e begin with a simple continuity property of the utility functions. Lemma 4.1. Assume (A1)-(A3). Then, the utility functions u A n and u D n ar e continuous on Q × Φ n . W e now show the main result of this subsection, namely existence and partial characterization of a NE for our hypothesis testing game. Proposition 4.1. Assume (A1)-(A3). Then, there exists a mixed strate gy Nash equilibrium for G B ( d, n ) . If ( ˆ σ A n , ˆ σ D n ) is a NE, then so is ( ˆ σ A n , ˆ ϕ n ) wher e ˆ ϕ n is the likelihood ratio test given by ˆ ϕ n ( x n ) =    1 , if q ˆ σ A n ( x n ) − γ p ( x n ) > 0 , ϕ ˆ σ D n , if q ˆ σ A n ( x n ) − γ p ( x n ) = 0 , 0 , if q ˆ σ A n ( x n ) − γ p ( x n ) < 0 , (4.1) wher e q ˆ σ A n ( x n ) = R q ( x n ) ˆ σ A n ( dq ) , and ϕ ˆ σ D n = R ϕ ( x n ) ˆ σ D n ( dϕ ) . The existence of a NE follows from Glicksber g’ s ﬁxed point theorem (see e.g., Corollary 2.4 in [ 26 ]); for the form of the defender’ s equilibrium strategy , we hav e to examine the utility function u D n . Remark 4.1 . Note that we have considered randomization o ver Φ n to show e xistence of a NE. Once this is established, we can then sho w the form of the strategy of the defender ˆ ϕ n at equilibrium; the existence of a NE is not clear if we do not consider randomization o ver Φ n . Remark 4.2 . Note that the distribution q ˆ σ A n on X n cannot necessarily be written as an n -fold product distribution of some element from M 1 ( X ) . Therefore, the test ˆ ϕ n is slightly different from the usual likelihood ratio test that appears in the classical hypothesis testing problem where samples are generated independently . Remark 4.3 . Apart from the conditions of the abov e proposition, a sufﬁcient condition for e xistence of pure strategy Nash equilibrium is that the utilities are individually quasiconca ve, i.e., u A n ( · , ϕ ) is quasiconcav e for all ϕ ∈ Φ n , and u D n ( q , · ) is quasiconca ve for all q ∈ Q . Howe ver , it is easy to check that the T ype II error term is not quasiconca ve in the attacker’ s strategy , and hence the utility of the attacker is not quasiconcav e. Hence, a pure strategy Nash equilibrium is not guaranteed to exist—see numerical experiments in Appendix A. Remark 4.4 . Proposition 4.1 does not provide any information about the structure of the attack er’ s strategy at a NE. W e believ e that obtaining the complete structure of a NE and computing it is a difﬁcult problem in general because the strategy spaces of both players are uncountable (and there is no pure-strate gy NE in general), and we cannot use the standard techniques for ﬁnite games. Howe ver , we emphasize that we are able to obtain error exponents at an equilibrium (see Theorem 4.1) without explicitly computing the structure of a NE. Also, one could study Stackelberg equilibrium for our game G B ( d, n ) to help solve computational issues, although we note that most of the security games literature using Stackelberg games assumes ﬁnite action spaces (see, for example, [ 18 ]); howe ver we do not address the study of Stackelber g equilibrium in this paper . 4.2 Concentration Properties of Equilibrium W e now study some concentration properties of the mixed strate gy Nash equilibrium for the game G B ( d, n ) for large n . The results in this section will be used later to show the e xponential conv ergence of the classiﬁcation error at equilibrium. 6 Let e n denote the classiﬁcation error, i.e., e n ( q , ϕ ) = − u D n ( q , ϕ ) , q ∈ Q, ϕ ∈ Φ n . W e begin with the following lemma, which asserts that the error at equilibrium is small for lar ge enough n . Lemma 4.2. Assume (A1)-(A3). Let ( ˆ σ A n , ˆ σ D n ) n ≥ 1 be a sequence such that, for eac h n ≥ 1 , ( ˆ σ A n , ˆ σ D n ) is a mixed strate gy Nash equilibrium for G B ( d, n ) . Then, e n ( ˆ σ A n , ˆ σ D n ) → 0 as n → ∞ . The main idea in the proof is to let the defender play a decision rule whose acceptance set is a small neighborhood around the point p , and then bound e n ( ˆ σ A n , ˆ σ D n ) using the error of the abov e strategy . W e no w sho w that the mix ed strategy proﬁle of the attacker ˆ σ A n con verges weakly to the point mass at q ∗ (denoted by δ q ∗ ) as n → ∞ . This is a consequence of the fact that q ∗ is the minimizer of c , and hence for large enough n , the attack er does not gain much by de viating from the point q ∗ . Lemma 4.3. Assume (A1)-(A3), and let ( ˆ σ A n , ˆ σ D n ) n ≥ 1 be as in Lemma 4.2. Then, ˆ σ A n → δ q ∗ weakly as n → ∞ . Note that it is not clear from the abov e lemma that the equilibrium strategy of the attacker ˆ σ A n is supported on a small neighborhood around q ∗ for large enough n . By playing a strategy q that is far from q ∗ we could still hav e u A n ( q , ˆ σ D n ) = u A n ( ˆ σ A n , ˆ σ D n ) , since the error term in u A n could compensate for the possible loss of utility from the cost term. W e now proceed to sho w that this cannot happen under Assumption (A4). W e ﬁrst argue that the equilibrium error is small even when the attacker deviates from her equilibrium strate gy . Lemma 4.4. Assume (A1)-(A4), and let ( ˆ σ A n , ˆ σ D n ) n ≥ 1 be as in Lemma 4.2. Then, sup q ∈ Q e n ( q , ˆ σ D n ) → 0 as n → ∞ . W e are now ready to sho w the concentration of the attacker’ s equilibrium: Lemma 4.5. Assume (A1)-(A4), and let ( ˆ σ A n , ˆ σ D n ) n ≥ 1 be as in Lemma 4.2. Let ( q n ) n ≥ 1 be a sequence such that q n ∈ supp ( ˆ σ A n ) for each n ≥ 1 . Then, q n → q ∗ as n → ∞ . The abo ve concentration phenomenon is a consequence of the uniqueness of q ∗ and Assumption (A4). The main idea in the proofs of Lemma 4.4 and Lemma 4.5 is to essentially show that, for lar ge enough n , the acceptance region of H 0 under (any) mixed strategy Nash equilibrium does not intersect the set Q . If we do not assume (A4), then the decision region at equilibrium could intersect Q , and we may not hav e the concentration property in the abov e lemma (we will still hav e the con vergence property in Lemma 4.3 though, which does not use (A4)). 4.3 Error Exponents W ith the results on concentration properties of the equilibrium from the previous section, we are now ready to examine the error exponent associated with the classiﬁcation error at equilibrium. Let Λ 0 denote the log-moment generating function of the random variable q ∗ ( X ) p ( X ) under H 0 , i.e., when X ∼ p : Λ 0 ( λ ) = log P i ∈X exp  λ q ∗ ( i ) p ( i )  p ( i ) , λ ∈ R . Deﬁne its Fenchel-Legendre transform Λ ∗ 0 ( x ) = sup λ ∈ R { λx − Λ 0 ( λ ) } , x ∈ R . Our main result in the paper (for the Bayesian case) is the following theorem. Theorem 4.1. Assume (A1)-(A4), and let ( ˆ σ A n , ˆ σ D n ) n ≥ 1 be as in Lemma 4.2. Then, lim n →∞ 1 n log e n ( ˆ σ A n , ˆ σ D n ) = − Λ ∗ 0 (0) . Our approach to show this result is via obtaining asymptotic lo wer and upper bounds for the classiﬁ- cation error at equilibrium e n ( ˆ σ A n , ˆ σ D n ) . Since we do not hav e much information about the structure of the equilibrium, we ﬁrst let one of the players de viate from their equilibrium strategy , so that we can estimate the error corresponding to the new pair of strategies, and then use these estimates to compute the error rate at equilibrium. The lo wer bound easily follows by letting the attacker play the strategy q ∗ , and using the error exponent in the classical h ypothesis testing problem between p and q ∗ . For the upper bound, we let the defender play a speciﬁc deterministic decision rule, and make use of the concentration properties of the equilibrium of the attacker in Section 4.2. 7 Thus, we see that the error e xponent is the same as that for the classical h ypothesis testing problem of X 1 , . . . , X n i.i.d. ∼ p versus X 1 , . . . , X n i.i.d. ∼ q ∗ (see Corollary 3.4.6 in [ 10 ]). That is, for large values of n , the adversarial hypothesis testing game is not much dif ferent from the abov e classical setting (whose parameters are deriv ed from the adversarial setting) in terms of classiﬁcation error . W e emphasize that we hav e not used any property of the speciﬁc structure of the mixed strategy Nash equilibrium in obtaining the error exponent associated with the classiﬁcation error, and hence Theorem 4.1 is valid for an y NE. W e believ e that obtaining the actual structure of a NE is a difﬁcult problem, as the strategy spaces are inﬁnite, and the utility functions do not possess any monotonicity properties in general. For numerical computation of error exponents in a simple case, see Section 5. W e conclude this section by discussing the role of Assumptions (A3) and (A4). W e used (A4) to obtain the concentration of equilibrium in Lemma 4.5. W ithout this assumption, Theorem 4.1 is not v alid; see Section 5 for numerical counter-examples. Also, in our setting, unlike the classical minimax testing, it is not clear whether it is always true that the error goes to 0 as the number of samples becomes large, and whether the attacker should alw ays play a point close to q ∗ at equilibrium. It could be that playing a point far from q ∗ is better if she can compensate the loss from c from the error term. In fact, that is what happens when (A4) is not satisﬁed, since there is partial overlap of the decision region of the defender with the set Q . Regarding (A3), when c has multiple minimizers, our analysis can only tell us that the equilibrium of the attacker is supported around the set of minimizers for large enough n ; to study error e xponents in such cases, one has to do a ﬁner analysis of characterizing the attacker’ s equilibrium. All in all, using (A3) and (A4) allo ws us to establish interesting concentration properties of the equilibrium (which is not a priori clear) and error exponents associated with classiﬁcation error without characterizing a NE, hence we believ e that these assumptions serve as a good starting point. 5 Numerical Experiments In this section, we present two numerical examples in the Bayesian formulation to illustrate the result in Theorem 4.1 and the importance of Assumption (A4). Due to space limitations, additional experiments in the Bayesian formulation are rele gated to Appendix C, which illustrate (a) best response strategies of the players, (b) existence of pure strategy Nash equilibrium for lar ge values on n as suggested by Lemma 4.5, and (c) importance of Assumption (A4). 1 W e illustrate the result in Theorem 4.1 numerically in the follo wing setting. W e ﬁx X = { 0 , 1 } (i.e. d = 2 ) and each probability distrib ution on X is represented by the probability that it assigns to the symbol 1 , and hence M 1 ( X ) is viewed as the unit interv al. W e ﬁx p = 0 . 5 . For numerical compu- tations, we discretize the set Q into 100 equally spaced points, and we only consider deterministic threshold-based decision rules for the defender . T o compute a NE, we solve the linear programs associated with attacker as well as the defender for the zero-sum equi valent game of G B (2 , n ) . For the function c ( q ) = | q − q ∗ | with q ∗ = 0 . 8 , Figure 1(a) shows the error exponent at the NE computed by the abov e procedure as a function of the number of samples, from n = 10 to n = 300 in steps of 10 . As suggested by Theorem 4.1, we see that the error exponents approach the value Λ ∗ 0 (0) = 0 . 054 (the boundary of the decision region is around the point q = 0 . 66 , and D ( q || p ) ≈ D ( q || q ∗ ) ≈ 0 . 054 ). W e now consider an e xample which demonstrates that, the result on error exponent in Theorem 4.1 may not be v alid if Assumption (A4) is not satisﬁed. In this experiment, we consider the case where Q = [0 . 6 , 0 . 9] and q ∗ = 0 . 9 . Note that the present setting does not satisfy Assumption (A4). Figure 1(b) shows the error exponent at the equilibrium as a function of n , from n = 100 to n = 400 in steps of 100 , for the cost function c ( q ) = 3 | q − q ∗ | . From this plot, we see that, the error e xponents con verge to some where around 0 . 032 , whereas Λ ∗ 0 (0) ≈ 0 . 111 . 6 Concluding Remarks In this paper , we studied hypothesis testing games that arise in the conte xt of adv ersarial classiﬁcation. W e showed that, at equilibrium, the strate gy of the classiﬁer is to use a likelihood ratio test. W e also 1 Appendix C also contains numerical experiments in the Neyman-Pearson formulation presented in Ap- pendix A. The code used for our simulations is av ailable at https://github.com/sarath1789/ahtg_ neurips2019 . 8 (a) Q = [0 . 7 , 0 . 9] , c ( q ) = | q − 0 . 8 | (b) Q = [0 . 6 , 0 . 9] , c ( q ) = 3 | q − 0 . 9 | Figure 1: Error exponents as a function of n examined the exponential rate of decay of classiﬁcation error at equilibrium and showed that it is same as that of a classical testing problem with parameters deriv ed from the adversarial model. Throughout the paper , we assumed that the alphabet X is ﬁnite. This is a reasonable assumption in applications that deal with digital signals such as image forensics (an important application for adversarial hypothesis testing); and it is also a good starting point because ev en in this case, our analysis of the error exponents is nontri vial. Making X countable/uncountable will make the space M 1 ( X ) inﬁnite dimensional, and the analysis of error exponents will become more difﬁcult (e.g., the continuity of relati ve entropy is no longer true in this case, which we crucially use in our analysis), but the case of general state space X is an interesting future direction. Finding the exact structure of the equilibrium for our hypothesis testing games is a challenging future direction. This will also shed some light on the error exponent analysis for the case when Assumption (A4) is not satisﬁed. Another interesting future direction is to e xamine the hypothesis testing game in the sequential detection context where the defender can also decide the number of data samples for classiﬁcation. In such a setting, an important question is to understand whether the optimal strategy of the classiﬁer is to use a standard sequential probability ratio test. Acknowledgments The ﬁrst author is partially supported by the Cisco-IISc Research Fellowship grant. The work of the second author was supported in part by the French National Research Agency (ANR) through the “In vestissements d’av enir” program (ANR-15-IDEX-02) and through grant ANR-16- TERC0012; and by the Alexander v on Humboldt Foundation. References [1] Bao, N., Kreidl, P ., and Musacchio, J. (2011). Binary hypothesis testing game with training data. In Pr oceedings of the 2nd International Conference on Game Theory for Networks (GameNets) , pages 265–280. [2] Barni, M. and T ondi, B. (2013). The source identiﬁcation game: An information-theoretic perspectiv e. IEEE T ransactions on Information F orensics and Security , 8(3):450–463. [3] Barni, M. and T ondi, B. (2014). Binary hypothesis testing game with training data. IEEE T ransactions on Information Theory , 60(8):4848–4866. [4] Barreno, M., Nelson, B., Joseph, A. D., and T ygar, J. D. (2010). The security of machine learning. Machine Learning , 81(2):121–148. [5] Brandão, F . G. S. L., Harrowy, A. W ., Leez, J. R., and Peres, Y . (2014). Adversarial h ypothesis testing and a quantum stein’ s lemma for restricted measurements. In Proceedings of the 5th Innovations in Theor etical Computer Science confer ence (ITCS) , pages 183–194. [6] Brückner, M., Kanzow, C., and Schef fer, T . (2012). Static prediction games for adversarial learning problems. The Journal of Machine Learning Resear ch , 13(1):2617–2654. 9 [7] Brückner, M. and Scheffer, T . (2011). Stackelber g games for adversarial prediction problems. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) , pages 547–555. [8] Chen, L. and Leneutre, J. (2009). A game theoretical frame work on intrusion detection in heterogeneous networks. IEEE T ransactions on Information F orensics and Security , 4(2):165–178. [9] Dalvi, N., Domingos, P ., Mausam, Sanghai, S., and V erma, D. (2004). Adversarial classiﬁcation. In Pr oceedings of the 10th A CM SIGKDD International Conference on Knowledge Disco very and Data Mining (KDD) , pages 99–108. [10] Dembo, A. and Zeitouni, O. (2010). Lar ge Deviations: T echniques and Applications . Springer-V erlag, Berlin Heidelberg, 2nd edition. [11] Dong, J., Roth, A., Schutzman, Z., W aggoner, B., and W u, Z. S. (2018). Strategic classiﬁcation from rev ealed preferences. In Pr oceedings of the 2018 A CM Confer ence on Economics and Computation (EC) , pages 55–70. [12] Dritsoula, L., Loiseau, P ., and Musacchio, J. (2017). A game-theoretic analysis of adversarial classiﬁcation. IEEE T ransactions on Information F orensics and Security , 12(12):3094–3109. [13] Ferguson, T . S. (1967). Mathematical Statistics: A Decision Theoretic Appr oach . Academic Press, New Y ork. [14] Hardt, M., N, M., Papadimitriou, C., and W ootters, M. (2016). Strategic classiﬁcation. In Pr oceedings of the 7th Innovations in Theor etical Computer Science conference (ITCS) , pages 111–122. [15] Huang, L., Joseph, A. D., Nelson, B., Rubinstein, B. I. P ., and T ygar, J. D. (2011). Adversarial machine learning. In Proceedings of the 2011 A CM W orkshop on Artiﬁcial Intelligence and Security (AISec) , pages 43–58. [16] Ingster, Y . I. and Suslina, I. A. (2003). Nonparametrie Goodness-of-F it T esting Under Gaussian Models , volume 169 of Lectur e Notes in Statistics . Springer-V erlag, New Y ork. [17] Kantarcioglu, M., Xi, B., and Clifton, C. (2011). Classiﬁer ev aluation and attribute selection against 344 activ e adversaries. Data Mining and Knowledge Disco very , 22(1):291–335. [18] K orzhyk, D., Y in, D., Kiekintveld, C., Conitzer, V ., and T ambe, M. (2011). Stackelberg vs. nash in security games: An extended in vestigation of interchangeability , equiv alence, and uniqueness. Journal of Artiﬁcial Intelligence Resear ch , 41(2):297–327. [19] Li, B. and V orobe ychik, Y . (2015). Scalable optimization of randomized operational decisions in adv ersarial classiﬁcation settings. In Pr oceedings of the Eighteenth International Confer ence on Artiﬁcial Intelligence and Statistics (AIST ATS) , pages 599–607. [20] Lisý, V ., Kessl, R., and Pe vý, T . (2014). Randomized operating point selection in adversarial classiﬁcation. In Pr oceedings of the Eur opean Confer ence on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD) , pages 240–255. [21] Lowd, D. and Meek, C. (2005). Adversarial learning. In Pr oceedings of the 11th ACM SIGKDD International Confer ence on Knowledge Discovery in Data Mining (KDD) , pages 641–647. [22] L ye, K. and Wing, J. M. (2005). Game strategies in network security . International Journal of Information Security , 4(1-2):71–86. [23] Osborne, M. (2003). An Introduction to Game Theory . Oxford University Press, 1st edition. [24] Papernot, N., McDaniel, P ., Sinha, A., and W ellman, M. (2018). T owards the science of security and priv acy in machine learning. In Pr oceedings of the 3rd IEEE Eur opean Symposium on Security and Privacy . [25] Poor, H. (1994). An Intr oduction to Signal Detection and Estimation . Springer-V erlag, New Y ork, 2nd edition. [26] Reny, P . J. (2005). Non-cooperative games: Equilibrium existence. In The New P algrave Dictionary of Economics . Palgra ve Macmillan. [27] Shiryaev, A. N. (2016). Probability-1 . Springer-V erlag, Ne w Y ork, 3rd edition. [28] Soper, B. and Musacchio, J. (2015). A non-zero-sum, sequential detection game. In Proceedings of the Allerton Confer ence on Communication, Control , pages 361–371. 10 [29] T ondi, B., Merhav, N., and Barni, M. (2019). Detection games under fully activ e adversaries. Entr opy , 21(1):23. [30] Zhou, Y . and Kantarcioglu, M. (2014). Adversarial learning with bayesian hierarchical mixtures of experts. In Pr oceedings of the 2014 SIAM International Conference on Data Mining (SDM) , pages 929–937. [31] Zhou, Y ., .Kantarcioglu, M., Thuraisingham, B., and Xi, B. (2012). Adversarial support vector machine learning. In Pr oceedings of the 18th A CM SIGKDD International Conference on Knowledg e Discovery and Data Mining (KDD) , pages 1059–1067. A Hypothesis T esting Game: Neyman-Pearson F ormulation In this section, we study the Neyman-Pearson version of the h ypothesis testing problem. The presentation of results in this section is similar to the Bayesian case, and we will also use the same notation as in the Bayesian formulation. A.1 Problem F ormulation In the Ne yman-Pearson point of vie w , we do not assume any kno wledge on the probability that the external agent is an attacker . Fix ε > 0 . As before, the strategy set of the attacker is the set Q . For the defender , motiv ated by the Neyman-Pearson approach for the classical hypothesis testing problem, we deﬁne the strategy set as the set of all randomized decision rules on n -length words whose false alarm probability is at most ε , i.e., Φ n = { ϕ : X n → [0 , 1] : P F A n ( ϕ ) ≤ ε } , where P F A n ( ϕ ) = P x n ϕ ( x n ) p ( x n ) denotes the false alarm probability under the decision rule ϕ . W e now deﬁne the utilities. Similar to the Bayesian framework, the utility of the attacker is deﬁned as u A n ( q , ϕ ) = X x n (1 − ϕ ( x n )) q ( x n ) − c ( q ) . (A.1) Since we ha ve already constrained the strate gy set of the defender by imposing an upper bound on the T ype I error , the utility of the defender is deﬁned as u D n ( q , ϕ ) = − X x n (1 − ϕ ( x n )) q ( x n ) ! , (A.2) which captures the T ype II error . W e denote our two-player Neyman-Pearson hypothesis testing game with utility functions (A.1) and (A.2) by G N P ( ε, d, n ) . Similar to the Bayesian case, we will use Assumptions (A1)-(A4) throughout the analysis of the Neyman-Pearson case. A.2 Mixed Strategy Nash Equilibrium for G N P ( ε, d, n ) W e now examine the mixed strategy Nash equilibrium for the game G N P ( ε, d, n ) . Similar to the Bayesian framew ork, we endow the set Q with the standard Euclidean topology on R d and the set Φ n with the sup-norm metric. The following lemma asserts compactness of the strate gy space of the defender . Lemma A.1. The set Φ n equipped with the metric d n is a compact metric space. W e also have joint continuity of the utility functions of both the players, and it can be prov ed similar to Lemma 4.1. Lemma A.2. Assume (A1)-(A3). Then, the utility functions u A n and u D d ar e continuous on Q × Φ n . W e are now ready to sho w the existence of a mixed strategy Nash equilibrium. Proposition A.1. Assume (A1)-(A3). Then, there e xists a mixed strate gy Nash equilibrium for G N P ( ε, d, n ) . If ( ˆ σ A n , ˆ σ D n ) is a NE, then ˆ σ D n is the point mass at the most powerful ε -level Ne yman-P earson test for X 1 , . . . , X n i.i.d. ∼ p versus ( X 1 , . . . , X n ) ∼ q ˆ σ A n . The existence of a mixed strategy equilibrium easily follows from the compactness of strategy spaces and continuity of utility functions. T o show the speciﬁc form of defender’s equilibrium strategy , we appeal to the Neyman-Pearson lemma. Let ( ˆ σ A n , ˆ ϕ n ) denote a NE given by Proposition A.1. 11 A.3 Characterization of Equilibrium in the Binary case Consider the game G N P ( ε, d, n ) in the binary case, i.e., d = 2 . Here, there are some interesting monotonicity properties of the utility functions that allow us to get a pure strategy Nash equilibrium for G N P ( ε, 2 , n ) , in which the defender plays a threshold-based test, i.e., declares the presence of an adversary whenev er the number of 1’ s in the observation e xceeds a threshold: Lemma A.3. Assume (A1)-(A3). Then, the defender admits a strictly dominant strate gy , and there e xists a pure strate gy Nash equilibrium for G N P ( ε, 2 , n ) . Remark A.1 . The monotonicity alluded to above is a consequence of the fact that u D n captures just the T ype II error . In the Bayesian framework, we do not hav e this monotonicity in u D n , due to the presence of both the T ype I and T ype II errors in u D n , and hence, existence of a pure strategy Nash equilibrium in the binary case cannot be guaranteed in the Bayesian framew ork. A.4 Concentration Properties of Equilibrium In this section, we study some concentration properties of the equilibrium. W e have the follo wing two lemmas, which can be prov ed similar to the corresponding Lemmas for the Bayesian formulation in Section 4.2. Lemma A.4. Assume (A1)-(A3). Let ( ˆ σ A n , ˆ ϕ n ) n ≥ 1 be a sequence such that, for each n ≥ 1 , ( ˆ σ A n , ˆ ϕ n ) is a mixed strate gy Nash equilibrium for G N P ( ε, d, n ) . Then, e n ( ˆ σ A n , ˆ ϕ n ) → 0 as n → ∞ . Lemma A.5. Assume (A1)-(A3), and let ( ˆ σ A n , ˆ ϕ n ) n ≥ 1 be as in Lemma A.4. Then ˆ σ A n → δ q ∗ weakly as n → ∞ . W e also ha ve that the error at equilibrium goes to 0 ev en when the attack er de viates from her equilibrium strategy . Lemma A.6. Assume (A1)-(A4), and let ( ˆ σ A n , ˆ ϕ n ) n ≥ 1 be as in Lemma A.4. Then, sup q ∈ Q e n ( q , ˆ ϕ n ) → 0 as n → ∞ . The main idea in the proof of the above lemma is to sho w that the acceptance region of H 0 under an y equilibrium does not intersect the set Q . W ith this lemma at hand, we now hav e the following concentration property for the support of the equilibrium strategy of the attack er, which can be pro ved similar to Lemma 4.5 in the Bayesian formulation. Lemma A.7. Assume (A1)-(A4), and let ( ˆ σ A n , ˆ ϕ n ) n ≥ 1 be as in Lemma A.4. Let ( q n ) n ≥ 1 be a sequence such that q n ∈ supp ( ˆ σ A n ) for each n ≥ 1 . Then, q n → q ∗ as n → ∞ . Remark A.2 . Note that, in one dimension ( d = 1 ), the acceptance region of an optimal Ne yman-Pearson test for a ﬁxed alternati ve q will be a “v anishingly small neighborhood of the null distrib ution p ” and that while it can still intersect Q for ﬁnite n , it may not for large-enough n ; so that Lemma A.6 may always hold. Howev er, it is unclear how this might generalize to higher dimension. Intuitively , in higher dimension, the acceptance region may become close to p only from certain directions. W e also note that our proof of Lemma A.6 actually uses Assumption (A4) and not a weaker version of it—see the expression of Γ n in the proof of Lemma A.6. Therefore, we believ e that (A4) is needed in higher dimensions even for the Ne yman-Pearson case; although it is possible that a weaker assumption will suf ﬁce in one dimension. A.5 Error Exponents Our main result in the Neyman-Pearson formulation is the follo wing theorem. Theorem A.1. Assume (A1)-(A4), and let ( ˆ σ A n , ˆ ϕ n ) n ≥ 1 be as in Lemma A.4. Then, lim n →∞ 1 n log e n ( ˆ σ A n , ˆ ϕ n ) = − D ( p || q ∗ ) . Again, we note that the error e xponent is the same as that of the classical Ne yman-Pearson hypothesis testing problem between p and q ∗ . B Proofs B.1 Proof of Lemma 4.1 Since we are on a metric space, it suf ﬁces to show sequential continuity . Let { ( q k , ϕ k ) , k ≥ 1 } be a sequence such that ( q k , ϕ k ) → ( q , ϕ ) as k → ∞ . First, consider u D n . Notice that, for each x n , we have q k ( x n ) → q ( x n ) , and ϕ k ( x n ) → ϕ ( x n ) as k → ∞ . Therefore, we have that q k ( x n ) ϕ k ( x n ) → q ( x n ) ϕ ( x n ) , which yields that, lim k →∞ X x n q k ( x n ) ϕ k ( x n ) = X x n q ( x n ) ϕ ( x n ) . 12 Similarly , we also hav e lim k →∞ X x n p ( x n ) ϕ k ( x n ) = X x n p ( x n ) ϕ ( x n ) . Therefore, we have that u D n ( q k , ϕ k ) → u D n ( q , ϕ ) as k → ∞ which proves continuity of the utility of the defender . Using similar arguments, and by using the continuity of the cost function c on Q , we see that u A n ( q k , ϕ k ) → u A n ( q , ϕ ) as k → ∞ , which shows the continuity of the utility of the attacker . B.2 Proof of Pr oposition 4.1 G B ( d, n ) is a two-player game with compact strategy spaces. Also, by Lemma 4.1, the utilities (in pure strategies) of both the attacker and the defender are jointly continuous on Q × Φ n . Therefore, an application of the Glicksberg ﬁx ed point theorem (see, for example, Corollary 2.4 in [ 26 ]) tell us that there exists a mixed strategy Nash equilibrium (NE) for the adv ersarial hypothesis testing game G B ( d, n ) . W e now show the structure of the equilibrium strategy of the defender . Note that, for an y ϕ ∈ Φ n , u D n ( ˆ σ A n , ϕ ) = − 1 + P x n ϕ ( x n )( q ˆ σ A n ( x n ) − γ p ( x n )) , where q ˆ σ A n ( x n ) = R q ( x n ) ˆ σ A n ( dq ) . Therefore, using the characterization of a NE (see Proposition 140.1 in [23]), it follows that for an y ϕ ∈ supp ( ˆ σ D n ) , we hav e ϕ ( x n ) =  1 , if q ˆ σ A n ( x n ) − γ p ( x n ) > 0 , 0 , if q ˆ σ A n ( x n ) − γ p ( x n ) < 0 . Now , deﬁne ˆ ϕ n such that ˆ ϕ n ( x n ) = R ϕ ( x n ) ˆ σ D n ( dϕ ) whenev er x n is such that q ˆ σ A n ( x n ) − γ p ( x n ) = 0 , and that satisﬁes the above condition when x n is such that q ˆ σ A n ( x n ) − γ p ( x n ) 6 = 0 . Consider the strategy proﬁle ( ˆ σ A n , ˆ ϕ n ) where the defender plays the pure strategy ˆ ϕ n . By the choice of ˆ ϕ n , we see that u A n ( q , ˆ ϕ n ) = u A n ( q , ˆ σ D n ) for all q ∈ Q , and u D n ( ˆ σ A n , ϕ ) ≤ u D n ( ˆ σ A n , ˆ ϕ n ) for any ϕ ∈ Φ n . Therefore, using the characterization of a NE, we see that ( ˆ σ A n , ˆ ϕ n ) is a NE. This completes the proof of the Proposition. B.3 Proof of Lemma 4.2 By Assumption (A1), there exist a δ > 0 such that B ( p, δ ) ∩ Q = ∅ , where B ( p, δ ) denotes an open ball of radius δ centered at p . Let ϕ δ denote the deterministic decision rule whose rejection region is the set B ( p, δ ) c , i.e., ϕ δ ( x n ) = 1 whenev er P x n ∈ B ( p, δ ) c and ϕ δ ( x n ) = 0 otherwise, where P x n ∈ M 1 ( X ) denotes the type of x n , i.e., P x n ( i ) = 1 n P n k =1 1 { x k = i } . Since ( ˆ σ A n , ˆ σ D n ) is a Nash equilibrium, and e n ( ˆ σ A n , ˆ σ D n ) = − u D n ( ˆ σ A n , ˆ σ D n ) , we see that e n ( ˆ σ A n , ˆ σ D n ) ≤ e n ( ˆ σ A n , ϕ δ ) , (B.1) where ( ˆ σ A n , ϕ δ ) denotes the strategy proﬁle where the attacker plays the mixed strate gy ˆ σ A n and the defender plays the pure strategy ϕ δ . W e now proceed to bound the error term e n ( ˆ σ A n , ϕ δ ) . W e have e n ( ˆ σ A n , ϕ δ ) = Z " X x n (1 − ϕ δ ( x n )) q ( x n ) + γ ϕ δ ( x n ) p ( x n ) # ˆ σ A n ( dq ) = Z q ( P x n ∈ B ( p, δ )) ˆ σ A n ( dq ) + γ p ( P x n ∈ B ( p, δ ) c ) . W e bound the ﬁrst term above using a simple upper bound for the probability of observing a giv en type under a giv en distribution (see, for example, Lemma 2.1.9 in [ 10 ]). Let P n denote the set of all possible types of an n -length word. For any q ∈ Q , we have that q ( P x n ∈ B ( p, δ )) ≤ X ν ∈ B ( p,δ ) ∩P n e − nD ( ν || q ) ≤ | B ( p, δ ) ∩ P n | e − n inf ν ∈ B ( p,δ ) D ( ν || q ) ≤ ( n + 1) d e − n inf ν ∈ B ( p,δ ) ,q ∈ Q D ( ν || q ) , where the last inequality follows since |P n | ≤ ( n + 1) d . Therefore, e n ( ˆ σ A n , ϕ δ ) ≤ ( n + 1) d e − n inf ν ∈ B ( p,δ ) ,q ∈ Q D ( ν || q ) + γ p ( P x n ∈ B ( p, δ ) c ) . The ﬁrst term abov e goes to 0 as n → ∞ , since inf ν ∈ B ( p,δ ) ,q ∈ Q D ( ν || q ) > 0 . Also, by the weak law of lar ge numbers, we see that P x n con verges to p in probability under the null hypothesis H 0 . Therefore, p ( P x n ∈ B ( p, δ ) c ) → 0 . Hence, we conclude that e n ( ˆ σ A n , ϕ δ ) → 0 as n → ∞ . Combining this with (B.1) completes the proof of the Lemma. 13 B.4 Proof of Lemma 4.3 From Lemma 4.2, we hav e e n ( ˆ σ A n , ˆ σ D n ) → 0 as n → ∞ . Since u A n ( q ∗ , ˆ σ D n ) ≥ − c ( q ∗ ) and since ( ˆ σ A n , ˆ σ D n ) is a NE for all n ≥ 1 , it follows that Z c ( q ) ˆ σ A n ( dq ) → c ( q ∗ ) as n → ∞ . Since ( ˆ σ A n ) n ≥ 1 is a sequence of probability measures on the compact space Q , by Prohorov’ s theorem (see Theorem 1, Section 2 in Chapter 3 of [ 27 ]), there exists a weakly con vergent subsequence (say ( n k ) k ≥ 1 ) . Let µ denote the weak limit of ( ˆ σ A n k ) k ≥ 1 . Then, we hav e, c ( q ∗ ) = lim k →∞ Z c ( q ) ˆ σ A n k ( dq ) = Z c ( q ) µ ( dq ) , (B.2) where the last equality follows from weak con ver gence. W e now claim that µ = δ q ∗ . Suppose not. Then, there exists ε > 0 such that µ ( B ( q ∗ , ε ) c ) > 0 . By Assumption (A3), for the above ε , there exists a δ > 0 such that c ( q ) > c ( q ∗ ) + δ whenev er q ∈ B ( q ∗ , ε ) c . Therefore, Z Q c ( q ) µ ( dq ) = Z B ( q ∗ ,ε ) c ( q ) µ ( dq ) + Z B ( q ∗ ,ε ) c c ( q ) µ ( dq ) ≥ c ( q ∗ ) µ ( B ( q ∗ , ε )) + ( c ( q ∗ ) + δ ) µ ( B ( q ∗ , ε ) c ) = c ( q ∗ ) + δ µ ( B ( q ∗ , ε ) c ) > c ( q ∗ ) , which contradicts (B.2). Therefore, it follows that µ ( B ( q ∗ , ε ) c ) = 0 for ev ery ε > 0 and hence µ = δ q ∗ . Since µ is independent of the subsequence ( n k ) k ≥ 1 , it follows that the whole sequence ( ˆ σ A n ) n ≥ 1 con verges to δ q ∗ (see Lemma 1, Section 3 in Chapter 3 of [27]). This completes the proof of the lemma. T o prove Lemma 4.4, we need the following lemma, which asserts uniform conv ergence of integrals of the relativ e entropy functional w .r .t. the equilibrium strategy of the attacker . Lemma B.1. Let ( ˆ σ A n , ˆ σ D n ) n ≥ 1 be as in Lemma 4.2. Then, sup µ ∈ M 1 ( X )     Z D ( µ || q ) ˆ σ A n ( dq ) − D ( µ || q ∗ )     → 0 as n → ∞ . Pr oof. Fix ε > 0 and µ ∈ M 1 ( X ) . Then, using the uniform continuity of the relative entropy function on M 1 ( X ) × Q , there exists δ > 0 such that sup q ∈ Q | D ( µ || q ) − D ( µ 0 || q ) | < ε for all µ 0 ∈ B ( µ, δ ) . Therefore, for all µ 0 ∈ B ( µ, δ ) , we have     Z D ( µ 0 || q ) ˆ σ A n ( dq ) − Z D ( µ || q ) ˆ σ A n ( dq )     ≤ Z | D ( µ 0 || q ) − D ( µ || q ) | ˆ σ A n ( dq ) ≤ ε, for all n ≥ 1 . Also, using weak con vergence of ( ˆ σ A n ) n ≥ 1 to the point mass at q ∗ , there exists N µ ≥ 1 such that     Z D ( µ || q ) ˆ σ A n ( dq ) − D ( µ || q ∗ )     ≤ ε for all n ≥ N µ . Note that, the sets B ( µ, δ ) µ ∈ Q is an open cov er for M 1 ( X ) . By compactness of the space M 1 ( X ) , extract a ﬁnite subcov er B ( µ i , δ ) , 1 ≤ i ≤ k . Put N = max { N µ 1 , . . . , N µ k } . Then, for all n ≥ N , we hav e     Z D ( µ || q ) ˆ σ A n ( dq ) − D ( µ || q ∗ )     ≤     Z D ( µ || q ) ˆ σ A n ( dq ) − Z D ( µ i || q ) ˆ σ A n ( dq )     +     Z D ( µ i || q ) ˆ σ A n ( dq ) − D ( µ i || q ∗ )     + | D ( µ i || q ∗ ) − D ( µ || q ∗ ) | ≤ 3 ε, where µ i is such that µ ∈ B ( µ i , δ ) . The result now follo ws since ε and µ are arbitrary . 14 B.5 Proof of Lemma 4.4 Recall the decision rule ˆ ϕ n from Proposition 4.1. Note that if H 0 is accepted under ˆ ϕ n when the defender observes x n , then we hav e R q ( x n ) ˆ σ A n ( dq ) p ( x n ) ≤ γ (note that there could be randomization when equality holds abov e). By Proposition 4.1, notice that ( ˆ σ A n , ˆ ϕ n ) is a Nash equilibrium, and e n ( q , ˆ σ D n ) = e n ( q , ˆ ϕ n ) for all q ∈ Q . Therefore it sufﬁces to show that sup q ∈ Q e n ( q , ˆ ϕ n ) → 0 as n → ∞ . Note that, the acceptance region of H 0 under the decision rule ˆ ϕ n is type-based, i.e., for e very n -length word x n , ˆ ϕ n ( x n ) depends only on P x n . Therefore, if H 0 is accepted when the defender observes x n , the type P x n must belong to the following subset of M 1 ( X ) :  P x n : log Z q ( x n ) p ( x n ) ˆ σ A n ( dq ) ≤ log γ  . Deﬁne Γ n =  P x n : Z log  q ( x n ) p ( x n )  ˆ σ A n ( dq ) ≤ log γ  . Notice that, by Jensen’ s inequality , the acceptance region of H 0 under the decision rule ˆ ϕ n is a subset of the abov e set Γ n . Also, it is easy to check that, Γ n = { µ ∈ M 1 ( X ) : D ( µ || p ) − Z D ( µ || q ) ˆ σ A n ( dq ) ≤ log γ n } ∩ P n . W e now show that the set Γ n does not intersect the set Q for large enough n . First, notice that, the set { µ ∈ M 1 ( X ) : D ( µ || p ) ≤ D ( µ || q ∗ ) } is closed in M 1 ( X ) . Therefore, by Assumption (A4), there exists η > 0 such that Q η ∩ { µ ∈ M 1 ( X ) : D ( µ || p ) ≤ D ( µ || q ∗ ) } = ∅ , where Q η = { µ ∈ M 1 ( X ) : inf q ∈ Q || µ − q || ≤ η } is the η -expansion of the set Q . W e sho w that there exists N ≥ 1 such that Q η ∩ Γ n = ∅ for all n ≥ N . Suppose not, then we can ﬁnd a sequence ( µ n ) n ≥ 1 such that µ n ∈ Q η and µ n ∈ Γ n for all n ≥ 1 . Since Q η is compact, we can ﬁnd a subsequence ( n k ) k ≥ 1 along which µ n con verges, and let µ = lim k →∞ µ n k ∈ Q η . Since µ n ∈ Γ n for all n ≥ 1 , using Lemma B.1, we see that µ satisﬁes D ( µ || p ) ≤ D ( µ || q ∗ ) . This contradicts the fact that Q η ∩ Γ n = ∅ , and hence, there exists N ≥ 1 such that Q η ∩ Γ n = ∅ for all n ≥ N . By the law of lar ge numbers, we hav e sup q ∈ Q q ( P x n / ∈ B ( q , η )) → 0 , and p ( P x n / ∈ Γ n ) → 0 as n → ∞ . But, notice that e n ( q , ˆ ϕ n ) ≤ q ( P x n ∈ Γ n ) + γ p ( P x n / ∈ Γ n ) ≤ q ( P x n / ∈ B ( q , η )) + γ p ( P x n / ∈ Γ n ) for all q ∈ Q and n ≥ N . Therefore, sup q ∈ Q e n ( q , ˆ ϕ n ) ≤ sup q ∈ Q q ( P x n / ∈ B ( q , η )) + γ p ( P x n / ∈ Γ n ) → 0 as n → ∞ . B.6 Proof of Lemma 4.5 Fix ε > 0 . By Lemma 4.4, there exists N ε such that e n ( q n , ˆ σ D n ) ≤ ε for all n ≥ N ε . Therefore, u A n ( q n , ˆ σ D n ) ≤ ε − c ( q n ) for all n ≥ N ε . Howe ver , by playing the pure strategy q ∗ , the attacker utility is u A n ( q ∗ , ˆ σ D n ) ≥ − c ( q ∗ ) for all n ≥ 1 . Since ( ˆ σ A n , ˆ σ D n ) is a Nash equilibrium, and since q n ∈ supp ( ˆ σ A n ) , we must hav e u A n ( q n , ˆ σ D n ) ≥ u A n ( q ∗ , ˆ σ D n ) for all n ≥ N ε . That is, c ( q n ) ≤ c ( q ) + ε for all n ≥ N ε . Thus, it follows that, c ( q n ) → c ( q ∗ ) as n → ∞ . Using the deﬁnition of q ∗ , we see that q n → q ∗ as n → ∞ . 15 B.7 Proof of Theor em 4.1 First, we obtain the asymptotic lower bound. T owards this, we shall consider an equi valent zero-sum game for G B ( d, n ) . For q ∈ Q and ϕ ∈ Φ n , deﬁne u eq n ( q , ϕ ) = X x n (1 − ϕ ( x n )) q ( x n ) + γ X x n ϕ ( x n ) p ( x n ) − c ( q ) . Observe that, as far as the attacker is concerned, for any ϕ ∈ Φ n , maximizing u A n ( · , ϕ ) is the same as maximizing u eq n ( · , ϕ ) , as the extra term present in u eq n does not depend on the attacker strategy . Similarly , for any q ∈ Q , maximizing the defender’ s utility function u D n ( q , · ) is the same as minimizing u eq n ( q , · ) , as the cost function c does not depend on the defender’ s strategy . Therefore, we see that G B ( d, n ) is best-response equivalent to a two-player zero sum g ame (with attacker being ﬁrst player and defender being second player) with the abov e utility for the ﬁrst player . Hence, any equilibrium for the original game is also going to be an equilibrium for the zero-sum equiv alent game with utility function u eq n (see Deﬁnition 4 in [ 12 ] and the remarks before Theorem 2). Consider the strategy proﬁle ( q ∗ , ˆ σ D n ) , i.e., the attacker plays the pure strategy q ∗ and the defender plays the mixed strategy ˆ σ D n that comes from the equilibrium. By deﬁnition of the Nash equilibrium, and the equiv alence of G B ( d, n ) with the abo ve zero-sum game, we hav e u eq n ( ˆ σ A n , ˆ σ D n ) ≥ u eq n ( q ∗ , ˆ σ D n ) . (B.3) ( u eq n ( ˆ σ A n , ˆ σ D n ) denotes the utility in mixed extension of the equiv alent zero-sum game). Deﬁne the deterministic decision rule ϕ ∗ n by ϕ ∗ n ( x n ) =  1 , if q ∗ ( x n ) p ( x n ) ≥ γ , 0 , otherwise . It is easy to see that ϕ ∗ n minimizes e n ( q ∗ , · ) . Writing the probabilities p ( x n ) and q ( x n ) in terms of P x n , it is easy to check that, the acceptance region of ϕ ∗ n is giv en by Γ ∗ n = { ν ∈ M 1 ( X ) ∩ P n : D ( ν || q ∗ ) − D ( ν || p ) > log γ n } , i.e., ϕ ∗ n ( x n ) = 0 whenev er P x n ∈ Γ ∗ n , and ϕ ∗ n ( x n ) = 1 otherwise. Noting that u eq n ( q , ϕ ) = e n ( q , ϕ ) − c ( q ) , (B.3) becomes e n ( ˆ σ A n , ˆ σ D n ) ≥ Z X x n ((1 − ϕ ( x n )) q ∗ ( x n ) + γ ϕ ( x n ) p ( x n )) ˆ σ D n ( dϕ ) − c ( q ∗ ) + Z c ( q ) ˆ σ A n ( dq ) ≥ Z X x n ((1 − ϕ ( x n )) q ∗ ( x n ) + γ ϕ ( x n ) p ( x n )) ˆ σ D n ( dϕ ) ≥ X x n ((1 − ϕ ∗ n ( x n )) q ∗ ( x n ) + γ ϕ ∗ n ( x n ) p ( x n )) , (B.4) where the second inequality follows from the deﬁnition of q ∗ , and the last inequality follows from the optimality of ϕ ∗ n . The quantitiy in the RHS of the last inequality is the minimum Bayesian error for the following standard binary hypothesis testing problem: under the null hypothesis, each symbol in x n is generated independently from p , and under the alternate hypothesis, each symbol is generated independently from q ∗ . It is well kno wn that (see, for example, Corollary 3.4.6 in [10]), lim inf n →∞ 1 n log e n ( q ∗ , ϕ ∗ n ) ≥ − Λ ∗ 0 (0) , and hence, from (B.3) and (B.4), it follows that, lim inf n →∞ 1 n log e n ( ˆ σ A n , ˆ σ D n ) ≥ − Λ ∗ 0 (0) . (B.5) W e now proceed to sho w the upper bound. Deﬁne the decision rule ϕ 0 n by ϕ 0 n ( x n ) =  1 , if q ∗ ( x n ) p ( x n ) ≥ 1 , 0 , otherwise . Similar to the decision rule ϕ ∗ n , the acceptance region of ϕ 0 n can be written as Γ 0 = { ν ∈ M 1 ( X ) : D ( ν || q ∗ ) − D ( ν || p ) > 0 } , i.e., ϕ 0 n ( x n ) = 0 if P x n ∈ Γ 0 , and ϕ 0 n ( x n ) = 1 otherwise. By the deﬁnition of a Nash equilibrium, and noting that u D n ( ˆ σ A n , ˆ σ D n ) = − e n ( ˆ σ A n , ˆ σ D n ) , we hav e e n ( ˆ σ A n , ˆ σ D n ) ≤ e n ( ˆ σ A n , ϕ 0 n ) , (B.6) 16 where ( ˆ σ A n , ϕ 0 n ) denotes the strategy proﬁle where the attacker plays the mixed strategy ˆ σ A n that comes form the equilibrium, and the defender plays the pure strategy ϕ 0 n . W e have, e n ( ˆ σ A n , ϕ 0 n ) = Z X x n  (1 − ϕ 0 n ( x n )) q ( x n ) + γ ϕ 0 n ( x n ) p ( x n )  σ A n ( dq ) = Z q ( P x n ∈ Γ 0 ) ˆ σ A n ( dq ) + p ( P x n ∈ (Γ 0 ) c ) . Consider the ﬁrst term. Using the upper bound on the probability of observing a type under a given distrib ution (Lemma 2.1.9 in [10]), we hav e q ( P x n ∈ Γ 0 ) ≤ ( n + 1) d e − n inf ν ∈ Γ 0 D ( ν || q ) . Fix ε > 0 . Since the relative entrop y is jointly uniformly continuous on Γ 0 × Q , there exists a δ > 0 such that D ( ν || q ) ≥ D ( ν || q ∗ ) − ε for all ν ∈ Γ 0 whenev er || q − q ∗ || 2 < δ . For the abov e δ , by Lemma 4.5, there exists N δ such that || q − q ∗ || 2 < δ whenev er q ∈ supp ( ˆ σ A n ) for all n ≥ N δ . Therefore, we see that, for all n ≥ N δ and ν ∈ Γ 0 , D ( ν || q ) ≥ D ( ν || q ∗ ) − ε for all q ∈ supp ( ˆ σ A n ) . Therefore, for all n ≥ N δ , we hav e q ( P x n ∈ Γ 0 ) ≤ ( n + 1) d e − n (inf ν ∈ Γ 0 D ( ν || q ∗ ) − ε ) for all q ∈ supp ( ˆ σ A n ) . For the second term, using Lemma 2.1.9 in [10], we ha ve p ( P x n ∈ (Γ 0 ) c ) ≤ e − n inf ν ∈ (Γ 0 ) c D ( ν || p ) . It can be easily shown that (for example, see Exercise 3.4.14(b) in [ 10 ]), inf ν ∈ Γ 0 D ( ν || q ∗ ) = inf ν ∈ (Γ 0 ) c D ( ν || p ) = Λ ∗ 0 (0) . Hence, the above implies that lim sup n →∞ 1 n log e n ( ˆ σ A n , ϕ 0 n ) ≤ − Λ ∗ 0 (0) + ε. Letting ε → 0 , we get lim sup n →∞ 1 n log e n ( ˆ σ A n , ϕ 0 n ) ≤ − Λ ∗ 0 (0) . Therefore, from (B.6) and the abov e inequality , we hav e lim sup n →∞ 1 n log e n ( ˆ σ A n , ˆ σ D n ) ≤ − Λ ∗ 0 (0) . (B.7) The theorem now follo ws from (B.5) and (B.7). B.8 Proof of Lemma A.1 W e sho w sequential compactness of Φ n . Let ( ϕ k ) k ≥ 1 be a sequence in Φ n . Let x n 1 , . . . , x n 2 n denote the elements of X n . Since ϕ n ( x n ) ∈ [0 , 1] for all x n ∈ X n , there exists a subsequence ( k (1) l ) l ≥ 1 along which ϕ ( x n 1 ) con verges. W e can then extract a further subsequence ( k (2) l ) l ≥ 1 of ( k (1) l ) l ≥ 1 along which ϕ ( x n 2 ) con verges. Repeating the above procedure 2 n times, we see that, there exists a subsequence ( k l ) l ≥ 1 along which ϕ ( x n ) con verges for all x n ∈ X n . Put ϕ ( x n ) = lim l →∞ ϕ k l ( x n ) , x n ∈ X n . It is clear that d n ( ϕ k l , ϕ ) → 0 as l → ∞ , and we have P F A n ( ϕ ) = X x n ϕ ( x n ) p ( x n ) = X x n lim l →∞ ( ϕ k l ( x n )) p ( x n ) = lim l →∞ X x n ϕ k l ( x n ) p ( x n ) = lim l →∞ P F A ( ϕ k l ) ≤ ε, since P F A ( ϕ k l ) ≤ ε for all l ≥ 1 . This shows that the space Φ n equipped with the metric d n is sequentially compact, and hence compact. 17 B.9 Proof of Pr oposition A.1 By Lemma A.1, the strategy space of the defender is compact. Also, the strate gy space of the att acker is compact under the standard Euclidean topology on R d . By Lemma A.2, we see that the utility functions of both players are jointly continuous. Therefore, by the Glicksberg ﬁxed point theorem (see, for example, Corollary 2.4. in [26]), there exists a mix ed strategy Nash equilibrium for the game G ( ε, d, n ) . W e now sho w the structure of the equilibrium strategy of the defender . Let ( ˆ σ A n , ˆ σ D n ) denote a mixed strate gy Nash equilibrium of G ( ε, d, n ) . By the property of Nash equilibrium, we ha ve that e n ( ˆ σ A n , ϕ ) = e n ( ˆ σ A n , ˆ σ D n ) for all ϕ ∈ supp ( ˆ σ D n ) . W e claim that P F A ( ϕ ) = ε for all ϕ ∈ supp ( ˆ σ D n ). If there exists ϕ ∈ supp ( ˆ σ D n ) with P F A ( ϕ ) < ε , then we can ﬁnd x n 0 ∈ X n such that ϕ ( x n 0 ) = 0 and δ > 0 such that the decision rule deﬁned by ϕ 0 ( x n ) = ϕ ( x n ) for all x n 6 = x n 0 , and ϕ 0 ( x n 0 ) = δ has the property that P F A n ( ϕ 0 ) ≤ ε and e n ( ˆ σ A n , ϕ 0 ) < e n ( ˆ σ A n , ˆ σ D n ) . This contradicts the fact that, ( ˆ σ A n , ˆ σ D n ) is a Nash equilibrium, which proves our claim. But, note that e n ( ˆ σ A n , ˆ σ D n ) = Z X x n (1 − ϕ ( x n )) q ( x n ) ˆ σ A n ( dq ) ˆ σ D n ( dϕ ) = X x n  Z (1 − ϕ ( x n )) q ( x n ) ˆ σ A n ( dq ) ˆ σ D n ( dϕ )  = X x n  Z (1 − ϕ ( x n )) q ˆ σ A n ( x n )) ˆ σ D n ( dϕ )  where q ˆ σ A n ∈ M 1 ( X n ) is given by q ˆ σ A n ( x n ) = Z q ( x n ) ˆ σ A n ( dq ) . That is, when the attacker plays the Nash equilibrium ˆ σ A n , the defender faces the problem of distinguishing between the two alternati ves: (i) ( X 1 , . . . , X n ) is generated by i.i.d. p , versus (ii) ( X 1 , . . . , X n ) is generated by q ˆ σ A n . By the Neyman-Pearson lemma, we kno w that there exists a Neyman-Pearson decision rule ˆ ϕ n ∈ Φ n with the property that P F A ( ˆ ϕ n ) = ε and e n ( ˆ σ A n , · ) is minimized by ˆ ϕ n on Φ n . Since ev ery ϕ ∈ supp ( ˆ σ D n ) minimizes e n ( ˆ σ A n , · ) , and P F A ( ϕ ) = ε , and since each x n ∈ X n has positi ve probability of observing under both H 0 and H 1 , an application of the uniqueness part in Neyman-Pearson lemma (see, for example, Section 5.1 in [13]) yields that that ˆ σ D n = δ ˆ ϕ n . This completes the proof. B.10 Proof of Lemma A.3 Recall the deﬁnition of a Ne yman-Pearson decision rule. In the binary case, since the comparison of the ratio q ( x n ) p ( x n ) to a threshold is the same as comparison of the number of 1 ’ s in the n -length word x n to some other threshold, we see that any Ne yman-Pearson decision rule ϕ must necessarily be of the follo wing form: ϕ ( x n ) =    0 , if P x n (1) ∈ { 0 , 1 n , . . . , k n } , π , if P x n (1) = k +1 n , 1 , if P x n (1) ∈ { k +2 n , . . . , 1 } , (B.8) for some π ∈ [0 , 1] and 0 ≤ k ≤ n . Here, P x n (1) denotes the fraction of 1 ’ s in x n . The false alarm probability of the abov e decision rule is P F A n ( ϕ ) = p  P x n (1) ∈ { 0 , 1 n , . . . , k n }  + π p  P x n (1) = k + 1 n  . Since every n -length word x n has positive probability under the distribution p , we see that, there exists a unique k and π such that P F A n ( ϕ ) = ε . Let ˆ ϕ n denote the abov e Neyman-Pearson decision rule. Then, by the Neyman-Pearson lemma (see, for e xample, Proposition II.D.1 in [25]), we see that, ˆ ϕ n = arg max ϕ ∈ Φ n u D n ( q , ϕ ) for all q ∈ Q. Thus, the defender has a unique strictly dominant strate gy . Using the continuity of c , and the continuity of the T ype II error term in the attacker’ s strategy , we see that u A n ( · , ˆ ϕ n ) is continuous on Q, and hence there exist a maximum. Therefore, letting the attacker play a pure strategy ˆ q n that maximizes u A n ( · , ˆ ϕ n ) yields a pure strategy Nash equilibrium ( ˆ q n , ˆ ϕ n ) . T o prove Lemma A.6, we need the follo wing lemma, which can be proved similar to Lemma B.1. 18 Lemma B.2. Let ( ˆ σ A n , ˆ ϕ n ) n ≥ 1 be as in Lemma A.4. Then, sup µ ∈ M 1 ( X )     Z D ( µ || q ) ˆ σ A n ( dq ) − D ( µ || q ∗ )     → 0 as n → ∞ . B.11 Proof of Lemma A.6 Let γ n denote the threshold and π denote the randomization used in the decision rule ˆ ϕ n , i.e., ˆ ϕ n is of the form ˆ ϕ n ( x n ) =          1 , if q ˆ σ A n ( x n ) p ( x n ) > γ n , π , if q ˆ σ A n ( x n ) p ( x n ) = γ n , 0 , if q ˆ σ A n ( x n ) p ( x n ) < γ n . W e ﬁrst claim that lim sup n →∞ γ n ≤ 1 . Since P F A ( ˆ ϕ n ) = ε , we hav e that p q ˆ σ A n ( x n ) p ( x n ) ≥ γ n ! ≥ ε (B.9) But, using the probability of observing an n -length word under a distrib ution (see, for example, Lemma 2.1.6 in [10]), we hav e q ˆ σ A n ( x n ) = Z q ( x n ) ˆ σ A n ( dq ) = Z e − n ( H ( P x n )+ D ( P x n || q )) ˆ σ A n ( dq ) , and p ( x n ) = e − n ( H ( P x n )+ D ( P x n ) || p ) . Therefore, p q ˆ σ A n ( x n ) p ( x n ) ≥ γ n ! = p  Z e n ( D ( P x n || p ) − D ( P x n || q )) ≥ γ n  ≤ p  e n ( D ( P x n || p ) − inf q ∈ Q D ( P x n || q )) ≥ γ n  By Assumption (A1), we can choose δ > 0 such that D ( µ || p ) < inf q ∈ Q D ( µ || q ) for all µ ∈ B ( p, δ ) . Thus, p  e n ( D ( P x n || p ) − inf q ∈ Q D ( P x n || q )) ≥ γ n  = p  e n ( D ( P x n || p ) − inf q ∈ Q D ( P x n || q )) ≥ γ n , P x n ∈ B ( p, δ )  + p  e n ( D ( P x n || p ) − inf q ∈ Q D ( P x n || q )) ≥ γ n , P x n / ∈ B ( p, δ )  By law of large numbers, p ( P x n / ∈ B ( p, δ )) → 0 , and hence the second term above goes to 0 as n → ∞ . Suppose that lim sup n →∞ γ n > 1 , then there exists a subsequence ( n k ) k ≥ 1 such that γ n k > 1 for all k ≥ 1 . Therefore, along this subsequence, the ﬁrst term abov e becomes p ( e n k ( D ( P x n k || p ) − inf q ∈ Q D ( P x n k || q )) ≥ γ n k , P x n k ∈ B ( p, δ )) ≤ p ( e n ( D ( P x n k || p ) − inf q ∈ Q D ( P x n k || q )) > 1 , P x n k ∈ B ( p, δ )) which goes to 0 as n → ∞ by the choice of δ . This implies that, p  q ˆ σ A n ( x n ) p ( x n ) ≥ γ n  → 0 as n → ∞ , which contradicts (B.9). Therefore, we must hav e lim sup n →∞ γ n ≤ 1 . W e now ar gue that, for some η > 0 , the acceptance set of H 0 under ˆ ϕ n does not intersect the set Q η . T owards this, consider the set Γ n =  P x n : Z log  q ( x n ) p ( x n )  ˆ σ A n ( dq ) ≤ log γ n  . Notice that, by Jensen’ s inequality , the acceptance region of H 0 under the decision rule ˆ ϕ n is a subset of the abov e set Γ n . Also, it is easy to check that, Γ n = { µ ∈ M 1 ( X ) : D ( µ || p ) − Z D ( µ || q ) ˆ σ A n ( dq ) ≤ log γ n n } ∩ P n . 19 W e now show that the set Γ n does not intersect the set Q for large enough n . First, notice that, the set { µ ∈ M 1 ( X ) : D ( µ || p ) ≤ D ( µ || q ∗ ) } is closed in M 1 ( X ) . Therefore,by Assumption (A4) there exists η > 0 such that { µ ∈ M 1 ( X ) : D ( µ || p ) ≤ D ( µ || q ∗ ) } ∩ Q η = ∅ . W e show that there exists N ≥ 1 such that Q η ∩ Γ n = ∅ for all n ≥ N . Suppose not, then we can ﬁnd a sequence ( µ n ) n ≥ 1 such that µ n ∈ Q η and µ n ∈ Γ n for all n ≥ 1 . Since Q η is compact, we can ﬁnd a subsequence ( n k ) k ≥ 1 along which µ n con verges, and let µ = lim k →∞ µ n k ∈ Q η . Since µ n ∈ Γ n for all n ≥ 1 , using Lemma B.2 and the fact that lim sup n →∞ γ n = 0 , we see that µ satisﬁes D ( µ || p ) ≤ D ( µ || q ∗ ) . This contradicts the fact that Q η ∩ Γ n = ∅ , and hence, there exists N ≥ 1 such that Q η ∩ Γ n = ∅ for all n ≥ N . By the law of lar ge numbers, we hav e sup q ∈ Q q ( P x n / ∈ B ( q , η )) → 0 as n → ∞ . But, notice that e n ( q , ˆ ϕ n ) ≤ q ( P x n ∈ Γ n ) ≤ q ( P x n / ∈ B ( q , η )) for all q ∈ Q and n ≥ N . Therefore, sup q ∈ Q e n ( q , ˆ ϕ n ) ≤ sup q ∈ Q q ( P x n / ∈ B ( q , η )) → 0 as n → ∞ . B.12 Proof of Theor em A.1 W e proceed through similar steps as in the proof of Theorem 4.1. T o show the lower bound, we let the attacker play the pure strate gy q ∗ instead of the her equilibrium strategy ˆ σ A n for all n ≥ 1 . Since u A n ( q , ϕ ) = e n ( q , ϕ ) − c ( q ) , and since ( ˆ σ A n , ˆ ϕ n ) is a Nash equilibrium for G N P ( ε, d, n ) , we see that e n ( ˆ σ A n , ˆ ϕ n ) ≥ e n ( q ∗ , ˆ ϕ n ) − c ( q ∗ ) + Z c ( q ) ˆ σ A n ( dq ) ≥ e n ( q ∗ , ˆ ϕ n ) ≥ e n ( q ∗ , ϕ ∗ n ) where ϕ ∗ n denotes the best ε -lev el Neyman-Pearson test for distinguishing p from q ∗ from n independent samples. Here, the second inequality follows from the deﬁnition of q ∗ , and the last inequallity follows from the optimality of Neyman-Pearson test ϕ ∗ n . Hence, using Stein’ s lemma (see, for example, Lemma 3.4.7 in [10]), we see that lim inf n →∞ 1 n log e n ( ˆ σ A n , ˆ ϕ n ) ≥ − D ( p || q ∗ ) . (B.10) W e now show the upper bound. Fix 0 < δ < 1 such that B ( p, δ ) ∩ Q = ∅ , and consider the deterministic decision rule ϕ δ with acceptance region B ( p, δ ) , i.e., ϕ δ ( x n ) = 0 whenev er P x n ∈ B ( p, δ ) and ϕ δ ( x n ) = 1 otherwise. T o obtain the upper bound, we let the defender play the strategy ϕ δ n for all n ≥ 1 . Since ( ˆ σ A n , ˆ ϕ n ) is a Nash equilibrium, and u D n ( q , ϕ ) = − e n ( q , ϕ ) , we hav e e n ( ˆ σ A n , ˆ ϕ n ) ≤ e n ( ˆ σ A n , ϕ δ ) = Z q ( P x n ∈ Γ δ ) ˆ σ A n ( dq ) ≤ Z ( n + 1) d e − n inf ν ∈ Γ δ D ( ν || q ) ˆ σ A n ( dq ) , (B.11) where the last inequality follows form the upper bound in Lemma 2.1.9 in [ 10 ]. By Lemma A.7 and by the uniform continuity of D ( ·||· ) on Γ δ × Q , there exists N δ ≥ 1 such that D ( ν || q ) ≥ D ( ν || q ∗ ) − δ for all ν ∈ Γ δ , q ∈ supp ( ˆ σ A n ) and n ≥ N δ . Therefore, (B.11) implies that lim sup n →∞ 1 n log e n ( ˆ σ A n , ˆ ϕ n ) ≤ − inf ν ∈ Γ δ D ( ν || q ∗ ) + δ. Letting δ → 0 and using the continuity of D ( ·|| q ∗ ) on M 1 ( X ) , we get lim sup n →∞ 1 n log e n ( ˆ σ A n , ˆ ϕ n ) ≤ − D ( p || q ∗ ) . (B.12) The result now follo ws form (B.10) and (B.12). 20 (a) n = 200 (b) n = 250 Figure 2: Best response plots for c ( q ) = | q − 0 . 8 | (a) c ( q ) = | q − 0 . 8 | , n = 250 , defender plays threshold 166 (b) c ( q ) = ( q − 0 . 8) 2 , n = 800 , defender plays thresh- old 529 Figure 3: Finer plots of attacker re venue around q ∗ for speciﬁc defender thresholds C Additional Numerical Experiments C.1 Bayesian F ormulation As explained in Section 5, we ﬁx X = { 0 , 1 } and p = 0 . 5 . For numerical computations, we discretize the set Q into 100 equally spaced points, and we only consider deterministic threshold-based decision rules for the defender . W e ﬁrst examine the best response of the players. W e ﬁx Q = [0 . 7 , 0 . 9] and the cost function to be c ( q ) = | q − q ∗ | where q ∗ = 0 . 8 . Figure 2(a) shows the best response of the players for n = 200 . The x -axis shows the strate gy space of the attacker and y -axis sho ws the defender’ s threshold. The blue curve plots the best response of the defender , and the red curve plots the best response of the attacker for 20 thresholds around the best response threshold corresponding to q ∗ . As we see from the ﬁgure, the two curves do not intersect (the best threshold for 0 . 8 is 133 , whereas the best value of q for threshold 133 is 0 . 7 ) and hence this suggests that there is no pure strategy equilibrium in this case. Figure 2(b) plots the best response curves for n = 250 . W e see that the two curves intersect (the point of intersection is when the attacker plays 0 . 8 and the defender plays the threshold 166 ). Howe ver , this does not mean that there is a pure equilibrium, as our discretization may not capture the exact value of the attacker strategy . T o see whether this is the case, we plot the attacker revenue when the defender plays the threshold 166 ov er a ﬁner grid around q ∗ ( 1000 equally sized points on the interval of length 1 / (100 ∗ n ) around q ∗ ), which is shown in Figure 3(a). From this, we observe that the maximum of the attacker utility is indeed attained at the point q = 0 . 8 . This suggests that there is a pure strategy Nash equilibrium when the attacker plays 0 . 8 and defender plays the threshold 166 , though we could not prove this analytically . Similar to the best response plots in Figure 2, we plot the best response plots for the quadratic cost function c ( q ) = ( q − q ∗ ) 2 where q ∗ = 0 . 8 . Figure 4(a) shows the best response plots for n = 700 . Here, they don’t intersect (the best threshold for 0 . 8 is 463 whereas the best v alue of q for threshold 463 is 0 . 7 ), which shows that there is no pure equilibrium for the game. Figure 4(b) shows the best repose plots for n = 800 . Here, we see that the curves intersect (when the attacker plays 0 . 8 and defender plays the threshold 529 ). As before, Figure 3(b) shows a ﬁner plot of the utility of the attacker around the point q ∗ . W e see that the utility of the attacker is indeed maximized at 0 . 8 , which suggests that there is a pure strategy equilibrium when the attacker plays the strategy 0 . 8 and defender plays the threshold 529 . 21 (a) n = 700 (b) n = 800 Figure 4: Best response plots for the cost function c ( q ) = ( q − 0 . 8) 2 (a) c ( q ) = 2 | q − 0 . 9 | (b) c ( q ) = ( q − 0 . 9) 2 Figure 5: Error exponents as a function of n From these experiments for the linear as well as quadratic cost functions, as we expect, there is no incentiv e for the attacker to deviate much form the point q ∗ , since for large values of n , the error term in the utility of the attacker does not contribute to the ov erall rev enue. Howe ver , in the second case, since the cost function has zero deriv ative at q ∗ , it is not clear whether a slight deviation form the point q ∗ can increase the error term compared to the decrease in the cost function, so that the overall utility of the attacker increases. Therefore, the existence of a pure strategy Nash equilibrium with the attacker strategy being equal to q ∗ is surprising in this case. Howe ver , in the ﬁrst case, since the left and right deriv ativ es of the cost function at q ∗ are non-zero, the decease in the cost function is much larger compared to the possible increase in the error term as we slightly de viate from q ∗ , and hence it is reasonable to expect the e xistence of a pure strategy equilibrium at q ∗ for large n . Comparing the two cost functions, a much higher v alue of n is needed in the second case for us to ha ve a pure equilibrium at q ∗ , since the increase in the cost function is much slo wer in the second case as we mo ve a way from the point q ∗ . W e now give two more examples that does not satisfy Assumption (A4) whose error exponents are dif ferent from Theorem 4.1. As before, Q = [0 . 6 , 0 . 9] and q ∗ = 0 . 9 , and recall that Assumption (A4) is not satisﬁed in this case. W e consider the linear cost function c ( q ) = 2 | q − q ∗ | and the quadratic cost function c ( q ) = ( q − q ∗ ) 2 . Figures 5(a) and 5(b) show the error e xponent at the equilibrium as a function of n , from n = 100 to n = 400 in steps of 100 . From these plots, we see that, the error exponents conv erge to somewhere around 0 . 023 and 0 . 011 respectively for the abov e two cost functions, whereas the value of Λ ∗ 0 (0) is around 0 . 111 . C.2 Neyman-Pearson F ormulation W e ﬁx ε = 0 . 1 and consider the piecewise linear cost function c ( q ) = | q − q ∗ | on Q = [0 . 7 , 0 . 8] where q ∗ = 0 . 8 . As suggested by Lemma A.3, there exists a pure strategy Nash equilibrium for G N P ( ε, 2 , n ) for each n ≥ 1 . W e ﬁrst compute the dominant decision rule of the defender by ﬁnding the appropriate value of threshold and randomization. Once this is done, we compute the equilibrium by ﬁnding the best response of the attacker corresponding to this dominant strategy of the defender (as before, we discretize the set Q into 100 equally-spaced points). W e repeat the experiment for dif ferent values of n , and for the quadratic cost function c ( q ) = 0 . 001 ∗ ( q − 0 . 8) 2 . Figure 6 and Figure 7 shows the results for the abo ve two cost functions. 22 (a) Attacker’ s strategy as a function of n (b) Error exponent as a function of n Figure 6: c ( q ) = | q − 0 . 8 | (a) Attacker’ s strategy as a function of n (b) Error exponent as a function of n Figure 7: c ( q ) = 0 . 001 ∗ ( q − 0 . 8) 2 Since the former cost function increases much faster than the latter as we mov e away from the point q ∗ , we see that the attacker has much more incentive to play a strategy that is away from q ∗ in the second case compared to the ﬁrst. This is reﬂected in the equilibrium strategy of the attacker; from Figures 6(a) and 7(a), we see that it takes much lar ger v alues of n for the equilibrium strategy of the attacker to become equal to q ∗ in the second case compared to the ﬁrst. From Figures 6(b) and 7(b), we see that the error exponents approach the limiting value D ( p || q ∗ ) = 0 . 223 . 23

Nonzero-sum Adversarial Hypothesis Testing Games

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment