Stackelberg Contention Games in Multiuser Networks
Interactions among selfish users sharing a common transmission channel can be modeled as a non-cooperative game using the game theory framework. When selfish users choose their transmission probabilities independently without any coordination mechani…
Authors: Jaeok Park, Mihaela van der Schaar
Stac k elb erg Con ten tion Games in Multi u s er Net w orks Jaeok P ark ∗ and Mihaela v an der Sc haar † Abstract Int er actions among selfish user s sharing a common trans mission channel can be mo deled as a non-co op era tive game using the game theo r y framework. When selfish users choose their trans- mission probabilities indep endently without an y co ordina tion mec hanism, Nash equilibria usually result in a netw ork c o llapse. W e prop ose a metho dolog y that transforms the no n-co op era tive game into a Stack elb er g game. Stack elb er g equilibria of the Stack elb erg ga me can overcome the deficiency of the Nash equilibria o f the or iginal game. A par ticular type of Stack elb erg interven- tion is co nstructed to sho w that a ny positive pay off profile feasible with independent transmiss ion probabilities can be achiev ed as a Stackelberg equilibr ium pay off profile. W e discuss criteria to select an operating p o int of the netw ork and informational re quirements for the Stackelberg ga me. W e rela x the requir ements and examine the effects of r elaxation on p erfor mance. 1 In tro du ction In wireless communicati on netw orks, multiple users often share a common c hannel and contend for access. T o resolve the co nten tion prob lem, man y differen t medium access con trol (MA C) proto cols ha ve b een devised and u sed. Recen tly , the selfish b ehavio r of users in MA C proto cols has b een studied using game theory . There h a v e b een attempts to understand th e existing MAC proto cols as the lo cal utilit y maximizing b eh a vior of selfish u sers by rev erse-engineering the curr en t p roto cols (e.g., [1]). It h as also b een in ve stigated whether existing pr otocols are vulnerable to the existence of selfish users w ho p ursu e their s elf-interest in a non -co op erativ e manner. Non-coop er ative b eha vior often leads to inefficien t outcomes. F or example, in the 802.11 d istributed MA C proto col, DCF, and its enhanced ve r sion, EDCF, comp etition among selfish u sers can lead to an inefficien t use of the shared c hann el in Nash equilibria [2]. Similarly , a prisoner’s dilemma phenomenon arises in a non-co op erativ e game for a generalized version of slotted-Aloha proto cols [3]. In general, if a game has Nash equilibria yielding lo w p a y offs for the pla yers, it w ill b e desirable for th em to transform the game to extend the set of equilibria to includ e b etter outco mes [4]. Th e ∗ Department of E conomics, U niversit y of Cal ifornia, Los A ngeles (U CLA), Los An geles, CA 90095-1477, U SA (e-mail: jpark31@ucla.edu) † Department of Electrical Engineering, U niversit y of Califo rn ia, Los Angeles (UCLA), Los Angeles, CA 90095-159 4, USA (e-mail: mi h aela@ee.ucla.e d u) 1 same idea can b e applied to the game pla y ed b y selfish us ers who comp ete for access to a common medium. If comp etition among selfish users b rings ab out a netw ork collapse, then it is b eneficial for them to design a device whic h pro vides incen tiv es to b eha ve co op erativ ely . Game theory [4] discusses three typ es of transf ormation: 1) games with contract s, 2) games with comm un ication, and 3) rep eated games. A game is said to b e with con tracts if the play ers of the game can comm u nicate and bargain with eac h other, and enf orce the agreemen t with a binding con tract. The main obstacle to apply this approac h to w ireless netw orking is the distrib uted nature of w ir eless net wo rks . T o r eac h an agreemen t, users should kno w the netw ork system and b e able to comm unicate with eac h other. They sh ou ld also b e able to enforce the agreed plan. A game with comm unication is the one in w hic h pla y ers can comm u n icate with eac h other through a mediator b ut they cannot wr ite a bind in g con tract. In this case, a correlated equilibriu m is predicted to b e play ed. [5] studies correlated equilibria using a coord ination mec hanism in a slotted Aloha-t yp e scenario. Unlike the first appr oac h, this do es not require that the actions of pla y ers b e enforceable. Ho we ver, to ap p ly this appr oac h to the medium acc ess p roblem, signals need to b e con v ey ed from a mediator to all users, and users n eed to kno w the correct meanings of the signals. A rep eated game is a dynamic ga me in wh ic h the s ame game is pla yed rep eatedly b y the same pla y ers ov er finite or infin ite p erio ds. Rep eated interact ions among th e s ame play ers enable th em to sustain coop eration by pu nishing d eviations in subsequent p erio ds. A main c h allenge of applying the idea of rep eated games to wir eless net wo rks is that the users should ke ep tr ac k of their past observ ations and b e able to detect d eviations and to co ord inate th eir actions in order to pu nish deviating users. Besides th e three approac hes ab o v e, another approac h widely applied to communicat ion n et- w orks is pricing [6]. A cen tral en tit y c harges prices to users in order to con trol their utilization of the net w ork. Nash equilibria with pricing sc hemes in an Aloha net w ork are analyzed in [7, 8]. Im- plemen ting a p ricing sc heme requires the cent ral en tit y to ha ve r elev ant system inform ation as w ell as u sers’ b enefits and costs, which are often th eir p riv ate information. Eliciting priv ate information often results in an efficiency loss in the pr esence of the strategic b eh avior of users as sho wn in [9]. Ev en in th e case wh er e the en tit y has all the r elev ant information, p rices need to b e computed and comm unicated to the u sers. In th is pap er, we prop ose y et another approac h using a S tac k elb er g game. W e introd uce a net wo rk manager as an add itional user and mak e him access the medium according to a certain rule. Unlik e the Stac kel b erg game of [10] in w hic h the manager (the leader) c ho oses a certain strategy b efore users (follo w ers) mak e their decisions, in the prop osed Stac kelberg game he sets 2 an in terv ention ru le fi rst and then implements his in terv entio n after users c ho ose their strategies. Alternativ ely , the prop osed Stac k elb erg game can b e considered as a generalized Stac ke lb erg game in whic h there are multi p le leaders (users) and a sin gle f ollo w er (the manager) and the leaders kn o w the resp onse of th e follo w er to their decisions correctly . With appropriate c hoices of interv en tion rules, the manager can shap e the incen tive s of users in such a w ay that their selfish b eha vior results in co op er ative outcomes. In the con text of cognitiv e radio net wo r k s , [11] pr op oses a related Stac k elb erg game in which the o wner of a licensed frequency b an d (the leader) can c h arge a virtual price for usin g the frequ en cy band to cognitiv e rad ios (follo wers). The virtual price signals the exten t to whic h cognitiv e radios can exploit the licensed fr equency band. Ho wev er, since p rices are virtual, selfish users ma y ignore prices when they mak e decisions if they can gai n b y doing so. On the con trary , in the Stac kel b erg game of this pap er, th e interv en tion of the manager is not virtu al bu t it r esults in the reduction of throughput, wh ic h selfish u sers care ab out for sure. Hence, the interv en tion metho d pr o vides b etter grounds for the net work manager to deal with th e selfish b eh a vior of us ers. [12] and [13] use game theoretic mo dels to study random access. Th eir approac h is to capture the in formation and implemen tation constrain ts u s ing the game theoretic f ramew ork and to sp ecify utilit y fu nctions so that a desired op erating p oint is achiev ed at a Nash equilibrium. If cond itions under wh ic h a certain typ e of dynamic adjustment p la y conv erges to the Nash equilibrium are m et, suc h a strategy up date mec hanism can b e used to deriv e a d istributed algorithm that con v erges to the desired op erating p oint. Ho wev er, this con trol-theoretic approac h to game theory assumes that users are ob edien t. In this pap er, our main conce r n is ab out the selfish b eha vior of u sers wh o ha ve innate ob jectiv es. Because w e start from natural utilit y fun ctions an d affect them by devising an in terve ntion scheme, we are in a b etter p osition to deal with selfis h users. F urthermore, the id ea of in terve ntion can p oten tially lead to a distributed algorithm to ac hiev e a d esired op erating p oin t. By f orm ulating the mediu m access problem as a non-co op erativ e game, we sho w the follo win g main results: 1. Beca u se the Nash equilibr ia of the non-co op erative game are inefficien t and/or unfair, w e transform the original game into a Stac k elb erg game, in which an y feasible outcome w ith indep en d en t transmission probabilities can b e ac hiev ed as a Stac k elb erg equilibrium. 2. A particular form of a Stac kelberg in terven tion str ategy , called total relativ e deviation (TRD)- based in terven tion, is constructed and u sed to ac hieve any feasible outcome with indep endent transmission p r obabilities. 3. The additional amoun t of inf ormation flo ws required for the transformation is r elativ ely m o d- erate, and it can b e fur ther reduced w ithout large efficiency losses. 3 The rest of this pap er is organized as follo ws. Section 2 introdu ces the model and formulates it as a non-co op erativ e game called the con tentio n game. Nash equilibria of the conte ntion game are c haracterized, and it is shown that they t ypically yield sub optimal p erformance. In Section 3, we transform the con tent ion game in to another r elated game called the Stac ke lb erg con tentio n game b y in tro d ucing an interv ening man ager. W e show that the m anager can implement an y transmiss ion probabilit y p rofile as a S tack elberg equilibrium us in g a class of inte r ven tion functions. Section 4 discusses natural candidates for the target transmiss ion p robabilit y profile selected b y the manager. In Section 5, we discuss th e flows of information required for our results and examine the imp lications of some relaxations of the requirement s on p erf ormance. Section 6 pro vides numerical r esults, and Section 7 concludes the p ap er. 2 Con ten tion G ame Mo del W e consider a simple conten tion mo del in whic h m ultiple users share a communicatio n channel as in [14]. A user represen ts a transm itter-receiv er pair. Time is divided into slots of the same duration. Ev ery u ser has a p ac k et to transmit and can send the pac ket or wa it. If ther e is only one transmission, the p ack et is successfully transm itted within th e time slot. If more than one user transmits a pac k et sim ultaneously in a slot, a collision occurs and n o p ack et is transmitted. W e sum marize the assump tions of our con ten tion mo del. 1. A fixed set of users interact s o ver a giv en p erio d of time (or a session). 2. Time is d ivided into m ultiple slots, and slots are synchronized. 3. A user alwa y s has a pac ke t to transmit in eve r y slot. 4. The transmission of a pac ket is complete d within a slot. 5. A user transmits its pac k et with the same probabilit y in ev ery slot. Th ere is no adjus tment in the transmission probab ilities during the session. This excludes coord ination among users, for example, using time division m ultiplexing. 6. There is no cost of transmitting a pac ket. W e formulate the medium access p r oblem as a non-co op erativ e game to analyze the b eha vior of selfish users. W e denote the set of u sers by N = { 1 , . . . , n } . Because w e assume that a user uses the same transmission probabilit y o ver the en tire session, the strategy of a user is its transmission probabilit y , and we d en ote the str ategy of user i by p i and the strategy space of user i by P i = [0 , 1] for all i ∈ N . 4 Once the users decide th eir transmission probabilities, a strategy profile can b e constructed. The users transmit their p ac k ets indep endent ly according to their transmiss ion probabilities, and th us the strategy p rofile determines the pr obabilit y of a successfu l transmission by user i in a slot. A strategy p rofile can b e wr itten as a v ector p = ( p 1 , . . . , p n ) in P = P 1 × · · · × P n , the set of strategy profiles. The pay off fu nction of user i , u i : P → R , is defined as u i ( p ) = k i p i Y j 6 = i (1 − p j ) , (1) where k i > 0 measures the v alue of transm ission of u ser i and p i Q j 6 = i (1 − p j ) is the p robabilit y of successful transm iss ion by user i . W e define the c ontentio n game by the tuple Γ = h N , ( P i ) , ( u i ) i . If the users c ho ose their trans - mission probabilities taking others ’ transmission probabilities as giv en, then the resu lting outcome can b e d escrib ed by th e solution concept of Nash equilibrium [4]. W e first c haracterize the Nash equilibria of the con tenti on game. Prop osition 1 A str ate gy pr ofile p ∈ P is a Nash e quilibrium of the c ontention game Γ if and only if p i = 1 for at le ast one i . Pr o of : In the con ten tion game, the b est resp onse corresp ondence of user i assumes t wo sets: b i ( p − i ) = { 1 } if Q j 6 = i (1 − p j ) > 0 and b i ( p − i ) = [0 , 1] if Q j 6 = i (1 − p j ) = 0. Supp ose that user i c ho oses p i = 1. Then it is p la ying its b est resp onse wh ile other users are also pla ying their b est resp onses, whic h establishes the suffi ciency part. T o pro ve the necessit y part, supp ose that p is a Nash equ ilibrium and p i < 1 for all i ∈ N . S ince Q j 6 = i (1 − p j ) > 0, p i is n ot a b est resp onse to p − i , whic h is a con tradiction. If a Nash equilibrium p h as only one user i such that p i = 1, then u i ( p ) > 0 and u j ( p ) = 0 for all j 6 = i where u i ( p ) can b e as large as k i . I f there are at least tw o u sers with the transmission probabilit y equal to 1, then w e ha v e u i ( p ) = 0 for all i ∈ N . Let U i = { u ∈ R n : u i ∈ [0 , k i ] , u j = 0 ∀ j 6 = i } . Then, the set of Nash equilibrium pa y offs is giv en b y U ( N E ) = n [ i =1 U i . (2) Giv en the game Γ, w e can define the set of f e asible p ayoffs by U = { ( u 1 ( p ) , . . . , u n ( p )) : p ∈ P } . (3) A pa y off profile u in U is Par eto efficient if there is n o other elemen t v in U suc h that v ≥ u and v i > u i for at least one us er i . W e also call a strategy p rofile p Pareto efficien t if u ( p ) = ( u 1 ( p ) , . . . , u n ( p )) is a P areto efficient pa yoff profile. Let U ( P E ) b e the set of Pareto efficien t pa yo ffs. 5 0 0.5 1 0 0.2 0.4 0.6 0.8 1 u 1 u 2 (a) Coordinated access 0 0.5 1 0 0.2 0.4 0.6 0.8 1 u 1 u 2 (b) Nash equilibria 0 0.5 1 0 0.2 0.4 0.6 0.8 1 u 1 u 2 (c) Independent random access Figure 1: Pa y off profiles with t wo homogeneous u sers with k 1 = k 2 = 1. (a) The set of feasible pa yo ffs wh en coord ination b et w een t wo users is p ossible. (b) The set of Nash equilibrium pa yo ffs. (c) The set of feasible pa y offs with indep endent transmission probabilities. There are n p oints in U ( N E ) ∩ U ( P E ), namely , u s uc h that u i = k i and u j = 0 for all j 6 = i , for i = 1 , . . . , n . These are th e corner p oint s of U ( P E ) in whic h only one user receiv es a p ositive pay off. Therefore, Nash equilibriu m p a yo ff profiles are either inefficien t or un f air. Moreov er, sin ce p i = 1 is a we akly dominant str ate gy for ev ery u ser i , in a sense that u i (1 , p − i ) ≥ u i ( p ) for all p ∈ P , the most lik ely Nash equilibrium is the one in whic h p i = 1 for all i ∈ N . At the most likely Nash equ ilibrium, ev ery u ser alwa ys tr an s mits its p ac k et, and as a result no pack et is successfully transmitted. Hence, the selfish b ehavio r of the users is lik ely to lead to a net work collapse, w hic h giv es zero pa yoff to ev ery user, as argu ed also in [15]. Figure 1 presents the pa y off spaces of tw o homogeneous u sers with k 1 = k 2 = 1. If co ordination b et ween the t wo us ers is p ossible, th ey can ac hiev e any pay off profile in the d ark area of Figure 1(a). F or example, (1 / 2 , 1 / 2) can b e ac hieve d b y arranging us er 1 to transm it only in o dd-num b ered slots and user 2 only in ev en-n umb ered slots. T h is kind of co ord in ation ca n b e supp orted through direct comm un ications among the users or mediated communicati ons. Ho wev er, if such coordination is not p ossible and eac h us er has to choose one transmission p robabilit y , Nash equilibria yield the pa yo ff p rofiles in Figure 1(b). T he set of feasible pay offs of the conten tion game is sh own as the dark area of Figure 1(c). The set of P areto-efficien t pa yoff p rofiles is the frontier of that area. The lac k of co ordin ation m ak es the set of feasible p a y offs smaller r ed ucing the area of Figure 1(a) to that of Figure 1(c ). Because the typica l Nash equilibr ium pa yoff is (0 , 0), the next section develo ps a transformation of the con ten tion game, and th e set of equ ilibria of the resu lting Stac k elb erg game is shown to expand to the entire area of Figure 1(c). 6 3 Stac k elb erg Con ten tion Game W e introd u ce a net wo rk m anager as a sp ecial k in d of user in the con tentio n game and call him user 0. As a user, the manager can access the c hannel with a certain transmission probabilit y . Ho wev er, the manager is differen t fr om the users in that he can choose h is transmission probability dep ending on the tr an s mission probabilities of the u sers. T h is abilit y of the m anager enables him to act as the p olice. If the users access the c hann el excessiv ely , the manager can in terve n e and pu nish them by c ho osing a high tr an s mission pr ob ab ility , thus reducing th e success r ates of the users. F ormally , the strategy of the manager is an intervention function g : P → [0 , 1], whic h give s his tran s mission probabilit y p 0 = g ( p ) when the strateg y p rofile of the users is p . g ( p ) can b e in terpr eted as the lev el of inte rven tion or pu nishment by the manager when the users choose p . Note that the level of inte rven tion b y the manager is the same for eve r y us er . W e assu me th at the manager has a sp ecific “target ” s trategy profile ˜ p , that h is transmiss ion has no v alue to him (as w ell as to others), and that he is b enevo lent. On e repr esentati on of his ob jectiv e is the pa yo ff function of the follo w ing form: u 0 ( g , p ) = 1 − g ( p ) if p = ˜ p , 0 otherwise . (4) This p a yo ff function means that the manager wan ts the users to op erate at the target strategy profile ˜ p with th e min imum level of in terv ent ion. W e call the transf orm ed game the Stackelb er g c ontention g ame b ecause the manager c ho oses his strategy g b efore the us er s mak e their decisions on the tran s mission pr obabilities. In this sense, the manager can b e though t of as a Stac k elb erg leader and th e u sers as follo w ers. The sp ecific timing of the Stac k elb erg con ten tion game can b e outlined as f ollo ws: 1. The net work manager d etermines h is in terven tion function. 2. Kno wing the in terven tion function of the manager, the users c ho ose their transmission prob- abilities simultaneously . 3. Observing the s tr ategy profile of the users, the manager determines the leve l of in terven tion using his in terv entio n fu n ction. 4. The transmission probabilities of the manager and the us ers determine their pa yoffs. Timing 1 h ap p ens b efore the session starts. Timing 2 o ccurs at the b eginning of the session whereas timin g 3 o ccurs when the m an ager kn o ws the tran s mission p robabilities of all the users. Therefore, there is a time lag b et wee n the time when the session b egins and when the manager 7 1 T1 T2 Tn R1 R2 Rn Common channel p 1 p 2 p n T1 T2 Tn R1 R2 Rn Common channel p 1 p 2 p n M p 0 =g( p ) (i) g (iii) (ii) (ii) (ii) (a) Contention ga me (b) Stackelberg cont ention game Figure 2: Sc hematic illustration of (a) the con tentio n game and (b) the Stac k elb erg con tent ion game. (i),(ii), and (iii) repr esen t the order of mo ves in the S tac k elb er g con tentio n game, and the dotted arro ws represen t the fl ows of inf ormation requir ed for the Stac k elb erg con ten tion game. starts to int ervene. Pa y offs can b e calculated as the prob ab ility of successful transmiss ion av eraged o v er th e ent ire s ession, multiplied by v aluation. If the int erv al b etw een timing 2 and timin g 3 is short relativ e to the d u ration of the session, the p a y off of u ser i can b e app ro ximated as the pay off during the in terv entio n us ing the follo wing pay off fu n ction: u i ( g , p ) = k i p i (1 − g ( p )) Y j 6 = i (1 − p j ) . (5) The transformation of the con ten tion game into the Stac k elb erg con ten tion game is schemat ically sho wn in Figure 2. The figure s h o ws that the main role of the manager is to set the in terve ntion rule and to implemen t it. The user s still b eha ve n on -co op erativ ely maximizing their pay offs, and the interv en tion of the manager affects their selfish b eha vior ev en though the manager do es n either directly con trol their b eha vior nor contin uously comm unicate with the us er s to con ve y co ordin ation or price signals. In th e Stac k elb erg routing ga me of [10], the strategy spaces of the manager and a user coincide. If that is the case in the Stac k elb erg conten tion game, i.e., if the manager c h o oses a single transmission probabilit y b efore th e u s ers c ho ose theirs, then this in terv ention only mak es the c han n el lossy but it do es not pro vide incent ives for users n ot to c h o ose the maxim um p ossible transmission p robabilit y . Hence, in order to pro vide an in cen tiv e to choose a smaller trans mission p robabilit y , the manager needs to v ary his trans mission probabilit y dep ending on th e transmission p robabilities of th e users. A S tack elberg game is analyzed u s ing a bac kw ard induction argumen t. The leader pr edicts the Nash equilibriu m b ehavio r of the follo wers give n h is strategy and c ho oses the b est strategy for him. 8 The same argument can b e app lied to the S tac k elb erg con tent ion game. Once the manager d ecides his strategy g and commits to implemen t his transmission probabilit y according to g , the rest of the Stac k elb erg con tentio n game (timing 2–4) can b e view ed as a non-co op er ative game pla y ed by the users. Giv en the int erven tion function g , the pa yoff function of user i can b e wr itten as ˜ u i ( p ; g ) = k i p i (1 − g ( p )) Y j 6 = i (1 − p j ) . (6) In essence, the role of the manage r is to c hange the n on-coop erativ e game that th e users play from the con ten tion game Γ to a new game Γ g = h N , ( P i ) , ( ˜ u i ( · ; g )) i , wh ic h w e call the c ontention g ame with intervention g . Understanding the non-co op erativ e b ehavio r of the users giv en the int erven tion function g , the man ager will c ho ose g that maximizes his pay off. W e no w define an equilibr iu m concept for the Stac ke lb erg con tent ion game. Definition 1 A n intervention function of the manager g and a pr ofile of the tr ansmission pr ob abili- ties of the users ˆ p = ( ˆ p 1 , . . . , ˆ p n ) c onstitutes a Stac kelberg equilibrium if (i) ˆ p i s a Nash e quilibrium of the c ontentio n game with intervention g and (ii) ˆ p = ˜ p and g ( ˆ p ) = 0 . Com binin g (i) and (ii), an equiv alen t definition is that ( g , ˜ p ) is a Stac k elb erg equilibrium if ˜ p is a Nash equilibrium of Γ g and g ( ˜ p ) = 0. Cond ition (i) says that once the manager chooses his strategy , the u sers will pla y a Nash equilibrium strategy profile in the resulting game, and condition (ii) sa ys that exp ecting the Nash equilibrium strateg y profile of the users, the manager c ho oses his strategy that ac hieves his ob j ective . 3.1 Stac k elb er g Equilibrium with TRD-based Interv en tion As we ha ve ment ioned earlier, the m an ager can c ho ose only one lev el of in terven tion th at affects the users equally . A question that arises is whic h strategy p rofile the manager can implement as a Stac kel b erg equilibrium with one lev el of interv en tion for ev ery user. W e answe r this q u estion constructiv ely . W e prop ose a sp ecific form of an inte r v enti on fu nction with wh ic h the manager can attain an y strategy p rofile ˜ p with 0 < ˜ p i < 1 for all i . The basic idea of this r esult is that b ecause the strategy of the manager is not a single interv en tion lev el but a function wh ose v alue dep end s on the str ategies of the users, he can discriminate the users by reacting d ifferen tly to their transmission probabilities in choosing the lev el of interv en tion. Therefore, ev en though the r ealized lev el of in terven tion is the same for every user, the manager can induce the u sers to c ho ose different transmission p r obabilities. T o construct such an in terv ent ion fun ction, we firs t define the total r elative deviation (TRD) of p fr om ˜ p by h ( p ) = n X i =1 p i − ˜ p i ˜ p i = p 1 ˜ p 1 + · · · + p n ˜ p n − n. (7) 9 Since g d etermines the transmission probabilit y of the manager, its range should lie in [0 , 1]. T o satisfy this constrain t, w e d efi ne the TRD- b ase d intervention fu nction by g ∗ ( p ) = [ h ( p )] 1 0 (8) where the op erator [ x ] b a = min { max { x, a } , b } is used to obtain th e “trimmed” v alue of TRD b et ween 0 and 1. The T RD-based in terv enti on can b e inte r p reted in the follo wing wa y . The manager sets the target at ˜ p . As long as the users c ho ose small transmission prob ab ilities so that th e TRD of p from ˜ p d o es not exceed zero, the manager d o es n ot in terv ene. If it is larger th an zero, the manager will resp ond to a one-unit increase in p i b y in creasing p 0 b y 1 ˜ p i units until the TRD reac hes 1. The manager determines the degree of punishment based on the target transmission probabilit y profile. If he w ants a user to transmit with a lo w probability , th en his punishm ent against its d eviation is strong. Prop osition 2 ( g ∗ , ˜ p ) c onstitutes a Stackelb er g e quilibrium. Pr o of : W e need to chec k t wo th in gs. First, ˜ p is a Nash Equilibriu m of Γ g ∗ . Second, g ∗ ( ˜ p ) = 0. It is straight forward to confirm the second. T o show the first, the pa yoff function of user i giv en others’ strategies ˜ p − i is ˜ u i ( p i , ˜ p − i ; g ∗ ) = k i p i (1 − g ∗ ( p i , ˜ p − i )) Y j 6 = i (1 − ˜ p j ) (9) = 0 if p i > 2 ˜ p i , k i p i 2 − p i ˜ p i Q j 6 = i (1 − ˜ p j ) if ˜ p i ≤ p i ≤ 2 ˜ p i , k i p i Q j 6 = i (1 − ˜ p j ) if p i < ˜ p i . (10) It can b e seen from the ab o ve expression that ˜ u i ( p i , ˜ p − i ; g ∗ ) is increasing on p i < ˜ p i , r eac hes a p eak at p i = ˜ p i , is decreasing on ˜ p i < p i < 2 ˜ p i , and then sta ys at 0 on p i ≥ 2 ˜ p i . Therefore, user i ’s b est resp onse to ˜ p − i is ˜ p i for all i , and th us ˜ p constitutes a Nash Equilibrium of the con ten tion game with TRD-based in terv ent ion, Γ g ∗ . Corollary 1 A ny fe asible p ayoff pr ofile u ∈ U of the c ontention game with u i > 0 for al l i ∈ N c an b e achieve d by a Stackelb er g e quilibrium. Corollary 1 resembles th e F olk theorem of rep eated games [4] in th at it claims that an y feasible outcome can b e attained as an equilibrium . Incen tive s not to deviate from a certain op erating p oint are p r o vided b y the manager’s interv en tion in the Stac k elb er g con ten tion game, wh ile in a rep eated game pla yers d o not deviate since a deviatio n is follo we d by pu n ishment from other pla y ers. 10 3.2 Nash Equilibria of the Con tention Game with T R D-based Interv en tion In Pr op osition 2, w e ha ve seen that ˜ p is a Nash equilibrium of the conte ntion game with TRD-based in terve ntion. How ev er, if other Nash equilibr ia exist, the outcome ma y b e d ifferen t from the one that the manager inte nd s. In fact, an y strategy profile p with p i = 1 for at least one i is still a Nash equilibrium of Γ g ∗ . The follo wing p rop osition c haracterizes the set of Nash equilibria of Γ g ∗ that are differen t from those of Γ. Prop osition 3 Consider a str ate gy pr ofile ˆ p with ˆ p i < 1 for al l i ∈ N . ˆ p is a Nash e quilibrium of the c ontentio n game with TRD-b ase d intervention if and only if either ( i ) ˆ p = ˜ p (11) or ( ii ) X j 6 = i ˆ p j − ˜ p j ˜ p j ≥ 2 for al l i = 1 , . . . , n. (12) Pr o of : S ee App endix A. T r ansforming Γ to Γ g ∗ do es not eliminate the Nash equilibria of the conte ntion game. Rather, the set of Nash equilibria expand s to include tw o classes of n ew equilibr ia. Th e firs t Nash equilibrium of Prop osition 3 is the one that the manager int end s the users to p la y . T he second class of Nash equilibria are those in whic h the sum of relativ e deviations of other users is already to o large that no m atter ho w small trans mission pr obabilit y user i c ho oses, the lev el of in terven tion sta ys the same at 1. Since ˜ p is c hosen to satisfy 0 < ˜ p i < 1 for all i and g ∗ satisfies g ∗ ( ˜ p ) = 0, it follo ws that ˜ u i ( ˜ p ) > 0 for all i . 1 F or the second class of Nash equilibria in Prop osition 3, ˜ u i ( ˆ p ) = 0 for all i b ecause g ∗ ( ˆ p ) = 1. Therefore, the pay off pr ofile of the second class of Nash equilibria is Par eto dominate d b y that of th e intended Nash equilibrium in that th e intended Nash equilibrium yields a higher pay off for ev ery u ser compared to th e second cla ss of Nash equilibria. The same conclusion holds for Nash equ ilibria w ith more than one u ser with transm ission pr ob - abilit y 1 b ecause ev ery user get s zero pa y off. Finally , the remaining Nash equilibria are those with exactly one user with transmission pr obabilit y 1. Supp ose that p i = 1. Then the highest p a y off for user i is achiev ed when p j = 0 for all j 6 = i . Denoting this strategy profile by e i , the pa y off profile of e i is Pa reto dominated by that of ˜ p if 1 − g ∗ ( e i ) = 1 + n − 1 ˜ p i < ˜ p i Q j 6 = i (1 − ˜ p j ). 1 Since w e mostly consider the TRD-based interven tion function g ∗ , we will use ˜ u i ( ˜ p ) instead of ˜ u i ( ˜ p ; g ∗ ) when there is no confusion. 11 3.3 Reac hing the Stack elberg E quilibrium W e ha ve seen th at there are m u ltiple Nash equilibria of the con tent ion ga me with TRD-based in terve ntion and that the Nash equilibrium ˜ p in general y ields higher pa y offs to the users than other Nash equilibria. If the user s are aw are of the welfare p rop erties of different Nash equilibria, they will tend to select ˜ p . Supp ose that the users pla y the second class of Nash equilibria in Prop osition 3 for some reason. If the Stac kel b erg con ten tion game is play ed r ep eatedly and the us ers an ticipate that the strategy profile of th e other users will b e the s ame as that of the last p erio d, then it can b e sho wn that under certain conditions there is a s equence of inte rven tion functions conv ergen t to g ∗ that the manager can emp lo y to ha ve the users reac h the in tended Nash equilibrium ˜ p , thus approac hin g the Stac k elb erg equilibrium. Prop osition 4 Supp ose that at t = 0 the manager cho oses the intervention function g ∗ and that the users play a Nash e quilibrium ˆ p 0 of the se c ond class. Without loss of gener ality, the users ar e enumer ate d so that the fol lowing holds: ˆ p 0 1 ˜ p 1 ≤ ˆ p 0 2 ˜ p 2 ≤ . . . ≤ ˆ p 0 n − 1 ˜ p n − 1 ≤ ˆ p 0 n ˜ p n . (13) Supp ose further that for e ach i , either ˆ p 0 n ˜ p n − ˆ p 0 i ˜ p i < 2 or ˆ p 0 i ˜ p i ≤ 1 holds. At t ≥ 1 ; Define c t = X j 6 = n ˆ p t − 1 j ˜ p j + 1 . (14) Assume that the manager employs the intervention function g t ( p ) = [ h t ( p )] 1 0 wher e h t ( p ) = p 1 ˜ p 1 + · · · + p n ˜ p n − c t (15) and that user i cho oses ˆ p t i as a b e st r e sp onse to ˆ p t − 1 − i given g t . Then lim t →∞ ˆ p t i = ˜ p i for al l i = 1 , . . . , n and lim t →∞ c t = n . Pr o of : S ee App endix B. The reason that no user has an in centiv e to deviate from the second class of Nash equilibria is that since others use high transmission probabilities, the TRD is o v er 1 no matter what transmission probabilit y a user c ho oses. Since the punishment lev el is alw a ys 1, a red uction of the transmission probabilit y b y a user is not rewa r d ed by a decreased lev el of interv en tion. If the r elativ e deviatio ns of p i from ˜ p i are not too disp erse, the manager can su ccessiv ely adjust down th e effectiv e range of punish men t so that he can r eact to the c hanges in the strategies of the us er s . Prop osition 4 shows that this pro cedu re succeeds to hav e the strategy pr ofile of the users con verge to the inte nd ed Nash equilibrium. 12 4 T arget Selection Criteria of the M anager So far we hav e assumed that the manager has a target strategy profile ˜ p and examined w hether he can find an in terv ent ion fun ction that implemen ts it as a Stac k elb erg equilibriu m . This section discusses selectio n criteria that the manager can use to c ho ose the target strategy profile. T o address this issue, w e rely on coop erativ e game theory b ecause a r easonable choi ce of the manager should ha ve a close relationship to the likel y outcome of bargaining among the users if bargaining w ere p ossible f or them [4]. The absen ce of communicatio n opp ortunities among the u sers p rev ents them from engaging in bargaining or from directly coordinating with eac h other. 4.1 Nash Bargaining Solution The pair ( F , v ) is an n- p erson b ar gaining pr oblem where F is a closed and con ve x sub set of R n , represent ing th e set of feasible pay off allocations and v = ( v 1 , . . . , v n ) is the d isagreemen t pay off allocation. Supp ose that there exists y ∈ F su c h th at y i > v i for ev ery i . Definition 2 x is the Nash bargaining solution for an n-p erson b ar gaining pr oblem ( F , v ) if it is the unique Par e to efficient ve ctor that solves max x ∈ F , x ≥ v n Y i =1 ( x i − v i ) . (16) Consider the conte ntion game Γ. ( U , 0 ) can b e regarded as an n -p erson b argaining problem where U is d efined in (3) and 0 is the disagreement p oint. The vec tor 0 is th e natural d isagreement p oint b ecause it is a Nash equilibrium pa y off as w ell as the minimax v alue for eac h user. The only departure from the standard theory is that the set of f easible pa yo ffs U is not con v ex. 2 Ho w eve r , we can carry the d efinition of the Nash bargaining solution to our setting as in [15]. Since the manager k n o ws the structur e of the con ten tion game, h e can calc u late th e Nash bar- gaining solution u for ( U , 0 ) and find the strategy profile ˜ p that yields u . T hen th e manager can implemen t ˜ p by choosing g ∗ based on ˜ p . Noti ce th at the presence of th e manager do es not decrease the pa yoffs of the us ers b ecause g ∗ ( ˜ p ) = 0. The Nash bargaining solution for ( U , 0 ) has the follo win g simple form. Prop osition 5 ( n − 1) n − 1 n n ( k 1 , . . . , k n ) is the Nash b ar gaining solution f or ( U , 0 ) , and it is attaine d by p i = 1 n for al l i = 1 , . . . , n . Pr o of : Th e maximand in the definition of the Nash bargaining s olution can b e written as max u ∈U , u ≥ 0 n Y i =1 u i . (17) 2 W e d o not allo w pub lic randomization among u sers, which requires coordination among t hem. 13 Since any u ∈ U satisfies u ≥ 0 , the ab o ve problem can b e expr essed in terms of p : max p ∈ P n Y i =1 k i ! n Y i =1 p i (1 − p i ) n − 1 . (18) The logarithm of th e ob jectiv e f unction is strictly conca v e in p , and th e fir st-order optimalit y condition give s p i = 1 n for all i = 1 , . . . , n . The Nash bargaining solution for ( U , 0 ) tr eats ev ery user equally in that it sp ecifies the same transmission pr ob ab ility f or ev ery user. Therefore, the manager do es not need to know the v ector of the v alues of trans m ission k = ( k 1 , . . . , k n ) to imp lemen t the Nash bargaining solution. Th e Nash bargaining solution coincides with the Kalai-Smorodin sky solution [16] b ecause the maxim um p a y off for u ser i is k i and the Nash bargaining solution is the unique efficie nt pa y off profile in w hic h eac h user receiv es a pay off p r op ortional to its maxim um feasible pay off. If th e manager w ants to treat th e users with discrimination, he can use the gener alize d Nash pr o duct n Y i =1 ( x i − v i ) ω i (19) as the maximand to fi nd a nonsymmetric Nash b ar gaining solution , where ω i > 0 represents the w eigh t for us er i . One example of the w eigh ts is the v aluation of the u sers. 3 The nonsymmetric Nash bargaining solution for ( U , 0 ) can b e sho wn to b e ac hiev ed b y p i = ω i P i ω i for all i using th e similar metho d to the p ro of of Prop osition 5. 4.2 Coalition-Pro of Strategy P rofile If s ome of the us ers can comm u n icate and collude effectiv ely , the net work manager ma y w an t to c ho ose a strategy pr ofi le which is self-enforcing ev en in the existence of coalitions. S ince w e define a us er as a transmitter-receiv er pair, a collusion ma y o ccur w hen a single transmitter send s pack ets to seve ral destinations and con trols the transmission p robabilities of sev eral u sers. Giv en the set of users N = { 1 , . . . , n } , a c o alition is any nonempt y sub set S of N . Let p S b e the strategy profile of th e users in S . Definition 3 ˜ p is coalition-proof with r esp e ct to a c o alition S in a non-c o op er ative game h N , [0 , 1] N , ( u i ) i if ther e do es not exist p S ∈ [0 , 1] S such that u i ( p S , ˜ p − S ) ≥ u i ( ˜ p ) for al l i ∈ S and u i ( p S , ˜ p − S ) > u i ( ˜ p ) for at le ast one user i ∈ S . 3 If k i is priv ate information, it would b e in teresting to construct a mec hanism that ind uces users to reveal their true v alues k i . 14 By d efinition, ˜ p is coaliti on-pr o of with resp ect to the gr and c o alition S = N if and only if u ( ˜ p ) = ( u 1 ( ˜ p ) , . . . , u n ( ˜ p )) is P areto efficien t. If ˜ p is a Nash equ ilibrium, then it is coalition- pro of with resp ect to any one-per s on “coaliti on.” The non-co op erative game of our in terest is the con ten tion game with T RD-based inte rven tion g ∗ . Prop osition 6 ˜ p is c o alition-pr o of with r esp e ct to a two-p erson c o alition S = { i, j } in the c on- tention game with TRD-b ase d intervention g ∗ if and only if ˜ p i + ˜ p j ≤ 1 . Pr o of : S ee App endix C. The pro of of Prop osition 6 shows that if ˜ p i + ˜ p j > 1 then users i and j can join tly reduce their transmission pr ob ab ilities to in cr ease their p a y offs at the same time. F or example, su pp ose that users 1 and 2 are con trolled by the same transmitter and that the manager selects the target ˜ p with ˜ p 1 = 0 . 3 and ˜ p 2 = 0 . 8. Then ˜ u 1 ( ˜ p ) = 0 . 06 k 1 Q j 6 =1 , 2 (1 − ˜ p j ) and ˜ u 2 ( ˜ p ) = 0 . 56 k 2 Q j 6 =1 , 2 (1 − ˜ p j ). Supp ose that the t wo users joint ly deviate to ( p 1 , p 2 ) = (0 . 25 , 0 . 75). Then the new pa yoffs are ˜ u 1 ( p 1 , p 2 , ˜ p N \{ 1 , 2 } ) = 0 . 0625 k 1 Q j 6 =1 , 2 (1 − ˜ p j ) and ˜ u 2 ( p 1 , p 2 , ˜ p N \{ 1 , 2 } ) = 0 . 5625 k 2 Q j 6 =1 , 2 (1 − ˜ p j ), whic h is strictly b etter for b oth users. A d ecrease in p i and p j at th e same time also increases the pa yo ffs of all the users not b elonging to the coaliti on, whic h imp lies that a target ˜ p with ˜ p i + ˜ p j > 1 is not P areto efficient. T his observ ation leads to the follo win g coroll ary . Corollary 2 If ˜ p is Par eto efficient in the c ontention game with TRD- b ase d intervention g ∗ , then it is c o alition-pr o of with r esp e ct to any two-p erson c o alition. In fact, w e can generalize the abov e corollary and provi d e a str onger statemen t. Prop osition 7 ˜ p is Par eto efficie nt in the c ontention game with TRD-b ase d intervention g ∗ if and only if it is c o alition-pr o of with r esp e ct to any c o alition. Pr o of : S ee App endix D. 5 Informational Requiremen t and Its Relaxat ion W e ha ve in tro duced and analyzed the con ten tion game and the Stac k elb erg con tentio n game with TRD-based inte r ven tion. In this section w e discuss wh at the p la y ers of eac h game need to kn o w in order to pla y the corresp onding equilibr ium. 5.1 Con ten t ion Game and Nash Equilibrium In a general non-co op erative game, eac h user needs to kno w, or predict correctly , the strategy pr ofile of others in ord er to find its b est resp ons e strategy . I n the con ten tion game with the pa yoff f u nction 15 u i ( p ) = k i p i Q j 6 = i (1 − p j ), it s uffices f or user i to kn ow the s ign of Q j 6 = i (1 − p j ), i.e., whether it is p ositiv e or zero, to calculate its b est resp onse. On the other hand, p i = 1 is a we akly dominant str ate gy for any user i , w hic h means setting p i = 1 is w eakly b etter no m atter w hat strategies other users c ho ose. Hence, the Nash equilibr ium p = (1 , . . . , 1) do es not require an y kno wledge on others’ strategies. 5.2 Stac k elb er g Con ten t ion Game with TRD-based In terven tion and Stac kel- b erg Equilibrium Considering the timing of the Stac k elb erg cont entio n game outlined in Section 3, w e can list th e follo wing requiremen ts on the manager and the u sers for the S tac k elb er g equ ilibr ium to b e play ed. Requiremen t M. Once the users choose the tr an s mission pr ob ab ilities, the manager obs erv es the strategy p rofile of the users. The manager n eeds to decide the leve l of in terve ntion as a fun ction of the transmission probabil- ities of the users . If the manage r can d istinguish the access of eac h user and hav e sufficientl y man y observ ations to d etermin e the transmission p robabilit y of eac h u s er, then this r equiremen t will b e satisfied. If the manager can obs erv e the c hannel state (idle, success, collision) and ident ify the users of successfully transmitted pac k ets, h e can estimate the trans m ission pr obabilit y of eac h user in the follo wing wa y . First, he can obtain an estimate of Q i ∈ N (1 − p i ) by calculating the frequen cy of idle slots, called q idle . Second, he can obtain an estimate of p i Q j 6 = i (1 − p j ) by calculating the frequ en cy of slots in w hic h user i succeeds to transmit its pac k et, called q i . Finally , an estimate of p i can b e obtained by solving p i 1 − p i = q i q idle for p i . Requiremen t U. User i kno ws g ∗ (and thus ˜ p ) and p − i when it chooses its tran s mission probability . Requiremen t U is sufficient for the Nash equilibrium of the con ten tion game with TRD-based in terve ntion to b e pla yed by the u sers. User i can find its b est resp onse strategy by maximizing ˜ u i giv en g ∗ and p − i . In fact, a wea ker requiremen t is compatible with the Nash equilibrium of the con ten tion game with T R D-based interv en tion. Su pp ose that user i kno ws the form of int erven tion function g ∗ and th e v alue of ˜ p i , and observe s the in terv ention lev el p 0 . ˜ p em b edd ed in the TRD- based in terven tion fu n ction g ∗ can b e thought of as a recommended strategy profile by th e manager (th us the comm u nication from the manager to the users o ccurs indirectly through the function g ∗ ). Ev en though user i d o es not know the recommended strategies to other u sers, i.e., th e v alues of ˜ p j , j 6 = i , it knows its recommended tran s mission prob ab ility . F rom the form of the in terven tion function, user i can deriv e that it is of its b est inte r est to follo w the recommendatio n as long as all the other users follo w their recommended strategies. Obs er v in g p 0 = 0 confirms its b elief that other users pla y the r ecommend ed strategie s, and it has no reason to deviate. 16 The u sers can acquire knowle d ge on the int erven tion fun ction g ∗ through one of three w a ys: (i) k n o wn proto col, (ii) announ cement, and (iii) learning. The fi rst metho d is effectiv e in the case where a certain net w ork manager op erates in a certain c hannel (for example, a frequency band). The comm unity of u sers will know the proto col (or in terv ention fun ction) used by the manager. Th is metho d do es not require an y inf ormation exc han ge b etw een the manager and the users. Neither teac hing of th e manager nor learnin g of th e users is n ecessary . How ev er, there is inflexibilit y in c h o osing an inte rven tion function, an d the manager cannot change his target strategy profile frequ ently . Nev ertheless, this is the metho d most often used in current wir eless n et w orks, where us er s ap p ertain to a predetermined class of known and homogeneous protocols. The second metho d allo ws the manager to mak e the users kn o w g ∗ directly , whic h includes information on the target ˜ p . The manager will execute h is inte rven tion according to the ann ounced in terve ntion function b ecause the S tack elberg equilibrium ( g ∗ , ˜ p ) ac hiev es h is ob jectiv e. Ho w eve r , it requires explicit message deliv ery from the manager to th e users, which is sometimes costly or m a y ev en b e imp ossible in p ractice. Finally , if the Stac k elb erg con tentio n game is pla yed rep eatedly with the same interv en tion function, the us ers may b e able to reco v er the form of the inte r ven tion fu nction c h osen b y the manager based on their observ ations on ( p 0 , p ), for example, using learning tec h niques d ev elop ed in [17, 18, 19]. Ho wev er, this p ro cess ma y tak e long and the users ma y not b e able to collect enough data to find out the true functional f orm if there is limited exp eriment ation of the users. R e mark. If users are ob edien t, the m an ager can use cent ralized con trol b y co mmunicating ˜ p i to user i . Add itional comm unication and estimation ov erhead required f or the S tac k elb erg equilibrium can b e consid ered as a cost incur red to deal with th e s elfis h b eha vior of users, or to provide incen tive s for users to follo w ˜ p . 5.3 Limited Observ ability of the Manager The constru ction of th e TRD-based in terven tion fun ction assumes that the manager can observe or estimate the transmission probabilities of the users correctly . In r eal ap p lications, the manager may not b e able to observ e the exact choic e made by eac h user. W e consider sev eral scenarios und er whic h the manager h as limited observ abilit y and examine h o w the TRD-based interv en tion fu nction can b e mo dified in those scenarios. 5.3.1 Quan tized Observ ation Let I = { I 0 , I 1 , . . . , I m } b e a set of in terv als wh ic h partition [0 , 1]. W e assume that eac h in terv al con tains its r ight end p oint . F or simplicit y , we will consider in terv als of the same length. That 17 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 1.2 u 1 u 2 (a) m =5 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 1.2 u 1 u 2 (b) m =15 Figure 3: Pa yoffs that can b e ac h iev ed b y the manager with quan tized observ ation. (a) m = 5. (b) m = 15. is, I = { 0 } , 0 , 1 m , 1 m , 2 m , . . . , m − 1 m , 1 , and w e call I 0 = { 0 } and I r = r − 1 m , r m for all r = 1 , . . . , m . Supp ose that the manager only observes wh ic h inte rv al in I eac h p i b elongs to. In other wo r d s, the manager observes r i instead of p i suc h that p i ∈ I r i . I n this case, the leve l of interv en tion is calculate d based on r = ( r 1 , . . . , r n ) rather than p . It means that giv en p − i , p 0 w ould b e the same for any p i , p ′ i if p i and p ′ i b elong to the same I r . Since an y p i ∈ r − 1 m , r m is wea kly dominated b y p i = r m , the users will c ho ose their trans mission pr obabilities at the righ t en d p oin ts of the inte rv als in I . This in turn will affect the c hoice of a target by the manager. The manager will b e restricted to c ho ose ˜ p suc h that ˜ p i ∈ 1 m , . . . , m − 1 m for all i ∈ N . Then the manager can implement ˜ p with the in terv enti on function g ( r ) = g ∗ ( p ), where p i is set equ al to r i m . In summary , the quantize d observ ation on p restricts the choice of ˜ p by the manager from (0 , 1) N to 1 m , . . . , m − 1 m N . Figure 3 s ho ws the pa yo ff profiles that can b e ac hieved b y the manager with q u an tized observ a- tion. When the num b er of in terv als is mo d erately large, the manager has man y optio n s near or on the P areto efficiency b ound ary . 5.3.2 Noisy Observ at ion W e mo dify the Stac ke lb erg con tent ion game to analyz e the case w h ere the manager observes noisy signals of the transmiss ion p robabilities of the us er s . Let P i = [ ǫ, 1 − ǫ ] b e the strategy space of u s er i , where ǫ is a small p ositive num b er. W e assu m e that the u sers can observe the strategy pr ofile p , but the manager observes a noisy signal of p . The manager observes p o i instead of p i where p o i is u n iformly distributed on [ p i − ǫ, p i + ǫ ], ind ep endently ov er i ∈ N . Su pp ose that the m anager c ho oses a target ˜ p such that ˜ p i ∈ [2 ǫ, 1 − 2 ǫ ]. The exp ected pay off of user i when the manager uses 18 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 u 1 u 2 (a) ε = 0.1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 u 1 u 2 (b) ε = 0.01 Figure 4: Pa y offs that can b e ac hiev ed b y the manager with noisy observ ation. (a) ǫ = 0 . 1. (b) ǫ = 0 . 01. an inte rven tion function g is E [ ˜ u i ( p ; g ) | p ] = k i p i Y j 6 = i (1 − p j ) (1 − E [ g ( p o ) | p ]) . (20) Hence, the int erven tion function is effectiv ely E [ g ( p o ) | p ] instead of g ( p ) when the manager observes p o . If ˆ p is a Nash equilibrium of the conten tion game with in terven tion g when p is p erfectly observ able to the manager and E [ g ( p o ) | p ] = g ( p ) for all p such that max i ∈ N | p i − ˆ p i | ≤ ǫ , then ˆ p will still b e a Nash equ ilibr ium of the con tenti on game with interv en tion g when the manager observ es a n oisy signal of the str ategy profile of th e users. Consider th e TRD-based interv en tion function g ∗ . Since g ∗ ( p ) ≥ 0 f or all p ∈ P and h ( p o ) > 0 with a p ositiv e p robabilit y wh en p = ˜ p , E [ g ∗ ( p o ) | ˜ p ] > 0 w h ereas g ∗ ( ˜ p ) = 0. Since g ∗ is kink ed at ˜ p , the noise in p o will distort the incen tives of the users to choose ˜ p . The manager can implemen t his target ˜ p at the exp ense of in terven tion with a p ositiv e proba- bilit y . If the manager adopts the f ollo wing in terv entio n function g ( p ) = X i ∈ N 1 1+ ǫq p i − ˜ p i ˜ p i + ( n + 1) ǫq 1 + ǫq , (21) where q = P i ∈ N 1 ˜ p i , then ˜ p is a Nash equilibr ium of the conte ntion game with interv en tion g , but the a ve r age lev el of interv en tion at ˜ p is E [ g ( p o ) | ˜ p ] = g ( ˜ p ) = ǫq 1 + ǫq > 0 , (22) whic h can b e thought of as the efficiency loss due to the noise in observ ations. 19 Figure 4 illus trates the s et of pa yoff pr ofiles th at can b e ac hieved with th e interv en tion function giv en b y (21). As the size of the noise gets smaller, th e set expands to approac h the P areto efficiency b ound ary . 5.3.3 Observ at ion on the Aggregate Probabilit y W e consider the case where the manager can observ e only the frequency of the slots that are not accessed by an y user. If the users transmit their pac ke ts acc ordin g to p , then the manager obs erv es only the aggregate pr obabilit y Q i ∈ N (1 − p i ). In this s cenario, the in terven tion function that the manager chooses h as to b e a fun ction of Q i ∈ N (1 − p i ), and this implies that the manager cannot discriminate among the users. The T RD-based interv en tion function g ∗ allo ws the m anager to use different reactions to eac h user’s deviation. I n the effectiv e r egion w h ere the TRD is b et w een 0 and 1, one unit increase in p i results in 1 ˜ p i units increase in p 0 . Ho w eve r, this kin d of discrimination th r ough the structur e of the interv en tion fu nction is imp ossible when the manager cannot observe individual transmission probabilities. This limitatio n forces the manager to treat the users equally , and the target has to b e c hosen suc h that ˜ p i = ˜ p f or all i ∈ N . If the manager uses the follo win g in terv ent ion f unction, g ( p ) = " 1 ˜ p (1 − ˜ p ) n − 1 (1 − ˜ p ) n − Y i ∈ N (1 − p i ) !# 1 0 (23) then he can imp lemen t ˜ p = ( ˜ p, . . . , ˜ p ) with g ( ˜ p ) = 0 as a Stac kelberg equilibriu m. Hence, if the m anager only observes the aggregate probability , this p rev ents him fr om setting the target transmission p r obabilities differently across us ers. Figure 5 sh o ws the p a y off profiles achiev ed with symmetric strategy pr ofi les, whic h can b e im- plemen ted b y th e manager wh o observes the ag gregate probabilit y . 5.4 Limited Observ ability of the Users and Conject ural Equilibrium W e now r elax Requ iremen t U and assume that user i can observ e only the aggregate probability (1 − p 0 ) Q j 6 = i (1 − p j ). Eve n though the users d o not know the exact form of the interv en tion fun ction of the manager, th ey are a wa r e of the dep enden ce of p 0 on th eir transmission p robabilities and try to mo del this dep endence based on their observ ati ons ( p i , (1 − p 0 ) Q j 6 = i (1 − p j )). Sp ecifically , user i bu ilds a conjecture fun ction f i : [0 , 1] → [0 , 1], whic h means that user i conjectures that the v alue of (1 − p 0 ) Q j 6 = i (1 − p j ) will b e f i ( p i ) if he c ho oses p i . The equilibriu m concept appropriate in this con text is c onje c tur al e qu ilibrium fi rst introd u ced by Hahn [20]. 20 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 u 1 u 2 (a) Homogeneous users 0 0.5 1 1.5 2 0 0.5 1 1.5 2 u 1 u 2 (b) Heterogeneous users Figure 5: Pa y offs that can b e ac h ieved by the manager who observ es on ly the aggregate probab ility . (a) Homogeneous users with k 1 = k 2 = 1. (b) Heterogeneous users with k 1 = 1 and k 2 = 2. Definition 4 A str ate gy pr ofile ˆ p and a pr ofile of c onje ctur es ( f 1 , . . . , f n ) c onstitutes a conjectural equilibrium of the c ontention game with intervention g i f k i ˆ p i f i ( ˆ p i ) ≥ k i p i f i ( p i ) for al l p i ∈ P i (24) and f i ( ˆ p i ) = (1 − g ( ˆ p )) Y j 6 = i (1 − ˆ p j ) (25) for al l i ∈ N . The fi rst condition states that ˆ p i is optimal giv en user i ’s conjecture f i , and the second condition sa ys that its conjecture is consisten t with its observ ation. It can b e seen fr om this d efinition that the conjectural equ ilibrium is a generaliza tion of Nash equ ilibrium in that an y Nash equilibr ium is a conjectural equilibriu m with every u ser holding the correct conjecture giv en others’ strategies. On the other hand, it is quite general in some cases, and in the game w e consider, for any strategy profile ˆ p ∈ P , there exists a conjecture p rofile ( f 1 , . . . , f n ) that constitutes a conjectural equilibriu m. F or example, we can set f i ( p i ) = (1 − g ( ˆ p )) Q j 6 = i (1 − ˆ p j ) if p i = ˆ p i and 0 otherwise. Since the TRD-based interv en tion function g ∗ is linear in eac h p i , it is natural for the users to adopt a conjecture function of the linear form. Let us assume that conjecture functions are of the follo wing trimmed linear form : f i ( p i ) = [ a i − b i p i ] 1 0 (26) for some a i , b i > 0. 21 W e sa y that a conjecture fu n ction f i is line arly c onsistent at ˆ p if it is lo cally correct up to the first deriv ativ e at ˆ p , i.e., f i ( ˆ p i ) = (1 − g ( ˆ p )) Q j 6 = i (1 − ˆ p j ) and f ′ i ( ˆ p i ) = − ∂ g ( ˆ p ) ∂ p i Q j 6 = i (1 − ˆ p j ). Since th e TRD-based in terve ntion function g ∗ is linear in eac h p i , the conjecture function f ∗ i ( p i ) , g ∗ ( p i , ˜ p − i ) is linearly consisten t at ˜ p , and ˜ p and ( f ∗ 1 , . . . , f ∗ n ) constitutes a conjectural equ ilibrium. Therefore, as long as the users u se lin early consisten t conjectures, limited observ abilit y of the users do es not affect th e final outcome. T o build linearly consisten t conjectures, ho wev er, th e users need to exp eriment and collect data usin g lo cal deviatio ns from th e equilibrium p oin t in a r ep eated pla y of the Stac k elb erg con ten tion game. A loss in p erformance ma y result durin g this learning p hase. 6 Illustrativ e Results 6.1 Homogeneous Users W e assume that the users are h omogeneous with k i = 1 for all i ∈ N . Giv en a transmission probabilit y pr ofile p , the system utilization ratio can b e defined as the probabilit y of successful transmission in a giv en slot τ ( p ) = X i ∈ N p i Y j 6 = i (1 − p j ) . (27) Note that the maxim um system utilizatio n ratio is 1, which occur s when only one user transmits with probabilit y 1 wh ile others never transm it. T able 1 sho ws th e individual pa y offs and the system utilization r atios for the n u m b er of users 3, 10, and 100 when the manager imp lemen ts th e target at the symmetric efficien t strategy pr ofile ˜ p = ( 1 n , . . . , 1 n ). n Individu al P a y off System Utili zation Ratio 3 0.1481 5 0.4444 4 10 0.0387 4 0.3874 2 100 0.0037 0 0.3697 3 T ab le 1. Individual pa yoffs and system utilizatio n r atios with homogeneous users W e can see that pac ke ts are transm itted in approximate ly 37% of the slots with a large n u m b er of users even if there is no explicit co ordin ation among the users. The system utilization of our mo del con v erges to 1 /e ≈ 36 . 8% as n go es to infinit y , w hic h coincides with the maximal throughpu t of a slotted Aloha system with P oisson arriv als and an infinite n umber of users [21]. B u t in our mo del users main tain their selfish b eha vior, an d w e do n ot u se an y feedbac k information on the c hannel state. 22 6.2 Heterogeneous Users W e n o w consider u sers with d ifference v aluatio n s. Sp ecifically , w e assume that k i = i for i = 1 , . . . , n . W e will consider th ree targets: ˜ p 1 = (1 , . . . , n ) / P n i =1 i , ˜ p 2 = ( 1 n , . . . , 1 n ), and ˜ p 3 with whic h ˜ u i ( ˜ p 3 ; g ∗ ) = ˜ u j ( ˜ p 3 ; g ∗ ) for all i, j . ˜ p 1 assigns a higher transm ission probabilit y to a user with a higher v aluation. ˜ p 2 treats all the u sers equally regardless of their v aluations. ˜ p 3 is egalitarian in that it yields the same ind ividual pa yo ff to ev ery user, w hic h implies that a u ser with a lo w v aluation is assigned a higher trans mission pr ob ab ility . Av erage Aggregat e Standard System Nash Generalized T arget n Individu al P a yoff Deviation of Utilizatio n Pro du ct Nash P a yoff P a yoffs Ratio Pro du ct 3 0.3888 9 1.1666 7 0 .32710 0.4722 2 1.2860 1e-2 2 .48073e-3 ˜ p 1 10 0. 28048 2.8 0481 0.2464 3 0.39384 3.4019 3e-9 4.5749 7e-30 100 0.248 55 24.854 66 0 .22189 0.370 34 2.1263 2e-98 ≈ 0 3 0.2963 0 0.8888 9 0.12096 0.44 444 1.95092e-2 1.1418 3e-3 ˜ p 2 10 0. 21308 2.1 3081 0.1112 7 0.38742 2.7643 2e-8 4.8311 7e-34 100 0.186 71 18.671 35 0 .10673 0.369 73 5.7336 4e-86 ≈ 0 3 0.2513 3 0.7540 0 0 0.4607 8 1.5876 5e-2 2 .52064e-4 ˜ p 3 10 0. 13753 1.3 7533 0 0. 40283 2.42148e -9 4.09682 e-48 100 0.073 03 7.303 37 0 0.378 85 2.25 070e-114 ≈ 0 T ab le 2. Av erage individual pa yo ffs, aggregate pa y offs, standard deviations of individu al pa y offs, system u tilizatio n ratios, Nash pro d ucts, and generalized Nash pro d ucts with heterogeneous users T ab le 2 sho ws that a tradeoff b et wee n efficiency (measured by the sum of pay offs) and equit y exists when users are h eterogeneous. A higher aggrega te pa y off is ac hiev ed when us er s w ith high v aluatio ns are given p riorit y . A t the same time, it limits ac cess b y users w ith lo w v aluations, wh ic h increases v ariatio n s in individ u al p a y offs. Also, the resu lts in T able 2 are consisten t with that ˜ p 2 is a Nash bargaining solution and th at ˜ p 1 is a nonsymmetric Nash bargaining solution with w eigh ts equal to v aluations. 7 Conclusion W e ha v e analyzed the problem of m ultiple u sers who share a common comm un ication c hann el. Us- ing the game theory framew ork, we ha v e shown that selfish b eha vior is lik ely to lead to a net wo rk collapse. Ho wev er, full system utiliza tion requ ires co ordination among users using explicit message exc hanges, whic h ma y b e imp ractical giv en the distributed natur e of wireless net works. T o achiev e 23 a b etter p er f ormance without co ordination schemes, users need to sustain co op eration. W e pro vide incen tiv es for selfish users to limit their access to the c hannel b y in tro d ucing an in terv ent ion function of the net w ork manager. With TRD-based interv en tion functions, the manager can imp lemen t an y outcome of the con tentio n game as a Stac k elb erg equilibr iu m. W e hav e d iscussed th e amount of in- formation requ ired for implementat ion, and h ow the v arious kind s of relaxations of the r equ iremen ts affect the ou tcome of the Stac k elb erg con ten tion game. Our approac h of using an inte r ven tion fu nction to impr o v e n et w ork p erformance can b e app lied to other situations in wireless comm u nications. P oten tial applications of the idea in clude sustaining co op eration in multi- h op net wo rk s and limiting the attac k of adv ersary users . An interv en tion function ma y b e designed to ser ve as a coordin ation d evice in addition to providing selfish users with incent ives to coop erate. Finally , d esigning a proto col that enables u sers to pla y the role of the m an ager in a distribu ted manner will b e cr itical to ensur e that our app roac h can b e adopted in completely decentrali zed comm un ication scenarios, where no manager is present . A Pro of of P r op osition 3 Recall h ( p ) = p 1 ˜ p 1 + · · · + p n ˜ p n − n us ed to define g ∗ ( p ). W e examine wh ether a strategy pr ofile ˆ p w ith ˆ p i < 1 for all i ∈ N constitutes a Nash equilibrium of Γ g ∗ b y considering four cases on the v alue of h ( ˆ p ). Case 1. h ( ˆ p ) < 0. Let ǫ = − h ( ˆ p ) > 0. If u ser i c hanges its transmission probab ility f r om ˆ p i to ˆ p i + ǫ , then its pa yo ff increases b ecause p 0 is still zero. Hence ˆ p cannot b e a Nash equilibriu m if h ( ˆ p ) < 0. Case 2. h ( ˆ p ) = 0. Consider arb itrary us er i . If it deviates to p i < ˆ p i , p 0 is s till zero and ˜ u i decreases. ˜ u i ( p i , ˆ p − i ) is differen tiable and strictly conca ve on p i > ˆ p i . Since d ˜ u i dp i = ( k i Q j 6 = i (1 − ˆ p j ))(1 + n − P j 6 = i ˆ p j ˜ p j − 2 p i ˜ p i ), k i > 0 and ˆ p i < 1 for all i , sig n d ˜ u i dp i p i = ˆ p i = s ig n 1 + n − X j 6 = i ˆ p j ˜ p j − 2 ˆ p i ˜ p i (28) = s ig n 1 + n − n X j =1 ˆ p j ˜ p j − ˆ p i ˜ p i (29) = s ig n 1 − ˆ p i ˜ p i . (30) There is no gain f or user i from deviating to an y p i > ˆ p i if and only if d ˜ u i dp i | p i = ˆ p i ≤ 0, which is equiv alen t to ˆ p i ≥ ˜ p i . F or ˆ p to b e a Nash equilibrium, we need ˆ p i ≥ ˜ p i for all i = 1 , . . . , n . T o satisfy 24 h ( ˆ p ) = 0, all inequalities sh ou ld b e equalities. Hence, only ˆ p = ˜ p is a Nash equilibrium among ˆ p suc h that h ( ˆ p ) = 0. Case 3. 0 < h ( ˆ p ) < 1. Since ˜ u i ≥ 0, there is no gain for user i to deviate to p i suc h that h ( p i , ˆ p − i ) ≥ 1. If there is a gain from deviation to p i suc h that h ( p i , ˆ p − i ) < 0, then there is another profitable deviation p ′ i suc h that h ( p ′ i , ˆ p − i ) = 0 by u sing the argument of Case 1. Th erefore, w e can restrict our atten tion to deviations p i that lead to 0 ≤ h ( p i , ˆ p − i ) < 1. A t su c h a deviation by user i , ˜ u i ( p i , ˆ p − i ) = k i Y j 6 = i (1 − ˆ p j ) p i 1 + n − X j 6 = i ˆ p j ˜ p j − p i ˜ p i . (31) ˆ p i is b est resp onse to ˆ p − i if and only if d ˜ u i dp i | p i = ˆ p i = 0. Using the first deriv ativ e giv en in Case 2, we obtain ˆ p i ˜ p i = 1 + n − n X j =1 ˆ p j ˜ p j = 1 − h ( ˆ p ) < 1 . (32) F or ˆ p to b e a Nash equilibrium , the ab o ve inequ alit y should b e satisfied for ev ery i , whic h in turn implies n X i =1 ˆ p i ˜ p i < n, (33) and this con tradicts to the initial assumption h ( ˆ p ) > 0. Therefore, there is n o ˆ p with 0 < h ( ˆ p ) < 1 that constitutes a Nash equilibrium. Case 4. h ( ˆ p ) ≥ 1. Since ˜ u i ( ˆ p ) = 0 for eve ry i , there is a profitable deviation of user i only if there exists p i ∈ (0 , ˆ p i ) suc h that h ( p i , ˆ p − i ) < 1. Equ iv alen tly , if setting p i = 0 yields h ( p i , ˆ p − i ) ≥ 1, then there is no profitable deviation of user i from ˆ p i . Since h (0 , ˆ p − i ) = X j 6 = i ˆ p j ˜ p j − n, (34) ˆ p with h ( ˆ p ) ≥ 1 is a Nash equilibrium if and only if X j 6 = i ˆ p j ˜ p j − n ≥ 1 for all i = 1 , . . . , n. (35) B Pro of of P r op osition 4 Consider t = 1. User i c ho oses ˆ p 1 i to maximize ˜ u 1 i ( p i , ˆ p 0 − i ) = k i p i 1 − g 1 ( p i , ˆ p 0 − i ) Y j 6 = i (1 − ˆ p 0 j ) (36) 25 = k i Q j 6 = i (1 − ˆ p 0 j ) p i if p i < ˜ p i 1 − ˆ p 0 n ˜ p n + ˆ p 0 i ˜ p i , k i Q j 6 = i (1 − ˆ p 0 j ) p i 2 − ˆ p 0 n ˜ p n + ˆ p 0 i ˜ p i − p i ˜ p i if ˜ p i 1 − ˆ p 0 n ˜ p n + ˆ p 0 i ˜ p i ≤ p i ≤ ˜ p i 2 − ˆ p 0 n ˜ p n + ˆ p 0 i ˜ p i , 0 if p i > ˜ p i 2 − ˆ p 0 n ˜ p n + ˆ p 0 i ˜ p i . (37) If 0 ≤ ˆ p 0 n ˜ p n − ˆ p 0 i ˜ p i < 2, the maximum is attained at ˆ p 1 i that satisfies ˆ p 1 i ˜ p i = 1 − 1 2 ˆ p 0 n ˜ p n − ˆ p 0 i ˜ p i . (38) Notice that ˆ p 1 n = ˜ p n . If ˆ p 0 n ˜ p n − ˆ p 0 i ˜ p i ≥ 2, then ˜ u 1 i ( p i , ˆ p 0 − i ) = 0 for all p i ≥ 0. Since an y p i is a b est resp onse in this case, w e assume that ˆ p 1 i = ˆ p 0 i . 4 Consider t = 2. Firs t, consider user i such that ˆ p 0 n ˜ p n − ˆ p 0 i ˜ p i < 2. Since ˆ p 1 n ˜ p n − ˆ p 1 i ˜ p i = 1 2 ˆ p 0 n ˜ p n − ˆ p 0 i ˜ p i , 0 ≤ ˆ p 1 n ˜ p n − ˆ p 1 i ˜ p i < 2. Using an analogous argumen t, w e get ˆ p 2 i ˜ p i = 1 − 1 2 ˆ p 1 n ˜ p n − ˆ p 1 i ˜ p i = 1 − 1 2 2 ˆ p 0 n ˜ p n − ˆ p 0 i ˜ p i . (39) Next consider us er i suc h that ˆ p 0 i ˜ p i ≤ 1. S ince ˆ p 1 n ˜ p n = 1, w e again ha v e 0 ≤ ˆ p 1 n ˜ p n − ˆ p 1 i ˜ p i < 2 and the b est r esp onse is giv en by ˆ p 2 i ˜ p i = 1 − 1 2 ˆ p 1 n ˜ p n − ˆ p 1 i ˜ p i = 1 − 1 2 ˆ p 1 n ˜ p n − ˆ p 0 i ˜ p i . (40) Considering a general t ≥ 2, we get ˆ p t i ˜ p i = 1 − 1 2 t ˆ p 0 n ˜ p n − ˆ p 0 i ˜ p i (41) for user i su c h that ˆ p 0 n ˜ p n − ˆ p 0 i ˜ p i < 2 and ˆ p t i ˜ p i = 1 − 1 2 t − 1 ˆ p 1 n ˜ p n − ˆ p 0 i ˜ p i (42) for user i su c h that ˆ p 0 i ˜ p i ≤ 1. T aking limits as t → ∞ , we obtain the conclusions of the prop osition. C Pro of of Prop osition 6 Supp ose that the users in the coalition S = { i, j } c ho ose ( p i , p j ) instead of ( ˜ p i , ˜ p j ). Then h ( p i , p j , ˜ p − S ) = p i ˜ p i + p j ˜ p j + ( n − 2) − n = p i ˜ p i + p j ˜ p j − 2 , (43) 4 If we assume that ˆ p 1 i is chosen according to (38), w e do not need the assumption that for each i eith er ˆ p 0 n ˜ p n − ˆ p 0 i ˜ p i < 2 or ˆ p 0 i ˜ p i ≤ 1 in the proposition. 26 and ˜ u i ( p i , p j , ˜ p − S ) = k i Y k / ∈ S (1 − ˜ p k ) p i (1 − p j )(1 − g ∗ ( p i , p j , ˜ p − S )) , (44) ˜ u j ( p i , p j , ˜ p − S ) = k j Y k / ∈ S (1 − ˜ p k ) p j (1 − p i )(1 − g ∗ ( p i , p j , ˜ p − S )) . (45) Hence, ˜ p is coalition-proof with resp ect to S if and on ly if there do es not exist ( p i , p j ) ∈ [0 , 1] 2 suc h that p i (1 − p j )(1 − g ∗ ( p i , p j , ˜ p − S )) ≥ ˜ p i (1 − ˜ p j ) , (46) p j (1 − p i )(1 − g ∗ ( p i , p j , ˜ p − S )) ≥ ˜ p j (1 − ˜ p i ) (47) with at lea st one inequality strict. First, notice that setting p i = ˜ p i and p j 6 = ˜ p j will violate one of the tw o inequalities. The inequalit y for u ser i will n ot hold if p j > ˜ p j , and the one for user j will not hold if p j < ˜ p j . Hence, b oth p i 6 = ˜ p i and p j 6 = ˜ p j are necessary to hav e b oth inequalities satisfied at the same time. W e consider four p ossible cases. Case 1. p i < ˜ p i and p j > ˜ p j Since g ∗ ( · ) ≥ 0, (46) is violated. Case 2. p i > ˜ p i and p j < ˜ p j Equation (47) is violated. Case 3. p i < ˜ p i and p j < ˜ p j Since h ( p i , p j , ˜ p − S ) < 0, g ∗ ( p i , p j , ˜ p − S ) = 0. Hence, (46) and (47) b ecome p i (1 − p j ) ≥ ˜ p i (1 − ˜ p j ) , (48) p j (1 − p i ) ≥ ˜ p j (1 − ˜ p i ) . (49) W e consider the con tour curves of p i (1 − p j ) and p j (1 − p i ) going through ( ˜ p i , ˜ p j ) in the ( p i , p j )-plane. The slop e of the con tour curv e of p i (1 − p j ) at ( ˜ p i , ˜ p j ) is 1 − ˜ p j ˜ p i and that of p j (1 − p i ) is ˜ p j 1 − ˜ p i . There is no area of mutual impro vemen t if and only if 1 − ˜ p j ˜ p i ≥ ˜ p j 1 − ˜ p i , (50) whic h is equ iv alen t to ˜ p i + ˜ p j ≤ 1. Case 4. p i > ˜ p i and p j > ˜ p j Since h ( p i , p j , ˜ p − S ) > 0, g ∗ ( p i , p j , ˜ p − S ) = h ( p i , p j , ˜ p − S ) as long as p i ˜ p i + p j ˜ p j ≤ 3. Hence, (46) and (47) b ecome p i (1 − p j ) 3 − p i ˜ p i − p j ˜ p j ≥ ˜ p i (1 − ˜ p j ) , (51) p j (1 − p i ) 3 − p i ˜ p i − p j ˜ p j ≥ ˜ p j (1 − ˜ p i ) . (52) 27 The slop e of the conto ur curve of p i (1 − p j ) 3 − p i ˜ p i − p j ˜ p j at ( ˜ p i , ˜ p j ) is (1 − ˜ p j ) 3 − 2 ˜ p i ˜ p i − ˜ p j ˜ p j ˜ p i 3 + 1 ˜ p j − ˜ p i ˜ p i − 2 ˜ p j ˜ p j = 0 , (53) and that of p j (1 − p i ) 3 − p i ˜ p i − p j ˜ p j is ˜ p j 3 + 1 ˜ p i − 2 ˜ p i ˜ p i − ˜ p j ˜ p j (1 − ˜ p i ) 3 − ˜ p i ˜ p i − 2 ˜ p j ˜ p j = + ∞ . (54) Therefore, there is no ( p i , p j ) > ( ˜ p i , ˜ p j ) that satisfies (46) and (47) at the same time. D Pro of of Prop osition 7 The “if ” part is trivial b ecause a strategy profile that is coalition-pro of with resp ect to the grand coaliti on is Pareto efficient . T o establish the “only if ” part, we will prov e that if for a give n strategy profile th er e exists a coalition that can impro ve the pa y offs of its members th en its deviation will not hurt other users outside of the coalition, which shows that the original strategy profile is not P areto efficien t. Consider a strategy profile ˜ p and a coalition S ⊂ N that can impr o v e up on ˜ p by deviating fr om ˜ p S to p S . Let p 0 = g ∗ ( p S , ˜ p − S ) the transmission probabilit y of the manager after the deviation b y coaliti on S . Since choosing p S instead of ˜ p S yields h igher pay offs to the m em b ers of S , we ha ve p i (1 − p 0 ) Y j ∈ S \{ i } (1 − p j ) ≥ ˜ p i Y j ∈ S \{ i } (1 − ˜ p j ) (55) for all i ∈ S with at least one inequ alit y strict. W e w an t to show th at the m emb ers not in the coaliti on S d o not get low er pa y offs as a resu lt of the deviation b y S , that is, (1 − p 0 ) Y j ∈ S (1 − p j ) ≥ Y j ∈ S (1 − ˜ p j ) . (56) Supp ose (1 − p 0 ) Q j ∈ S (1 − p j ) < Q j ∈ S (1 − ˜ p j ). W e can see that p 0 < 1 and 0 < p i < 1 for all i ∈ S b ecause the r ight-hand side of (55) is strictly p ositiv e. Combining this in equalit y with (55) yields p i > ˜ p i for all i ∈ S , which im p lies p 0 > 0. W e can write p i = ˜ p i + ǫ i for some ǫ i > 0 for i ∈ S . Then p 0 = g ∗ ( p S , ˜ p − S ) = P i ∈ S ǫ i ˜ p i . (55) can b e r ewritten as ˜ p i Y j ∈ S \{ i } (1 − ˜ p j ) ≤ ( ˜ p i + ǫ i )(1 − p 0 ) Y j ∈ S \{ i } (1 − ˜ p j − ǫ j ) (57) < ( ˜ p i + ǫ i )(1 − p 0 ) Y j ∈ S \{ i } (1 − ˜ p j ) (58) 28 for all i ∈ S . Simplifying this giv es ǫ i ˜ p i > p 0 1 − p 0 (59) for all i ∈ S . Summing these inequalitie s up ov er i ∈ S , w e get p 0 = X i ∈ S ǫ i ˜ p i > | S | p 0 1 − p 0 (60) where | S | is the num b er of the mem b ers in S . This inequ alit y simplifies to p 0 < 1 − | S | ≤ 0, whic h is a co ntradiction. References [1] J.-W. Lee, A. T ang, J. Huang, M. Chiang, and A. R. Calderbank, “Reverse-engineering MAC: a non- co op erative game model,” IEEE Journ al on Sele cte d Ar e as in Communic ations , vol. 25, no. 6, pp. 1135– 1147 , 2007. [2] G. T an and J. Guttag, “The 80 2.11 MAC proto col lea ds to inefficient equilibria ,” in Pr o c e e dings of the 24th Annual Joint Confer enc e of the IEEE Computer and Communic ations S o cieties (INFOCOM 2005) , vol. 1, pp. 1– 11, Miami, FL, USA, Ma rch 20 05. [3] R. T. Ma, V. Mis ra, and D. Rubenstein, “ Mo deling and analys is of ge neralized s lotted-Aloha MAC proto cols in co op erative, comp etitive and adversarial en vironments,” in Pr o c e e dings of the 26th IEEE International Confer enc e on D istribute d Computing Systems (ICDCS ’06) , Lisb oa , Portuga l, July 2006 . [4] R. My ers o n, Game The ory: Analysis of Conflict , Harv ard University Press, Cambridge, MA, USA, 1991. [5] E. Altman, N. Bonneau, and M. Debbah, “ C o rrelated equilibr ium in acces s control for wireles s c ommu- nications,” in Pr o c e e dings of NETWORKING 2006 , pp. 173– 183, Coimbra, Portugal, May 2 006. [6] J. K. Mac Kie-Maso n and H. R. V ar ian, “P ricing congestible netw ork reso urces,” IEEE Journal on Sele cte d Ar e as in Communic ations , v ol. 1 3, no. 7, pp. 1141 – 1149 , 1995. [7] Y. J in and G. Kesidis, “A pricing strategy for an Aloha netw ork o f heter ogeneous users with inelas tic bandwidth req uir ements,” in Pr o c e e dings of t he 39th A nn ual Confer enc e on In formation Scienc es and Systems , P rinceton, NJ, USA, Mar ch 200 2. [8] D. W a ng, C. Co ma niciu, and U. T ureli, “A fair and efficient pricing stra teg y for slotted Aloha in MPR mo dels,” in Pr o c e e dings of the 64th IEEE V ehicular T e chnolo gy Confer enc e , pp. 2474–24 78, Mo nt r´ eal, Canada, September 2006 . [9] R. Joha ri and J. N. Ts itsiklis, “Efficiency loss in a netw or k resour c e allo ca tion ga me,” Mathematics of Op er ations R ese ar ch , vol. 29 , no. 3, pp. 407 – 435, 20 04. [10] Y. A. Korilis , A. A. La zar, and A. Orda, “Achieving netw or k optima using Stack elb erg ro uting strategies ,” IEEE/A CM T r ansactions on Networking , vol. 5, no . 1, pp. 16 1–17 3, 1997 . 29 [11] M. Blo em, T. Alp can, and T. Ba¸ sar , “ A Stackelberg game for p ow er control and channel allo ca tio n in cognitive r adio net works,” in Pr o c e e dings of the 1st International Workshop on Game The ory in Communic ation Networks (GameComm2007) , Nantes, F r ance, Octob er 2007. [12] L. Chen, T. Cui, S. H. Low, and J. C. Doyle, “ A game-theo retic mo del for medium a ccess control,” in Pr o c e e dings of the 3r d International Wir eless In t ernet Confer enc e , Austin, TX, USA, Octob er 2 007. [13] L. Chen, S. H. Low, and J. C. Do yle, ”Conten tion control: a game-theoretic approach,” in Pr o c e e dings of the 46th IEEE Confer enc e on D e cision and Contro l , pp. 342 8–343 4, New Or leans, LA, USA, Dece mber 2007. [14] A. H. Mohsenian-Ra d, J . Huang, M. Chiang, and V. W. S. W o ng, “Utility-optimal r andom a ccess without message pas sing,” IEEE T r ansactions on Wir eless Commun ic ations , vol. 8, no . 3, pp. 1073 –1079 , 2009. [15] M. ˆ Cagalj, S. Ga ne r iwal, I. Aa d, and J.-P . Hubaux , “On selfish b ehavior in CSMA/CA net works,” in Pr o c e e dings of the 24 t h A nnu al Joint Confer enc e of the IEEE Computer and Communic ations So cieties (INFOCOM 2005) , vol. 4, pp. 2513 –252 4, Miami, FL, USA, Ma rch 2005 . [16] E . Kala i and M. Smor o dinsky , “O ther solutions to Nash’s bargaining problem,” Ec onometric a , v ol. 45, no. 3, pp. 513– 518, 19 75. [17] M. P . W ellman a nd J . Hu, “Co njectural equilibrium in m ultiag ent learning,” Machine L e arning , vol. 33 , pp. 179 –200, 1998. [18] J . Hu and M. P . W ellman, “O nline learning abo ut other agents in a dynamic m ultiag ent system,” in Pr o c e e dings of the 2nd Intern ational Confer enc e on Autonomous A gents , pp. 239 –246 , Minneapolis, MN, USA, May 1998. [19] Y. V o rob eychik, M. P . W ellman and S. Singh, “Lea rning pa yoff functions in infinite games,” Machine L e arning , vol. 6 7, pp. 14 5–168 , 2007. [20] F. H. Hahn, “Exercis es in conjectura l equilibria,” Sc andinavian Journal of Ec onomics , vol. 79 , no. 2, pp. 210–2 26, 1 977. [21] D. Berts e k as a nd R. Gallager , Data Networks , Pr e nt ice Hall, E nglewoo d Cliffs, NJ, USA, 198 7. 30
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment