An experimental analysis of Lemke-Howson algorithm
We present an experimental investigation of the performance of the Lemke-Howson algorithm, which is the most widely used algorithm for the computation of a Nash equilibrium for bimatrix games. Lemke-Howson algorithm is based upon a simple pivoting st…
Authors: Bruno Codenotti, Stefano De Rossi, Marino Pagan
An Experimental Analysis of Lemke-Ho wson Algorithm Bruno Codenotti ∗ Stefano De Rossi † Marino Pagan ‡ Abstract W e present an experimental in vestigation of the performance of the Lemke-Ho wson algo- rithm, which is the most widely used algorithm for the computation of a Nash equilibrium for bimatrix games. Lemke-Ho wson algorithm is based upon a simple piv oting strategy , which corresponds to following a path whose endpoint is a Nash equilibrium. W e analyze both the basic Lemke-Ho wson algorithm and a heuristic modification of it, which we designed to cope with the effects of a ‘bad’ initial choice of the pivot. Our experimental findings show that, on uniformly random games, the heuristics achiev es a linear running time, while the basic Lemke-Ho wson algorithm runs in time roughly proportional to a polynomial of de gree sev en. T o conduct the experiments, we have dev eloped our own implementation of Lemke- Howson algorithm, which turns out to be significantly faster than state-of-the-art software. This allowed us to run the algorithm on a much larger set of data, and on instances of much larger size, compared with pre vious work. 1 Intr oduction The computation of a Nash equilibrium for bimatrix games has attracted a lot of attention in recent years. The problem is of central importance in several theoretical and applied areas, and has many applications in dif ferent fields, like the social sciences, biology , economics, etc. The computational complexity of this problem has been unkno wn for many years, to the point that in 2001 Papadimitriou [9] mentioned it as one of the most important open problems in computational complexity . Recently the problem has been proved complete for a complexity class, PP AD , which contains problems for which efficient algorithms are not belie ved to exist [2]. In spite of the interest on the problem, little work has been done to carry out an accurate e val- uation of the performance of the algorithms actually used to solve it. In this paper we e xperimen- tally analyze Lemke-Howson algorithm, which is the best kno wn algorithm for the computation of a Nash equilibrium for bimatrix games. W e pro vide a new implementation of this algorithm, which turns out to be significantly faster than state-of-the-art software, and giv e an account of its performance. This new implementation allo wed us to experimentally analyze Lemke-Ho wson algorithm on a much larger set of sample data, and on instances of much larger size, compared with pre vious work. ∗ IIT -CNR, V ia Moruzzi 1, 56124 Pisa, Italy . Email: bruno.codenotti@iit.cnr .it. † SSSA, P .zza Martiri della Libert ` a 33, 56127 Pisa, Italy . Email: s.derossi@sssup.it. ‡ SSSA, P .zza Martiri della Libert ` a 33, 56127 Pisa, Italy . Email: m.pagan@sssup.it. 1 W e also dev elop a heuristic modification of Lemke-Ho wson algorithm, which reduces the computational inef ficiency which may result from a ‘bad’ initial choice of the piv ot. On uni- formly random games, this heuristics significantly outperforms the basic Lemke-Ho wson algo- rithm. While Lemke-Ho wson takes a number of steps which is roughly proportional to a polyno- mial of de gree 7 with respect to the game size (see Figure 1), the heuristics takes a linear number of steps (see Figure 2). Figure 1: A verage number of piv oting steps performed by LH as a function of the game size. Figure 2: A verage number of piv oting steps performed by our heuristics as a function of the game size. This improv ement makes it possible to compute equilibria in a reasonable time in games with size well beyond 1000 × 1000. The details can be found in the Appendix. Note that with standard Lemke-Ho wson algorithm, one can compute equilibria only on games of size up to a fe w hundreds (see Figure 3). 1.1 Other algorithms and previous experimental results Lemke-Ho wson algorithm (LH from no w on) is the most widely used algorithm for the computa- tion of Nash equilibria in bimatrix games [4]. Nice descriptions of LH can be found in [16, 17]. More recently , ne w algorithms have been developed. Porter , Nudelman and Shoham [10] introduced a simple search method ( PNS ) based on the enumeration of all strategy supports. Sandholm, Gilpin and Conitzer introduced a different algorithm ( MIP ), based on Mixed Inte ger Pr ogramming [12]. A taxonomy of games can be found in [8]. The results obtained in [8, 10] sho w that two of the most challenging classes of games are the “uniformly random games” and the “covariant games”, which are the main objects of in vestigation in our paper . Pre vious experimental works [10, 12] ha ve sho wn that on games with small and balanced sup- port, such as random games, PNS outperforms both MIP and LH, while on games with medium- size support LH outperforms both MIP and PNS . These e xperimental findings ha ve been obtained 2 by using the implementation of LH released in Gambit. A major difference with our work is that we hav e performed the experiments on a much larger set of sample data, and on instances of much larger size (see Section 3). 1.2 Organization of this paper In Section 2 we give some background, and introduce the notation used in the paper . In Section 3 we briefly describe our implementation, giving account of its performance, and present our experimental results. In Section 4 we describe our heuristics, and giv e ample evidence of its ef ficiency on uniformly random games. In the Appendix we giv e a detailed description of our implementation of LH, and present some further experimental results on some classes of games, which, for lack of space, we had to omit from this extended abstract. The source code of our implementation can be found at http://allie vi.sssup.it/game. 2 Backgr ound and Notation W e consider bimatrix games in normal form . These games are described in terms of two matrices, containing the payof fs of the tw o players. The ro ws (resp. columns) of both matrices are index ed by the ro w (resp. column) player’ s pur e strate gies . A mixed str ategy consists of a set of pure strategies and a probability distrib ution (a collection of nonnegati ve weights adding up to one) which indicates how likely it is that each pure strategy is played. In other words, each player associates to her i -th pure strategy a number p i between 0 and 1, such that ∑ i p i = 1. The pure strategies played with positi ve probability form the support of a mixed strate gy . Let us consider a tw o-player game, where the ro w (resp., column) player has m (resp., n ) pure strategies, and let x be a mixed strategy of the row player , and y a mixed strategy of the column player . Strate gy x is the m -tuple x = ( x 1 , x 2 , . . . , x m ) , where x i ≥ 0, and ∑ m i = 1 x i = 1. Similarly , y = ( y 1 , y 2 , . . . , y n ) , where y i ≥ 0, and ∑ n i = 1 y i = 1. Let now A = ( a i j ) be the payoff matrix of the ro w player . The entry a i j is the payoff to the row player , when she plays her i -th pure strategy and the opponent plays the pure strategy j . According to the mixed strategies x and y , the entry a i j contributes to the e xpected payoff of the ro w player with weight x i y j . The expected payoff of the row player can be ev aluated by adding up all the entries of A weighted by the corresponding entries of x and y , i.e., the payof f is ∑ i j x i y j a i j . This can be rewritten as ∑ i x i ∑ j a i j y j , which can be expressed in matrix terms as 1 x T A y . Similarly , the expected payof f of the column player is x T By . A pair ( x , y ) is a Nash equilibrium if x T A y ≥ x 0 T A y , and x T By ≥ x T By 0 , for all stochastic vectors x 0 and y 0 . If the pair ( x , y ) is a Nash equilibrium, we say that x (resp. y ) is a Nash equilibrium strate gy for the row (resp. column) player . It is well known that a Nash equilibrium in the mixed strate gies always e xists [7]. An equi v alent definition is the follo wing. A Nash equilibrium for a bimatrix game ( A , B ) is a pair of mixed str ate gies ( x , y ) such that each pure strate gy is played with positiv e probability only if it is a best r esponse to the other player’ s mixed strategy (linear complementarity constraints): 1 W e use the notation x T to denote the transpose of vector x . 3 ∀ i ∈ { 1 , . . . , m } either x i = 0 or ( A y ) i ≥ max k ( A y ) k ; ∀ j ∈ { 1 , . . . , n } either y j = 0 or ( x T B ) j ≥ max k ( x T B ) k . T o the set of equilibria, we add the artificial equilibrium ( 0 , 0 ) , where no strategy is played. It satisfies the linear complementarity constraint, but it is not a v alid equilibrium. W e also say that a pair of mixed strate gies form a quasi-equilibrium if all but one of the pure strategies played with positi ve probability satisfy the linear complementarity constraint. Gi ven a bimatrix game, and starting from any equilibrium (including the artificial equilib- rium), LH follows a path in a graph whose vertices are the equilibria and the quasi-equilibria. Thus, starting from the artificial equilibrium, it is possible to reach a Nash equilibrium of any gi ven game (see [17] for a detailed description of this process). At each step, the state of the algorithm can be represented by the following system of inequal- ities, which define the space of the feasible mixed strate gy profiles: ∀ i ∈ { 1 , . . . , m } x i ≥ 0; ∀ j ∈ { 1 , . . . , n } y j ≥ 0; ∀ j ∈ { 1 , . . . , n } ( x T B ) j ≤ 1; ∀ i ∈ { 1 , . . . , m } ( A y ) i ≤ 1; Note that the formulation above has been obtained by applying a suitable scaling procedure where the maximum payof f is set to 1, and the mixed strate gies are scaled accordingly . Follo wing [17], these inequalities can be rewritten as: a 11 . . . a 1 n . . . . . . . . . a m 1 . . . a mn − 1 . . . 0 . . . . . . . . . 0 . . . − 1 y 1 . . . y n ≤ 1 . . . 1 0 . . . 0 x 1 . . . x m − 1 . . . 0 b 11 . . . b 1 n . . . . . . . . . . . . . . . . . . 0 . . . − 1 b m 1 . . . b mn ≤ 0 . . . 0 1 . . . 1 In order to describe the state of the algorithm at each step, we will use a structure called tableau , which consists of a representation of the inequalities above as equalities, obtained by introducing slack variables . The transformation of inequalities into equalities of the tableau occurs as follows. Consider for example the inequality a 11 y 1 + a 12 y 2 + · · · + a 1 n y n ≤ 1 After introducing the slack v ariables, this inequality becomes: a 11 y 1 + a 12 y 2 + · · · + a 1 n y n + r 1 = 1 4 Finally the equation is represented with all the non basic variables on the right hand side. Therefore, at the first step, we will hav e: r 1 = 1 − a 11 y 1 − a 12 y 2 − · · · − a 1 n y n . In this equation, the variable on the left hand side is the basic variable, and the constant on the right hand side represents its current value. Starting from this representation, the algorithm proceeds via complementary pivoting steps, which allow the computation to mov e from a quasi- equilibrium to another one by changing each time one of the basic v ariables. This translates into a corresponding change in the tableau. The choice of the ne w basic v ariable is determined by the minimum ratio test - lines 12-15 of the pseudocode, which is in the Appendix. The complemen- tary piv oting procedure terminates when all the strategies satisfy the complementary constraint, i.e., when an equilibrium is reached. 3 Experimental Results 3.1 Our implementation The current state-of-the-art software for the computation of equilibria for games is Gambit [6]. It can be used to deal with extensiv e and normal form games. In particular, it is endowed with suitable command-line tools which allo w the user to compute Nash equilibria for games gi ven in normal form. Being intended as a tool for high-lev el game theoretical analysis, Gambit does not provide accurate information on the efficienc y of LH, e.g., on the number of complementary piv oting steps, and on properties of each execution. Therefore Gambit does not seem to be suitable for large scale e xperiments. The need of low-le vel control on the computation and of a faster software tool lead us to de velop our own implementation of LH. A detailed description of our implementation is gi ven in the Appendix. Figure 3 shows the average running time of LH, as a function of the game size. The games were generated using GAMUT 2 , a suite of game generators [8]. 3.2 Unif ormly Random Games In this section we present the data collected for uniformly random games, i.e., games in which each payoff is chosen at random from the uniform distribution. W e focus on the number of complementary piv oting steps, rather than on the ex ecution time, the latter being too much im- plementation dependent. When collecting the data, for each equilibrium computation, we kept track both of the equilib- rium itself (the strategies played and the probability of playing each of them), and of the number of pi voting steps needed to reach it. Figure 1 shows the average number of piv oting steps performed by LH: one can see that the number of steps gro ws polynomially (approximately according to a polynomial of degree 7) with the size of the game. The data has been obtained by running LH on 100,000 instances. 2 RandomGame game class. 5 Figure 3: A verage running time of LH. Our implementation (solid line) and Gambit (dotted line). On game sizes larger than 100 × 100, a relev ant fraction of games were not solved by Gambit within the capping time of 120 secs. Figure 4 sho ws the distribution of the support size of equilibria found by LH. W e see that there is a small number of equilibria with large support size. Thus the behavior of LH agrees with the fact that the probability to find an equilibrium with a large support size is very low for (sufficiently large) uniformly random games. Although these findings only concern those equilibria which are found by LH, they show a close agreement with the theoretical results on the support of equilibria of random games [1]. T o gather some further insights on the beha vior of the algorithm, we analyzed the distribution of the number of piv oting steps performed by LH on games of three dif ferent sizes: 20, 40, and 100 strategies per player . For each size, we analyzed a lar ge number of runs. More precisely , about 7.5 million runs for games with 20 and 40 strate gies per player , and 700,000 runs for games with 100 strategies per player . Figure 5 shows the distrib ution of the number of steps for runs on 7.5 million games with 20 strategies per player . Figure 5 shows that the mode of the distribution of the number of complementary piv oting steps is 2, which is the minimum number of steps needed to reach a pure equilibria. The data in Figure 5 illustrates that the distribution is very sparse: the third quartile is at 34 steps. Thus it is hard to make accurate predictions on the number of steps based on the game size. Indeed, in order to gather 99.5% of our statistical data, we had to go up to more than 200 steps for games with 20 strategies per player . A closer look at the distribution gi ves us additional information on the beha vior of the algorithm: two dif ferent distributions are interleaved, one for an ev en number of steps, and another one for an odd number of steps. Except for two, three, and four steps, the two interlea ved distrib utions are almost geometrical. Similar observations hold for larger games. Figure 6 shows the statistical data for games of size 40 and 100. Zooming on the distribution, we see that the two interleav ed distributions still exist. As the games size increases these distributions do not change qualitativ ely . Indeed, while the mean and the sparseness of the distribution increase with the size, the shape stays the same. 6 Figure 4: Support size distribution for games with 20 strategies per player (on the left), and 100 strategies per player (on the right). W e take as support size the sum of the sizes of the supports of the strategies of both players. Mode Mean 1st Quartile 3rd Quartile 95% Quantile 99.5% Quantile 2 27.39 7 34 91 203 Figure 5: Number of steps performed by LH on uniformly random games of size 20 3.3 Other classes of games Further experimental results hav e been obtained on cov ariant games, as defined in [11]. This class contains families of instances which are har der to solve by LH than uniformly random games. For lack of space, the details are in the Appendix. 4 A heuristic impro vement on Lemke-Ho wson algorithm The number of steps taken by LH depends on the pi vot it initially chooses. Since this choice is arbitrary , the algorithm might take a large number of steps, e ven on instances where there exist piv ots on which it would terminate quickly . The performance of LH could be significantly improv ed if one kne w in adv ance which piv ot leads to the minimum number of steps. In this section, we first analyze the performance of a clairvoyant algorithm (ND-Lemke- Ho wson) that chooses the best pi vot and then executes LH starting from it. W e then revie w a heuristics proposed by Porter , Nudelman and Shoham [10], which can be viewed as a simulation 7 Figure 6: Distribution of the number of steps performed by LH on games with 40 (left) and 100 (right) strategies per player . The y-axis is intentionally limited to ease the comparison of the two distributions. of ND-Lemke-Howson, and finally introduce a novel heuristics, which achieves a more efficient simulation of ND-Lemke-Ho wson. 4.1 ND-Lemke-Howson W e simulated ND-Lemke-Ho wson by executing the standard Lemke-Howson algorithm on all the possible piv ots, and choosing the path with the minimum number of steps. The following is the pseudo-code description of this algorithm. i = Best_pivot_to_start_from(); Lemke-Howson(pivot=i); return equilibrium; The first predictable result is that the distribution of the number of steps for ND-Lemke- Ho wson is much less sparse than the one for LH: for a large fraction of games it terminates in a very small number of steps. Figure 7 shows the distribution of the number of steps after 100,000 executions of ND-Lemke-Ho wson on 40 × 40 random games. In particular, notice that the maximum number of steps is 12. A significant difference between LH and ND-Lemke-Ho wson is that the av erage number of steps taken by ND-Lemke-Ho wson does not increase with the size of the game. The mean is almost stable at 2.8, and it slightly decreases as the game size increases (compare Figure 8 with Figure 1). This fact can be explained by observing that there is little correlation between the ex ecutions of LH for the same game on different piv ots. And so, since the number of piv ots increase linearly with size, the probability to find a short path increases accordingly . 8 Figure 7: Number of steps performed by ND-Lemk e-Howson algorithm for 40 × 40 games. Figure 8: Mean of the number of steps performed by ND-Lemke-Ho wson as a function of the game size. 4.2 Porter , Nudelman and Shoham heuristic approach Porter , Nudelman and Shoham heuristics [10] simulates ND-Lemke-Ho wson by keeping track of all possible different executions of LH (two times the game size), and then performing a single pi voting step on each ex ecution, until one of the paths reaches an equilibrium; the ov erhead with respect to ND-Lemk e-Ho wson is approximately given by the number of steps performed by ND- Lemke-Ho wson multiplied by the number of possible pivots. The following is the pseudo-code implementation of this heuristics. create 2 * dim different tableaux while( true ) { for i = 1 to 2 * dim do { pivoting_step( tableaux[i] ); if an equilibrium is found then return equilibrium; else continue; } } A significant implementation issue with this heuristics is that it requires a large amount of memory . Indeed, we hav e to store a different tableau for each piv oting path. Therefore the 9 memory consumption is worse than LH by a factor of 2 · d im . For a d im · d im game, one has to store 2 · d im tableaux, each of size quadratic in the game size, so that the ov erall memory consumption is cubic in the game size. This heuristics is thus unsuitable to compute equilibria for games with more than a fe w hundreds strategies per player . 4.3 A novel heuristics W e dev eloped a slightly different heuristics, which av oids the problem of a large memory con- sumption, tak es (on a verage) less pi voting steps, and is much easier to implement. This heuristics takes a parameter , which we call capping , that tells how many steps will be performed on each possible piv ot before truncating the ex ecution and starting it again on the following piv ot, un- til some path reaches an equilibrium. If the length of the paths on e very piv ot is larger than the capping value, then we just ex ecute LH on the last piv ot. The follo wing is the pseudo-code implementation of our heuristics. for i = 1 to 2 * dim - 1 do { Lemke-Howson(pivot = i, max_steps = capping); if an equilibrium is found then return equilibrium; else continue; } if all paths have been truncated then Lemke-Howson(pivot = 2 * dim, max_steps = INFTY); return equilibrium; The memory consumption is clearly the same as in standard LH, since at each time step we use just one tableau. This allows our implementation to be executed even on games with thousands of strate gies for each player . What makes this heuristics ef ficient is the fact that only a small fraction of games does not admit any short path. Figure 9 sho ws the fraction of games (of size 100 × 100) for which the algorithm does not terminate before reaching the last piv ot, as a function of the capping value. The lower is the capping value, the higher will be the probability to piv ot on the last strategy . W e can see that by choosing a capping v alue greater than 20 this probability is very small. W e performed an extended analysis to determine the best capping value for each game size on uniformly random games, and we hav e seen that the best value is approximately of 10 steps, and does not significantly change increasing the game size. Figure 10 shows how the a verage number of steps varies with the capping value for 100 × 100 games. The performance improves significantly as the capping value approaches 10, and then gets slowly worse for larger values. The best performance (on av erage) is obtained by choosing a capping value of 10. 10 Figure 9: Fraction of 100 × 100 games for which the heuristics is forced to ex- ecute LH on the last piv ot, with respect to the capping parameter . Figure 10: A verage number of steps performed by our heuristics as a func- tion of the capping parameter on 100 × 100 games. Figure 11 sho ws the distribution of the number of steps for our heuristics. It is similar to the distribution obtained for LH, although it is less sparse and not monotonic. Inside of it one can recognize many geometrical distributions, which represent the distribution of the feasible short paths that LH can generate from each pi vot. Figure 11: Number of steps performed by our heuristics, on 100 × 100 games. The data in Figure 2 sho ws the a verage number of steps of our algorithm. The av erage number of steps turns out to be linear , which is a drastic reduction compared with what we observed for LH (Figure 1). 11 Refer ences [1] I. Barany , S. V empala, A. V etta, Nash Equilibria in Random Games. FOCS 2005, pp. 123-131. [2] X. Chen, X. Deng, Settling the Comple xity of 2-Player Nash-Equilibrium. Electronic Colloquium on Computational Complexity (ECCC) 140 (2005). [3] S. Govindan and R. Wilson. A Global Newton method to compute Nash equilibria. J. of Economic Theory , 110:65–86 (2003). [4] C. E. Lemke, J. T . Ho wson. Equilibrium Points in Bimatrix Games. Journal of the Society for Indus- trial and Applied Mathematics, 12, pp. 413-423 (1964). [5] O. L. Mangasarian and H. Stone. T wo-person nonzero-sum g ames and quadratic programming. Jour- nal of Mathematical Analysis and Applications, 9:348-355 (1964). [6] R. D. McKelve y , A. M. McLennan, and T . L. T urocy . Gambit: Software T ools for Game Theory , V ersion 0.2007.01.30 http://econweb .tamu.edu/gambit (2007). [7] J. F . Nash. Non-cooperativ e Games. Annals of Mathematics, 51, pp. 286-295 (1951). [8] E. Nudelman, J. W ortman, K. Leyton-Bro wn, Y . Shoham. Run the GAMUT : A Comprehensive Ap- proach to Evaluating Game-Theoretic Algorithms. AAMAS-2004: 880-887. [9] C. H. Papadimitriou. Algorithms, games, and the internet. STOC 2001: 749-753. [10] R.Porter, E. Nudelman, Y . Shoham. Simple Search Methods for Finding a Nash Equilibrium. Pro- ceedings of the National Conference on Artificial Intelligence, 2004, 664-669 [11] Y . Rinott, M. Scarsini. On the number of pure strate gy Nash equilibria in random games. Games and Economic Behavior , 33, 2000. [12] T . Sandholm, A. Gilpin, V . Conitzer . Mixed-Inte ger Programming Methods for Finding Nash Equi- libria. Proceedings of the National Conference on Artificial Intelligence, 2005, 495-501 [13] R. Savani. Challenge Instances for N ASH. CD AM Research Report LSE-CDAM-2004-14 (2004). [14] R. Sav ani, B. von Stengel. Exponentially Many Steps for Finding a Nash Equilibrium in a Bimatrix Game. Proceedings of 45th Annual Symposium on Foundations of Computer Science, 2004, pp. 258-267. [15] R. Savani and B. v on Stengel. Hard-to-Solve Bimatrix Games. Econometrica 74, 397-429 (2006). [16] L. S. Shapley . A note on the Lemke-Ho wson Algorithm. Mathematical Programming Study 1: Piv- oting and Extensions, 1974, pp. 175-189. [17] B. von Stengel. Computing Equilibria for T wo-person Games. Handbook of Game Theory , vol.3, eds. R. J. Aumann e S. Hart, North-Holland, Amsterdam, 2002, cap. 45, pp. 1723-1759. 12 A Our implementation Our implementation of LH can either search for a Nash equilibrium in a normal form bimatrix game, ex ecuting LH on a given pi vot strategy , or it can enumerate all equilibria reachable by LH starting from the artificial equilibrium. W e will only describe the first feature, the latter being of less importance to t he experimental analysis we carry out in this paper . A.1 The algorithm A.1.1 Data Structures The most rele vant data structures are those used to store tableaux, equilibria, and lists of equilib- ria. Since our main concern is ex ecution time rather than memory consumption, we do not use a sparse matrix implementation of the tableau, but just a naive matrix representation. This allows us to ef ficiently access and update the tableau. T o keep things simple, instead of using an array to keep track of the strate gies which are in the basis, we store this information directly in the first column of the tableau. The second column represents the actual v alue of the v ariable in the basis for that ro w , while all the other entries represent the coefficients of all nonbasic v ariables. For the sake of the reader , we now show an example of ho w a tableau looks like after being initialized. Consider the following game: A = 1 2 3 4 5 6 B = 7 8 9 10 11 12 The linear complementarity formulation is the follo wing: s 1 = 1 − x 4 − 2 x 5 s 2 = 1 − 3 x 4 − 4 x 5 s 3 = 1 − 5 x 4 − 6 x 5 s 4 = 1 − 7 x 1 − 9 x 2 − 11 x 3 s 5 = 1 − 8 x 1 − 10 x 2 − 12 x 3 where s 1 , . . . , s 5 are the slack variables, and x 1 , . . . , x 5 are the actual variables. This is the initial set up, where the equations represent the artificial equilibrium, with all the slack variables in the basis. The tableaux will be: T abl eau 1 = − 1 1 0 0 0 − 1 − 2 − 2 1 0 0 0 − 3 − 4 − 3 1 0 0 0 − 5 − 6 T abl eau 2 = − 4 1 0 0 − 7 − 9 − 11 − 5 1 0 0 − 8 − 10 − 12 W e broke the tableau in two smaller independent tableaux, to simplify its update during the complementary piv oting steps. The first column represents the index of the basic variable for the 13 gi ven row . Positiv e indices are used for actual variables and neg ati ve ones for slack v ariables. The second column stores the value of that variable. All the other entries in the tableau are the coef ficients of all the other v ariables, which are out of the basis. First come the coef ficients of the slack variables, then those of the actual variables. For example, in the first tableau (representing the first three linear complementarity equations) columns 3 to 5 represent the coef ficient of slack v ariables of indices 1 to 3 (these are all zeros, because all slacks are in the basis in the artificial equilibrium), while columns 6 and 7 are the coefficients of variables 4 and 5. All the other data structures of interest are used in the enumeration of all Nash equilibria reachable by LH. W e use lexicographically sorted linked lists to store both equilibria and lists of equilibria, thus minimizing the time needed to check if an equilibrium had been already inserted in the list. Since we wish to keep the time to compare equilibria as low as possible, we used a null pointer implementation of the artificial equilibrium. A.1.2 Pseudo-Code In the follo wing, we describe our implementation of LH. 1 lemke_howson( bimatrix, tableaux, startpivot ) { 2 pivot = startpivot 3 4 while( true ) { 5 cur_tab = get_cur_tableau( pivot ) 6 col_i = get_col_i( pivot ) 7 8 for i = 1 to cur_tabeau.n_rows { 9 if( cur_tab[i][col_i] > 0 ) 10 continue; 11 12 ratio = -cur_tab[i][1] / cur_tab[i][col_i] 13 if( ratio < minimum_ratio ) { 14 minimum_ratio = ratio 15 row_i = i 16 } 17 } 18 19 var_out = get_variable( cur_tab[row_i] ) 20 col_i_out = get_col_i( var_out ) 21 22 cur_tab[row_i][col_i] = 0 23 cur_tab[row_i][col_i_out] = -1 24 for i = 1 to cur_tab[row_i].length 25 cur_tab[row_i][i] /= -cur_tab[row_i][col_i] 26 27 for i = 1 to cur_tab.n_rows { 28 if( cur_tab[i][col_i] != 0 ) { 14 29 for j = 1 to cur_tab[i].length 30 cur_tab[i][j]+=cur_tab[i][col_i] * cur_tab[i][j] 31 32 cur_tab[i][col_i] = 0 33 } 34 } 35 36 pivot = -var_out 37 if(pivot == startpivot or pivot == -startpivot) 38 break 39 } 40 41 equilibrium = get_equilibrium( tableaux ) 42 return equilibrium 43 } The algorithm acts on the two matrices describing the game, the strategy to piv ot on at the first step, and the two tableaux. In the first execution the two tableaux will be built from the artificial equilibrium, while in any subsequent execution, the algorithm can start from an arbitrarily chosen equilibrium. The body of the algorithm consists of an infinite loop (lines 4 to 39 above) where the com- plementary piv oting steps are performed: the algorithm exits from the cycle when the variable which is going to leav e the basis is either the starting piv ot or its corresponding slack variable (line 37). Indeed, this means we are at an actual Nash equilibrium. At the end of the piv oting steps, we only have to extract the equilibrium from the tableau: this is done by looking at which strategies are in the basis, and what is the value of the corresponding variable in the tableau. W e can then recover the actual Nash equilibrium, after normalizing these values so that the sum of the strategies played by each player is 1. This task is performed by the get equilibrium function, called after the pi voting steps at line 41. W e no w analyze the implementation of the complementary piv oting steps. W e first select the tableau which contains our piv ot, and then determine the column which corresponds to our pi voting strate gy (lines 5 and 6). The complementary pi voting step consists of two phases: (1) determining the variable going out of the basis (and the row in the tableau associated with it), and (2) updating the tableau ac- cording to the ne w basis. In phase (2), the update in volv es both the ro w in the tableau determined in the phase (1), and the rest of the coefficients, on the grounds of the e xpression of our ne w basic v ariable as a linear combination of nonbasic variables. The minimum ratio test (lines 8 to 17) is done by e valuating the ratio between the value of all the variables in the basis and the coefficient of the variable entering the basis (our pivot). This ratio is obviously calculated only in those rows where the coefficient of the piv ot is negati ve (lines 9-10). The row which minimizes this ratio is chosen, along with the variable leaving the basis (lines 13-16). The update of the tableau is done as follo ws. First of all we determine the inde x of the column of the variable leaving the basis (lines 19-20). Then, we update the row in volv ed in the change of the basic variable: to do this we set the coef ficient of the piv ot v ariable to zero (because now 15 it is a basic variable), the coefficient of the variable going out of basis to -1 (we are, in fact, moving that variable from the left hand side of the equation represented by this ro w to the right hand side). After di viding all the other coef ficients by the coefficient of the piv ot, we get the final v alues (lines 22 to 25). Finally , we update all the other rows in the tableau (lines 27 to 34). The complementary pi voting rule forces the choice of the next piv oting variable: it will be the complement of the variable which just exited the basis (line 36). Therefore the algorithm follows a path of quasi-equilibria with a duplicate label, i.e., a label for which both the v ariable and its slack are in basis, and a missing label. The process stops when the variable going out of the basis will be either our initial pi vot or its slack (line 37-38). A.2 Perf ormance A.2.1 The computational en vironment W e e xecuted our program on a SUN X4200 w orkstation, with 2 AMD Opteron 2.6 Ghz dual core processors, 4GB of DDR PC3200 RAM, 1MB L2 Cache per core. This workstation runs Debian GNU-Linux Etch amd64 on a Xen 3.0.2 virtual machine. The program was compiled with gcc 4.0.3 with ’-O2 -mtune=x86-64’ as compilation options. A.2.2 Some Issues with Gambit implementation For the sake of comparing our implementation of LH with the one provided within Gambit, we had to make some minor modifications to Gambit code. In our experiments, we sometimes ex ecute a single LH run piv oting on a given variable. On the other hand, Gambit’ s tool gambit- lcp only allo ws the user to enumerate all equilibria reachable by LH starting from the artificial equilibrium. Therefore, we modified Gambit code, and added the option above. Moreover , we made other minor changes in order to gather more detailed information on the execution of the algorithm, i.e. variables entering and leaving the basis and number of piv oting steps performed by the algorithm itself. A.2.3 Perf ormance of our heuristics Figure 12 sho ws the performance of our heuristics, on uniformly random games. One can see that, on a standard PC, one can compute equilibria of games with size up to 1200 × 1200 under 1 min. B Other classes of games B.1 Covariance Games Follo wing [8], we analyzed the performance of LH on the random game model of Rinott and Scarsini ([11]), in which the payof fs are dra wn from two multiv ariate normal random distribution with a cov ariance parameter ρ v arying in the range [-1, 1]. A cov ariance of 1 means that the two players share the same payof fs, while a covariance of -1 means that the game is zero-sum and the payof fs hav e a minimal correlation. For a game of size 20, the behavior of LH changes as sho wn in Figure 13. 16 Game Size Running T ime 200 0.1699s 300 0.6664s 400 1.6184s 500 3.1029s 600 5.1076s 700 8.7794s 800 10.4485s 900 16.7848s 1000 27.7215s 1100 32.3306s 1200 57.6926s Figure 12: A verage running time of our heuristics with respect to the game size. Figure 13: A verage number of steps performed by LH on a 20 × 20 covariant game with respect to the cov ariance. When ρ > 0, the number of steps of LH decrease as ρ increases. This might be explained by the theoretical analysis in [11], which shows that for a positive value of ρ , the probability of finding a pure strategy equilibrium increases as a monotonic function of ρ . When ρ < 0, the mean, mode and median of the distribution of the number of steps increase dramatically as ρ decreases, reaching a maximum for ρ = − 0 . 7. Similar results hold for games of different size. It is interesting to look at the distribution of the support size of the equilibria reached by LH on 20 × 20 and 100 × 100 games for ρ = − 0 . 7 (Figure 14). The graph resembles a normal distribution centered at a v alue which is slightly less than half the size of the game. Comparing Figure 14 with Figure 4, we see that the equilibria found by LH on cov ariant games tend to hav e a greater average support size. The resulting distribution of the number of steps turns out to be shifted to the right with respect to what happens for uniforml y random g ames (see Figure 15). Therefore, it is unlikely that any heuristics of the kind discussed in this paper could possibly do better than LH. 17 Figure 14: Support size distribution for covariant games with 20 strategies per player (on the left), and 100 strategies per player (on the right) for ρ = − 0 . 7. Figure 15: Distribution of the number of the steps for cov ariant games with 20 strategies per player (on the left), and 100 strategies per player (on the right) for ρ = − 0 . 7. 18
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment