📝 Original Info
- Title: Learning and innovative elements of strategy adoption rules expand cooperative network topologies
- ArXiv ID: 0708.2707
- Date: 2007-08-20
- Authors: Shijun Wang, Mate S. Szalay, Changshui Zhang, Peter Csermely
📝 Abstract
Cooperation plays a key role in the evolution of complex systems. However, the level of cooperation extensively varies with the topology of agent networks in the widely used models of repeated games. Here we show that cooperation remains rather stable by applying the reinforcement learning strategy adoption rule, Q-learning on a variety of random, regular, small-word, scale-free and modular network models in repeated, multi-agent Prisoners Dilemma and Hawk-Dove games. Furthermore, we found that using the above model systems other long-term learning strategy adoption rules also promote cooperation, while introducing a low level of noise (as a model of innovation) to the strategy adoption rules makes the level of cooperation less dependent on the actual network topology. Our results demonstrate that long-term learning and random elements in the strategy adoption rules, when acting together, extend the range of network topologies enabling the development of cooperation at a wider range of costs and temptations. These results suggest that a balanced duo of learning and innovation may help to preserve cooperation during the re-organization of real-world networks, and may play a prominent role in the evolution of self-organizing, complex systems.
💡 Deep Analysis
Deep Dive into Learning and innovative elements of strategy adoption rules expand cooperative network topologies.
Cooperation plays a key role in the evolution of complex systems. However, the level of cooperation extensively varies with the topology of agent networks in the widely used models of repeated games. Here we show that cooperation remains rather stable by applying the reinforcement learning strategy adoption rule, Q-learning on a variety of random, regular, small-word, scale-free and modular network models in repeated, multi-agent Prisoners Dilemma and Hawk-Dove games. Furthermore, we found that using the above model systems other long-term learning strategy adoption rules also promote cooperation, while introducing a low level of noise (as a model of innovation) to the strategy adoption rules makes the level of cooperation less dependent on the actual network topology. Our results demonstrate that long-term learning and random elements in the strategy adoption rules, when acting together, extend the range of network topologies enabling the development of cooperation at a wider range of
📄 Full Content
Cooperation is necessary for the emergence of complex, hierarchical systems [1][2][3][4][5]. Why is cooperation maintained, when there is a conflict between self-interest and the common good? A set of answers emphasized agent similarity, in terms of kin-or groupselection and compact network communities, which is helped by learning of successful strategies [2,3]. On the other hand, agent diversity in terms of noise, variation of behavior and innovation, as well as the changing environment of the agent-community all promoted cooperation in different games and settings [3,[6][7][8].
Small-world, scale-free or modular network models, which all give a chance to develop the complexity of similar, yet diverse agent-neighborhoods, provide a good starting point for the modeling of the complexity of cooperative behavior in real-world networks [9][10][11][12][13]. However, the actual level of cooperation in various games, such as the Prisoner’s Dilemma or Hawk-Dove games is very sensitive to the topology of the agent network model [14-16, Electronic supplementary material 1 -ESM1 -Table S1.1]. In our work we applied a set of widely used network models and examined the stability of cooperation after repeated games using the reinforcement learning strategy adoption rule, Q-learning. To examine the surprising stability of cooperation observed, when using Q-learning, we approximated the complex rules of Q-learning by designing a long-term versions of the best-takes-over and other strategy adoption rules as well as introducing a low level of randomness to these rules. We found that none of these features alone results in a similar stability of cooperation in various network models. However, when applied together, long-term (’learning’) and random (‘innovative’) elements of strategy adoption rules can make cooperation relatively stable under various conditions in a large number of network models. Our results have a wide application in various complex systems of biology from the cellular level to social networks and ecosystems.
As an illustrative example for the sensitivity of cooperation on network topology, we show cooperating agents after the last round of a ‘repeated canonical Prisoner’s Dilemma game’ (PD-game) on two, almost identical versions of a modified Watts-Strogatz-type small-world model network [13,17]. Comparison of the top panels of Figure 1 shows that a minor change of network topology (replacement of 37 links from 900 links total) completely changed both the level and topology of cooperating agents playing with a best-takes-over short term strategy adoption rule. We have observed a similar topological sensitivity of cooperation in all combinations of (a) other short-term strategy adoption rules; (b) a large number of other network topologies; (c) other games, such as the extended Prisoner’s Dilemma or Hawk-Dove games (ESM1 Figures S1. 1 and S1.6).
On the contrary to the general sensitivity of cooperation to the topology of agentnetworks in PD-games using the short-term strategy adoption rule shown above, when the long-term, reinforcement learning strategy adoption rule, Q-learning was applied, the level and configuration of cooperating agents showed a surprising stability (cf. the bottom panels of Figure 1). Just oppositely to the short-term strategy adoption rule shown on the top panels of Figure 1, the Q-learning strategy adoption rule (a) is based on the long-term experiences of the agents from all previous rounds allowing some agents to choose a cooperative strategy despite of the current adverse effects, and (b) is an ‘innovative’ strategy adoption rule [3] re-introducing cooperation even under conditions, when it has already been wiped out from the network-community completely [18,19].
Extending the observations shown on Figure 1 we decided to compare the level of cooperation in PD-games on small-world and scale-free networks at various levels of temptations (T, the defector’s payoff, when it meets a cooperator) in detail. The top panel of Figure 2 shows that the cooperation level of agents using the best-takes-over strategy adoption rule rapidly decreased with a gradual increase of their temptation to defect. This was generally true for both small-world, and scale-free networks leaving a negligible amount of cooperation at T-values higher than 4.5. However, at smaller temptation levels the level of cooperation greatly differed in the two network topologies. Initially, the small-world network was preferred, while at temptation values higher than 3.7, agents of the scale-free network developed a larger cooperation. The behavior of agents using the Q-learning strategy adoption rule was remarkably different (top panel of Figure 2). Their cooperation level remained relatively stable even at extremely large temptation values. Moreover, the cooperation levels of agents using Q-learning had no significant difference, if we compared small-world and scale-free networks. This behavior continued at temptation
…(Full text truncated)…
📸 Image Gallery
Reference
This content is AI-processed based on ArXiv data.