Abstract
We present a model that explains the process of strategy learning by the players in repeated normal-form games. The proposed model is based on a directed weighted graph, which we define and call as the game’s dynamic graph. This graph is used as a framework by a learning algorithm that predicts which actions will be chosen by the players during the game and how the players are acting based on their gained experiences and behavioral characteristics. We evaluate the model’s performance by applying it to some human-subject datasets and measure the rate of correctly predicted actions. The results show that our model obtains a better average hit-rate compared to that of respective models. We also measure the model’s descriptive power (its ability to describe human behavior in the self-play mode) to show that our model, in contrast to the other behavioral models, is able to describe the alternation strategy in the Battle of the sexes game and the cooperating strategy in the Prisoners’ dilemma game.
Table 6 indicates the estimated values for each of the parameters of the baseline models for every game in the datasets of Section 4.1. The search intervals for φ and δ parameters in EWA, Weighted Fictitious Play and Reinforcement Learning have been [0,1], and [0,100] for parameter λ in EWA and QRE, as suggested by the models’ designers. Table 7 reports the in-sample hit rate percentages of the proposed and the other models incorporating their estimated parameters on each experimental game. The best fit for each game among all models is printed in bold. The last column shows the average accuracy of the models across all games. The DGB model has achieved a better accuracy as an average among all the studied models in the training phase, as well as the test phase.
The fitted values for each of the parameters of the other learning models in every dataset used in the experiments.
Parameters | Games | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
GRP1 | GRP2 | GRP3 | GRP4 | Patent | DSG3 | nDSG3 | DSG4 | nDSG4 | CPR1 | CPR2 | PD | Cont | |
φ (WFP) | 1 | 0.9 | 1 | 1 | 0 | 0 | 0 | 0.9 | 0 | 0 | 0.4 | 0 | 0 |
φ (RL) | 0.8 | 0.9 | 1 | 1 | 1 | 0 | 0.5 | 0.4 | 0.3 | 0.5 | 1 | 0.3 | 0 |
φ (EWA) | 0.9 | 1 | 1 | 1 | 1 | 0.9 | 0.5 | 0.6 | 0.7 | 0 | 0 | 0 | 0.5 |
λ (EWA) | 4 | 35 | 25 | 15 | 15 | 4 | 45 | 2 | 35 | 2 | 2 | 15 | 2 |
δ (EWA) | 0.2 | 0.5 | 1 | 0.2 | 0 | 0 | 0 | 0.2 | 0 | 1 | 0.5 | 0.2 | 0.8 |
λ (QRE) | 2 | 65 | 2 | 55 | 40 | 2 | 1 | 10 | 80 | 65 | 4 | 50 | 75 |
μ (RM) | 510 | 750 | 200 | 800 | 240 | 9010 | 500 | 1110 | 1450 | 100 | 1120 | 1000 | 2040 |
In-sample accuracy (hit rate percentage) of the proposed and other learning models on the experimental games.
Models | Games | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
GRP1 | GRP2 | GRP3 | GRP4 | Patent | DSG3 | nDSG3 | DSG4 | nDSG4 | CPR1 | CPR2 | PD | Cont | Average | |
Weighted fictitious play | 37 | 25 | 35 | 24 | 46 | 14 | 62 | 19 | 23 | 19 | 13 | 61 | 25 | 31.0 |
Reinforcement learning | 37 | 36 | 34 | 29 | 55 | 63 | 62 | 62 | 63 | 77 | 85 | 84 | 50 | 56.7 |
EWA | 38 | 33 | 35 | 29 | 55 | 74 | 72 | 63 | 64 | 82 | 88 | 86 | 51 | 59.2 |
QRE | 25 | 21 | 27 | 22 | 52 | 68 | 22 | 40 | 35 | 70 | 62 | 70 | 9 | 40.2 |
Regret matching | 36 | 31 | 35 | 25 | 54 | 65 | 57 | 69 | 63 | 66 | 77 | 80 | 53 | 47.5 |
Cognitive hierarchy | 25 | 15 | 24 | 16 | 14 | 31 | 39 | 33 | 32 | 78 | 87 | 57 | 12 | 35.6 |
DCNN | 43 | 36 | 36 | 26 | 49 | 72 | 65 | 63 | 64 | 95 | 95 | 85 | - | 58.8 |
DGB (proposed) | 46 | 41 | 42 | 39 | 58 | 78 | 76 | 69 | 68 | 95 | 96 | 80 | 56 | 64.9 |
Average of above models | 35.9 | 29.8 | 33.5 | 26.3 | 47.9 | 58.1 | 56.9 | 52.3 | 51.5 | 72.8 | 75.4 | 75.4 | 36.6 | 50.1 |
References
Andreoni, J., and J. H. Miller. 1993. “Rational Cooperation in The Finitely Repeated Prisoner’s Dilemma: Experimental Evidence.” The Economic Journal 103 (418): 570–85.10.2307/2234532Suche in Google Scholar
Ansari, A., R. Montoya, and O. Netzer. 2012. “Dynamic Learning in Behavioral Games: A Hidden Markov Mixture-Of-Experts Approach.” Quantitative Marketing and Economics 10 (4): 475–503. https://doi.org/10.1007/s11129-012-9125-8.Suche in Google Scholar
Biecek, P., and T. Burzykowski. 2021. Explanatory Model Analysis. New York: Chapman and Hall/CRC.10.1201/9780429027192Suche in Google Scholar
Brown, G. W. 1951. “Iterative Solution of Games by Fictitious Play.” Activity Analysis of Production and Allocation 13 (1): 374–6.Suche in Google Scholar
Camerer, C. F., and T. H. Ho. 2015. “Behavioral Game Theory Experiments and Modeling (Chapter 10).” In Handbook of Game Theory with Economic Applications, Vol. 4, 517–73. Elsevier.10.1016/B978-0-444-53766-9.00010-0Suche in Google Scholar
Camerer, C. F., T. H. Ho, and J. K. Chong. 2002. “Sophisticated Experience-Weighted Attraction Learning and Strategic Teaching in Repeated Games.” Journal of Economic Theory 104 (1): 137–88. https://doi.org/10.1006/jeth.2002.2927.Suche in Google Scholar
Camerer, C. F., T. H. Ho, and J. K. Chong. 2004. “A Cognitive Hierarchy Model of Games.” Quarterly Journal of Economics 119 (3): 861–98. https://doi.org/10.1162/0033553041502225.Suche in Google Scholar
Camerer, C., and T. H. Ho. 1999. “Experience‐Weighted Attraction Learning in Normal Form Games.” Econometrica 67 (4): 827–74. https://doi.org/10.1111/1468-0262.00054.Suche in Google Scholar
Camerer, C., T. Ho, and K. Chong. 2003. “Models of Thinking, Learning, and Teaching in Games.” The American Economic Review 93 (2): 192–5. https://doi.org/10.1257/000282803321947038.Suche in Google Scholar
Cason, T. N., S. H. P. Lau, and V. L. Mui. 2013. “Learning, Teaching, and Turn Taking in The Repeated Assignment Game.” Economic Theory 54 (2): 335–57.10.1007/s00199-012-0718-ySuche in Google Scholar
Chen, W., Y. Chen, and D. K. Levine. 2015. “A Unifying Learning Framework for Building Artificial Game-Playing Agents.” Annals of Mathematics and Artificial Intelligence 73 (3–4): 335–58. https://doi.org/10.1007/s10472-015-9450-1.Suche in Google Scholar
Corley, H. W., and P. Kwain. 2014. “A Cooperative Dual to the Nash Equilibrium for Two-Person Prescriptive Games.” Journal of Applied Mathematics. https://doi.org/10.1155/2014/806794.Suche in Google Scholar
Erev, I., and A. E. Roth. 1998. “Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria.” The American Economic Review: 848–81.Suche in Google Scholar
Foerster, J., R. Y. Chen, M. Al-Shedivat, S. Whiteson, P. Abbeel, and I. Mordatch. 2018. “Learning with Opponent-Learning Awareness.” In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 122–30. International Foundation for Autonomous Agents and Multiagent Systems.Suche in Google Scholar
Fudenberg, D., and D. K. Levine. 2016. “Whither Game Theory? Towards a Theory of Learning in Games.” Journal of Economic Perspectives 30 (4): 151–70.10.1257/jep.30.4.151Suche in Google Scholar
Georganas, S., P. J. Healy, and R. A. Weber. 2015. “On the Persistence of Strategic Sophistication.” Journal of Economic Theory 159: 369–400. https://doi.org/10.1016/j.jet.2015.07.012.Suche in Google Scholar
Goeree, J. K., C. A. Holt, and S. K. Laury. 2002. “Private Costs and Public Benefits: Unraveling the Effects of Altruism and Noisy Behavior.” Journal of Public Economics 83 (2): 255–76. https://doi.org/10.1016/s0047-2727(00)00160-2.Suche in Google Scholar
Hart, S., and A. Mas‐Colell. 2000. “A Simple Adaptive Procedure Leading to Correlated Quilibrium.” Econometrica 68 (5): 1127–50.10.1111/1468-0262.00153Suche in Google Scholar
Ho, T. H., C. F. Camerer, and J. K. Chong. 2007. “Self-tuning Experience Weighted Attraction Learning in Games.” Journal of Economic Theory 133 (1): 177–98. https://doi.org/10.1016/j.jet.2005.12.008.Suche in Google Scholar
Ho, T. H., C. Camerer, and K. Weigelt. 1998. “Iterated Dominance and Iterated Best Response in Experimental P-Beauty Contests.” The American Economic Review 88 (4): 947–69.Suche in Google Scholar
Ho, T.-H., and X. Su. 2013. “A Dynamic Level-K Model in Sequential Games.” Management Science 59: 452–69. https://doi.org/10.1287/mnsc.1120.1645.Suche in Google Scholar
Hyndman, K., E. Y. Ozbay, A. Schotter, and W. Z. E. Ehrblatt. 2012. “Convergence: An Experimental Study of Teaching and Learning in Repeated Games.” Journal of the European Economic Association 10 (3): 573–604. https://doi.org/10.1111/j.1542-4774.2011.01063.x.Suche in Google Scholar
Ioannou, C. A., and J. Romero. 2014. “A Generalized Approach to Belief Learning in Repeated Games.” Games and Economic Behavior 87: 178–203. https://doi.org/10.1016/j.geb.2014.05.007.Suche in Google Scholar
Izquierdo, L. R., S. S. Izquierdo, and F. Vega-Redondo. 2012. “Learning and Evolutionary Game Theory.” In Encyclopedia of the Sciences of Learning, 1782–8. Boston: Springer.10.1007/978-1-4419-1428-6_576Suche in Google Scholar
Kolumbus, Y., and G. Noti. 2019. “Neural Networks for Predicting Human Interactions in Repeated Games.” arXiv preprint arXiv:1911.03233.10.24963/ijcai.2019/56Suche in Google Scholar
Kümmerli, R., C. Colliard, N. Fiechter, B. Petitpierre, F. Russier, and L. Keller. 2007. “Human Cooperation in Social Dilemmas: Comparing the Snowdrift Game with the Prisoner’s Dilemma.” Proceedings of the Royal Society B: Biological Sciences 274 (1628): 2965–70.10.1098/rspb.2007.0793Suche in Google Scholar
Mathevet, L., and J. Romero. 2012. Predictive Repeated Game Theory: Measures and Experiments. New York: Mimeo.Suche in Google Scholar
McKelvey, R. D., and T. R. Palfrey. 1995. “Quantal Response Equilibria for Normal Form Games.” Games and Economic Behavior 10 (1): 6–38. https://doi.org/10.1006/game.1995.1023.Suche in Google Scholar
Mengel, F. 2014. “Learning by (Limited) Forward Looking Players.” Journal of Economic Behavior & Organization 108: 59–77. https://doi.org/10.1016/j.jebo.2014.08.001.Suche in Google Scholar
Mohlin, E., R. Ostling, and J. T. Y. Wang. 2014. “Learning by Imitation in Games: Theory, Field, and Laboratory.” In Economics Series Working Papers, 734.Suche in Google Scholar
Mookherjee, D., and B. Sopher. 1997. “Learning and Decision Costs in Experimental Constant Sum Games.” Games and Economic Behavior 19 (1): 97–132. https://doi.org/10.1006/game.1997.0540.Suche in Google Scholar
Nax, H. 2015. Behavioral Game Theory. Zurich: ETH Editions.Suche in Google Scholar
Rapoport, A., and W. Amaldoss. 2000. “Mixed Strategies and Iterative Elimination of Strongly Dominated Strategies: An Experimental Investigation of States of Knowledge.” Journal of Economic Behavior & Organization 42 (4): 483–521. https://doi.org/10.1016/s0167-2681(00)00101-3.Suche in Google Scholar
Terracol, A., and J. Vaksmann. 2009. “Dumbing Down Rational Players: Learning and Teaching in an Experimental Game.” Journal of Economic Behavior & Organization 70 (1–2): 54–71. https://doi.org/10.1016/j.jebo.2009.02.003.Suche in Google Scholar
Van Huyck, J. B., J. P. Cook, and R. C. Battalio. 1997. “Adaptive Behavior and Coordination Failure.” Journal of Economic Behavior & Organization 32 (4): 483–503. https://doi.org/10.1016/s0167-2681(97)00007-3.Suche in Google Scholar
Vazifedan, A., and M. Izadi. 2021. “Predicting Human Behavior in Size-Variant Repeated Games through Deep Convolutional Neural Networks.” Progress in Artificial Intelligence 11: 1–14. https://doi.org/10.1007/s13748-021-00258-y.Suche in Google Scholar
© 2022 Walter de Gruyter GmbH, Berlin/Boston
Artikel in diesem Heft
- Frontmatter
- Research Articles
- Duty to Read vs Duty to Disclose Fine Print. Does the Market Structure Matter?
- Cobb-Douglas Preferences and Pollution in a Bilateral Oligopoly Market
- Epsilon-Efficiency in a Dynamic Partnership with Adverse Selection and Moral Hazard
- Management Turnover, Strategic Ambiguity and Supply Incentives
- Uninformed Bidding in Sequential Auctions
- Arrowian Social Equilibrium: Indecisiveness, Influence and Rational Social Choices under Majority Rule
- Family Ties and Corruption
- Social Efficiency of Entry in a Vertical Structure with Third Degree Price Discrimination
- Insufficient Entry and Consumer Search
- Quality Competition and Market-Share Leadership in Network Industries
- The Effects of Introducing Advertising in Pay TV: A Model of Asymmetric Competition between Pay TV and Free TV
- Redistributive Unemployment Benefit and Taxation
- Constrained Persuasion with Private Information
- A Dynamic Graph Model of Strategy Learning for Predicting Human Behavior in Repeated Games
- Relative Income Concerns, Dismissal, and the Use of Pay-for-Performance
- Delegation in Vertical Relationships: The Role of Reciprocity
- Step by Step Innovation without Mutually Exclusive Patenting: Implications for the Inverted U
- Data and Competitive Markets: Some Notes on Competition, Concentration and Welfare
- Notes
- Optimality of a Linear Decision Rule in Discrete Time AK Model
- Equilibrium Pricing under Concave Advertising Costs
Artikel in diesem Heft
- Frontmatter
- Research Articles
- Duty to Read vs Duty to Disclose Fine Print. Does the Market Structure Matter?
- Cobb-Douglas Preferences and Pollution in a Bilateral Oligopoly Market
- Epsilon-Efficiency in a Dynamic Partnership with Adverse Selection and Moral Hazard
- Management Turnover, Strategic Ambiguity and Supply Incentives
- Uninformed Bidding in Sequential Auctions
- Arrowian Social Equilibrium: Indecisiveness, Influence and Rational Social Choices under Majority Rule
- Family Ties and Corruption
- Social Efficiency of Entry in a Vertical Structure with Third Degree Price Discrimination
- Insufficient Entry and Consumer Search
- Quality Competition and Market-Share Leadership in Network Industries
- The Effects of Introducing Advertising in Pay TV: A Model of Asymmetric Competition between Pay TV and Free TV
- Redistributive Unemployment Benefit and Taxation
- Constrained Persuasion with Private Information
- A Dynamic Graph Model of Strategy Learning for Predicting Human Behavior in Repeated Games
- Relative Income Concerns, Dismissal, and the Use of Pay-for-Performance
- Delegation in Vertical Relationships: The Role of Reciprocity
- Step by Step Innovation without Mutually Exclusive Patenting: Implications for the Inverted U
- Data and Competitive Markets: Some Notes on Competition, Concentration and Welfare
- Notes
- Optimality of a Linear Decision Rule in Discrete Time AK Model
- Equilibrium Pricing under Concave Advertising Costs