Startseite A Dynamic Graph Model of Strategy Learning for Predicting Human Behavior in Repeated Games
Artikel
Lizenziert
Nicht lizenziert Erfordert eine Authentifizierung

A Dynamic Graph Model of Strategy Learning for Predicting Human Behavior in Repeated Games

  • Afrooz Vazifedan ORCID logo EMAIL logo und Mohammad Izadi
Veröffentlicht/Copyright: 6. Juni 2022

Abstract

We present a model that explains the process of strategy learning by the players in repeated normal-form games. The proposed model is based on a directed weighted graph, which we define and call as the game’s dynamic graph. This graph is used as a framework by a learning algorithm that predicts which actions will be chosen by the players during the game and how the players are acting based on their gained experiences and behavioral characteristics. We evaluate the model’s performance by applying it to some human-subject datasets and measure the rate of correctly predicted actions. The results show that our model obtains a better average hit-rate compared to that of respective models. We also measure the model’s descriptive power (its ability to describe human behavior in the self-play mode) to show that our model, in contrast to the other behavioral models, is able to describe the alternation strategy in the Battle of the sexes game and the cooperating strategy in the Prisoners’ dilemma game.


Corresponding author: Afrooz Vazifedan, Department of Computer Engineering, Sharif University of Technology, Tehran, Iran, E-mail:

Appendix

Table 6 indicates the estimated values for each of the parameters of the baseline models for every game in the datasets of Section 4.1. The search intervals for φ and δ parameters in EWA, Weighted Fictitious Play and Reinforcement Learning have been [0,1], and [0,100] for parameter λ in EWA and QRE, as suggested by the models’ designers. Table 7 reports the in-sample hit rate percentages of the proposed and the other models incorporating their estimated parameters on each experimental game. The best fit for each game among all models is printed in bold. The last column shows the average accuracy of the models across all games. The DGB model has achieved a better accuracy as an average among all the studied models in the training phase, as well as the test phase.

Table 6:

The fitted values for each of the parameters of the other learning models in every dataset used in the experiments.

Parameters Games
GRP1 GRP2 GRP3 GRP4 Patent DSG3 nDSG3 DSG4 nDSG4 CPR1 CPR2 PD Cont
φ (WFP) 1 0.9 1 1 0 0 0 0.9 0 0 0.4 0 0
φ (RL) 0.8 0.9 1 1 1 0 0.5 0.4 0.3 0.5 1 0.3 0
φ (EWA) 0.9 1 1 1 1 0.9 0.5 0.6 0.7 0 0 0 0.5
λ (EWA) 4 35 25 15 15 4 45 2 35 2 2 15 2
δ (EWA) 0.2 0.5 1 0.2 0 0 0 0.2 0 1 0.5 0.2 0.8
λ (QRE) 2 65 2 55 40 2 1 10 80 65 4 50 75
μ (RM) 510 750 200 800 240 9010 500 1110 1450 100 1120 1000 2040
Table 7:

In-sample accuracy (hit rate percentage) of the proposed and other learning models on the experimental games.

Models Games
GRP1 GRP2 GRP3 GRP4 Patent DSG3 nDSG3 DSG4 nDSG4 CPR1 CPR2 PD Cont Average
Weighted fictitious play 37 25 35 24 46 14 62 19 23 19 13 61 25 31.0
Reinforcement learning 37 36 34 29 55 63 62 62 63 77 85 84 50 56.7
EWA 38 33 35 29 55 74 72 63 64 82 88 86 51 59.2
QRE 25 21 27 22 52 68 22 40 35 70 62 70 9 40.2
Regret matching 36 31 35 25 54 65 57 69 63 66 77 80 53 47.5
Cognitive hierarchy 25 15 24 16 14 31 39 33 32 78 87 57 12 35.6
DCNN 43 36 36 26 49 72 65 63 64 95 95 85 - 58.8
DGB (proposed) 46 41 42 39 58 78 76 69 68 95 96 80 56 64.9
Average of above models 35.9 29.8 33.5 26.3 47.9 58.1 56.9 52.3 51.5 72.8 75.4 75.4 36.6 50.1

References

Andreoni, J., and J. H. Miller. 1993. “Rational Cooperation in The Finitely Repeated Prisoner’s Dilemma: Experimental Evidence.” The Economic Journal 103 (418): 570–85.10.2307/2234532Suche in Google Scholar

Ansari, A., R. Montoya, and O. Netzer. 2012. “Dynamic Learning in Behavioral Games: A Hidden Markov Mixture-Of-Experts Approach.” Quantitative Marketing and Economics 10 (4): 475–503. https://doi.org/10.1007/s11129-012-9125-8.Suche in Google Scholar

Biecek, P., and T. Burzykowski. 2021. Explanatory Model Analysis. New York: Chapman and Hall/CRC.10.1201/9780429027192Suche in Google Scholar

Brown, G. W. 1951. “Iterative Solution of Games by Fictitious Play.” Activity Analysis of Production and Allocation 13 (1): 374–6.Suche in Google Scholar

Camerer, C. F., and T. H. Ho. 2015. “Behavioral Game Theory Experiments and Modeling (Chapter 10).” In Handbook of Game Theory with Economic Applications, Vol. 4, 517–73. Elsevier.10.1016/B978-0-444-53766-9.00010-0Suche in Google Scholar

Camerer, C. F., T. H. Ho, and J. K. Chong. 2002. “Sophisticated Experience-Weighted Attraction Learning and Strategic Teaching in Repeated Games.” Journal of Economic Theory 104 (1): 137–88. https://doi.org/10.1006/jeth.2002.2927.Suche in Google Scholar

Camerer, C. F., T. H. Ho, and J. K. Chong. 2004. “A Cognitive Hierarchy Model of Games.” Quarterly Journal of Economics 119 (3): 861–98. https://doi.org/10.1162/0033553041502225.Suche in Google Scholar

Camerer, C., and T. H. Ho. 1999. “Experience‐Weighted Attraction Learning in Normal Form Games.” Econometrica 67 (4): 827–74. https://doi.org/10.1111/1468-0262.00054.Suche in Google Scholar

Camerer, C., T. Ho, and K. Chong. 2003. “Models of Thinking, Learning, and Teaching in Games.” The American Economic Review 93 (2): 192–5. https://doi.org/10.1257/000282803321947038.Suche in Google Scholar

Cason, T. N., S. H. P. Lau, and V. L. Mui. 2013. “Learning, Teaching, and Turn Taking in The Repeated Assignment Game.” Economic Theory 54 (2): 335–57.10.1007/s00199-012-0718-ySuche in Google Scholar

Chen, W., Y. Chen, and D. K. Levine. 2015. “A Unifying Learning Framework for Building Artificial Game-Playing Agents.” Annals of Mathematics and Artificial Intelligence 73 (3–4): 335–58. https://doi.org/10.1007/s10472-015-9450-1.Suche in Google Scholar

Corley, H. W., and P. Kwain. 2014. “A Cooperative Dual to the Nash Equilibrium for Two-Person Prescriptive Games.” Journal of Applied Mathematics. https://doi.org/10.1155/2014/806794.Suche in Google Scholar

Erev, I., and A. E. Roth. 1998. “Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria.” The American Economic Review: 848–81.Suche in Google Scholar

Foerster, J., R. Y. Chen, M. Al-Shedivat, S. Whiteson, P. Abbeel, and I. Mordatch. 2018. “Learning with Opponent-Learning Awareness.” In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 122–30. International Foundation for Autonomous Agents and Multiagent Systems.Suche in Google Scholar

Fudenberg, D., and D. K. Levine. 2016. “Whither Game Theory? Towards a Theory of Learning in Games.” Journal of Economic Perspectives 30 (4): 151–70.10.1257/jep.30.4.151Suche in Google Scholar

Georganas, S., P. J. Healy, and R. A. Weber. 2015. “On the Persistence of Strategic Sophistication.” Journal of Economic Theory 159: 369–400. https://doi.org/10.1016/j.jet.2015.07.012.Suche in Google Scholar

Goeree, J. K., C. A. Holt, and S. K. Laury. 2002. “Private Costs and Public Benefits: Unraveling the Effects of Altruism and Noisy Behavior.” Journal of Public Economics 83 (2): 255–76. https://doi.org/10.1016/s0047-2727(00)00160-2.Suche in Google Scholar

Hart, S., and A. Mas‐Colell. 2000. “A Simple Adaptive Procedure Leading to Correlated Quilibrium.” Econometrica 68 (5): 1127–50.10.1111/1468-0262.00153Suche in Google Scholar

Ho, T. H., C. F. Camerer, and J. K. Chong. 2007. “Self-tuning Experience Weighted Attraction Learning in Games.” Journal of Economic Theory 133 (1): 177–98. https://doi.org/10.1016/j.jet.2005.12.008.Suche in Google Scholar

Ho, T. H., C. Camerer, and K. Weigelt. 1998. “Iterated Dominance and Iterated Best Response in Experimental P-Beauty Contests.” The American Economic Review 88 (4): 947–69.Suche in Google Scholar

Ho, T.-H., and X. Su. 2013. “A Dynamic Level-K Model in Sequential Games.” Management Science 59: 452–69. https://doi.org/10.1287/mnsc.1120.1645.Suche in Google Scholar

Hyndman, K., E. Y. Ozbay, A. Schotter, and W. Z. E. Ehrblatt. 2012. “Convergence: An Experimental Study of Teaching and Learning in Repeated Games.” Journal of the European Economic Association 10 (3): 573–604. https://doi.org/10.1111/j.1542-4774.2011.01063.x.Suche in Google Scholar

Ioannou, C. A., and J. Romero. 2014. “A Generalized Approach to Belief Learning in Repeated Games.” Games and Economic Behavior 87: 178–203. https://doi.org/10.1016/j.geb.2014.05.007.Suche in Google Scholar

Izquierdo, L. R., S. S. Izquierdo, and F. Vega-Redondo. 2012. “Learning and Evolutionary Game Theory.” In Encyclopedia of the Sciences of Learning, 1782–8. Boston: Springer.10.1007/978-1-4419-1428-6_576Suche in Google Scholar

Kolumbus, Y., and G. Noti. 2019. “Neural Networks for Predicting Human Interactions in Repeated Games.” arXiv preprint arXiv:1911.03233.10.24963/ijcai.2019/56Suche in Google Scholar

Kümmerli, R., C. Colliard, N. Fiechter, B. Petitpierre, F. Russier, and L. Keller. 2007. “Human Cooperation in Social Dilemmas: Comparing the Snowdrift Game with the Prisoner’s Dilemma.” Proceedings of the Royal Society B: Biological Sciences 274 (1628): 2965–70.10.1098/rspb.2007.0793Suche in Google Scholar

Mathevet, L., and J. Romero. 2012. Predictive Repeated Game Theory: Measures and Experiments. New York: Mimeo.Suche in Google Scholar

McKelvey, R. D., and T. R. Palfrey. 1995. “Quantal Response Equilibria for Normal Form Games.” Games and Economic Behavior 10 (1): 6–38. https://doi.org/10.1006/game.1995.1023.Suche in Google Scholar

Mengel, F. 2014. “Learning by (Limited) Forward Looking Players.” Journal of Economic Behavior & Organization 108: 59–77. https://doi.org/10.1016/j.jebo.2014.08.001.Suche in Google Scholar

Mohlin, E., R. Ostling, and J. T. Y. Wang. 2014. “Learning by Imitation in Games: Theory, Field, and Laboratory.” In Economics Series Working Papers, 734.Suche in Google Scholar

Mookherjee, D., and B. Sopher. 1997. “Learning and Decision Costs in Experimental Constant Sum Games.” Games and Economic Behavior 19 (1): 97–132. https://doi.org/10.1006/game.1997.0540.Suche in Google Scholar

Nax, H. 2015. Behavioral Game Theory. Zurich: ETH Editions.Suche in Google Scholar

Rapoport, A., and W. Amaldoss. 2000. “Mixed Strategies and Iterative Elimination of Strongly Dominated Strategies: An Experimental Investigation of States of Knowledge.” Journal of Economic Behavior & Organization 42 (4): 483–521. https://doi.org/10.1016/s0167-2681(00)00101-3.Suche in Google Scholar

Terracol, A., and J. Vaksmann. 2009. “Dumbing Down Rational Players: Learning and Teaching in an Experimental Game.” Journal of Economic Behavior & Organization 70 (1–2): 54–71. https://doi.org/10.1016/j.jebo.2009.02.003.Suche in Google Scholar

Van Huyck, J. B., J. P. Cook, and R. C. Battalio. 1997. “Adaptive Behavior and Coordination Failure.” Journal of Economic Behavior & Organization 32 (4): 483–503. https://doi.org/10.1016/s0167-2681(97)00007-3.Suche in Google Scholar

Vazifedan, A., and M. Izadi. 2021. “Predicting Human Behavior in Size-Variant Repeated Games through Deep Convolutional Neural Networks.” Progress in Artificial Intelligence 11: 1–14. https://doi.org/10.1007/s13748-021-00258-y.Suche in Google Scholar

Received: 2021-01-23
Revised: 2021-12-10
Accepted: 2022-03-26
Published Online: 2022-06-06

© 2022 Walter de Gruyter GmbH, Berlin/Boston

Heruntergeladen am 11.9.2025 von https://www.degruyterbrill.com/document/doi/10.1515/bejte-2021-0015/html
Button zum nach oben scrollen