Home A Dynamic Graph Model of Strategy Learning for Predicting Human Behavior in Repeated Games
Article
Licensed
Unlicensed Requires Authentication

A Dynamic Graph Model of Strategy Learning for Predicting Human Behavior in Repeated Games

  • Afrooz Vazifedan ORCID logo EMAIL logo and Mohammad Izadi
Published/Copyright: June 6, 2022

Abstract

We present a model that explains the process of strategy learning by the players in repeated normal-form games. The proposed model is based on a directed weighted graph, which we define and call as the game’s dynamic graph. This graph is used as a framework by a learning algorithm that predicts which actions will be chosen by the players during the game and how the players are acting based on their gained experiences and behavioral characteristics. We evaluate the model’s performance by applying it to some human-subject datasets and measure the rate of correctly predicted actions. The results show that our model obtains a better average hit-rate compared to that of respective models. We also measure the model’s descriptive power (its ability to describe human behavior in the self-play mode) to show that our model, in contrast to the other behavioral models, is able to describe the alternation strategy in the Battle of the sexes game and the cooperating strategy in the Prisoners’ dilemma game.


Corresponding author: Afrooz Vazifedan, Department of Computer Engineering, Sharif University of Technology, Tehran, Iran, E-mail:

Appendix

Table 6 indicates the estimated values for each of the parameters of the baseline models for every game in the datasets of Section 4.1. The search intervals for φ and δ parameters in EWA, Weighted Fictitious Play and Reinforcement Learning have been [0,1], and [0,100] for parameter λ in EWA and QRE, as suggested by the models’ designers. Table 7 reports the in-sample hit rate percentages of the proposed and the other models incorporating their estimated parameters on each experimental game. The best fit for each game among all models is printed in bold. The last column shows the average accuracy of the models across all games. The DGB model has achieved a better accuracy as an average among all the studied models in the training phase, as well as the test phase.

Table 6:

The fitted values for each of the parameters of the other learning models in every dataset used in the experiments.

Parameters Games
GRP1 GRP2 GRP3 GRP4 Patent DSG3 nDSG3 DSG4 nDSG4 CPR1 CPR2 PD Cont
φ (WFP) 1 0.9 1 1 0 0 0 0.9 0 0 0.4 0 0
φ (RL) 0.8 0.9 1 1 1 0 0.5 0.4 0.3 0.5 1 0.3 0
φ (EWA) 0.9 1 1 1 1 0.9 0.5 0.6 0.7 0 0 0 0.5
λ (EWA) 4 35 25 15 15 4 45 2 35 2 2 15 2
δ (EWA) 0.2 0.5 1 0.2 0 0 0 0.2 0 1 0.5 0.2 0.8
λ (QRE) 2 65 2 55 40 2 1 10 80 65 4 50 75
μ (RM) 510 750 200 800 240 9010 500 1110 1450 100 1120 1000 2040
Table 7:

In-sample accuracy (hit rate percentage) of the proposed and other learning models on the experimental games.

Models Games
GRP1 GRP2 GRP3 GRP4 Patent DSG3 nDSG3 DSG4 nDSG4 CPR1 CPR2 PD Cont Average
Weighted fictitious play 37 25 35 24 46 14 62 19 23 19 13 61 25 31.0
Reinforcement learning 37 36 34 29 55 63 62 62 63 77 85 84 50 56.7
EWA 38 33 35 29 55 74 72 63 64 82 88 86 51 59.2
QRE 25 21 27 22 52 68 22 40 35 70 62 70 9 40.2
Regret matching 36 31 35 25 54 65 57 69 63 66 77 80 53 47.5
Cognitive hierarchy 25 15 24 16 14 31 39 33 32 78 87 57 12 35.6
DCNN 43 36 36 26 49 72 65 63 64 95 95 85 - 58.8
DGB (proposed) 46 41 42 39 58 78 76 69 68 95 96 80 56 64.9
Average of above models 35.9 29.8 33.5 26.3 47.9 58.1 56.9 52.3 51.5 72.8 75.4 75.4 36.6 50.1

References

Andreoni, J., and J. H. Miller. 1993. “Rational Cooperation in The Finitely Repeated Prisoner’s Dilemma: Experimental Evidence.” The Economic Journal 103 (418): 570–85.10.2307/2234532Search in Google Scholar

Ansari, A., R. Montoya, and O. Netzer. 2012. “Dynamic Learning in Behavioral Games: A Hidden Markov Mixture-Of-Experts Approach.” Quantitative Marketing and Economics 10 (4): 475–503. https://doi.org/10.1007/s11129-012-9125-8.Search in Google Scholar

Biecek, P., and T. Burzykowski. 2021. Explanatory Model Analysis. New York: Chapman and Hall/CRC.10.1201/9780429027192Search in Google Scholar

Brown, G. W. 1951. “Iterative Solution of Games by Fictitious Play.” Activity Analysis of Production and Allocation 13 (1): 374–6.Search in Google Scholar

Camerer, C. F., and T. H. Ho. 2015. “Behavioral Game Theory Experiments and Modeling (Chapter 10).” In Handbook of Game Theory with Economic Applications, Vol. 4, 517–73. Elsevier.10.1016/B978-0-444-53766-9.00010-0Search in Google Scholar

Camerer, C. F., T. H. Ho, and J. K. Chong. 2002. “Sophisticated Experience-Weighted Attraction Learning and Strategic Teaching in Repeated Games.” Journal of Economic Theory 104 (1): 137–88. https://doi.org/10.1006/jeth.2002.2927.Search in Google Scholar

Camerer, C. F., T. H. Ho, and J. K. Chong. 2004. “A Cognitive Hierarchy Model of Games.” Quarterly Journal of Economics 119 (3): 861–98. https://doi.org/10.1162/0033553041502225.Search in Google Scholar

Camerer, C., and T. H. Ho. 1999. “Experience‐Weighted Attraction Learning in Normal Form Games.” Econometrica 67 (4): 827–74. https://doi.org/10.1111/1468-0262.00054.Search in Google Scholar

Camerer, C., T. Ho, and K. Chong. 2003. “Models of Thinking, Learning, and Teaching in Games.” The American Economic Review 93 (2): 192–5. https://doi.org/10.1257/000282803321947038.Search in Google Scholar

Cason, T. N., S. H. P. Lau, and V. L. Mui. 2013. “Learning, Teaching, and Turn Taking in The Repeated Assignment Game.” Economic Theory 54 (2): 335–57.10.1007/s00199-012-0718-ySearch in Google Scholar

Chen, W., Y. Chen, and D. K. Levine. 2015. “A Unifying Learning Framework for Building Artificial Game-Playing Agents.” Annals of Mathematics and Artificial Intelligence 73 (3–4): 335–58. https://doi.org/10.1007/s10472-015-9450-1.Search in Google Scholar

Corley, H. W., and P. Kwain. 2014. “A Cooperative Dual to the Nash Equilibrium for Two-Person Prescriptive Games.” Journal of Applied Mathematics. https://doi.org/10.1155/2014/806794.Search in Google Scholar

Erev, I., and A. E. Roth. 1998. “Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria.” The American Economic Review: 848–81.Search in Google Scholar

Foerster, J., R. Y. Chen, M. Al-Shedivat, S. Whiteson, P. Abbeel, and I. Mordatch. 2018. “Learning with Opponent-Learning Awareness.” In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 122–30. International Foundation for Autonomous Agents and Multiagent Systems.Search in Google Scholar

Fudenberg, D., and D. K. Levine. 2016. “Whither Game Theory? Towards a Theory of Learning in Games.” Journal of Economic Perspectives 30 (4): 151–70.10.1257/jep.30.4.151Search in Google Scholar

Georganas, S., P. J. Healy, and R. A. Weber. 2015. “On the Persistence of Strategic Sophistication.” Journal of Economic Theory 159: 369–400. https://doi.org/10.1016/j.jet.2015.07.012.Search in Google Scholar

Goeree, J. K., C. A. Holt, and S. K. Laury. 2002. “Private Costs and Public Benefits: Unraveling the Effects of Altruism and Noisy Behavior.” Journal of Public Economics 83 (2): 255–76. https://doi.org/10.1016/s0047-2727(00)00160-2.Search in Google Scholar

Hart, S., and A. Mas‐Colell. 2000. “A Simple Adaptive Procedure Leading to Correlated Quilibrium.” Econometrica 68 (5): 1127–50.10.1111/1468-0262.00153Search in Google Scholar

Ho, T. H., C. F. Camerer, and J. K. Chong. 2007. “Self-tuning Experience Weighted Attraction Learning in Games.” Journal of Economic Theory 133 (1): 177–98. https://doi.org/10.1016/j.jet.2005.12.008.Search in Google Scholar

Ho, T. H., C. Camerer, and K. Weigelt. 1998. “Iterated Dominance and Iterated Best Response in Experimental P-Beauty Contests.” The American Economic Review 88 (4): 947–69.Search in Google Scholar

Ho, T.-H., and X. Su. 2013. “A Dynamic Level-K Model in Sequential Games.” Management Science 59: 452–69. https://doi.org/10.1287/mnsc.1120.1645.Search in Google Scholar

Hyndman, K., E. Y. Ozbay, A. Schotter, and W. Z. E. Ehrblatt. 2012. “Convergence: An Experimental Study of Teaching and Learning in Repeated Games.” Journal of the European Economic Association 10 (3): 573–604. https://doi.org/10.1111/j.1542-4774.2011.01063.x.Search in Google Scholar

Ioannou, C. A., and J. Romero. 2014. “A Generalized Approach to Belief Learning in Repeated Games.” Games and Economic Behavior 87: 178–203. https://doi.org/10.1016/j.geb.2014.05.007.Search in Google Scholar

Izquierdo, L. R., S. S. Izquierdo, and F. Vega-Redondo. 2012. “Learning and Evolutionary Game Theory.” In Encyclopedia of the Sciences of Learning, 1782–8. Boston: Springer.10.1007/978-1-4419-1428-6_576Search in Google Scholar

Kolumbus, Y., and G. Noti. 2019. “Neural Networks for Predicting Human Interactions in Repeated Games.” arXiv preprint arXiv:1911.03233.10.24963/ijcai.2019/56Search in Google Scholar

Kümmerli, R., C. Colliard, N. Fiechter, B. Petitpierre, F. Russier, and L. Keller. 2007. “Human Cooperation in Social Dilemmas: Comparing the Snowdrift Game with the Prisoner’s Dilemma.” Proceedings of the Royal Society B: Biological Sciences 274 (1628): 2965–70.10.1098/rspb.2007.0793Search in Google Scholar

Mathevet, L., and J. Romero. 2012. Predictive Repeated Game Theory: Measures and Experiments. New York: Mimeo.Search in Google Scholar

McKelvey, R. D., and T. R. Palfrey. 1995. “Quantal Response Equilibria for Normal Form Games.” Games and Economic Behavior 10 (1): 6–38. https://doi.org/10.1006/game.1995.1023.Search in Google Scholar

Mengel, F. 2014. “Learning by (Limited) Forward Looking Players.” Journal of Economic Behavior & Organization 108: 59–77. https://doi.org/10.1016/j.jebo.2014.08.001.Search in Google Scholar

Mohlin, E., R. Ostling, and J. T. Y. Wang. 2014. “Learning by Imitation in Games: Theory, Field, and Laboratory.” In Economics Series Working Papers, 734.Search in Google Scholar

Mookherjee, D., and B. Sopher. 1997. “Learning and Decision Costs in Experimental Constant Sum Games.” Games and Economic Behavior 19 (1): 97–132. https://doi.org/10.1006/game.1997.0540.Search in Google Scholar

Nax, H. 2015. Behavioral Game Theory. Zurich: ETH Editions.Search in Google Scholar

Rapoport, A., and W. Amaldoss. 2000. “Mixed Strategies and Iterative Elimination of Strongly Dominated Strategies: An Experimental Investigation of States of Knowledge.” Journal of Economic Behavior & Organization 42 (4): 483–521. https://doi.org/10.1016/s0167-2681(00)00101-3.Search in Google Scholar

Terracol, A., and J. Vaksmann. 2009. “Dumbing Down Rational Players: Learning and Teaching in an Experimental Game.” Journal of Economic Behavior & Organization 70 (1–2): 54–71. https://doi.org/10.1016/j.jebo.2009.02.003.Search in Google Scholar

Van Huyck, J. B., J. P. Cook, and R. C. Battalio. 1997. “Adaptive Behavior and Coordination Failure.” Journal of Economic Behavior & Organization 32 (4): 483–503. https://doi.org/10.1016/s0167-2681(97)00007-3.Search in Google Scholar

Vazifedan, A., and M. Izadi. 2021. “Predicting Human Behavior in Size-Variant Repeated Games through Deep Convolutional Neural Networks.” Progress in Artificial Intelligence 11: 1–14. https://doi.org/10.1007/s13748-021-00258-y.Search in Google Scholar

Received: 2021-01-23
Revised: 2021-12-10
Accepted: 2022-03-26
Published Online: 2022-06-06

© 2022 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 18.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/bejte-2021-0015/html?lang=en
Scroll to top button