1、 Reprint requests should be sent by email to Goeree at jg2nvirginia.edu. This research was funded in part by the*National Science Foundation (SBR-9617784 and SBR-9818683). We wish to thank Melayne McInnes for a helpfulsuggestion.Stochastic game theory: for playing games, not just for doing theoryJac
2、ob K. Goeree and Charles A. Holt*Department of Economics, Rouss Hall, University of Virginia, Charlottesville, VA 22903Recent theoretical advances have dramatically increased the relevance of game theory forpredicting human behavior in interactive situations. By relaxing the classical assumptions of
3、perfect rationality and perfect foresight, we obtain much improved explanations of (i) initialdecisions, (ii) dynamic patterns of learning and adjustment, and (iii) equilibrium steady-statedistributions.IntroductionAbout fifty years ago, John Nash walked into the office of the Chair of the Princeton
4、 MathematicsDepartment with a solution concept for N-person games, along with an existence proof that wassoon published in these Proceedings 1. John von Neumann was dismissive and remarked “thatstrivial, you know. Thats just a fixed point theorem“ 2. But word of Nashs theorem spreadquickly at RAND o
5、n the West Coast, where researchers working on defense strategy weredissatisfied with the received theory of zero-sum games, since the assumption that one playersgain is anothers loss is of limited relevance beyond simple card games. Two mathematicians,Dresher and Flood, designed a laboratory experi
6、ment to test Nashs equilibrium concept the sameday they heard about his proof. Their experiment implemented a game in which two players haveunilateral incentives to “defect“ even though both are better off when both “cooperate.“ Nashsthesis advisor, Tucker, later saw the payoffs for this experiment
7、on the blackboard in someonesoffice and devised the famous story of the “prisoners dilemma,“ which he used in a seminar forthe Psychology Department at Stanford University 3. The applications of game theory have expanded greatly since then, and with the Nashequilibrium as its centerpiece, game theor
8、y has finally gained the central role first envisioned byvon Neumann and Morgenstern 4. If anything, game theory is the leading contender for2becoming a general theory of social science, with extensive applications in economics, politicalscience, psychology, law, and biology. Indeed, in some areas o
9、f economics virtually all recenttheoretical developments are applications of game theory.There is, however, widespread criticism of theories based on the classical “rational choice“assumptions of perfect decision making (no errors) and perfect foresight (no surprises), especiallywhen they are applie
10、d to describe behavior in complex interactive situations. This skepticism isreinforced by evidence from laboratory experiments with financially-motivated subjects whichoften produce behavior patterns that are systematically biased away from rational choicepredictions. Nash himself participated in ex
11、periments as a subject and later designed experimentsof his own, but he and his coauthors lost whatever confidence they had in game theory when theysaw how poorly it predicted actual behavior 2. And Reinhard Selten, who shared the 1995Economics Nobel Prize (with Nash and Harsanyi), remarked that “ga
12、me theory is for doingtheory, not for playing games.“ Like many others, he has argued that decisions are stochastic or“noisy,“ where the noise in subjects behavior may be due to errors in perception, calculation, orrecording decisions 5. Alternatively, apparent noise may represent fully rational res
13、ponses tofactors like benevolence, envy, or other idiosyncratic factors that are not measured by theexperimenter 6. Regardless of the source and interpretation of the noise, the effect will be thatdifferent players encounter different histories of others play, and learning in such environmentsmay le
14、ad to variations in individuals beliefs and decisions. This paper describes three newdevelopments in game theory that relax the classical assumptions of perfect rationality and perfectforesight. These approaches to noisy introspection (prior to play), learning (from previous plays),and equilibrium (
15、after a large number of plays) provide complementary perspectives for explainingactual behavior in a wide variety of games. Coordination and Social Dilemma GamesThe models summarized here have been strongly influenced by data from experiments that showdisturbing differences between game-theoretic pr
16、edictions and behavior of human subjects whoare earning money in controlled strategic situations. For example, Goeree and Holt 7 show thatall the standard types of games can be implemented in a manner that yields predictions consistent3with the Nash equilibrium for some parameter values, and yet in
17、each case the observed data willshift dramatically in response to a payoff change that does not alter the Nash prediction. Similaranomalous results have been reported in many other experiments, e.g., matching pennies games,centipede games, two-stage games, market pricing games, and bargaining games
18、8-12. We willpresent the main argument of a social dilemma game for which Nashs theory predicts a uniqueequilibrium that is “bad“ for all concerned, and a coordination game in which any common effortlevel is an equilibrium, i.e. the Nash equilibrium makes no prediction at all. The social dilemma is
19、based on a story in which two travelers lose luggage with identicalcontents, and the airline official promises to pay any claim in an acceptable range as long as theclaims are equal. If not, the person making the higher claim is assumed to have lied, and bothwill be reimbursed at the lower claim, wi
20、th a reward, R 1, being deducted from thereimbursement to the high claimant and given to the low claimant. A Nash equilibrium in thiscontext is a pair of claims that survives an “announcement test:“ if each person writes in theirclaim and then announces it as they turn in their claim sheet, neither
21、should want to reconsider.Since the travelers file their claims separately, each will have a temptation to “undercut“ anyagreed on common claim. For example, suppose the range of acceptable claims is from 80 to200, with a reward parameter, R, equal to 10. A common claim of 200 yields 200 for both, b
22、uta deviation by one person to 199 would profitably raise that persons payoff to 199 + 10. Theincentive to undercut the others decision by 1 implies that the maximum claim of 200 is neveran optimal choice, irrespective of the beliefs one has about the others claim choice.Consequently, a rational per
23、son must assign zero probability to a choice of 200. But once 200 isruled out as a possibility, 199 can be ruled out on the same grounds, and this logic can be repeateduntil the only beliefs rational players can have are that claims will be 80. In fact, 80 is the uniqueNash equilibrium, despite the
24、fact that both would be better off by claiming a high amount.The paradoxical outcome of this “travelers dilemma“ game was first derived by Basu 3.He did not expect behavior to converge to the Nash prediction for low values of R, but as henoted, none of the standard modifications of game theory can p
25、redict this anticipated deviation.Capra, Goeree, Gomez, and Holt 14 conducted an experiment based on this game form, usingrandomly matched student subjects who made claim decisions independently in a sequence of ten80 90 100 110 120 130 140 150 160 170 180 190 2000102030404periods. Earnings ranged f
26、rom $24 to $44 and were paid in private, immediately after theexperiment. With R = 50, the average claim was quite close the Nash prediction of 80 in thefinal 5 rounds, but with R = 10, the average claim started high (at about 180) and moved awayfrom the Nash prediction, ending up at 186 for the las
27、t five rounds. The frequency of actualdecisions for the final five rounds is indicated in figure 1 by the blue bars for R = 50 and by thered bars for R = 10. The yellow bars show the frequency of decisions for an intermediatetreatment with R = 25. The task for theory is to explain these treatment di
28、fferences, whichsharply contrast the Nash prediction of 80, independent of R.Figure 1. Claims in a Travelers Dilemma with R = 50 (blue), R = 25 (yellow), and R = 10(red).The second game has a similar structure, with payoffs again being determined by theminimum of the two players decisions. In this g
29、ame, the decisions are “effort levels,“ and thejoint production process is such that it requires both players to perform a costly task in order to5raise the level of production. The payoff for each player is the minimum of the two efforts, minusthe cost of the players own effort: B = minx , x - cx ,
30、 where x is player is effort level andi 1 2 i ic 1 times the error parameter associated with the lower level. For instance, p= N (N (q) represents a players noisy () response to the other players noisy (t) response to t beliefs q. The “telescope“ parameter t determines how fast the error rate blows
31、up with furtheriterations; the error rate for the nth iteration is given by t . We are interested in the choicen-1probabilities in the limit as the number of iterations goes to infinity:In 41 we use continuity arguments to show that this limit is well defined when t 1. Since N4maps the whole probabi
32、lity simplex to a single point, the process is independent of the initialbelief vector q. Goeree and Holt 41 show that (3) provides a good explanation of (non-equilibrium) play in many types of one-shot games; see also 42-43 for alternative approaches.13The logit equilibrium arises as a limit case o
33、f this two parameter introspective model.Recall that a logit equilibrium is a fixed point of N , i.e. a vector p that satisfies p = N (p ), and * * *note that for t = 1, a fixed point of N is also a fixed point of (3). So, if the introspective modelconverges for t = 1, the result is a logit equilibr
34、ium (although, in general, convergence is onlyensured for t 1). To summarize, the logit equilibrium generalizes Nash by relaxing theassumption of perfect decision making, and the introspective model generalizes logit by relaxingthe assumption of perfect consistency between actions and beliefs.Conclu
35、sionGame theory is the closest thing to a unifying theory in social science, and it evokes some of thestrongest antagonism as well. Critics argue that people are not perfectly rational, and that theexperimental support for game theory is mixed. Daniel Kahneman, a noted Princetonpsychologist, remarke
36、d in a plenary address: “When an economist says the evidence is mixed, thatmeans the theory says one thing and the data say something else.“ For most economic theorists,the subtext on this would be that the there must be something wrong with the experiments becausethe theory is logically correct. Th
37、e problem with this normative defense is that what is optimalin a game like the travelers dilemma depends on what the other players actually do, not on whatsome theory says they should do. This paper describes three complementary modifications of classical game theory. Themodels of introspection, le
38、arning/evolution, and equilibrium contain the common stochasticelements that represent errors or unobserved preference shocks. These three approaches are likethe “three friends“ of classical Chinese gardening (pine, prunus, and bamboo), they fit togethernicely, each with a different purpose. Models
39、of iterated noisy introspection are used to explainbeliefs and choices in games played only once, where surprises are to be expected, and beliefs arenot likely to be consistent with choices. With repetition, beliefs and decisions can be revised vialearning or evolution. Choice distributions will ten
40、d to stabilize when there are no more surprisesin the aggregate, and the resulting steady state constitutes a noisy (quantal response) equilibrium.These theoretical perspectives have allowed us to predict initial play, adjustment patterns,and final tendencies in a series of laboratory experiments. D
41、ata patterns that our colleagues would14previously characterize as “behavioral“ (i.e., consistent with intuition but not with theory) arebeing picked up by these new stochastic game-theoretic models. There are discrepancies andsurprises, but the overall pattern of results is surprisingly coherent, e
42、specially considering that weare using human subjects in interactive situations. In fact, the coauthor with a second degree inphysics (Goeree) sometimes remarks that he is getting “that old physics feeling“ when somethingunexpected happens in an economics experiment. Laboratory experiments have been
43、 intimately connected with the development of gametheory, starting with the reaction to Nashs seminal theorem that appeared in this journal. Twoof the three recipients of the first Nobel Prize in Economics given to game theorists (Nash andSelten) conducted experiments. Patterns of actual human data
44、provide the landmarks that areneeded to avoid becoming lost in the jungle of possibilities once theorists move away fromassumptions of perfect rationality. The resulting models have the empirical content that makesthem relevant for playing games, not just for doing theory.References1. Nash, J. (1950
45、) Proceedings of the National Academy of Sciences, U.S.A, 36, 48-49.2. Nasar, S. (1998) A Beautiful Mind, New York: Simon and Schuster.3. Roth, A. E. (1995) in the Handbook of Experimental Economics, eds. Kagel, J. H. and Roth,A. E. (Princeton University Press, Princeton: New Jersey), pp. 3-109.4. v
46、on Neumann, J. & Morgenstern, O. (1944) Theory of Games and Economic Behavior,(Princeton University Press: Princeton: New Jersey).5. Selten, R. (1975) International Journal of Game Theory, 4, 25-55.6. Harsanyi, J. C. (1967-1968) Management Science, 14, 159-182, 320-334, 486-502.7. Goeree, J. K. & Ho
47、lt, C. A. (1999) “Ten Little Treasures of Game Theory and Ten IntuitiveContradictions,“ working paper, University of Virginia.8. Ochs, J. (1994) Games and Economic Behavior, 10, 202-217.9. McKelvey, R. D. & Palfrey, T. R. (1992) Econometrica, 60, 803-836.10. Beard, T. R. & Beil, R. O. (1994) Managem
48、ent Science, 40(2), 252-262.1511. Capra, C. M., Goeree, J. K., Gomez, R. & Holt, C. A. (1999) “Learning and NoisyEquilibrium Behavior in an Experimental Study of Imperfect Price Competition,“ workingpaper, University of Virginia.12. Roth, A. E. & Erev, I. (1995) Games and Economic Behavior, 8, 164-2
49、12.13. Basu, K. (1994) American Economic Review, 84(2), 391-395.14. Capra, C. M., Goeree, J. K., Gomez, R. & Holt, C. A. (1999) “Anomalous Behavior in aTravelers Dilemma?“ forthcoming in the American Economic Review.15. Goeree, J. K. & Holt, C. A. (1998) “An Experimental Study of Costly Coordination,“working paper, University of Virginia.16. Van Huyck, J. B., Battalio, R. C. & Beil, R. O. (1990) American Economic Review, 80,234-248.17. Cooper, R., DeJong, D. V., Forsythe, R. & Ross, T. W. (1992) Quarterly Journal ofE