| AlphaGo's Decision Making
Woosuk Park, Sungyong Kim, Keunhyoung Luke Kim and Jeounghoon Kim
In this paper, we study the similarities and differences between the process
of decision making in humans and AlphaGo in playing Baduk (Go, Weiqi).
Previous discussions of unique or unconventional moves of AlphaGo ignored how
AlphaGo tends to play in different situations: (1) when AlphaGo is leading the
game, (2) when she is falling behind, and (3) when the situation of the game is
close enough. Nor did they pay due attention to the problem of strategic choice
of moves of AlphaGo. We argue that (1) that AlphaGo tends to play very thick
and safe enclosing moves when she is leading the game, (2) that she tends to
play do-or-die (all-or-nothing or gambling) moves that are backed up by very
carefully calculated scheming strategy, when there is no hope to win the game,
and (3) that she tends to figure out creative moves in order to take the initiative,
when the game is close enough. After sharpening the concept of strategy itself,
we also argue that there is sufficient ground to ascribe strategic reasoning to
AlphaGo. Based on DeepMind AlphaGo team’s monumental paper in Nature
[24] we will check to what extent our results are compatible with AlphaGo’s
structure and its operating principles. What is most striking in our examination
of AlphaGo’s decision making is that her features can be better explained by
prospect theory [14] rather than by expected utility theory. In order to test this
hypothesis, we analyze many examples from AlphaGo’s games. We conclude
by a brief discussion of the possible implications of the present study and the
remaining urgent problems for future study.
|