================================================================================ Mastering the game of Go with deep neural networks and tree search ================================================================================ Monte Carlo Tree Search Monte Carlo Tree Search: plays the game up to end point from a specific state Suppose you are in the state where you need to put the stone in Go match. If you find all possible cases, you can find optimal decision at that situation. ================================================================================ Complete information game (like Go game) has optimal value function ================================================================================ Tree search? Why tree search? You search optimal decision. But you search optimal decision from the current state. Current state becomes root node (top of the tree). ================================================================================ But considering infinite number of possible cases is impossible. So, you will consider possible cases in smart way ================================================================================ Smart way is MCTS MCTS has 2 methods. 1. You reduce width. You have (19,19) grid in Go match. So, you can choose 361 options. You won't test all 361 options. There are good option and bad options (like end of corner suddenly) Then, you can reduce "width" 2. You reduce depth. Originally you had to play game up to end of game (very deep) But if you can know situation without playing game up to end of game, you don't need to do it. ================================================================================ Monte Carlo is used to approximate probability distribution by using iterative simulation when you can't know probability distribution ================================================================================ Monte Carlo Tree Search Monte Carlo Search Monte Carlo Roll out If you use method which performs iterative trials when you deal with complicated probability distribution that method is called Monte Carlo as prefix