79440275

Date: 2025-02-14 18:34:32
Score: 0.5
Natty:
Report link

I am not an expert. I myself am modelling another game using MCTS and a modified version of MuZero.

Regarding your question, from your position (original state) in the game, you expand your tree once (expand means drawing your possible moves (actions)) to the neighbor nodes, from each node you rollout ("build a sufficiently deep/large tree") at random-but-legal moves until you reach the end, and collect the results (wins, defeats) and also collect the actions and aggregate them by node. And you go down the tree in that process of expansion (node creation) for as long as the time or number of iterations you have impose to the program. Once finish, you regard the wins and loses and the number of visits to each of the node, and select the best action (move) from your original position. That's going to be your turn's play. Recall, you started from a position expanding a tree to help you decide what to play.

Wait then for the opponent's play, and with that new position (state) repeat the above process.

For each state (position in the game), you move thru downward into the tree below your state, collect as much data as posible, and decide your next move as a function of the goods or bads of the (small or subset) of the tree you have explored.

If you were playing chess, and had a super mind, at your turn, you would not look at all posible moves. You would look only a 2 or 3 of them, imagine playing each one of them until the end of the game, and then from those 2 or 3 select the one to play. That's MCTS

Greeting

Reasons:
  • Long answer (-1):
  • No code block (0.5):
  • Low reputation (1):
Posted by: William Colmenares