執行人: cheezad
專題解說錄影
SuNsHiNe-75
關於定點數的實作,可考慮透過 stdint.h 中定義的資料類型去取代 double
。
cheezad
了解 謝謝!
aa860630
對於 MCTS 原理的解釋可以附上圖片解說
cheezad
了解!目前已更新在 Lab0-c 中
hugo0406
the highest Upper Confidence Bound (UCB) 具體來說是什麼評估標準?
cheezad
我在 Lab0-c 中有更詳細的提到 UCB 的公式,他單純就是數值大小的比較
重做第三次作業,並彙整其他學員的成果
search tree
Monte Carlo Tree Search (MCTS) uses a search tree to identify the next best move. At the root, it selects the node with the highest Upper Confidence Bound (UCB) in the selection step. Upon reaching a leaf node, it expands the next layer of nodes. During the rollout phase, it simulates random moves until a win condition is met. The results are then tracked and backpropagated through the nodes to update their values.
For an example, please visit Lab0-c at the bottom of the paragraph, I provided an example of how the algorithm runs.
If the Pseudo Random Number Generator (PRNG) is of poor quality, meaning the numbers it generates are not uniformly distributed, it can negatively impact the Monte Carlo Tree Search (MCTS) during the rollout stage. This may result in certain positions being consistently overlooked, leading to blind spots on the board. Consequently, the search tree may be skewed, preventing it from finding the optimal solution.
量化分析,搭配對應的理論依據。
The score is kept using double.
Uct score is calculated using the score, and also the EXPLORATION_FACTOR
is defined as sqrt(2)
, sqrt
in c is also double. (log
is also double)
Selection of the next move in selection phase.
Simulation of the game involves
?
善用 perf, valgrind