Moving Model - HackMD

# Moving Model To train a moving model, the first task is finding out how can the picking prediction be used as a useful information. The previous part have discussed about this task. The second task focus on how can the model simulate the MTCS search method by just using a one-shot AI model. MCTS is a well-known method not only in chess game but also decision making. In this section, the design of method and thoughts behind will be introduced. ## Model Our model composed Conv3D and tranformer decoder into a 3D block. There can be N blocks connected by skip connection in our model. The last layer perform a Conv3D to output the prediction. ![](https://i.imgur.com/5DWIoWy.png) ### 3D Convolution As discussed in Picking Model section, this structure fetch out the local information of the board by considering the position and piece type. ### Self-Attension Self-attension is a method that the layer output will be affected by the realtionship between each nodes. Using this special mechanism, the model perfrom a self-attension on each local information to output a decision value. The output from Conv3D will be flattend first, then input into this layer. In the model, we use the tranformer encoder for implementation, so the output will be a probability distribution of dicision. ## Performance Our best performance : ![](https://i.imgur.com/Dq3d42r.png) ``` Best Top 1 Accuracy : 44% Best Top 3 Accuracy : 66% Best Top 5 Accuracy : 68% ``` The overall performance isn't quite desirable. As shown in the figure, the model only learn the partial knowledge of how to choose the moving position. From the curve of top 3 and top 5 accuracy, we can see that the model stops moving the answer into the pool and start to overfit the training example.