2022.5.25 A small CFR experiment

# 2022.5.25 A small CFR experiment ## Settings ### Game 2-player Dudo (each player has 1 die) ### CFR algorithm Outcome Sampling Monte Carlo CFR (Lanctot et al., 2009) - Strategies on a sampled path is updated for each iteration ## AI agents The following $8$ agents are prepared. ### Naive Dudo AI |#iterations|exploitability| |:-:|:-:| |$10^3$|$1.6147998$| |$10^4$|$1.2615279$| |$10^5$|$0.7056980$| |$10^6$|$0.2901173$| ### Abstract Dudo AI - Abstraction: players forget all player's decisions, except the last one - Exploitability is computed in an abstract Dudo game. |#iterations|exploitability| |:-:|:-:| |$10^3$|$1.4553667$| |$10^4$|$0.9087963$| |$10^5$|$0.3376476$| |$10^6$|$0.1703836$| ## Result ### Win rate table - Win rate of the column AI - $10^5$ trials for each pair ||Na3|Na4|Na5|Na6|Ab3|Ab4|Ab5|Ab6| |-|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:| |**Na3**|-|0.64355|0.70853|0.72941|0.54092|0.60223|0.60282|0.60350| |**Na4**|-|-|0.59068|0.62911|0.44060|0.50413|0.53785|0.53930| |**Na5**|-|-|-|0.54494|0.39779|0.46675|0.50873|0.51763| |**Na6**|-|-|-|-|0.37375|0.45145|0.49207|0.49998| |**Ab3**|-|-|-|-|-|0.67113|0.69791|0.71040| |**Ab4**|-|-|-|-|-|-|0.56866|0.58357| |**Ab5**|-|-|-|-|-|-|-|0.51355| |**Ab6**|-|-|-|-|-|-|-|-|