Background and Motivation
As one of the team leaders, you know that nothing keeps your fellow elves more productive and motivated than a steady supply of candy canes! But all seven levels of the Candy Cane Forest are closed for revegetation, so the only ones available are stuck in the break room vending machines. And even though you receive free snacks on the job, the vending machines are always broken and don’t always give you what you want.
Problem Definition
Your objective is to find a strategy to beat your opponent as much as possible.
Both participants will work with the same set of $N$ vending machines (bandits). Each bandit provides a random reward based on a probability distribution specific to that machine. Every round each player selects ("pulls") a bandit, the likelihood of a reward decreases by 3%.
Each agent can see the move of the other agent, but will not see whether a reward was gained in their respective bandit pull.
fourcolor changed 2 years agoSlide mode Like Bookmark
contributed by < fourcolor >
實驗環境
$ gcc --version
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ lscpu
fourcolor changed 3 years agoView mode Like Bookmark