# Main Run 2024-3-1 (Discarded)
* 2024-3-1
* Start the training.
* The version is v0.7.0.
* learning rate = 0.005
* batch size = 256
* The network size is 6bx96c.
</br>
* 2024-3-5
* Played 320k games.
* Accumulate around $5.412 \times 10^{7}$ 20bx256c eval queries.
* The strengh is better than Leela Zero with [071](https://zero.sjeng.org/networks/257aeeb863dc51bfc598838361225459257377a4b2c9abd3e1ac6cdba1fcc88f.gz) weights. Elo different is +106.
* Sayuri: ```-w current_weights -t 1 -b 1 -p 400 --lcb-reduction 0 --score-utility-factor 0.1 --cpuct-init 0.5 --use-optimistic-policy --random-moves-factor 0.1 --random-moves-temp 0.8```
* Leela Zero: ```-w 071.gz --noponder -v 400 -g -t 1 -b 1 --timemanage off --randomcnt 30 --randomtemp 0.8```
* Game result (played 400 games with Leela Zero):
```
Name black won white won total (win-rate)
Sayrui v0.7.0 141 118 259 (64.75%)
Leela Zero 0.17 82 59 141 (35.25%)
```
</br>
* 2024-3-7
* Played 465k games.
* Accumulate around $1.124 \times 10^{8}$ 20bx256c eval queries.
* The strengh is better than Leela Zero with [091](https://zero.sjeng.org/networks/b3b00c6d75b4e74946a97b88949307c9eae2355a88f518ebf770c7758f90e357.gz) weights. Elo different is +51.
* Sayuri: ```-w current_weights -t 1 -b 1 -p 400 --lcb-reduction 0 --score-utility-factor 0.1 --cpuct-init 0.5 --use-optimistic-policy --random-moves-factor 0.1 --random-moves-temp 0.8```
* Leela Zero: ```-w 091.gz --noponder -v 400 -g -t 1 -b 1 --timemanage off --randomcnt 30 --randomtemp 0.8```
* Game result (played 400 games with Leela Zero):
```
Name black won white won total (win-rate)
Sayrui v0.7.0 113 116 229 (57.25%)
Leela Zero 0.17 84 87 171 (42.75%)
```
</br>
* 2024-3-7
* Played 495k games.
* Looks the loss is not stable. Drop the learning rate to 0.0025 (from 0.005)
</br>
* 2024-3-9
* Played 630k games.
* Accumulate around $1.781 \times 10^{8}$ 20bx256c eval queries.
* The strengh is as same as Leela Zero with [092](https://zero.sjeng.org/networks/ae205d8b957e560c19cc2bc935a8ea76d08dd9f88110ea783d50829bdca45329.gz) weights. Elo different is -18.
* Sayuri: ```-w current_weights -t 1 -b 1 -p 400 --lcb-reduction 0 --score-utility-factor 0.1 --cpuct-init 0.5 --use-optimistic-policy --random-moves-factor 0.1 --random-moves-temp 0.8```
* Leela Zero: ```-w 092.gz --noponder -v 400 -g -t 1 -b 1 --timemanage off --randomcnt 30 --randomtemp 0.8```
* Game result (played 400 games with Leela Zero):
```
Name black won white won total (win-rate)
Sayrui v0.7.0 91 99 190 (47.5%)
Leela Zero 0.17 101 109 210 (52.5%)
```
</br>
* 2024-3-9
* Played 640k games.
* Accumulate around $1.820 \times 10^{8}$ 20bx256c eval queries.
* Halt the 6bx96c training.
</br>
* 2024-3-10
* Start the 10bx128c training.
* learning rate = 0.005
* batch size = 256
* current replay buffer is 175000 games.
</br>
* 2024-3-13
* Played 800k games (10bx128c played 160k games).
* Accumulate around $3.846 \times 10^{8}$ 20bx256c eval queries.
* The strengh is better than Leela Zero with [116](https://zero.sjeng.org/networks/39d465076ed1bdeaf4f85b35c2b569f604daa60076cbee9bbaab359f92a7c1c4.gz) weights. Elo different is +82.
* Sayuri: ```-w current_weights -t 1 -b 1 -p 400 --lcb-reduction 0 --score-utility-factor 0.1 --cpuct-init 0.5 --use-optimistic-policy --random-moves-factor 0.1 --random-moves-temp 0.8```
* Leela Zero: ```-w 116.gz --noponder -v 400 -g -t 1 -b 1 --timemanage off --randomcnt 30 --randomtemp 0.8```
* Game result (played 400 games with Leela Zero):
```
Name black won white won total (win-rate)
Sayrui v0.7.0 132 113 245 (61.25%)
Leela Zero 0.17 87 68 155 (38.75%)
```
</br>
* 2024-3-15
* Played 915k games (10bx128c played 275k games).
* Looks the loss is not stable. Drop the learning rate to 0.0025 (from 0.005)
</br>
* 2024-3-16
* Looks performance of the last 10bx128c network is bad. The network can not perdict win-rate well under scoring area. Here is a example, the last weights think both side win the game under the scoring area. The MCTS win-rate is 50%.
* 
* I should check what happened. Halt this run.
</br>
* 2024-3-21
* Fix some bug and look like better than before on 7x7. But I think I can not get significantly benefit on scoring territory. I will forbid scoring territory rule next run. However, that does not mean I give up the scoring territory.