2023-8-14
-w current_weights -t 1 -b 1 -p 400 --lcb-reduction 0 --score-utility-factor 0.1 --cpuct-init 0.5
-w 071.gz --noponder -v 400 -g -t 1 -b 1
Name black won white won total (win-rate)
Sayrui v0.6.0 104 125 229 (57.25%)
Leela Zero 0.17 75 96 171 (42.75%)
2023-8-15
-w current_weights -t 1 -b 1 -p 400 --lcb-reduction 0 --score-utility-factor 0.1 --cpuct-init 0.5
-w 081.gz --noponder -v 400 -g -t 1 -b 1
Name black won white won total (win-rate)
Sayrui v0.6.0 93 112 205 (51.25%)
Leela Zero 0.17 88 107 195 (48.75%)
-w current_weights -t 1 -b 1 -p 400 --lcb-reduction 0 --score-utility-factor 0.1 --cpuct-init 0.5
-w 091.gz --noponder -v 400 -g -t 1 -b 1
Name black won white won total (win-rate)
Sayrui v0.6.0 103 94 197 (49.25%)
Leela Zero 0.17 106 97 203 (50.75%)
-w current_weights -t 1 -b 1 -p 400 --lcb-reduction 0 --score-utility-factor 0.1 --cpuct-init 0.5
-w 092.gz --noponder -v 400 -g -t 1 -b 1
Name black won white won total (win-rate)
Sayrui v0.6.0 109 124 233 (58.25%)
Leela Zero 0.17 76 91 167 (41.75%)
-w current_weights -t 1 -b 1 -p 400 --lcb-reduction 0 --score-utility-factor 0.1 --cpuct-init 0.5
-w 095.gz --noponder -v 400 -g -t 1 -b 1
Name black won white won total (win-rate)
Sayrui v0.6.0 109 124 233 (58.25%)
Leela Zero 0.17 76 91 167 (41.75%)
-w current_weights -t 1 -b 1 -p 400 --lcb-reduction 0 --score-utility-factor 0.1 --cpuct-init 0.5
-w 102.gz --noponder -v 400 -g -t 1 -b 1
Name black won white won total (win-rate)
Sayrui v0.6.0 108 128 236 (59.00%)
Leela Zero 0.17 72 92 164 (41.00%)
-w current_weights -t 1 -b 1 -p 400 --lcb-reduction 0 --score-utility-factor 0.1 --cpuct-init 0.5
-w 105.gz --noponder -v 400 -g -t 1 -b 1
Name black won white won total (win-rate)
Sayrui v0.6.0 108 112 220 (55.00%)
Leela Zero 0.17 88 92 180 (45.00%)
-w current_weights -t 1 -b 1 -p 400 --lcb-reduction 0 --score-utility-factor 0.1 --cpuct-init 0.5
-w 111.gz --noponder -v 400 -g -t 1 -b 1
Name black won white won total (win-rate)
Sayrui v0.6.0 100 103 203 (50.75%)
Leela Zero 0.17 97 100 197 (50.25%)
-w current_weights -t 1 -b 1 -p 400 --lcb-reduction 0 --score-utility-factor 0.1 --cpuct-init 0.5
-w 116.gz --noponder -v 400 -g -t 1 -b 1
Name black won white won total (win-rate)
Sayrui v0.6.0 91 111 202 (50.50%)
Leela Zero 0.17 89 109 198 (49.50%)
-w current_weights -t 1 -b 1 -p 400 --lcb-reduction 0.02 --score-utility-factor 0.1 --cpuct-init 0.5
-w 117.gz --noponder -v 400 -g -t 1 -b 1
Name black won white won total (win-rate)
Sayrui v0.6.0 85 117 202 (50.50%)
Leela Zero 0.17 83 115 198 (49.50%)
-w current_weights -t 1 -b 1 -p 400 --lcb-reduction 0.02 --score-utility-factor 0.1 --cpuct-init 0.5
-w 122.gz --noponder -v 400 -g -t 1 -b 1
Name black won white won total (win-rate)
Sayrui v0.6.0 83 126 209 (52.25%)
Leela Zero 0.17 74 117 191 (47.75%)
-w current_weights -t 1 -b 1 -p 400 --lcb-reduction 0.02 --score-utility-factor 0.1 --cpuct-init 0.5
-w 135.gz --noponder -v 400 -g -t 1 -b 1
Name black won white won total (win-rate)
Sayrui v0.6.0 78 130 208 (52.00%)
Leela Zero 0.17 70 122 192 (48.00%)
0.00125
(from 0.0025
). We will not drop the learning rate again for the 15bx192c. The low learning can perform well for value predictions and also improve the strengh. However, high learning rate would make the network plastic. I think current the learning rate is lower enough.-w current_weights -t 1 -b 1 -p 400 --lcb-reduction 0.02 --score-utility-factor 0.1 --cpuct-init 0.5
-w 143.gz --noponder -v 400 -g -t 1 -b 1
Name black won white won total (win-rate)
Sayrui v0.6.0 88 107 195 (48.75%)
Leela Zero 0.17 93 112 205 (51.25%)
-w current_weights -t 4 -b 2 -p 1600 --lcb-reduction 0.02 --score-utility-factor 0.1 --cpuct-init 0.5
-w 151.gz --noponder -v 1600 -g -t 4 -b 2
Name black won white won total (win-rate)
Sayrui v0.6.0 93 100 193 (48.25%)
Leela Zero 0.17 100 107 207 (51.75%)
0.0005
(from 0.00125
). However, we don't use these networks for the self-play. We call these networks are special 15b weights.-w special_weights -t 4 -b 2 -p 1600 --lcb-reduction 0.02 --score-utility-factor 0.1 --cpuct-init 0.5
-w 157.gz --noponder -v 1600 -g -t 4 -b 2
Name black won white won total (win-rate)
Sayrui v0.6.0 102 99 201 (50.25%)
Leela Zero 0.17 101 98 199 (49.75%)
Hi Hung-Tse Lin
I think there are several possible reasons why Sayuri doesn't have
enough growth of her strength.
1. Strength measurements
Once the go engine reaches a certain strength, it's hard to
compare exact strength because games tend to have the same
progression. The solution for this problem is simple: provide
an opening book to be used for measurement and start games from
specified positions of an opening book.
2. The number of visits for self-play is small.
The more times the go engine is searched, the stronger it becomes,
and this strength for self-play games is a factor that determines
the limit of the accuracy of value network predictions. The
solution is to increase the number of visits for self-play games.
But I recommend you to try other solutions, because this solution
slows down the RL progress considerably.
3. Learning rate is too small.
From your RL notes, this is probably not the cause.
4. Limitation of the neural network.
The fewer the number of parameters in the neural network, the
faster it can reach a plateau. FYI, I changed the neural network
structure from 15 blocks with 192 channels to 20 blocks with 256
channels when I generated 2,000,000 self-play games.
In self-play games the difference in ELO rating is usually twice
as large then using other go engines. I'd like you to change a way
of measuring strength and see if there is a difference. Then next
try is to change the neural network structure.
Maybe it's not a bug and the RL process should continue I think.
Best regards,
Yuki Kobayashi
128 * 0.000625
. The 15b's is 256 * 0.0005
. The 15b's is lower of factor for 2.5. But their strength are equal. LeelaZero and KataGo update their network size when they achieve around LZ150. I think updating the netork size now is a reasonable choise.2023-11-18
-w current_weights -t 4 -b 2 -p 1600 --lcb-reduction 0.02 --score-utility-factor 0.1 --cpuct-init 0.5
-w 173.gz --noponder -v 1600 -g -t 4 -b 2
Name black won white won total (win-rate)
Sayrui v0.6.1 106 110 216 (54.00%)
Leela Zero 0.17 90 94 184 (46.00%)
2023-12-13
-w current_weights -t 1 -b 1 -p 800 --lcb-reduction 0.02 --score-utility-factor 0.1 --cpuct-init 0.5
-w 174.gz --noponder -v 800 -g -t 1 -b 1
Name black won white won total (win-rate)
Sayrui v0.6.1 92 114 206 (51.50%)
Leela Zero 0.17 92 102 194 (48.50%)
floodgate
and pairing
to desgin a multi-players match system. Mm… It is Ruby which I never used.