UPDATE
1st of all, I was wondering e.g up to 600 games:
Is enough data (per player under these Bullet cond.)?
And here are the details, with facts of course (not papers):
Well, according to my experiences I strongly believe in that,
Sometimes small number of games ok, but usually not enough !!
It's much better to be played min 1000-1500 games (per player)
Here again and again I refer for strong opening suites..otherwise,
Many more thousands of games (per player) required...such as 5000+
Sure if running many more games will be much better...actually my
Fingers are tired of explaining.. but what is changing?..not so much..
Yes..there will be same story as in past, we'll see some engines which
Will forfeit on time...We will see also such testers who will run small
Number of games ...anyhow, maybe this time I will be successful..
I know it's hard but there is nothing impossible.., right ?)
More over, you know: my target is very simple: I'm trying only to help!
This is other question of course, but
How many of us (TD / Testers) prefer thousands of games (per player)?
Not many for sure unfortunately..but it's all right... plus no any one is
Forcing us for more right/accurate metrics..but I think that we should
Do our best for more valid data.. such as to increase number of games!
And here are latest tests and see what is going on:
Code: Select all
1st test: here it seems all right: 1 Elo difference,
I mean very accurate..since both are same eng copy...
1 SF-POLY2 Copy) +11/-10/=579 50.08% 300.5/600
2 SF-POLY2 10924a +10/-11/=579 49.92% 299.5/600
------------------------------------------------------
2nd test: it's all right again: 2 Elo difference,
I mean not bad at all..since both are same eng copy!
1 SF-CTG (Copy) +10/-8/=319 50.30% 169.5/337
2 SF-CTG 150724 +8/-10/=319 49.70% 167.5/337
------------------------------------------------------
3rd test: it's NOT all right !! 7 Elo difference,
I mean bad at all... since both are same eng copy!
1 Incognito(Copy) +23/-11/=566 51.00% 306.0/600
2 Incognito5 pro +11/-23/=566 49.00% 294.0/600
------------------------------------------------------
4rd test: here no much idea, but 0 Elo difference,
As we see the results are identical in strength..
And note that here both are NOT same eng copy!
1 Rems MPV Sep24 +20/-20/=560 50.00% 300.0/600
2 Rems EXP 160824 +20/-20/=560 50.00% 300.0/600
------------------------------------------------------
And here are one of last tests SF-POLY2 vs Rems MPV
Test via Balsa suite: SF-POLY2 10924a performed better!
1 SF-POLY2 10924a +15/-13/=518 50.18% 274.0/546
2 Rems MPV Sep24 +13/-15/=518 49.82% 272.0/546
Test via Unique suite: Rems MPV performed better!
1 Rems MPV Sep24 +29/-25/=478 50.38% 268.0/532
2 SF-POLY2 10924a +25/-29/=478 49.62% 264.0/532
Plus more tests are included in database...anyhow
Here are overall results: identical in performance!
1 SF-POLY2 10924a +59/-58/=1615 50.03% 866.5/1732
2 Rems MPV Sep24 +58/-59/=1615 49.97% 865.5/1732
Conditions:
2x Epyc 7B12, CuteChess, 1 Core, Ponder OFF, 30s+0.6s, 64 MB Hash, 4-MEN
Note: As openings, Balsa plus Unique suites are used...
GAMES:
https://mega.nz/file/XxhEURCK#buTFaVhRP ... Z-FDoKO7U4
As final words,
What I am trying to say over long past years to all of you:
In short, if I was new/amateur/beginner in Computer chess,
And If checking small number of games.. then I'd say such as
Oh yes... SF POLY is stronger or just opposite Rems is stronger ))
And I hope again and again,
All my data to be useful..sure just for computer chess progress!
Greetings