Sedat Canbaz
- Sedat Canbaz
- Posts: 594
- Joined: Mon Jun 19, 2023 8:49 am
- Has thanked: 682 times
- Been thanked: 2067 times
Re: Sedat Canbaz
One thing more and once more:
Thanks to all eng authors as well, otherwise I would not run...
Special thanks to authors: Cfish, Berserk, Spectral, Shashchess!
One note more,
Unfortunately all my tested Eman engines series are crashed
And buggy on my tournament machine via Eman exp (1.5+ GB)
It seems Eman eng needs serious optimization on modern machines!
At least in case of using HUGE exp files..sure in Gauntlet mode etc.
And these crashes are appearing in beginning of test..Cutechess is
Directly terminating...sad really..we are in 2024.. but you see...
But the good news is that,
Spectral is Spectral... this great engine is stable via Eman exp as well!!
It does not matter in Gauntlet, Round-Robin etc. well-done to Mr. Anton
And keep up the great work!
Greetings )
Thanks to all eng authors as well, otherwise I would not run...
Special thanks to authors: Cfish, Berserk, Spectral, Shashchess!
One note more,
Unfortunately all my tested Eman engines series are crashed
And buggy on my tournament machine via Eman exp (1.5+ GB)
It seems Eman eng needs serious optimization on modern machines!
At least in case of using HUGE exp files..sure in Gauntlet mode etc.
And these crashes are appearing in beginning of test..Cutechess is
Directly terminating...sad really..we are in 2024.. but you see...
But the good news is that,
Spectral is Spectral... this great engine is stable via Eman exp as well!!
It does not matter in Gauntlet, Round-Robin etc. well-done to Mr. Anton
And keep up the great work!
Greetings )
- Sedat Canbaz
- Posts: 594
- Joined: Mon Jun 19, 2023 8:49 am
- Has thanked: 682 times
- Been thanked: 2067 times
Re: Sedat Canbaz
Hello there,
SCCT - Unofficial Test with OrgZ's latest Top engines 2024!
Simply target is to check which OrgZ engines are better..!?
1st of all, let's start with good news:
Great performance comes by one of my favorite engines: SF-POLY!
Well-done to Mr. Tanick Ramz !
The bad news is that,
NON of them can use CTG books... so I hope next SF-POLY ver to use...
At least one of them, if not...there will be BIG miss, thanks in advance
GAMES:
https://mega.nz/file/TwxmgI5Q#NTYlZRhwp ... DiRS0bdpG0
Conditions:
2x Epyc 7B12, CuteChess, 1 Core, Ponder OFF, Balsa/Unique, 30s+0.6s, 64 Hash, 4-MEN
More Details:
Sailfish seems to be not so stable: lost 1 game on time,
Where all rest so far are stable..no games lost on time!
Greetings
SCCT - Unofficial Test with OrgZ's latest Top engines 2024!
Simply target is to check which OrgZ engines are better..!?
1st of all, let's start with good news:
Great performance comes by one of my favorite engines: SF-POLY!
Well-done to Mr. Tanick Ramz !
The bad news is that,
NON of them can use CTG books... so I hope next SF-POLY ver to use...
At least one of them, if not...there will be BIG miss, thanks in advance
Code: Select all
1 SF-POLY2 10924a 102.0 - 98.0102.5 - 97.5103.5 - 96.5** 308.0/600
2 JigSaw 6.0 98.0 - 102.0102.0 - 98.098.5 - 101.5 ** 298.5/600
3 Private 19Sf VICE 97.5 - 102.598.0 - 102.0101.5 - 98.5 ** 297.0/600
4 Sailfish 3 96.5 - 103.5101.5 - 98.598.5 - 101.5 ** 296.5/600
https://mega.nz/file/TwxmgI5Q#NTYlZRhwp ... DiRS0bdpG0
Conditions:
2x Epyc 7B12, CuteChess, 1 Core, Ponder OFF, Balsa/Unique, 30s+0.6s, 64 Hash, 4-MEN
More Details:
Sailfish seems to be not so stable: lost 1 game on time,
Where all rest so far are stable..no games lost on time!
Greetings
- Sedat Canbaz
- Posts: 594
- Joined: Mon Jun 19, 2023 8:49 am
- Has thanked: 682 times
- Been thanked: 2067 times
Re: Sedat Canbaz
UPDATE
1st of all, I was wondering e.g up to 600 games:
Is enough data (per player under these Bullet cond.)?
And here are the details, with facts of course (not papers):
Well, according to my experiences I strongly believe in that,
Sometimes small number of games ok, but usually not enough !!
It's much better to be played min 1000-1500 games (per player)
Here again and again I refer for strong opening suites..otherwise,
Many more thousands of games (per player) required...such as 5000+
Sure if running many more games will be much better...actually my
Fingers are tired of explaining.. but what is changing?..not so much..
Yes..there will be same story as in past, we'll see some engines which
Will forfeit on time...We will see also such testers who will run small
Number of games ...anyhow, maybe this time I will be successful..
I know it's hard but there is nothing impossible.., right ?)
More over, you know: my target is very simple: I'm trying only to help!
This is other question of course, but
How many of us (TD / Testers) prefer thousands of games (per player)?
Not many for sure unfortunately..but it's all right... plus no any one is
Forcing us for more right/accurate metrics..but I think that we should
Do our best for more valid data.. such as to increase number of games!
And here are latest tests and see what is going on:
GAMES:
https://mega.nz/file/XxhEURCK#buTFaVhRP ... Z-FDoKO7U4
As final words,
What I am trying to say over long past years to all of you:
In short, if I was new/amateur/beginner in Computer chess,
And If checking small number of games.. then I'd say such as
Oh yes... SF POLY is stronger or just opposite Rems is stronger ))
And I hope again and again,
All my data to be useful..sure just for computer chess progress!
Greetings
1st of all, I was wondering e.g up to 600 games:
Is enough data (per player under these Bullet cond.)?
And here are the details, with facts of course (not papers):
Well, according to my experiences I strongly believe in that,
Sometimes small number of games ok, but usually not enough !!
It's much better to be played min 1000-1500 games (per player)
Here again and again I refer for strong opening suites..otherwise,
Many more thousands of games (per player) required...such as 5000+
Sure if running many more games will be much better...actually my
Fingers are tired of explaining.. but what is changing?..not so much..
Yes..there will be same story as in past, we'll see some engines which
Will forfeit on time...We will see also such testers who will run small
Number of games ...anyhow, maybe this time I will be successful..
I know it's hard but there is nothing impossible.., right ?)
More over, you know: my target is very simple: I'm trying only to help!
This is other question of course, but
How many of us (TD / Testers) prefer thousands of games (per player)?
Not many for sure unfortunately..but it's all right... plus no any one is
Forcing us for more right/accurate metrics..but I think that we should
Do our best for more valid data.. such as to increase number of games!
And here are latest tests and see what is going on:
Code: Select all
1st test: here it seems all right: 1 Elo difference,
I mean very accurate..since both are same eng copy...
1 SF-POLY2 Copy) +11/-10/=579 50.08% 300.5/600
2 SF-POLY2 10924a +10/-11/=579 49.92% 299.5/600
------------------------------------------------------
2nd test: it's all right again: 2 Elo difference,
I mean not bad at all..since both are same eng copy!
1 SF-CTG (Copy) +10/-8/=319 50.30% 169.5/337
2 SF-CTG 150724 +8/-10/=319 49.70% 167.5/337
------------------------------------------------------
3rd test: it's NOT all right !! 7 Elo difference,
I mean bad at all... since both are same eng copy!
1 Incognito(Copy) +23/-11/=566 51.00% 306.0/600
2 Incognito5 pro +11/-23/=566 49.00% 294.0/600
------------------------------------------------------
4rd test: here no much idea, but 0 Elo difference,
As we see the results are identical in strength..
And note that here both are NOT same eng copy!
1 Rems MPV Sep24 +20/-20/=560 50.00% 300.0/600
2 Rems EXP 160824 +20/-20/=560 50.00% 300.0/600
------------------------------------------------------
And here are one of last tests SF-POLY2 vs Rems MPV
Test via Balsa suite: SF-POLY2 10924a performed better!
1 SF-POLY2 10924a +15/-13/=518 50.18% 274.0/546
2 Rems MPV Sep24 +13/-15/=518 49.82% 272.0/546
Test via Unique suite: Rems MPV performed better!
1 Rems MPV Sep24 +29/-25/=478 50.38% 268.0/532
2 SF-POLY2 10924a +25/-29/=478 49.62% 264.0/532
Plus more tests are included in database...anyhow
Here are overall results: identical in performance!
1 SF-POLY2 10924a +59/-58/=1615 50.03% 866.5/1732
2 Rems MPV Sep24 +58/-59/=1615 49.97% 865.5/1732
Conditions:
2x Epyc 7B12, CuteChess, 1 Core, Ponder OFF, 30s+0.6s, 64 MB Hash, 4-MEN
Note: As openings, Balsa plus Unique suites are used...
https://mega.nz/file/XxhEURCK#buTFaVhRP ... Z-FDoKO7U4
As final words,
What I am trying to say over long past years to all of you:
In short, if I was new/amateur/beginner in Computer chess,
And If checking small number of games.. then I'd say such as
Oh yes... SF POLY is stronger or just opposite Rems is stronger ))
And I hope again and again,
All my data to be useful..sure just for computer chess progress!
Greetings
- Sedat Canbaz
- Posts: 594
- Joined: Mon Jun 19, 2023 8:49 am
- Has thanked: 682 times
- Been thanked: 2067 times
Re: Sedat Canbaz
UPDATE 2
Really sad... for example:
After checking more closely the latest Error margin test...
I've found several games lost on time by SF-POLY 210924a
Note also that about former Champion: SF-Poly 220723:
I never seen/noticed any game to be lost on time...
What does it mean ? why some of latest engines
Are became as worst (not so much stable)?
Dual Nets can not be reason.. because some
SF based ones (with dual nets) never loose on time...
But I wonder much ? Any opinions over these issues ?
By the way, the good news is that,
So far Mr. Eduard's engines seems be very stable..great !
E.g so far no any game is recorded to be lost on time !!
Ok...that's all for now...exc. all engines are played with move overhead: 400
Greetings
Really sad... for example:
After checking more closely the latest Error margin test...
I've found several games lost on time by SF-POLY 210924a
Note also that about former Champion: SF-Poly 220723:
I never seen/noticed any game to be lost on time...
What does it mean ? why some of latest engines
Are became as worst (not so much stable)?
Dual Nets can not be reason.. because some
SF based ones (with dual nets) never loose on time...
But I wonder much ? Any opinions over these issues ?
By the way, the good news is that,
So far Mr. Eduard's engines seems be very stable..great !
E.g so far no any game is recorded to be lost on time !!
Ok...that's all for now...exc. all engines are played with move overhead: 400
Greetings
- Sedat Canbaz
- Posts: 594
- Joined: Mon Jun 19, 2023 8:49 am
- Has thanked: 682 times
- Been thanked: 2067 times
Re: Sedat Canbaz
Meanwhile,
I realized to quote one of my old posting/ ranking...
https://open-chess.org/viewtopic.php?f=4&p=34207#p34207
Who knows? older stats may help here...where in those times:
0 (zero) game is recorded on time loss (based on 27750 games)
For these reasons, again and again I wish to say..
Not always newer is better..and not everything as it seems!
Best,
Sedat
I realized to quote one of my old posting/ ranking...
https://open-chess.org/viewtopic.php?f=4&p=34207#p34207
Who knows? older stats may help here...where in those times:
0 (zero) game is recorded on time loss (based on 27750 games)
Btw, If you need more data (as facts that all stable...) just let me know please...Sedat Canbaz wrote: ↑Thu Mar 07, 2024 7:27 amConditions:Code: Select all
Rank Name Elo + - games score oppo. draws 1 Brainlearn 27 3780 2 2 2800 51% 3776 95% 2 CoolIris 11.80 3779 2 3 2700 50% 3776 96% 3 RapTora 2.3 3779 3 3 2400 50% 3776 96% 4 SF-PB 080124 3778 2 2 2800 50% 3776 96% 5 Raid v3.4 3778 2 2 2800 50% 3777 95% 6 Brainlearn 26.5 3778 3 3 2200 50% 3776 96% 7 Polyfish 140124 3778 2 3 2700 50% 3776 97% 8 CoolIris 11.90 3778 3 3 2100 50% 3777 96% 9 SF-PB 051123 3778 3 3 2200 50% 3776 96% 10 Patzer AI X256 3778 3 3 2100 50% 3776 96% 11 Eman 9.90 3778 2 2 2800 50% 3777 96% 12 DarkSisTer 8.50 3777 3 3 2000 50% 3776 97% 13 SF POLY 261123 3777 2 2 2700 50% 3776 96% 14 Killfish 231123 3776 3 3 2000 50% 3776 97% 15 Incognito 5 Pro 3776 2 2 2700 50% 3776 95% 16 Hazard 3.78 3775 3 3 2000 50% 3776 96% 17 Tactical 281023 3775 3 3 2000 50% 3776 97% 18 SF POLY 220723 3775 3 2 2700 50% 3777 95% 19 SunLight 3 3773 3 3 2100 50% 3776 95% 20 XTD 010723 3773 3 3 2100 50% 3776 96% 21 SpecTral 5.50 3773 3 3 2100 49% 3776 95% 22 AWOL Z11 3773 3 3 2100 49% 3776 95% 23 ShashChess 34.6 3771 3 3 1900 49% 3778 94% 24 Sawfish 2TC 3768 3 3 1500 49% 3776 94%
2x Epyc 7B12, CuteChess, 1 Core, Ponder OFF, 30s+0.6s, Balsa, 64 Hash, 4-MEN
Note: In the beginning is started at 30s+0.5s but later switched to 30s+0.6s
In other words, mostly of the current games are played at TC: 30sec + 0.6sec
GAMES:
https://mega.nz/file/D5ginASb#r0qRGjiBY ... mnuNINhpmk
For these reasons, again and again I wish to say..
Not always newer is better..and not everything as it seems!
Best,
Sedat
- Sedat Canbaz
- Posts: 594
- Joined: Mon Jun 19, 2023 8:49 am
- Has thanked: 682 times
- Been thanked: 2067 times
Re: Sedat Canbaz
Hello Chess Friends,
As usually, I'm very pleased to announce also that,
I managed to organize another new championship!
And what's new: each book contains 5600 games !!
In other words, I think that they deserve more...but
That's what I can do my best.. at least for nowadays!
Some notes about the current played Top book participants:
The Winners of SIZE tours: Small / Medium / Large / Giant
I know too that it is not so much fair...but anyhow, I think that
It's not so bad idea to be in fight each other, right ?) if nothing
Else mainly for fun..what I can add more, a lot of things but no
Free time for all, exc. Messi's old dated one (by Mr. Angel) proves
Again to all of us as to be the strongest under these conditions!!
Sure I'm impressed a lot by rest Top books too, for examples:
Super strong performance by SENTINEL 2409 despite its very
Small in size, plus its produced DrawRatio is lowest, just: 89%
Geralt is the only Public one, plus small + old dated...so nothing
Strange...that ranked at last place...but in 7th tour (via Cfish..):
Geralt is Geralt..where managed to be 3rd place...really good!
As other very important issue is that,
I realized to run many separate tours, played by various engines!
And via this testing method..now is much clear the influences e.g
Error margin and this is not all, we can compare Eng/Books Draw
Records as well...for more notes I suggest to read 'More Details'
XXXVI's GRAND Champion: Chucaro - Congrats to Angel Morano!!
My Congratulations to all rest Former Champions Authors as well!
For More Details, Full Standings etc:
https://sites.google.com/site/computers ... k-nn-cs-36
GAMES:
https://mega.nz/file/OppjxDTI#ZusW5Fi7K ... 8T3g-ho9io
That's all for now...thanks for your interest...
Best Regards,
Sedat Canbaz
As usually, I'm very pleased to announce also that,
I managed to organize another new championship!
And what's new: each book contains 5600 games !!
In other words, I think that they deserve more...but
That's what I can do my best.. at least for nowadays!
Some notes about the current played Top book participants:
The Winners of SIZE tours: Small / Medium / Large / Giant
I know too that it is not so much fair...but anyhow, I think that
It's not so bad idea to be in fight each other, right ?) if nothing
Else mainly for fun..what I can add more, a lot of things but no
Free time for all, exc. Messi's old dated one (by Mr. Angel) proves
Again to all of us as to be the strongest under these conditions!!
Sure I'm impressed a lot by rest Top books too, for examples:
Super strong performance by SENTINEL 2409 despite its very
Small in size, plus its produced DrawRatio is lowest, just: 89%
Geralt is the only Public one, plus small + old dated...so nothing
Strange...that ranked at last place...but in 7th tour (via Cfish..):
Geralt is Geralt..where managed to be 3rd place...really good!
As other very important issue is that,
I realized to run many separate tours, played by various engines!
And via this testing method..now is much clear the influences e.g
Error margin and this is not all, we can compare Eng/Books Draw
Records as well...for more notes I suggest to read 'More Details'
XXXVI's GRAND Champion: Chucaro - Congrats to Angel Morano!!
My Congratulations to all rest Former Champions Authors as well!
For More Details, Full Standings etc:
https://sites.google.com/site/computers ... k-nn-cs-36
GAMES:
https://mega.nz/file/OppjxDTI#ZusW5Fi7K ... 8T3g-ho9io
That's all for now...thanks for your interest...
Best Regards,
Sedat Canbaz
- Sedat Canbaz
- Posts: 594
- Joined: Mon Jun 19, 2023 8:49 am
- Has thanked: 682 times
- Been thanked: 2067 times
Re: Sedat Canbaz
UPDATE
A new STAR is born, but belongs to brightest ones!
And the name of this great star is SF-PB 220324 SC
A super strong engine, plus so far the less drawish
Than all tested engines, which are close to 3800+
Really that means a lot ..especially for book tours!
One thing in SF-PB missing: not capable to use CTG..
But no one work is perfect and we've to be satisfied..
Meanwhile and just to be more clear,
SF-PB 220324 SC = SF-PB via nn-b1a57edbea57.nnue
And here are the latest new strength NN results:
GAMES:
https://mega.nz/file/OlB0HDxS#_PWsDKR9Z ... uXCyorOcdw
Meantime, I'd happy also the programmers to make theirs
Best too..sure for appearing less draws as well..Reminder:
I am just a simple Tester/TD here... no more no less... !)
And as a last note,
I've tested many more engines..but they were out..as
Reason: they are more drawish..and this is not all..
Some are not so stable..e.g time forfeits...rarely, but..
Or crashing in Gauntlet..and it seems they need some
Optimizations on fast + modern hardwares, if nothing
Else on 2x EPYC 7B12 (with 256 Threads / 128 Cores)
But the good news is that, current tested Top engines
Are stable..at least so far!! You know, not easy.. e.g
Playing at Bullet (30s+0.6s) + High Concurrency (64)!
Thanks for reading and have a nice weekend )
Greetings
A new STAR is born, but belongs to brightest ones!
And the name of this great star is SF-PB 220324 SC
A super strong engine, plus so far the less drawish
Than all tested engines, which are close to 3800+
Really that means a lot ..especially for book tours!
One thing in SF-PB missing: not capable to use CTG..
But no one work is perfect and we've to be satisfied..
Meanwhile and just to be more clear,
SF-PB 220324 SC = SF-PB via nn-b1a57edbea57.nnue
And here are the latest new strength NN results:
Code: Select all
SF-PB 220324 SC Vs SF-CTG 150724: 9 Elo difference
Here we need more games, sure for accurate metrics..
1 SF-PB 220324 SC +31/-16/=553 51.25% 307.5/600
2 SF-CTG 150724 +16/-31/=553 48.75% 292.5/600
DrawRatio is normal: 92%, since played via strong lines
-------------------------------------------------------
Default (nn-1ceb1ade0001.nnue) Vs SC (nn-b1a57edbea57.nnue)
SF-PB 220324 SC Vs SF-PB 220324 Def: 0 Elo difference
In short: just great as we see identical (in strength)
1 SF-PB 220324 Def +22/-21/=879 50.05% 461.5/922
2 SF-PB 220324 SC +21/-22/=879 49.95% 460.5/922
DrawRatio high: 95% but here it seems nn-1ceb1ade0001
Played as serious role to appear more draws..because
According to SC's itself testings: the draws were 92%
And here is the mentioned SF-PB 220324 SC Draw Test:
Note: Played each other, sure with 2 other SF-PB eng
1 SF-PB 220324 SC +23/-20/=557 50.25% 301.5/600
2 SF-PB SC (Copy) +20/-23/=557 49.75% 298.5/600
----------------------------------------------------
Last test: Vs Brainlearn 28.1, which has CTG future!
And theirs Elo difference is almost same..not so bad!
That means just in case CTG books will be played under
More fair conditions.. since strength matters a lot!
1 SF-PB 220324 SC +19/-17/=756 50.13% 397.0/792
2 Brainlearn 28.1 +17/-19/=756 49.87% 395.0/792
Btw, here the draw ratio is high: 95%, but sometimes
Not all in my hands.. but I will see what I can do..
Sure for appearing 'less' Draw percentage values, but
If running SF-PB 220324 SC (for all books) then in
Recent XXXVI CS is already proved as less drawish than
All Top engines, which are close to 3800 Elo points !!
Conditions:
2x Epyc 7B12, CuteChess, 1 Core, Ponder OFF, 30s+0.6s, Balsa/Unique, 64 Hash, 4-MEN
https://mega.nz/file/OlB0HDxS#_PWsDKR9Z ... uXCyorOcdw
Meantime, I'd happy also the programmers to make theirs
Best too..sure for appearing less draws as well..Reminder:
I am just a simple Tester/TD here... no more no less... !)
And as a last note,
I've tested many more engines..but they were out..as
Reason: they are more drawish..and this is not all..
Some are not so stable..e.g time forfeits...rarely, but..
Or crashing in Gauntlet..and it seems they need some
Optimizations on fast + modern hardwares, if nothing
Else on 2x EPYC 7B12 (with 256 Threads / 128 Cores)
But the good news is that, current tested Top engines
Are stable..at least so far!! You know, not easy.. e.g
Playing at Bullet (30s+0.6s) + High Concurrency (64)!
Thanks for reading and have a nice weekend )
Greetings
- Sedat Canbaz
- Posts: 594
- Joined: Mon Jun 19, 2023 8:49 am
- Has thanked: 682 times
- Been thanked: 2067 times
Re: Sedat Canbaz
UPDATE 2
Just one more testing...
SF-PB 220324 SC vs Rems EXP 160824: + 6 Elo (in favor for SC)
On other hand, here I am slightly surprised..e.g normally newer
Should be better (I mean the newer ones have to be stronger...)
Btw, as you may see too, this time:
Both Top engines are produced the lowest draw values: 91% great !!
Note also that
Rems EXP played as without Eng Learning (as all other engines)
Plus for all are used same conditions (such as 30s+0.6s etc.)
Be aware that all played games are included in previous post..
Best,
Sedat
Just one more testing...
SF-PB 220324 SC vs Rems EXP 160824: + 6 Elo (in favor for SC)
Code: Select all
1 SF-PB 220324 SC +50/-34/=916 50.80% 508.0/1000
2 RemsEXP 160824 +34/-50/=916 49.20% 492.0/1000
Should be better (I mean the newer ones have to be stronger...)
Btw, as you may see too, this time:
Both Top engines are produced the lowest draw values: 91% great !!
Note also that
Rems EXP played as without Eng Learning (as all other engines)
Plus for all are used same conditions (such as 30s+0.6s etc.)
Be aware that all played games are included in previous post..
Best,
Sedat