If you follow online chess drama on X you might have noticed that Vladimir Kramnik has been on a crusade against online cheating, especially on chess.com. He has often based his analysis on the Chess.com accuracy scores.
The only problem is that we do not know precisely how the accuracy score is calculated. It is a Chess.com trade secret based on an algorithm called CAPS2. This article will try to uncover more information about it.
What is CAPS2?
The CAPS algorithm was first introduced in 2017 and stands for Computer Aggregated Precision Score. From the announcement post, it was explained that the score was calculated based on:
how many top moves (moves that matched the engine's top choice or were equal in score to that choice)
how many inaccuracies (a move that changes the position's evaluation slightly in the negative direction)
how many blunders (a move that changes the position's evaluation greatly in the negative direction)
patterns of strength (our own algorithm that determines the sequencing of these scores per game timeline)”1
The issue with the first version of the CAPS algorithm was that some players, especially those in the lower rating tiers sometimes recorded very low-scoring games. This could ultimately result in a lack of interest in playing chess or you might even get lucky to get your game featured in a YouTube game review video seen below!
* Video link
Introduction of CAPS2 and Its Impact
In 2021 Chess.com introduced CAPS2. In the launch video, Daniel Rensch explained that the new algorithm aimed to limit the swings of the pendulum, meaning that there should be fewer extreme scores.
Chess.com writes the following about CAPS2:
“Our Accuracy is a measurement of how closely you played to what the computer has determined to be the best possible play against your opponent's specific moves. The closer you are to 100, the closer you are to 'perfect' play, as determined by the engine.
Chess.com’s Accuracy score is now powered by “CAPS2”, an improved version of the original Chess.com “CAPS” (Computer Accuracy Precision Score) algorithm.”2
- Chess.com
Chess.com has provided a plot to show how the CAPS2 algorithm concentrates the game scoring compared to the first version. In effect, the algorithm was tweaked to produce scores that resonated better with the users.
It is still a bit strange that the users don’t know how this number is calculated, and that Chess.com has fitted an algorithm where 20% of all games played now scores around 80 (based on the plot), which feels like a decent and non discouraging score for most. Are we just being nice to ourselves or do we get a useful output?
Reply from Chess.com
In trying to get more information about the algorithm, I reached out to Chess.com with some questions about CAPS2.
What was the main motivation for Chess.com to introduce CAPS and CAPS2 as a tool?
The idea behind Accuracy Score was to give chess players a clear sense of how well they played, no matter if the game ended as a win, loss, or draw. We think of it like getting an exam graded in school—each move you make in a game is checked and scored on its own. Then, we largely average these scores to give you an overall “grade” out of 100. This score helps players, especially newer ones, understand their game quality in a straightforward, quantifiable way. Accuracy Score doesn’t take into account how difficult the moves are to find, if you had a harder or easier opponent, or how much time you spend per move, etc. Some games are “easy” to score well on, and some are hard. It's completely separate from our Fair Play analysis, which focuses on detecting cheating and uses much more sophisticated methodologies for understanding players, games, and moves.
What variables does the CAPS2 algorithm use to determine the accuracy of a move and game?
Accuracy Score looks at every move you make and puts it into one of five categories: best, excellent, inaccuracy, mistake, or blunder, where each has its own score. Book moves are always scored as "best." We then mostly average everything to see how accurately you played overall. To make sure this score better reflects your game, we smooth out the edges by adjusting for things like mate-distance and reducing the penalty for multiple blunders. The idea is to assess each move equally and help players see where there’s room to improve.
How is engine depth used in the CAPS2 algorithm?
We use a moderate engine depth to balance precision and efficiency. This allows us to provide reliable responses for the millions of games played every day. The depth we use varies based on player strength and settings.
In the article “How is Accuracy in Analysis Determined?” it is stated that “The new Accuracy scores, based on CAPS2, replicate the feeling of being graded on a test in school.” In more technical terms, how was this achieved?
As mentioned above, every move you make in a game is graded on a scale from 'best' to 'blunder.' To fine-tune these evaluations, we use mate-distance scoring and adjustment for multiple blunders to ensure that a few poor moves in a row doesn’t skew your overall score too heavily. The final Accuracy Score is calculated as an average of these grades, which means every move roughly influences your score equally. This approach is designed less for absolute precision and more for providing actionable insights that can help players at all levels improve their game and enjoy their chess journey.
Why is CAPS2 a good tool for measuring accuracy compared to centipawn loss?
Centipawn loss isn't very intuitive for most players, especially if you're just beginning your chess journey. That's one reason we developed the Accuracy Score, to transform the complex concept of game quality into a simple, clear score from 0 to 100. We feel that’s a lot easier to understand than maybe two average centipawn loss games where one is 71 and the other is 87. It's straightforward and offers relatable feedback on how well you played. For us this tool is all about making learning chess more accessible and enjoyable.
In recent months we have seen a lot of debate in the chess world fueled by a faulty understanding of what the CAPS2 accuracy is. Is Chess.com considering opening up about how the accuracy is calculated to avoid the accuracy score being used for cheating accusations?
We’ve previously discussed this topic in places like State of Chess and are always open to providing more information for the chess community. Accuracy Score is all about helping players understand their performance in a single game. This is different from our Fair Play process, which examines patterns over many games to detect outside assistance. Accuracy Score is designed to be a helpful and straightforward tool for players to see how they played in a game. If there are questions that would further clarify the differences between Fair Play and Accuracy Score, we’re happy to provide answers.
Final Reflections
Even though I appreciated the willingness to answer the questions not a lot of new information was provided about how the algorithm works in practice. But the mentioning of “we smooth out the edges by adjusting for things like mate-distance and reducing the penalty for multiple blunders.”, was a good explanation of how the algorithm is designed to work.
The algorithm's "black box" nature presents some downsides. Users are left in the dark about the precise mechanics behind their scores, which can lead to misconceptions even though the goal is to make it simple.
The lack of transparency may also fuel inaccurate interpretations and debates, particularly around cheating accusations as brought forward by Vladimir Kramnik. With accuracy scores visible everyone can form theories about their opponent’s suspiciously good moves. As Chess.com mentions this is not a cheat detection tool, since the algorithm does not measure the difficulty of the moves when scoring.
The smoothing out of the extreme scoring might also affect our perception of what is probable and what is not.
In conclusion, while CAPS2 is an interesting tool that many chess players use, but few know how it works. Chess.com should in my opinion consider increasing transparency to give greater trust and understanding among its users.
The reason why they want to keep the details about the algorithm secret, I guess, is because the game review feature drives a lot of account upgrades because users want to know how well their play scored in accuracy.
That does not change that providing more insight into how scores are calculated could make it more obvious that it is a poor cheat detection tool.
/Martin
The fact that they use different depths based on the player ratings make it impossible to compare accuracy numbers. It's clear that it is a marketing tool (and a good one too because players like it)
Makes me wonder how you would test for the difficulty of the move.