Behind the Numbers: Understanding Chess Engine Evaluations

2023-03-13

Chess engines have revolutionized the way we play and analyze chess games, and for modern chess players, it is important to have a general understanding of how chess engines work. In this blog post, we will discuss one of the key functions of chess engines - the evaluation function. We will cover the following points:

How to understand the engine's numerical score
How the evaluation function works
How Stockfish and Lc0 evaluate chess positions

Understanding Chess Engine Evaluation Numbers

The concept of evaluation numbers may seem daunting to chess beginners, but it's actually quite simple.

The evaluation function of chess engines assigns a numerical score to each position, indicating which side has an advantage. The score is typically expressed as a positive number for white and a negative number for black. The higher the score, the greater the advantage.

For example, a score of +0.56 means that white has a slight advantage, while -4 means that black is winning. A score of 0 indicates that the position is equal. Some GUI chess programs and websites may also include algebraic symbols like +- or += to make it easier to understand the score. The following are the meanings of the symbols that you may encounter in chess engine analysis:

= Equal position
+/= Slight advantage for White
=/+ Slight advantage for Black
+/− Clear advantage for White
−/+ Clear advantage for Black
+ − White has a winning advantage
− + Black has a winning advantage

However, even if you don't have these symbols in your engine analysis, you can still easily understand the score by looking at our table below which shows the ranges of numerical scores. Please note, however, that even chess engines aren’t perfect and they may sometimes deviate from the correct position evaluation by 0.1-0.2.

However, even if you don't have these symbols in your engine analysis, you can still easily understand the score by referring to our table below, which shows the ranges of numerical scores and their corresponding algebraic symbol. It is worth noting, however, that even chess engines are not perfect and may sometimes deviate from the correct position evaluation by 0.1-0.2.

Range	Evaluation	Symbol
-0.26 to 0.26	Equality	=
0.27 to 0.7	Small advantage for White	+/=
0.7 to 1.5	Clear advantage for White	+/-
over 1.5	Decisive advantage for White	+-
-0.27 to -0.7	Small advantage for Black	=/+
-0.7 to -1.5	Clear advantage for Black	-/+
over -1.5	Decisive advantage for Black	-+
# plus any number	Checkmate	M or #

Important to know: In some theoretically drawn positions, the engines may still show an advantage for one side. This is especially common in fortresses, where one side has a significant material advantage but is unable to convert it in case the other side defends correctly. Consequently, some players prefer to integrate endgame tablebases (Syzygy, Nalimov) into their engines to get an accurate evaluation of endgame positions.

More About Endgame Tablebases

Endgame tablebases (EGTBs or EGTs) are basically databases of precalculated endgame positions. They enable chess engines like Stockfish and Leela Chess Zero to instantly determine:

Whether the position is winning, losing, or drawn with perfect play from both sides
How many moves it will take to checkmate the losing side if the position is not drawn
The best move in the position for both sides

Chessify uses the 6-piece Syzygy tablebase, which has all the aforementioned information about endgame positions with six pieces or less. If you are analyzing endgame positions with few pieces left, we recommend that you check the Syzygy box as shown in the screenshot below.

How Chess Engines Evaluate

The position evaluation of chess engines is not merely based on the material currently on the board. The engines also take into account the positional aspects and tactical possibilities. They consider a variety of factors, including:

Material: the number and value of pieces on the board
Mobility: the ability of pieces to move around the board
King safety: how vulnerable the kings are
Pawn structure: the configuration of pawns on the board
Control of space: which side controls more of the board
Piece coordination: how well the pieces work together

Material and Centipawns

The value of each piece is assigned a numerical value based on its relative strength. For example:

Pawn: 1 point
Knight or Bishop: 3 points
Rook: 5 points
Queen: 9 points

Hence, in basic terms, if a chess engine evaluates a position as +1 in favor of White, it means that White is considered to have an advantage of a full pawn. Conversely, if the evaluation is -0.5, it means that Black has a slight advantage of half a pawn.

However, chess engines need to evaluate positions with a high degree of precision, so they use centipawns to express very small differences in value accurately.

Centipawns are a unit of measurement in chess engine analysis that quantify the advantages or disadvantages of a position. A centipawn is equal to 1/100th of a pawn. So an evaluation like 0.35 means an advantage of 35 centipawns.

Mobility

The mobility of a piece is determined by how many squares it can move to from its current position. The more squares a piece can move to, the more mobile and usually powerful it is.

King Safety

The safety of the king is crucial in chess. The engine evaluates the safety of the king by looking at factors such as pawn cover, open files, and the presence of enemy pieces.

Pawn Structure

The structure of the pawns on the board is another important factor. A pawn chain can provide support for your pieces, while isolated pawns can be vulnerable.

Control of Space

Controlling more of the board can provide an advantage in chess. The engine evaluates the control of space by looking at how many squares each side's pieces are controlling.

Piece Coordination

A well-coordinated set of pieces can provide an advantage in chess. The engine evaluates coordination by looking at how well the pieces are working together.

Different Evaluations For Different Chess Engines

Different chess engines can use different evaluation functions to assess a position. As we've explained in the previous section, these evaluation functions take into account various factors such as material balance, pawn structure, piece activity, king safety, and potential threats.

One way that chess engine evaluations can differ is in the way that they assign weights to these different factors. For example, one engine might place more emphasis on pawn structure, while another might prioritize piece activity.

Another way that evaluations can differ is by the depth of analysis that the engine reached. Some engines might only consider a few moves ahead, while others will analyze dozens of moves deep. The depth of analysis can have a significant impact on the quality of the evaluation, as deeper analysis can uncover tactical possibilities that might otherwise be missed.

In addition to differences in evaluation functions and depth of analysis, chess engines can also differ in the way that they handle certain types of positions. For example, some engines might struggle with closed positions where there are few open lines for pieces to move along, while others might excel in these positions by using strategic maneuvers to gain control of key squares. So it is always a good idea to learn about the characteristics of the top chess engines such as Stockfish, Lc0, Komodo, etc.

Stockfish and LCZero example

Here's one example of the difference in evaluation between Stockfish and LCZero. The two engines work very differently. After analyzing for the same amount of time, they have different depth and NPS values: Stockfish has reached a depth of 41 and analyzed around 8.7 billion nodes, while Lc0 has reached a depth of 19 and analyzed only 2.4 million nodes.

Despite the different depth and node values, the two engines suggest the same lines. However, the evaluation is where you see the difference.

Stockfish assigns a slight advantage for white (+/= or ~0.4) for both Bb5 and dc5 moves, while LCZero identifies a more substantial advantage for white (+/- or 0.75) for Bb5.

So which one is more accurate?

Although Stockfish leads the score in its matches against LCZero, the latter is believed to have a more human touch in its position evaluation and creativity. Thus, in a real game, for a human player, the evaluation of the position could be closer to +0.7.

In conclusion, understanding how chess engines work and how they evaluate positions can greatly enhance your chess training and the effectiveness of analyzing your chess games. By learning about the evaluation function, the numerical scores, and the differences between chess engines, you can become a more informed and strategic player and elevate your game to the next level.