ru24.pro
News in English
Июнь
2025
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
26
27
28
29
30

FIDE Chess Ratings Revisited – what improvements can still be made

0

Vlad Ghita is a Romanian chess player and journalist, that likes to take deep dives in important chess topics. You may know him from his Chess Olympiad 2024 stories – Welcome to Budapest! Many reunions and Chess Olympiad 2024 revisited – 300 million stories. Today he takes a close look at the rating system we use today in chess

Ratings: FIDE Top 100 / FIDE Top 100 women / FIDE Top 100 juniors / FIDE Top 100 girls

FIDE Ratings Revisited, by Vlad Ghita

In this article, I revisit FIDE’s recent rating changes and examine whether the Elo system still serves chess effectively. With data spanning March 2024 to June 2025, I show how global deflation, junior-driven volatility, and cross-federation mismatches expose systemic flaws. The system’s rigidity in a rapidly evolving chess landscape demands statistical modernization. Let’s unpack what the numbers reveal.

The assumed audience is comprised of chess hobbyists, tournament players, and FIDE stakeholders. Some basic math and stats will help, but no deep dives into formulas, I promise. If you want to skip ahead to a particular section, here’s the outline:

  1. A System Under Scrutiny
  2. Background: Floors, Adjustments, and Band-Aids
  3. The Numbers Tell the Story: Deflation is Real
  4. Global Activity: More Games, More Data, But Not Better Ratings
  5. Why Elo Breaks Down
  6. Conclusion: A Future-Proof Rating System?

Follow the official substack of Vlad Ghita / Follow Vlad Ghita on Twitter

1. A System Under Scrutiny

Last year, while the changes were still nascent, I explored the effects of FIDE’s new rating policies and the broader implications of the adjustments within the standard Elo framework. That article, ‘FIDE Rating Changes: Are They Working So Far?’, raised eyebrows and garnered 5 full pages of comments to go along with nearly 20,000 views on Lichess!

Since then, the chess calendar has accelerated. New players are flooding in where federations invest in chess. Norm tournaments are increasing in places once considered off-the-radar, while European opens are becoming attractive for those who do not have the opportunities to face such a diverse array of high-rated players at home. All the while, FIDE’s rating system keeps churning out numbers rigidly and indifferently.

The FIDE rating system, built on the Elo formula, was revolutionary for its time. But in a chess world shaped by hyperactivity, global mobility, and asymmetrical tournament access, it’s starting to buckle. This article re-examines how well the Elo framework serves the modern chess ecosystem and where it fails.

Imagine two players, both rated 1800. One’s from Denmark, the other from Sri Lanka. On paper, they’re equals. In practice? One crushes the other 9 times out of 10. That’s not a bug in the system, it’s the system itself not adapting to today’s context.


2. Background: Floors, Adjustments, and Band-Aids

The Elo system was built on the assumption that only skilled players enter rated competition. That made sense in 1970 when FIDE adopted Elo. The first published list in 1971 had just ~600 players, led by Fischer. The rating floor? 2200.

Over time, that floor dropped:

  • 1993: reduced to 2000
  • Eventually: down to 1000 by the 2010s
  • 2024: reversed — raised again to 1400

Lowering the floor absorbed more players into the system, but also diluted the rating pool. Today’s ecosystem includes casuals, ambitious juniors, and professional aspirants in the same pool.

Timeline Snapshot

Rating Floors
2200 → 2000 → 1800 → 1600 → 1400 → 1200 → 1000 → 1400

K-Factor Evolution
Pre-2014:

  • K = 30 (or 25) for newcomers
  • K = 15 for <2400
  • K = 10 for ≥2400
    Post-2014:
  • K = 40 for new players (or U18 <2300)
  • K = 20 for <2400
  • K = 10 for ≥2400

Publication Frequency
Annual → Semiannual (1981) → Quarterly (2000) → Bi-monthly (2009) → Monthly (since 2012)

Rating Range Capping
The “400-point rule”, capping rating differences in game calculations, has toggled on and off:

  • “350-point rule” (pre-2011)
  • Abolished (2022), reinstated (2024)

Major 2024 Reforms

  • One-time inflation boost:
        Players <2000 received: 0.4 × (2000 − rating)
  • Floor raised from 1000 to 1400
  • New initial rating method:
        Performance-based + 2 fictitious draws vs 1800
  • 400-point rule restored for all games

The Rating Floor = Artificial Ceiling
Raising or lowering the rating floor doesn’t just affect beginners. It compresses the entire rating spectrum, squeezing out distinction and limiting upward mobility for ambitious players. While the rating floor changes managed to absorb more players into the system, it had a negative impact on those with higher aspirations for titles.

I found a gloomy reminder of that, and implicitly of my own title aspirations, last week while browsing the social media platform X. The user Gutsy Gambit posted the following screenshot, comparing active player distributions in June 2015 and June 2025 side-by-side. The comparison prompted me to explore deeper and write this follow-up.

Even as the number of active players has surged, ratings above 2000 Elo have steadily deflated – not from declining skill, but from systemic flaws. So, maybe instead of patching things up every few years, the time has come to rip off the band-aid?


3. The Numbers Tell the Story: Deflation is Real

Here’s what the rating distribution has done since March 2024. Each graph follows immediately below each bullet point.

  • A time-series of how the distribution of players has shifted since March 2024
    • Key trend: More and more players are clustering around the 1500 mark — a clear sign of rating pool compression
  • A detailed histogram snapshot taken at the time of this article (June 2025)
    • Notice the pile-up at 1400: this is an artifact of the floor rule, not actual player skill. We’ll revisit this later.
  • A plot of the number of Standard FIDE-rated players
    • ~3,500 new players are added per month, yet the average rating keeps declining.
  • A line plot of the average rating in the entire dataset across the interval from March 2024 to June 2025
    • Average rating is falling at ~1 Elo/month, despite growing participation.
  • A six-panel summary plot that explores the evolution of the distribution
    • Deflation in the fixed percentiles across the board (including at the elite, top 1% level)
    • Increased skewness and kurtosis (or more asymmetry and sharper peaks)
    • Rapid rise of sub-1600 players
    • A strange anomaly with player counts in July 2024
    • Closing of the gaps between fixed percentiles (narrower distribution)

The 1800–1999 rating band is slowly being replaced by the 1400–1599 band. This isn’t a performance decline, it’s structural compression. The deflation persists across all percentiles.

Over the past 20+ years, some of these measures appear to be stopgap solutions that FIDE implemented. In particular, among the top brass of FIDE, there’s a widespread belief that the Elo system is the only acceptable solution for players to be able to calculate their own requirements for scoring title norms. Although I don’t expect immediate action, I hope my points here are compelling enough to lead to some reflection and internal discussion. If I can be of any help in such discussions, I would gladly participate.

The Elo system flattens complexity into simplicity, but in doing so, it also flattens its own validity. It reacts too slowly for fast-improving players, and too weakly to structural asymmetries such as geographic and economic disparities. And while it still works well for elite stability, the bulk of the chess ecosystem suffers from its rigidity. Aspiring players have bigger roadblocks in their way to the top, and casual players suffer from a system that is biased against them.

Thanks for reading Vlad’s Chess Chronicles! Subscribe for free to receive new posts and support my work.


4. Global Activity: More Games, More Data, But Not Better Ratings

The renewed popularity of chess can be attributed to 3 main factors:

  • Pandemic shutdowns and an increase in work-from-home
  • The Netflix show The Queen’s Gambit
  • More chess content on livestreaming platforms such as Twitch and YouTube

If we assume that 2021 was the first year OTB chess activity resumed in earnest, we can take a look at the number of games played in this interval, compare it to pre-pandemic levels, and also forecast some future growth. I have chosen to discard the 2020 year entirely from the visualization, as it’s a clear outlier.

Although the post-pandemic increasing trend is starting to flatten out a bit, we are still due to eclipse 3.5 million games by the end of 2025. The recovery has been nothing short of impressive, with 2024 being the most active year in history, which coincided with the FIDE centenary celebration.

In his Supplemental Report, statistician Jeff Sonas introduced a 3-point segmentation based on age ranges. I will recast that visualization here, taken from page 8 of his report, and showing the April 2023 rating distributions.

Source: Sonas, J. (2023). Supplemental Report, p. 8

I have independently verified that this segmentation is valid for game data post-compression, at least for the interval March-December 2024. Let’s show how:

The color coding is different here, but I hope the relationship is clear:

  • improvers gain rating consistently by playing more games, as we would expect
  • stable players see a smaller improvement with playing more games
  • decliners lose rating slightly, as expected

Maybe you don’t agree that this age segmentation is correct, and prefer to look at more granular age groups, separated into rating bins. Here’s that analysis, spanning March 2024 to June 2025:

For now, let’s say that I managed to convince even the most skeptical readers that the young players are the most dangerous to face, since they improve the fastest. Their K-factor is of course a big helper during their ascent, but even when accounting for the K-factor asymmetry, we see a big discrepancy that favors the U16 players. For those who already play rated OTB tournaments consistently, the pain of losing to an underrated junior is all too real. Beyond that, there’s an even scarier thought out there!


The chart above illustrates what I deem to be FIDE’s biggest challenge over the upcoming years. This is a striking implication!

Same Rating, Different Reality
A player rated 1800 in Sri Lanka and one rated 1800 in Denmark might share a number, but not a skill level. This is reflected in their URS ratings, but Elo is completely naive to it. This isn’t an isolated mismatch. It’s the norm when federations with deflated pools meet those with inflated ones and Elo has no way of knowing.

The static K-factor and single-rating assumption struggles in this dynamic environment with players from various federations mixing in open events. Yet, the example above was merely a thought experiment. Out in the real world, this happens more frequently than before in large Swiss tournaments, where the mixing of federations offers the juniors from underrated countries a huge incentive to participate and “farm” rating from their unsuspecting opponents. A typical example is the Sunway Sitges tournament in Spain, which often attracts a lot of youth participants from India. Here’s a screenshot of Sunway Sitges 2024:

Of ~20 Indian players rated below 2000 listed above, only one – Adarsh D – underperformed their starting rank and finished lower. This is consistent with the belief that amateur Indian players are often more underrated than their professional counterparts, and has given rise to a new phenomenon where established European players often avoid tournaments for fear of getting paired to these extremely dangerous opponents. However, the discussion shouldn’t be limited to India alone. Federations from Central Asia, such as Kazakhstan and Uzbekistan, also have a wide array of extremely talented juniors, with Kazakhstan scoring particularly well in World Youth Championships recently.

David Smerdon, a known Grandmaster and Assistant Professor of Economics at the University of Queensland reiterates this geographical disparity and puts a different highlight on it: “It’s not an age thing, it’s a not-enough-FIDE-tournaments thing. Poorer federations are more likely to have deflated ratings because submitting FIDE tournaments is costly. So, there’s a correlation between country GDP and ratings inflation, which some might find problematic.”

While I agree that there’s a correlation between country GDP and ratings inflation, I think a bigger factor at play is the number of active youth players in that respective federation. Now, I will illustrate that chess is a young person’s game:

Look at the last diagram, which superimposes the top two diagrams. Compared to their representation in the set of all players, teenagers have been more active in 2024 and they show no sign of stopping in 2025. This is quite relevant because it introduces a “K-factor asymmetry” into the pool. If there’s often mixing between asymmetric K-factors (say, someone with K=40 facing someone with K=20), we expect the lower-rated youth players to add an inflationary pressure in the system, by extracting double the rating points from their more established opponents. And here’s the kicker: even with all the tweaks, deflation still hasn’t gone away!


5. Why Elo Breaks Down

Elo’s Blind Spots
A system designed decades ago for top players competing in elite round-robin events assumes all players compete under equal conditions. It doesn’t factor in regional differences, economic disparities, or uneven tournament access. It hardly accounts for uneven matchups where players are separated by more than 400 points. The consequences? Widespread rating distortions and unfairness.

This article has shown both the symptoms (global deflation, geographical rating disparities, youth-driven volatility) and the underlying causes. At the heart of these problems lie three fundamental limitations of the Elo framework

  • Elo assumes a logistic distribution→ Only motivated players participate in rated tournaments, which would be more consistent with a log-normal distribution→ Skill increases multiplicatively (each new concept learned builds on previous ones), and not linearly→ There’s a long “tail” of elite players
    • The graph below showcases key differences between the two distributions. They are not based on real data, but rather simulated data sets to illustrate the shape difference.
  • Static K-factors
    → They’re blunt instruments. Fast-rising players get stuck. Declining ones linger too long. A better measure would be a context-dependent volatility parameter. Inactive for too long? There’s no certainty your rating is meaningful.
  • Geographic and economic blind spots
    → Elo doesn’t adjust for regional inflation or deflation, nor does it consider tournament access or federation disparities. A 1900 in Denmark and a 1900 in Sri Lanka? Night and day. Elo sees no difference.

6. Conclusion: A Future-Proof Rating System?

Chess has evolved since 1970. The rating system hasn’t.

We now compete in a world of open tournaments, global mobility, asymmetric federation structures, and thousands of improving juniors who can accumulate hundreds of classical games per year. Yet FIDE still relies on a system built for closed, round-robin events between national elites. That system was revolutionary in its time. Today, it’s showing its cracks.

FIDE’s recent changes: the one-time sub-2000 adjustment, the reintroduction of the 400-point rule, the rating floor increase to 1400, are all sincere attempts at relief. But they remain reactive and fundamentally tied to an aging core assumption: that Elo is good enough.

We don’t need to burn the whole system down. But it’s time we build something that fits today’s chess world, with some key ingredients:

  • Flexibility.
  • Responsiveness.
  • Contextual strength estimation.
  • Models that can keep up with how players actually improve.

It won’t be easy to replace Elo. It’s embedded into our title systems and our historical lists. But if we value accuracy, fairness, and objectivity of the rating system, we owe it to ourselves to analyze things deeper. How long can a modern game run on a vintage algorithm?

The chess world has changed. Today, our clocks are digital and our games are online. Our analysis runs deeper than ever before with Stockfish and Leela, leveraging powerful neural nets and machine learning algorithms. Yet, our ratings still lag behind. If we want fairness to keep pace with progress, the time has come to modernize, and not use the same formula as in 1970.

This report is brought to you by Vlad Ghita. Vlad is a chess player, coach, content creator, and chess promoter from Romania. Since 2020 he has been involved prominently in the chess world