You want to rank players (or LLMs, or ping pong colleagues) — but you have no absolute measuring stick. All you have is who beat whom.
Arpad Elo proposed: if Player A has rating RA and Player B has rating RB, then A's expected probability of winning is:
That's it. The entire system flows from this one formula.
Source: Wikipedia — Elo rating system; Elo, The Rating of Chessplayers (1978), Ch. 2.
In a ping pong ladder or an LLM arena, you never measure "absolute skill." You only observe: A beat B. Elo converts these pairwise outcomes into a single number per player that is consistent — if A usually beats B, and B usually beats C, then A's rating will be higher than C's, even if A and C never play.
Player X has rating 1600. Player Y has rating 1400. What's the rating difference from X's perspective?
Two players have identical ratings. What's the expected win probability for each?
For the full picture, read the Wikipedia article on Elo — particularly the "Mathematical details" section. It's clear, well-sourced, and covers the formula derivation in more depth.
Next: The Update Rule and K-Factor →
Questions? Ask your Copilot agent — I'm your teacher here and happy to clarify anything.