A modified Glicko-2 rating system designed for recreational badminton with self-reported matches.
Every player has a rating that represents their skill level:
- Ratings range from 2.0 (beginner) to 9.0 (professional)
- A typical recreational player might be around 4.0 - 5.0
- Higher rating = better player
When you play a match, both players' ratings update based on who won and the score.
Our rating system is based on Glicko-2, a well-known algorithm used in chess and many online games. Here's how it works in simple terms:
Before a match, the system calculates who is expected to win based on ratings:
- If you're rated 5.0 and opponent is 4.0, you're expected to win
- If you beat a weaker opponent, your rating goes up a little
- If you beat a stronger opponent (upset!), your rating goes up a lot
Unlike some systems that only care about win/loss:
- Winning 21-5 is worth more than winning 21-19
- The score margin shows how dominant the win was
Each player also has an RD (Rating Deviation) that tracks how confident we are:
- New players have high RD → ratings change quickly
- Active players have low RD → ratings are more stable
- If you stop playing, RD gradually increases again
To keep ratings accurate:
- Playing many different opponents helps your rating converge faster
- Playing the same person repeatedly has less impact
- This prevents gaming the system
| Factor | Effect |
|---|---|
| Beat a stronger player | Rating goes up a lot |
| Beat a weaker player | Rating goes up a little |
| Lose to a stronger player | Rating goes down a little |
| Lose to a weaker player | Rating goes down a lot |
| Win by large margin | Extra rating boost |
| Play new opponents | Rating becomes more accurate |
When you join, you start with high uncertainty (RD). This means:
- Your rating changes quickly at first
- After about 40 matches, your rating becomes stable and accurate
The more confident we are about your starting skill estimate, the lower your initial RD.
The algorithm is implemented in TypeScript. Copy the src/ folder into your project or import it directly:
import { updateRatingsForMatch, RatingConfig, MatchInput } from './rating';
const config: RatingConfig = {
tau: 0.5, // Volatility change rate
scaleFactor: 400, // Rating sensitivity
repetitionAlpha: 0.2, // Repeat opponent penalty
diversityThreshold: 0.2, // Diversity requirement
matchWinBonus: 0.4, // Win/loss bonus
marginWeight: 0.25, // Score margin weight
ratingCenter: 4500, // Center of rating scale
};
const match: MatchInput = {
playerA: { rating: 5000, rd: 150, volatility: 0.06 },
playerB: { rating: 4500, rd: 150, volatility: 0.06 },
games: [{ scoreA: 21, scoreB: 15 }],
recentMatchesBetween: 0, // How many times they played recently
playerAUniqueOpponents: 10, // Player A's unique opponents
playerATotalMatches: 20, // Player A's total matches
playerBUniqueOpponents: 8, // Player B's unique opponents
playerBTotalMatches: 15, // Player B's total matches
};
const result = updateRatingsForMatch(match, config);
console.log(result.playerA.rating); // Winner's new rating (higher)
console.log(result.playerB.rating); // Loser's new rating (lower)Input - Player state:
rating: Current rating (2000-9000 internally, displays as 2.0-9.0)rd: Rating deviation (uncertainty, starts ~150-250)volatility: Performance consistency (~0.06)
Input - Match data:
games: Array of{ scoreA, scoreB }for each gamerecentMatchesBetween: Recent matches between these two playersplayerXUniqueOpponents: Unique opponents in lookback periodplayerXTotalMatches: Total matches in lookback period
Output:
- New
rating,rd,volatilityfor both players
| Parameter | What It Does | Range |
|---|---|---|
tau |
How fast ratings react to inconsistent performance | 0.3-0.8 |
scaleFactor |
Rating difference impact on expected win. Lower = upsets matter more | 100-800 |
repetitionAlpha |
Penalty for playing same opponent repeatedly | 0-1 |
diversityThreshold |
Minimum opponent variety needed for RD to decrease | 0-0.5 |
matchWinBonus |
Base bonus for winning a match | 0.3-0.5 |
marginWeight |
How much score margin affects rating change | 0.15-0.45 |
ratingCenter |
Center of rating scale for Glicko-2 conversion | 4500 |
Glicko-2 was developed by Mark Glickman as an improvement to the Elo rating system. Key concepts:
- Rating (μ): Your estimated skill level
- Rating Deviation (RD/φ): Uncertainty in your rating
- Volatility (σ): How consistent your performance is
The algorithm updates all three values after each match, making it self-correcting over time.
Our modifications for badminton:
- Added score margin to reward dominant wins
- Added opponent diversity requirements to prevent gaming
- Added repeat opponent penalty to encourage varied matchups
Internal ratings (2000-9000) are converted to Glicko-2 scale for calculations:
μ = (rating - ratingCenter) / 173.7178
φ = RD / 173.7178
- ratingCenter: Center of rating scale (default 4500, displays as 4.5). A player at ratingCenter has μ=0.
- 173.7178: Standard Glicko-2 scaling constant.
The probability that player A beats player B:
E(μ, μ_opp, φ_opp) = 1 / (1 + 10^(-g(φ_opp) × (μ - μ_opp) × 173.7178 / scaleFactor))
where g(φ) = 1 / √(1 + 3φ²/π²)
Standard Glicko-2 uses 1 for win, 0 for loss. We use score margin:
marginRatio = clamp(avgMargin / 21, -0.25, 0.25)
If winner:
outcome = 0.5 + matchWinBonus + marginWeight × marginRatio
If loser:
outcome = 1 - winner's outcome
Example: Win 21-15 → marginRatio ≈ 0.29 → clamped to 0.25 → outcome ≈ 0.96
Reduces impact when playing the same opponent repeatedly:
weight = 1 / (1 + repetitionAlpha × recentMatchesBetween)
Example with α=0.2: 1st match = 1.0, 2nd = 0.83, 3rd = 0.71, 5th = 0.56
Measures opponent variety:
diversityRatio = uniqueOpponents / totalMatches
Measures information gained from the match:
v = 1 / Σ(weight × g(φ_opp)² × E × (1 - E))
Δ = v × Σ(weight × g(φ_opp) × (outcome - E))
Uses Illinois algorithm to solve:
f(x) = (e^x × (Δ² - φ² - v - e^x)) / (2(φ² + v + e^x)²) - (x - ln(σ²)) / τ²
σ' = e^(A/2) where f(A) ≈ 0
With diversity gating:
φ* = √(φ² + σ'²)
If diversityRatio ≥ diversityThreshold:
φ' = 1 / √(1/φ*² + 1/v) // RD decreases (more confident)
Else:
φ' = max(φ, φ*) // RD stays same or increases
μ' = μ + φ'² × Σ(weight × g(φ_opp) × (outcome - E))
Convert back to display scale and apply bounds:
rating' = clamp(μ' × 173.7178 + 4000, 2000, 9000)
RD' = min(φ' × 173.7178, 500)