Back to Reddit questions
System DesignSoftware Engineer

Design a Gaming Leaderboard Service

Role: Software Engineer


Imagine you are building a backend service for game studios. Games run on players' computers, and after a game session ends the client wants to show two things right away:

  • the player's personal best for that game

  • the game's top 100 leaderboard

We only need to persist a submitted score if it sets a new personal best for that player or breaks into the game's top 100. That small requirement changes the whole traffic pattern: a newly launched game can generate a burst of writes, but mature games become much quieter because the leaderboard cutoff rises and most players no longer qualify.

This walkthrough follows the Interview Framework and focuses on what you would actually present in a 45-60 minute interview.

Prioritize availability over consistency. If a cache replica is slightly stale or one region is degraded, it is better to show a slightly old leaderboard and let clients retry score submission than to take the entire feature offline.

The key interview insight is that this is not a "store every score forever" system. Because only personal-best improvements and top-100 candidates matter, both the client and the server can filter most writes before they hit durable storage.

If the interviewer says top 10 instead of top 100, treat K as configurable. The architecture is the same; only the cutoff cache and leaderboard size change.

Assume anti-cheat and score verification are handled upstream by trusted game servers or a separate validation service. The focus here is leaderboard storage, latency, and scaling.

Phase 1: Requirements (~5 minutes)

Start by clarifying scope. The interviewer hint here is usually "traffic is small at first" or "this is a startup." That is a signal to begin with the simplest working system, then evolve it.

Functional Requirements

  • Players should be able to fetch the current top 100 leaderboard for a game

  • Players should be able to fetch their personal best for a game immediately after a session ends

  • Clients should be able to submit a score candidate and learn whether it updated personal best, leaderboard, both, or neither

  • The system should be able to safely handle retries and offline buffering so players can resubmit when service is back

  • The system should be able to support many games, including bursts when a new title launches

Do not over-scope the first pass. Matchmaking, achievements, friends, anti-cheat, and full score history are all separate systems unless the interviewer explicitly asks for them.

Non-Functional Requirements

RequirementTargetRationale
Leaderboard read latencyP95 under 50 ms from cacheGame UI should feel instant
Personal-best read latencyP95 under 50 ms from cacheUsually shown right after the match ends
Score submission ackP95 under 200 msClient should know quickly whether submission was accepted or queued
Availability99.95%Prefer degraded service over hard failure
ConsistencyEventual consistency is acceptableA few hundred milliseconds of staleness is fine
Durability for accepted scoresNo lost accepted submissionsOnce accepted, a retryable candidate must survive failures

Ask clarifying questions early:

  • "Do we need to store every raw score attempt, or only personal bests and top-K candidates?" Assume only the latter in the hot path.

  • "Does the client need strict read-after-write consistency after submission?" Assume near-realtime is enough, but call out how to tighten this if needed.

  • "Should the client submit every score?" No. The intended optimization is local filtering: only submit if the score beats the local personal best or current leaderboard cutoff.

  • "How do we break ties on the leaderboard?" Use a deterministic rule such as (score DESC, achieved_at ASC, submission_id ASC).

  • "What matters more: consistency or availability?" Availability wins.

Capacity Estimation

Use the traffic shape from the prompt, not generic social-media scale:

text
Assumptions:
- 10M daily active players across all games
- 3 completed game sessions per player per day
- 30M session ends/day = ~350 session ends/sec average
- Peak factor during evenings / launches: 15x => ~5,000 session ends/sec

Steady-state mature games:
- Only ~2% of scores beat local PB or top-100 cutoff
- Candidate write peak ~= 100 submissions/sec

Launch-day hot game:
- The leaderboard is not full yet, so nearly every finished session may submit
- Worst-case write peak for one new game can approach the full 5,000 submissions/sec

Reads:
- Leaderboard and PB are read after many sessions and from game menus
- Design for ~50,000 cache reads/sec peak across games

Hot data size is small enough to justify aggressive caching:

text
Leaderboard cache:
- 100K active games
- 100 entries/game
- ~64 bytes/entry raw payload
- ~640 MB raw before overhead

Personal-best cache:
- 200M active (user_id, game_id) pairs
- ~32 bytes per record raw
- ~6.4 GB raw before overhead

The most important capacity observation is not the absolute QPS. It is that write traffic decays over time for mature games, while launches create bursty "all scores matter" behavior. That is why a queue plus cache-first reads is a strong answer.

Phase 2: Data Model (~5 minutes)

The two core queries are different:

  • Leaderboard query: "show top 100 for game_id"

  • Personal-best query: "show best score for (game_id, user_id)"

Do not force one table to answer both efficiently. Treat them as separate read models.

Core Entities

text
Game
├── game_id: UUID
├── title: string
├── status: enum (prelaunch, live, retired)
├── launch_at: timestamp
└── metadata: jsonb

PersonalBest
├── game_id: UUID
├── user_id: UUID
├── best_score: bigint
├── best_submission_id: UUID
├── updated_at: timestamp
└── version: bigint

LeaderboardEntry
├── game_id: UUID
├── rank: int
├── user_id: UUID
├── score: bigint
├── submission_id: UUID
├── achieved_at: timestamp
└── tie_break_key: string

ScoreSubmission
├── submission_id: UUID
├── game_id: UUID
├── user_id: UUID
├── session_id: string
├── score: bigint
├── idempotency_key: string
├── client_seen_pb: bigint
├── client_seen_cutoff: bigint
├── status: enum (queued, applied, ignored, failed)
├── applied_reason: enum (personal_best, leaderboard, both, none)
└── created_at: timestamp

ScoreEvent
├── event_id: UUID
├── submission_id: UUID
├── game_id: UUID
├── user_id: UUID
├── event_type: enum (received, applied, ignored, retried)
└── created_at: timestamp

Relationships

text
Game 1:N LeaderboardEntry
Game 1:N PersonalBest
Game 1:N ScoreSubmission
User 1:N PersonalBest
User 1:N ScoreSubmission
ScoreSubmission 1:N ScoreEvent

Hot Read Models

For the interview, call out the actual cache keys:

text
top100:{game_id}               -> sorted leaderboard payload
cutoff:{game_id}               -> current 100th-place score
pb:{game_id}:{user_id}         -> player's best score
leaderboard_version:{game_id}  -> monotonic version for cache refresh / sync

The leaderboard is tiny. For each game, the exact top 100 can fit comfortably in memory. That means the hard problem is not storage size; it is coordinating updates during bursts and hot games.

Be explicit that rank is derived from a deterministic sort order, not stored as an independently edited field. For example: sort by score DESC, then achieved_at ASC, then submission_id ASC.

Phase 3: API Design (~5 minutes)

Protocol Choices

  • Public client API: REST over HTTPS

  • Internal score processing: durable queue / log partitioned by game_id

  • Optional live update path: WebSocket or server push if the game wants realtime leaderboard movement

REST is the right default here. The key is idempotency, not fancy transport.

Client-Facing APIs

http
GET /v1/games/{game_id}/leaderboard?limit=100
Response:
{
  "game_id": "game_123",
  "version": 8812,
  "cutoff_score": 984500,
  "entries": [
    { "rank": 1, "user_id": "u1", "score": 1432000 },
    { "rank": 2, "user_id": "u2", "score": 1419000 }
  ]
}
http
GET /v1/games/{game_id}/players/me/best
Response:
{
  "game_id": "game_123",
  "user_id": "me",
  "best_score": 991000,
  "updated_at": "2026-03-16T18:23:10Z"
}
http
GET /v1/games/{game_id}/score-context
Response:
{
  "game_id": "game_123",
  "personal_best": 991000,
  "cutoff_score": 984500,
  "leaderboard_version": 8812,
  "entries": [ ... top 100 ... ]
}

This combined endpoint is practical because the client typically wants both values together right after a match.

http
POST /v1/games/{game_id}/scores
{
  "session_id": "sess_abc",
  "score": 1002000,
  "idempotency_key": "game_123:sess_abc:attempt_1",
  "client_seen_pb": 991000,
  "client_seen_cutoff": 984500
}
Response:
{
  "submission_id": "sub_789",
  "status": "queued",
  "likely_updates": {
    "personal_best": true,
    "leaderboard": true
  },
  "current_context": {
    "personal_best": 991000,
    "cutoff_score": 984500,
    "leaderboard_version": 8812
  }
}

Include idempotency_key. The client may keep pending qualifying scores locally and retry after reconnect. Without idempotency, duplicate submissions are guaranteed.

If the interviewer insists on exact read-after-write semantics in the POST /scores response, upgrade the async processor into a per-game synchronous owner service that appends to a durable log and updates the cache before ACK. The rest of the design remains the same.

Phase 4: High-Level Design (~15-25 minutes)

Start Simple First

Because the prompt explicitly hints low initial traffic, start with the simplest design that works:

  • one API service

  • one relational database

  • two indexed tables or materialized views:

  • personal_best(game_id, user_id)

  • leaderboard_entries(game_id, rank)

That is enough for a startup or a private beta.

Then say how you would evolve it once reads and launches grow:

Scaled Architecture

Component Responsibilities

Game client

  • Caches the last seen personal best and leaderboard cutoff locally

  • Submits only if the new score beats one of those thresholds

  • Buffers pending qualifying submissions locally if the service is down

Read API

  • Serves leaderboard and personal-best reads from cache

  • Falls back to DB snapshots only on cache miss or rebuild

Score Submit API

  • Validates idempotency key and request shape

  • Re-checks cached PB and cutoff because the client may be stale or malicious

  • Rejects obvious non-qualifying scores without touching the database

  • Appends candidate submissions to the durable log for replayable processing

Durable Log / Queue

  • Acts as the source of truth for accepted candidate submissions

  • Partitions by game_id so updates for the same game stay ordered

  • Absorbs launch spikes and supports replay if cache or DB projections fall behind

Leaderboard Processor

  • Consumes submissions from the durable log partitioned by game_id

  • Maintains one logical ordered updater per game partition

  • Updates top100, cutoff, and personal_best caches

  • Materializes snapshots/submissions into the primary database

Primary DB

  • Stores materialized snapshots and submission audit records

  • Rebuilds cache if Redis is lost and recent log segments are replayed

Read Path

Right after a match ends, the game client usually needs both the personal best and the leaderboard:

  • Client calls GET /score-context

  • Read API fetches pb:{game_id}:{user_id} and top100:{game_id} from cache

  • Response returns in tens of milliseconds without hitting the database

This is why "read from cache, not DB" is the right default answer.

Submission Path

  • Client compares the new score against locally cached personal best and cutoff

  • If the score cannot matter, the client does not submit

  • If it might matter, client calls POST /scores with idempotency key

  • Submit API re-checks against authoritative cache

  • If the score still cannot affect PB or top 100, return ignored

  • Otherwise append the submission to the durable log

  • The leaderboard processor updates:

  • player's personal best if the score is higher

  • game's top 100 if the score clears the cutoff

  • cutoff score and leaderboard version

  • Downstream materializers persist the applied change to DB snapshots and audit tables; because the log is durable, the DB can lag briefly without losing accepted submissions

Separate "can this score matter?" from "store this score forever." Most scores should die in cache-level filtering and never become durable writes.

Stale local state is acceptable here. If the client has an old cutoff or personal best, it may submit a few extra candidates, but the server-side cache check keeps correctness intact.

Why A Queue Helps

The queue is not mainly about average traffic. It is about launch-day burst absorption.

When a new game launches:

  • the leaderboard may have fewer than 100 entries

  • nearly every early score qualifies

  • writes spike sharply for a short time

The queue gives you:

  • burst absorption

  • natural partitioning by game_id

  • retry and backoff during DB or cache hiccups

  • room to scale processors independently from APIs

Keeping The UX "Immediate"

There are two acceptable answers depending on how strict the interviewer is:

Availability-first version

  • POST /scores durably queues the candidate

  • cache updates appear within sub-second lag

  • client immediately reads updated context from cache

Stronger consistency version

  • a per-game owner service appends to a durable log and updates cache before ACK

  • higher complexity, lower staleness

Given the prompt's emphasis on availability over consistency, start with the first answer and then mention the second as an upgrade path.

Phase 5: Scaling & Trade-offs (~15-20 minutes)

1. Why Write Traffic Drops Over Time

This is one of the best insights in the prompt:

  • once a game has a meaningful top 100, the cutoff rises

  • most players stop qualifying

  • clients stop submitting losing scores

  • server-side filtering removes stale or incorrect candidates

So the system naturally becomes:

  • read-heavy for mature games

  • write-bursty for new games

That is why cache-first reads and queue-buffered writes are a strong fit.

2. Scaling Personal Best And Leaderboard Separately

These are different workloads:

  • PersonalBest is keyed by (game_id, user_id) and can be horizontally partitioned naturally

  • LeaderboardTop100 is a tiny per-game ranked set and should be cached as a single object

Do not try to answer both queries with one giant table and one query plan.

Practical scaling strategy:

  • shard PersonalBest by hash of (game_id, user_id)

  • keep LeaderboardTop100 in Redis sorted sets or processor memory with DB snapshots behind it

  • serve both from cache through separate read models

This directly addresses the interviewer follow-up: these two queries want different storage layouts.

3. Hot Game And Hot Partition Strategy

This is usually the hardest follow-up.

The bad answer is:

  • split one hot game's data into game-1, game-2, game-3

  • then try to join those shards on every leaderboard read

That makes exact realtime ranking awkward because the read path now has to merge multiple partial leaderboards.

A better answer is:

  • Keep one logical owner for the exact top 100 of a game

  • Scale around that owner, instead of splitting the exact ranking state blindly

Concretely:

  • assign hot games dedicated partitions or machines

  • let clients pre-filter using local PB and cutoff

  • let submit APIs pre-filter again using cache

  • if traffic is still too hot, add hierarchical top-K filtering

With hierarchical top-K:

  • each ingress shard keeps a local candidate heap for the hot game

  • only scores above a moving threshold are forwarded to the game's global owner

  • the global owner merges shard top-K streams and maintains the exact top 100

That avoids suffix-based sharding on the read path while still reducing pressure on the exact owner.

If the interviewer asks "what if one game becomes insanely hot?", say: "I would not shard the exact global leaderboard and merge on every read. I would keep one exact owner for top 100 and use pre-filtering or hierarchical top-K before it."

4. Availability Over Consistency

Be explicit about the trade-off:

  • Reads come from cache, so they may be slightly stale

  • Score submission is idempotent and retryable

  • If Redis has a partial outage, clients can keep pending scores locally and retry later

  • If the snapshot database is slow, the durable log can absorb accepted submissions while projection workers catch up later

This is acceptable because:

  • users care more that the feature works than that rank 98 and rank 99 swap instantly

  • the leaderboard is not a bank account

5. Failure Handling

Client offline or service unavailable

  • client stores pending candidate scores locally

  • retry later with the same idempotency key

Queue backlog

  • processors autoscale by partition count

  • hot games get dedicated workers

Redis loss

  • rebuild from DB snapshots and recent log replay

  • serve degraded responses while warming cache

Duplicate submissions

  • dedupe by (game_id, user_id, session_id, idempotency_key)

Out-of-order retries

  • keep submission versions or timestamps

  • only raise personal best if the new score is strictly larger than the stored best

6. Why Not Read From The Database Directly?

Because the main queries are tiny but very frequent:

  • top 100 for a game

  • one user's personal best

If every game-over screen hits the DB directly, read traffic grows with overall session volume. Cache avoids that and keeps p95 latency low.

The database should be the durable snapshot and rebuild layer, not the main serving layer for hot reads. In the scaled design above, accepted submissions land in the durable log first and are then materialized into the database.

Common Pitfalls

Storing every score in the hot path: this ignores the prompt's key optimization. Only personal-best and top-K candidates should enter the durable processing path; most other scores should be filtered out immediately.

Using one schema for both leaderboard and personal-best reads: these are different access patterns and want different read models.

Sharding a hot game's exact leaderboard into gameId-1, gameId-2, ... and merging on every read: this makes exact realtime top-K much harder than necessary.

Serving leaderboard reads from the primary database: this creates unnecessary latency and load for a cache-friendly workload.

Ignoring retries: if clients buffer scores locally during outages, idempotency is mandatory.

Interview Checklist

Requirements Phase

  • Clarified that the hot features are personal best and top 100 leaderboard

  • Confirmed availability is more important than strict consistency

  • Called out the low-traffic startup starting point before scaling

Design Phase

  • Separated personal-best and leaderboard read models

  • Put reads on cache instead of DB

  • Used a queue or durable log to absorb launch-day spikes

  • Added client-side and server-side filtering for non-qualifying scores

Scaling Phase

  • Explained why mature games become write-light

  • Covered hot-game handling without breaking exact ranking reads

  • Mentioned idempotent retries and local client buffering

  • Discussed dedicated partitions or hierarchical top-K for extreme hotspots

Summary

AspectDecisionWhy
Public APIRESTSimple and sufficient for score fetch + submit
Read pathCache-firstPersonal best and top 100 are tiny, hot reads
Write pathQueue-buffered candidate submissionsSmooths launch spikes and supports retries
Core optimizationSubmit only PB/top-100 candidatesEliminates most writes
Personal best scalingPartition by (game_id, user_id)Natural horizontal sharding
Leaderboard scalingOne exact owner per game + dedicated hot partitionsPreserves exact top 100 without read-path merges
Failure handlingIdempotent retries + local client bufferSupports availability-first behavior

Interview Experience

Candidates reported that the interviewer often started with a deliberately small-scope version of the problem: "Assume the user count is low. How would you build it first?" A strong answer is to begin with one machine and one database, then evolve the design only when the interviewer pushes on scale. The follow-up usually centers on two points:

  • personal-best and leaderboard queries should not share the same storage strategy

  • naive hot-game sharding creates a painful merge problem on the read path

The strongest way to close is to say that you would keep one exact owner for a game's top 100, aggressively pre-filter candidates before they reach that owner, and reserve hierarchical top-K aggregation for truly extreme hotspots.