Design a Gaming Leaderboard Service
Role: Software Engineer
Imagine you are building a backend service for game studios. Games run on players' computers, and after a game session ends the client wants to show two things right away:
-
the player's personal best for that game
-
the game's top 100 leaderboard
We only need to persist a submitted score if it sets a new personal best for that player or breaks into the game's top 100. That small requirement changes the whole traffic pattern: a newly launched game can generate a burst of writes, but mature games become much quieter because the leaderboard cutoff rises and most players no longer qualify.
This walkthrough follows the Interview Framework and focuses on what you would actually present in a 45-60 minute interview.
Prioritize availability over consistency. If a cache replica is slightly stale or one region is degraded, it is better to show a slightly old leaderboard and let clients retry score submission than to take the entire feature offline.
The key interview insight is that this is not a "store every score forever" system. Because only personal-best improvements and top-100 candidates matter, both the client and the server can filter most writes before they hit durable storage.
If the interviewer says top 10 instead of top 100, treat K as configurable. The architecture is the same; only the cutoff cache and leaderboard size change.
Assume anti-cheat and score verification are handled upstream by trusted game servers or a separate validation service. The focus here is leaderboard storage, latency, and scaling.
Phase 1: Requirements (~5 minutes)
Start by clarifying scope. The interviewer hint here is usually "traffic is small at first" or "this is a startup." That is a signal to begin with the simplest working system, then evolve it.
Functional Requirements
-
Players should be able to fetch the current top 100 leaderboard for a game
-
Players should be able to fetch their personal best for a game immediately after a session ends
-
Clients should be able to submit a score candidate and learn whether it updated personal best, leaderboard, both, or neither
-
The system should be able to safely handle retries and offline buffering so players can resubmit when service is back
-
The system should be able to support many games, including bursts when a new title launches
Do not over-scope the first pass. Matchmaking, achievements, friends, anti-cheat, and full score history are all separate systems unless the interviewer explicitly asks for them.
Non-Functional Requirements
| Requirement | Target | Rationale |
|---|---|---|
| Leaderboard read latency | P95 under 50 ms from cache | Game UI should feel instant |
| Personal-best read latency | P95 under 50 ms from cache | Usually shown right after the match ends |
| Score submission ack | P95 under 200 ms | Client should know quickly whether submission was accepted or queued |
| Availability | 99.95% | Prefer degraded service over hard failure |
| Consistency | Eventual consistency is acceptable | A few hundred milliseconds of staleness is fine |
| Durability for accepted scores | No lost accepted submissions | Once accepted, a retryable candidate must survive failures |
Ask clarifying questions early:
-
"Do we need to store every raw score attempt, or only personal bests and top-K candidates?" Assume only the latter in the hot path.
-
"Does the client need strict read-after-write consistency after submission?" Assume near-realtime is enough, but call out how to tighten this if needed.
-
"Should the client submit every score?" No. The intended optimization is local filtering: only submit if the score beats the local personal best or current leaderboard cutoff.
-
"How do we break ties on the leaderboard?" Use a deterministic rule such as
(score DESC, achieved_at ASC, submission_id ASC). -
"What matters more: consistency or availability?" Availability wins.
Capacity Estimation
Use the traffic shape from the prompt, not generic social-media scale:
Assumptions:
- 10M daily active players across all games
- 3 completed game sessions per player per day
- 30M session ends/day = ~350 session ends/sec average
- Peak factor during evenings / launches: 15x => ~5,000 session ends/sec
Steady-state mature games:
- Only ~2% of scores beat local PB or top-100 cutoff
- Candidate write peak ~= 100 submissions/sec
Launch-day hot game:
- The leaderboard is not full yet, so nearly every finished session may submit
- Worst-case write peak for one new game can approach the full 5,000 submissions/sec
Reads:
- Leaderboard and PB are read after many sessions and from game menus
- Design for ~50,000 cache reads/sec peak across gamesHot data size is small enough to justify aggressive caching:
Leaderboard cache:
- 100K active games
- 100 entries/game
- ~64 bytes/entry raw payload
- ~640 MB raw before overhead
Personal-best cache:
- 200M active (user_id, game_id) pairs
- ~32 bytes per record raw
- ~6.4 GB raw before overheadThe most important capacity observation is not the absolute QPS. It is that write traffic decays over time for mature games, while launches create bursty "all scores matter" behavior. That is why a queue plus cache-first reads is a strong answer.
Phase 2: Data Model (~5 minutes)
The two core queries are different:
-
Leaderboard query: "show top 100 for
game_id" -
Personal-best query: "show best score for (
game_id,user_id)"
Do not force one table to answer both efficiently. Treat them as separate read models.
Core Entities
Game
├── game_id: UUID
├── title: string
├── status: enum (prelaunch, live, retired)
├── launch_at: timestamp
└── metadata: jsonb
PersonalBest
├── game_id: UUID
├── user_id: UUID
├── best_score: bigint
├── best_submission_id: UUID
├── updated_at: timestamp
└── version: bigint
LeaderboardEntry
├── game_id: UUID
├── rank: int
├── user_id: UUID
├── score: bigint
├── submission_id: UUID
├── achieved_at: timestamp
└── tie_break_key: string
ScoreSubmission
├── submission_id: UUID
├── game_id: UUID
├── user_id: UUID
├── session_id: string
├── score: bigint
├── idempotency_key: string
├── client_seen_pb: bigint
├── client_seen_cutoff: bigint
├── status: enum (queued, applied, ignored, failed)
├── applied_reason: enum (personal_best, leaderboard, both, none)
└── created_at: timestamp
ScoreEvent
├── event_id: UUID
├── submission_id: UUID
├── game_id: UUID
├── user_id: UUID
├── event_type: enum (received, applied, ignored, retried)
└── created_at: timestampRelationships
Game 1:N LeaderboardEntry
Game 1:N PersonalBest
Game 1:N ScoreSubmission
User 1:N PersonalBest
User 1:N ScoreSubmission
ScoreSubmission 1:N ScoreEventHot Read Models
For the interview, call out the actual cache keys:
top100:{game_id} -> sorted leaderboard payload
cutoff:{game_id} -> current 100th-place score
pb:{game_id}:{user_id} -> player's best score
leaderboard_version:{game_id} -> monotonic version for cache refresh / syncThe leaderboard is tiny. For each game, the exact top 100 can fit comfortably in memory. That means the hard problem is not storage size; it is coordinating updates during bursts and hot games.
Be explicit that rank is derived from a deterministic sort order, not stored as an independently edited field. For example: sort by score DESC, then achieved_at ASC, then submission_id ASC.
Phase 3: API Design (~5 minutes)
Protocol Choices
-
Public client API: REST over HTTPS
-
Internal score processing: durable queue / log partitioned by
game_id -
Optional live update path: WebSocket or server push if the game wants realtime leaderboard movement
REST is the right default here. The key is idempotency, not fancy transport.
Client-Facing APIs
GET /v1/games/{game_id}/leaderboard?limit=100
Response:
{
"game_id": "game_123",
"version": 8812,
"cutoff_score": 984500,
"entries": [
{ "rank": 1, "user_id": "u1", "score": 1432000 },
{ "rank": 2, "user_id": "u2", "score": 1419000 }
]
}GET /v1/games/{game_id}/players/me/best
Response:
{
"game_id": "game_123",
"user_id": "me",
"best_score": 991000,
"updated_at": "2026-03-16T18:23:10Z"
}GET /v1/games/{game_id}/score-context
Response:
{
"game_id": "game_123",
"personal_best": 991000,
"cutoff_score": 984500,
"leaderboard_version": 8812,
"entries": [ ... top 100 ... ]
}This combined endpoint is practical because the client typically wants both values together right after a match.
POST /v1/games/{game_id}/scores
{
"session_id": "sess_abc",
"score": 1002000,
"idempotency_key": "game_123:sess_abc:attempt_1",
"client_seen_pb": 991000,
"client_seen_cutoff": 984500
}
Response:
{
"submission_id": "sub_789",
"status": "queued",
"likely_updates": {
"personal_best": true,
"leaderboard": true
},
"current_context": {
"personal_best": 991000,
"cutoff_score": 984500,
"leaderboard_version": 8812
}
}Include idempotency_key. The client may keep pending qualifying scores locally and retry after reconnect. Without idempotency, duplicate submissions are guaranteed.
If the interviewer insists on exact read-after-write semantics in the POST /scores response, upgrade the async processor into a per-game synchronous owner service that appends to a durable log and updates the cache before ACK. The rest of the design remains the same.
Phase 4: High-Level Design (~15-25 minutes)
Start Simple First
Because the prompt explicitly hints low initial traffic, start with the simplest design that works:
-
one API service
-
one relational database
-
two indexed tables or materialized views:
-
personal_best(game_id, user_id) -
leaderboard_entries(game_id, rank)
That is enough for a startup or a private beta.
Then say how you would evolve it once reads and launches grow:
Scaled Architecture
Component Responsibilities
Game client
-
Caches the last seen personal best and leaderboard cutoff locally
-
Submits only if the new score beats one of those thresholds
-
Buffers pending qualifying submissions locally if the service is down
Read API
-
Serves leaderboard and personal-best reads from cache
-
Falls back to DB snapshots only on cache miss or rebuild
Score Submit API
-
Validates idempotency key and request shape
-
Re-checks cached PB and cutoff because the client may be stale or malicious
-
Rejects obvious non-qualifying scores without touching the database
-
Appends candidate submissions to the durable log for replayable processing
Durable Log / Queue
-
Acts as the source of truth for accepted candidate submissions
-
Partitions by
game_idso updates for the same game stay ordered -
Absorbs launch spikes and supports replay if cache or DB projections fall behind
Leaderboard Processor
-
Consumes submissions from the durable log partitioned by
game_id -
Maintains one logical ordered updater per game partition
-
Updates
top100,cutoff, andpersonal_bestcaches -
Materializes snapshots/submissions into the primary database
Primary DB
-
Stores materialized snapshots and submission audit records
-
Rebuilds cache if Redis is lost and recent log segments are replayed
Read Path
Right after a match ends, the game client usually needs both the personal best and the leaderboard:
-
Client calls
GET /score-context -
Read API fetches
pb:{game_id}:{user_id}andtop100:{game_id}from cache -
Response returns in tens of milliseconds without hitting the database
This is why "read from cache, not DB" is the right default answer.
Submission Path
-
Client compares the new score against locally cached personal best and cutoff
-
If the score cannot matter, the client does not submit
-
If it might matter, client calls
POST /scoreswith idempotency key -
Submit API re-checks against authoritative cache
-
If the score still cannot affect PB or top 100, return
ignored -
Otherwise append the submission to the durable log
-
The leaderboard processor updates:
-
player's personal best if the score is higher
-
game's top 100 if the score clears the cutoff
-
cutoff score and leaderboard version
-
Downstream materializers persist the applied change to DB snapshots and audit tables; because the log is durable, the DB can lag briefly without losing accepted submissions
Separate "can this score matter?" from "store this score forever." Most scores should die in cache-level filtering and never become durable writes.
Stale local state is acceptable here. If the client has an old cutoff or personal best, it may submit a few extra candidates, but the server-side cache check keeps correctness intact.
Why A Queue Helps
The queue is not mainly about average traffic. It is about launch-day burst absorption.
When a new game launches:
-
the leaderboard may have fewer than 100 entries
-
nearly every early score qualifies
-
writes spike sharply for a short time
The queue gives you:
-
burst absorption
-
natural partitioning by
game_id -
retry and backoff during DB or cache hiccups
-
room to scale processors independently from APIs
Keeping The UX "Immediate"
There are two acceptable answers depending on how strict the interviewer is:
Availability-first version
-
POST /scoresdurably queues the candidate -
cache updates appear within sub-second lag
-
client immediately reads updated context from cache
Stronger consistency version
-
a per-game owner service appends to a durable log and updates cache before ACK
-
higher complexity, lower staleness
Given the prompt's emphasis on availability over consistency, start with the first answer and then mention the second as an upgrade path.
Phase 5: Scaling & Trade-offs (~15-20 minutes)
1. Why Write Traffic Drops Over Time
This is one of the best insights in the prompt:
-
once a game has a meaningful top 100, the cutoff rises
-
most players stop qualifying
-
clients stop submitting losing scores
-
server-side filtering removes stale or incorrect candidates
So the system naturally becomes:
-
read-heavy for mature games
-
write-bursty for new games
That is why cache-first reads and queue-buffered writes are a strong fit.
2. Scaling Personal Best And Leaderboard Separately
These are different workloads:
-
PersonalBestis keyed by(game_id, user_id)and can be horizontally partitioned naturally -
LeaderboardTop100is a tiny per-game ranked set and should be cached as a single object
Do not try to answer both queries with one giant table and one query plan.
Practical scaling strategy:
-
shard
PersonalBestby hash of(game_id, user_id) -
keep
LeaderboardTop100in Redis sorted sets or processor memory with DB snapshots behind it -
serve both from cache through separate read models
This directly addresses the interviewer follow-up: these two queries want different storage layouts.
3. Hot Game And Hot Partition Strategy
This is usually the hardest follow-up.
The bad answer is:
-
split one hot game's data into
game-1,game-2,game-3 -
then try to join those shards on every leaderboard read
That makes exact realtime ranking awkward because the read path now has to merge multiple partial leaderboards.
A better answer is:
-
Keep one logical owner for the exact top 100 of a game
-
Scale around that owner, instead of splitting the exact ranking state blindly
Concretely:
-
assign hot games dedicated partitions or machines
-
let clients pre-filter using local PB and cutoff
-
let submit APIs pre-filter again using cache
-
if traffic is still too hot, add hierarchical top-K filtering
With hierarchical top-K:
-
each ingress shard keeps a local candidate heap for the hot game
-
only scores above a moving threshold are forwarded to the game's global owner
-
the global owner merges shard top-K streams and maintains the exact top 100
That avoids suffix-based sharding on the read path while still reducing pressure on the exact owner.
If the interviewer asks "what if one game becomes insanely hot?", say: "I would not shard the exact global leaderboard and merge on every read. I would keep one exact owner for top 100 and use pre-filtering or hierarchical top-K before it."
4. Availability Over Consistency
Be explicit about the trade-off:
-
Reads come from cache, so they may be slightly stale
-
Score submission is idempotent and retryable
-
If Redis has a partial outage, clients can keep pending scores locally and retry later
-
If the snapshot database is slow, the durable log can absorb accepted submissions while projection workers catch up later
This is acceptable because:
-
users care more that the feature works than that rank 98 and rank 99 swap instantly
-
the leaderboard is not a bank account
5. Failure Handling
Client offline or service unavailable
-
client stores pending candidate scores locally
-
retry later with the same idempotency key
Queue backlog
-
processors autoscale by partition count
-
hot games get dedicated workers
Redis loss
-
rebuild from DB snapshots and recent log replay
-
serve degraded responses while warming cache
Duplicate submissions
- dedupe by
(game_id, user_id, session_id, idempotency_key)
Out-of-order retries
-
keep submission versions or timestamps
-
only raise personal best if the new score is strictly larger than the stored best
6. Why Not Read From The Database Directly?
Because the main queries are tiny but very frequent:
-
top 100 for a game
-
one user's personal best
If every game-over screen hits the DB directly, read traffic grows with overall session volume. Cache avoids that and keeps p95 latency low.
The database should be the durable snapshot and rebuild layer, not the main serving layer for hot reads. In the scaled design above, accepted submissions land in the durable log first and are then materialized into the database.
Common Pitfalls
Storing every score in the hot path: this ignores the prompt's key optimization. Only personal-best and top-K candidates should enter the durable processing path; most other scores should be filtered out immediately.
Using one schema for both leaderboard and personal-best reads: these are different access patterns and want different read models.
Sharding a hot game's exact leaderboard into gameId-1, gameId-2, ... and merging on every read: this makes exact realtime top-K much harder than necessary.
Serving leaderboard reads from the primary database: this creates unnecessary latency and load for a cache-friendly workload.
Ignoring retries: if clients buffer scores locally during outages, idempotency is mandatory.
Interview Checklist
Requirements Phase
-
Clarified that the hot features are personal best and top 100 leaderboard
-
Confirmed availability is more important than strict consistency
-
Called out the low-traffic startup starting point before scaling
Design Phase
-
Separated personal-best and leaderboard read models
-
Put reads on cache instead of DB
-
Used a queue or durable log to absorb launch-day spikes
-
Added client-side and server-side filtering for non-qualifying scores
Scaling Phase
-
Explained why mature games become write-light
-
Covered hot-game handling without breaking exact ranking reads
-
Mentioned idempotent retries and local client buffering
-
Discussed dedicated partitions or hierarchical top-K for extreme hotspots
Summary
| Aspect | Decision | Why |
|---|---|---|
| Public API | REST | Simple and sufficient for score fetch + submit |
| Read path | Cache-first | Personal best and top 100 are tiny, hot reads |
| Write path | Queue-buffered candidate submissions | Smooths launch spikes and supports retries |
| Core optimization | Submit only PB/top-100 candidates | Eliminates most writes |
| Personal best scaling | Partition by (game_id, user_id) | Natural horizontal sharding |
| Leaderboard scaling | One exact owner per game + dedicated hot partitions | Preserves exact top 100 without read-path merges |
| Failure handling | Idempotent retries + local client buffer | Supports availability-first behavior |
Interview Experience
Candidates reported that the interviewer often started with a deliberately small-scope version of the problem: "Assume the user count is low. How would you build it first?" A strong answer is to begin with one machine and one database, then evolve the design only when the interviewer pushes on scale. The follow-up usually centers on two points:
-
personal-best and leaderboard queries should not share the same storage strategy
-
naive hot-game sharding creates a painful merge problem on the read path
The strongest way to close is to say that you would keep one exact owner for a game's top 100, aggressively pre-filter candidates before they reach that owner, and reserve hierarchical top-K aggregation for truly extreme hotspots.