Chapter 8: System Design Interview Mastery

8.2 Design a URL Shortener

1. Restate the Problem and Pick the Scope

We are designing a URL shortening service (similar to Bitly or TinyURL) that takes a long URL and produces a compact, unique short link.

When a user visits the short link, the service redirects them to the original long URL.

Main user groups and actions:

  • Link creators — paste a long URL and get back a short one they can share anywhere.
  • Link visitors — click or type a short URL and get redirected to the original destination quickly.
  • Link owners — (nice to have) view analytics such as click count and referrer data.

Scope decisions:

  • We will focus on the core shorten-and-redirect loop, including link creation, redirection, and basic analytics (click count).
  • We will NOT cover: a full analytics dashboard with geographic breakdowns, custom branded domains, user accounts with login, team workspaces, or a paid tier system. These can be added later.

2. Clarify Functional Requirements

Must-Have Features

  • Given a long URL, the system generates a unique short URL (e.g. short.ly/abc123).
  • When a user visits the short URL, they are redirected (HTTP 301 or 302) to the original long URL.
  • Short URLs must be unique — no two long URLs map to the same short code.
  • Short codes should be as compact as possible (7 characters is a good target).
  • The system tracks the total number of times each short URL has been clicked.
  • Short URLs have a configurable expiration (default: 5 years, but can be permanent).
  • If the same long URL is shortened twice, the system may return a new short code each time (simpler design).

Nice-to-Have Features

  • Users can supply a custom alias (e.g. short.ly/my-brand).
  • Basic analytics: click count over time, top referrers, device type breakdown.
  • API key authentication so creators can manage their links programmatically.

Functional Requirements

3. Clarify Non-Functional Requirements

MetricAssumption / Target
Monthly active users (MAU)100 million visitors; ~1 million creators
Daily active users (DAU)~10 million visitors; ~100K creators
Read:Write ratio100:1 — overwhelmingly read-heavy (redirects >> creates)
Redirect latency< 50 ms p99 — users expect instant redirects
Create latency< 200 ms p99 — acceptable for a write operation
Availability99.99% (four nines) — redirect path is critical
ConsistencyEventual consistency is fine for analytics; short-code-to-URL mapping needs strong consistency on write (no duplicate codes)
Data retention5 years default; permanent if requested

4. Back-of-the-Envelope Estimates

Write QPS (URL creation)

100K creators/day. Assume each creates ~2 URLs per day on average.

Writes/day = 100,000 × 2 = 200,000
Write QPS = 200,000 / 86,400 ≈ 2.3 QPS (average)
Peak QPS = 2.3 × 5 ≈ ~12 QPS

Read QPS (redirects)

Read:Write ratio is 100:1.

Read QPS (avg) = 2.3 × 100 = 230 QPS
Peak read QPS = 230 × 5 = ~1,150 QPS

Even at peak, ~1,200 QPS is very manageable for a well-cached service. A single modern server can handle this, but we design for growth and redundancy.

Storage

Each URL record stores: short code (7 bytes), long URL (average ~200 bytes), created timestamp (8 bytes), expiry (8 bytes), click count (8 bytes), plus overhead ≈ ~300 bytes per record.

Records/year = 200,000/day × 365 = 73 million
Storage/year = 73M × 300 bytes ≈ ~22 GB/year
Over 5 years = ~110 GB

This is small. The entire dataset fits comfortably in memory of a few machines, which is great news for caching.

Short Code Space

Using base62 (a-z, A-Z, 0-9) with 7 characters:

62^7 = 3.5 trillion possible codes

Even at 73 million new URLs/year for decades, we will never exhaust this space.

Back of the Envelope Estimation

5. API Design

We expose a simple REST API. Three core endpoints cover all must-have features.

5.1 Create Short URL

FieldValue
Method & PathPOST /api/v1/urls
Request body{ "long_url": "https://example.com/...", "custom_alias": "my-brand" (optional), "expires_at": "2031-01-01" (optional) }
Success response (201 Created){ "short_url": "https://short.ly/abc1234", "short_code": "abc1234", "long_url": "https://example.com/...", "expires_at": "2031-04-06T00:00:00Z" }
Error codes400 — invalid URL format, 409 — custom alias already taken, 429 — rate limit exceeded

5.2 Redirect (the hot path)

FieldValue
Method & PathGET /{short_code}
Success responseHTTP 302 Found with Location header pointing to the original long URL
Error codes404 — short code not found, 410 — short URL has expired

We use 302 (temporary redirect) instead of 301 (permanent) so the browser always hits our server, which lets us count clicks and enforce expiry. If analytics are not needed, 301 reduces load because browsers cache the redirect.

5.3 Get Link Stats

FieldValue
Method & PathGET /api/v1/urls/{short_code}/stats
Success response{ "short_code": "abc1234", "long_url": "https://example.com/...", "total_clicks": 4837, "created_at": "2026-04-06T...", "expires_at": "2031-04-06T..." }
Error codes404 — short code not found

6. High-Level Architecture

Below is the system laid out layer by layer, from the client all the way to storage.

High-Level Architecture

Component Responsibilities

  • Load Balancer — distributes incoming requests across service instances; terminates TLS; performs health checks.
  • Redirect Service — the hot read path. Looks up the short code in cache (then DB if cache miss), returns an HTTP 302. Must be extremely fast.
  • URL Creation Service — the write path. Validates the long URL, generates a unique short code, stores the mapping, and returns the short URL.
  • ID Generator — produces globally unique IDs that are converted to base62 short codes. Options: a counter-based approach, Snowflake-style IDs, or pre-generated ID ranges.
  • Redis Cache — holds hot short-code-to-long-URL mappings in memory for sub-millisecond lookups. The majority of redirects should be served from cache.
  • URL Database — the source of truth for all URL mappings. A relational DB like PostgreSQL or a key-value store like DynamoDB works well here.
  • Message Queue — decouples the redirect path from analytics. Every redirect emits a lightweight click event to the queue; analytics workers consume and aggregate.
  • Analytics Store — stores aggregated click counts. Can be a simple counter in the URL table, or a dedicated time-series store for richer analytics.

7. Data Model

Database Choice

A relational database (PostgreSQL) is a solid default here. The dataset is small (~22 GB/year), relationships are simple, and we benefit from ACID guarantees when inserting new URL records to prevent duplicate short codes. For very high scale, a key-value store like DynamoDB could also work since lookups are almost always by short_code (a single key).

Main Table: urls

ColumnTypeNotes
idBIGINT PKAuto-increment or Snowflake ID
short_codeVARCHAR(10)UNIQUE index — the lookup key
long_urlTEXTThe original URL (indexed for optional de-dup)
created_atTIMESTAMPWhen the link was created
expires_atTIMESTAMPNULL means never expires
click_countBIGINTDenormalized counter, updated async
creator_ipVARCHAR(45)For abuse tracking (optional)

Indexes

  • Primary lookup: Unique index on short_code — supports the redirect query in O(1) via index scan.
  • Expiry cleanup: Index on expires_at — a background job periodically deletes expired rows.
  • Optional de-dup: Index on long_url hash — only if we want to return the same short code for the same long URL.

How the Model Supports Core Queries

  • Redirect: SELECT long_url, expires_at FROM urls WHERE short_code = ? — single-row lookup on unique index.
  • Create: INSERT INTO urls (short_code, long_url, ...) VALUES (...) — the unique index ensures no collisions.
  • Stats: SELECT click_count, created_at, expires_at FROM urls WHERE short_code = ? — same index.

8. Core Flows — End to End

This section walks through the three most critical operations in detail.

Flow 1: Create a Short URL

This is what happens when a creator submits a long URL to be shortened.

  • Step 1 — Client sends request. The creator's browser or API client sends a POST to /api/v1/urls with the long URL in the JSON body. The request hits the load balancer.

  • Step 2 — Load balancer routes to a URL Creation Service instance. The LB picks a healthy instance using round-robin. TLS is terminated at the LB, so the app server receives plain HTTP internally.

  • Step 3 — Validate the input. The service checks that the long URL is well-formed (has a scheme, a valid domain, is not on a blocklist of malicious sites). If invalid, it returns 400 immediately.

  • Step 4 — Generate a unique short code. This is the most important step. The service requests a unique ID from the ID Generator. There are several strategies:

    • Counter-based: A centralized counter (backed by a database sequence or a service like ZooKeeper) hands out auto-incrementing IDs. The service converts the numeric ID to base62 to produce the short code. This guarantees uniqueness without collisions.
    • Pre-generated range: Each app server grabs a range of IDs (e.g. 1–10,000) from a coordination service at startup. It consumes IDs from its local range without network calls. When the range runs out, it fetches a new one. This is fast and avoids a central bottleneck on every write.
    • Hash-based: Take the MD5 or SHA-256 of the long URL, then take the first 7 characters in base62. This can cause collisions, so you must check the database and retry with a different offset if there is a collision. Simpler to implement but slower under high collision rates.

For this design we use the pre-generated range approach because it is fast (no network round-trip per write), collision-free, and scales horizontally.

  • Step 5 — Write to the database. The service inserts a new row into the urls table with the short code, long URL, timestamps, and a click_count of 0. The unique index on short_code acts as a safety net — if the insert fails due to a duplicate (extremely unlikely with range-based IDs), the service retries with a new ID.

  • Step 6 — Populate the cache. Right after the DB write succeeds, the service writes the mapping (short_code → long_url) into Redis. This way, the very first redirect for this new link will be a cache hit.

  • Step 7 — Return the response. The service responds with HTTP 201 and a JSON body containing the full short URL (e.g. https://short.ly/abc1234), the short code, the original long URL, and the expiry date. The user sees the short link in their browser almost instantly (typically < 100 ms).

Flow 1

Flow 2: Redirect a Short URL (the Hot Path)

This is the most performance-critical flow. It happens 100x more often than creation.

  • Step 1 — User clicks or enters the short URL. Their browser sends a GET request to https://short.ly/abc1234. DNS resolves to our load balancer.

  • Step 2 — Load balancer routes to a Redirect Service instance. This is a separate service (or at least a separate handler) optimized purely for speed. It does almost no business logic — just look up and redirect.

  • Step 3 — Check the Redis cache. The service does a GET abc1234 in Redis. Since the cache is sized to hold all active URLs (our entire dataset is only ~110 GB over 5 years, and hot URLs are a tiny fraction), the cache hit rate should be 95%+ for popular links.

    • Cache hit: Redis returns the long URL in under 1 ms. Skip to Step 5.
    • Cache miss: Proceed to Step 4.
  • Step 4 — Fall back to the database. The service queries SELECT long_url, expires_at FROM urls WHERE short_code = 'abc1234'. Because of the unique index, this is a single B-tree lookup — very fast even at scale. If the row is not found, return 404. If expires_at is in the past, return 410 Gone. Otherwise, write the result back to Redis (with a TTL matching the expiry) so future requests are cache hits.

Step 5 — Return the HTTP 302 redirect. The service returns:

HTTP/1.1 302 Found
Location: https://example.com/very/long/original/url

  • The user's browser follows the Location header and lands on the destination page. The entire redirect takes 10–50 ms depending on network.

  • Step 6 — Fire an analytics event (async). Before returning the response (or in parallel), the service publishes a lightweight event to a message queue (e.g. Kafka or SQS): {short_code: 'abc1234', timestamp: ..., referrer: ..., user_agent: ...}. This is a fire-and-forget write — it must NOT slow down the redirect. If the queue is temporarily unavailable, the event is dropped (we accept a small amount of analytics loss for speed).

The key design decision here: analytics is decoupled from the redirect. We never make the user wait for a database counter increment. The redirect returns first; analytics are processed in the background.

Flow 2

Flow 3: Analytics Aggregation (Background)

This flow runs continuously in the background, processing click events from the message queue.

  • Step 1 — Analytics workers consume events. A pool of workers reads batches of click events from the message queue. Each event contains the short code, timestamp, referrer, and user agent.

  • Step 2 — Batch-increment click counts. Rather than incrementing the database counter for every single click, workers accumulate counts in memory for a short window (e.g. 5 seconds), then issue a single UPDATE urls SET click_count = click_count + N WHERE short_code = ? per short code. This batching dramatically reduces database write load.

  • Step 3 — (Optional) Write to a time-series store. For richer analytics (clicks per hour, top referrers), workers can also write aggregated data to a time-series database or an OLAP store. This is a nice-to-have and is completely separate from the redirect hot path.

  • What the user sees: The click count on the stats endpoint may be a few seconds behind real time. This is perfectly acceptable — analytics data does not need to be instantly consistent.

Flow 3

9. Caching and Read Performance

What We Cache

  • Primary cache: short_code → long_url mapping. This is the data used on every single redirect.
  • We do NOT cache analytics data in Redis — it changes constantly and is not latency-sensitive.

Where the Cache Sits

Redis sits between the Redirect Service and the database. The service always checks Redis first. On a miss, it reads from the database and populates Redis.

Cache Key and Value Shape

Key: url:{short_code} e.g. url:abc1234
Value: {long_url, expires_at} (small JSON or a simple string)
TTL: Min(time_until_expiry, 24 hours)

Cache Update and Invalidation

  • Write-through on create: When a new URL is created, we write to both the DB and Redis in the same flow.
  • Lazy loading on miss: If a redirect hits a cache miss (e.g. after eviction or a cold start), we read from the DB and populate the cache.
  • Expiry-based invalidation: We set a Redis TTL that matches the URL's expiry. When it expires, Redis automatically removes it.
  • No complex invalidation needed: URL mappings are immutable — once created, a short code always points to the same long URL. We never need to update a cache entry (only delete expired ones).

Eviction Policy

We use LRU (Least Recently Used) eviction in Redis. This works perfectly for a URL shortener because popular links (which are accessed frequently) stay in cache, while old, rarely-clicked links get evicted. If they are ever clicked again, we simply reload from the database.

10. Storage, Indexing, and Media

Primary Data Storage

PostgreSQL is the primary store. At ~22 GB/year, a single database can hold many years of data. We run a primary for writes and one or more read replicas for the redirect read path (though most reads are served from Redis).

Indexes

  • Unique index on short_code — the most critical index. Powers the redirect lookup and enforces uniqueness.
  • Index on expires_at — supports the background cleanup job that deletes expired URLs in batch.
  • Optional hash index on long_url — only needed if we decide to de-duplicate (return the same short code for the same long URL).

Media Storage

A URL shortener has no user-uploaded media. However, if we add link preview thumbnails (Open Graph images), we would store them in object storage (e.g. S3) and serve them through a CDN. This is a future optimization, not part of the core design.

CDN Consideration

For the redirect itself, a CDN is useful only if we use 301 (permanent) redirects, because CDN edge nodes can cache the redirect response. With 302 (temporary) redirects, every request must reach our servers (which is what we want for analytics). If we ever switch to 301 for links that do not need analytics, a pull-based CDN can absorb massive traffic at the edge.

Trade-offs

  • Cost: Storage costs are negligible (< $5/month for the first year on managed Postgres). Redis is the bigger cost item, but the dataset is small enough to fit in a few GB of memory.
  • Write load: Extremely low (~12 QPS peak). Not a concern.
  • Read latency: Sub-millisecond from Redis, ~2–5 ms from Postgres index lookup. Excellent for our 50 ms target.

11. Scaling Strategies

Version 1: Simple Setup

For the first deployment serving thousands of users:

  • A single Postgres instance with the urls table.
  • A single Redis instance for caching.
  • Two app server instances behind a load balancer for redundancy.
  • A simple auto-increment ID for short code generation.

This handles the entire workload for a small-to-medium service. It is simple to operate and debug.

Growing the System

Database replication: Add one or two read replicas to Postgres. The redirect service can read from replicas (the mapping is immutable once written, so replication lag is a non-issue). Writes still go to the primary.

Sharding (if needed at very large scale): Shard the urls table by short_code. Since redirects always look up by short code, the routing is simple: hash the short code to pick the shard. A small metadata service or consistent hashing ring maps codes to shards.

ID generation at scale: Switch from auto-increment to the pre-generated range approach. Each app server grabs a range of 10,000 IDs from a coordination service. This eliminates a central bottleneck.

Separating read and write paths: The redirect service and URL creation service are already separate. We can scale them independently — many redirect instances, fewer creation instances (since writes are 100x less frequent).

Handling Bursts

  • A message queue between the redirect service and analytics workers absorbs spikes in click traffic. If a viral link generates millions of clicks per minute, the queue buffers events and workers process them at a steady pace.
  • If the Redis cache becomes a bottleneck, we can use Redis Cluster to shard the cache across multiple nodes. But given our data size, a single Redis instance with 16 GB of memory can hold hundreds of millions of URL mappings.

12. Reliability, Failure Handling, and Backpressure

Removing Single Points of Failure

  • App servers: Run at least 2 instances of each service behind the load balancer. The LB health-checks and removes unhealthy instances.
  • Database: PostgreSQL with synchronous replication to a standby in a different availability zone. Automatic failover via tools like Patroni.
  • Redis: Use Redis Sentinel or Redis Cluster for automatic failover. If Redis is entirely down, the redirect service falls back to the database (slower but still functional).
  • Load balancer: Use a managed cloud LB (e.g. AWS ALB) which is inherently redundant across AZs.

Timeouts, Retries, and Idempotency

  • Timeouts: The redirect service sets a 50 ms timeout on Redis reads and a 200 ms timeout on database reads. If Redis times out, fall back to DB immediately.
  • Retries with backoff: The URL creation service retries failed DB inserts up to 3 times with exponential backoff (100 ms, 200 ms, 400 ms).
  • Idempotency: URL creation can be made idempotent by using the client-provided custom alias or by hashing the long URL as a deduplication key. This prevents duplicate URLs if the client retries a timed-out request.

Circuit Breakers

If the database starts responding slowly, a circuit breaker pattern prevents the app from piling up connections and making things worse. After a threshold of failures, the breaker opens and the service returns errors immediately for a short cooldown period, then retries.

Behavior Under Overload

  • Rate limiting: The API gateway enforces per-IP rate limits on URL creation (e.g. 100 URLs/hour). This prevents abuse and protects the write path.
  • Shedding analytics: Under extreme load, the redirect service can stop publishing analytics events. Redirects continue to work perfectly — only click counting degrades.
  • Priority: The redirect path (the user-facing read) is always the highest priority. If we must shed work, we shed writes (URL creation) and analytics first.

13. Security, Privacy, and Abuse

Authentication and Authorization

  • The redirect endpoint (GET /{short_code}) requires no authentication — it must be open and fast.
  • The creation endpoint (POST /api/v1/urls) can optionally require an API key for higher rate limits and link management features.
  • The stats endpoint should require the creator's API key so only the link owner can view analytics.

Encryption

  • In transit: All traffic uses HTTPS (TLS 1.3). TLS is terminated at the load balancer.
  • At rest: The database disk is encrypted using the cloud provider's managed encryption (e.g. AWS RDS encryption).

Handling Sensitive Data

  • We store creator IP addresses for abuse detection but should hash or anonymize them after 30 days.
  • Long URLs may contain sensitive tokens or session IDs. We should warn creators about this in documentation, but we do not inspect or modify the URL content.

Abuse Protection

  • Rate limiting: Per-IP limits on creation to prevent spamming millions of links.
  • Malicious URL detection: Before creating a short link, check the long URL against a blocklist of known phishing/malware domains (e.g. Google Safe Browsing API). Block or flag suspicious URLs.
  • Spam links: Monitor for patterns like many links created from the same IP to different suspicious domains.
  • Short code enumeration: An attacker could try to visit every possible short code to discover private links. Mitigation: use random base62 codes (not sequential), which makes enumeration infeasible across 3.5 trillion codes.

14. Bottlenecks and Next Steps

Main Bottlenecks and Risks

  • Hot links: A single viral URL (e.g. shared by a celebrity) could generate millions of redirects per minute. Mitigation: Redis handles this well since it is an in-memory lookup, but we can add a local in-process cache (e.g. a small LRU map on each app server) for the top 1,000 hottest codes to avoid even the Redis network round-trip.

  • ID generation coordination: The range-based ID generator requires a coordination service. If that service goes down, new URL creation stops. Mitigation: Each app server pre-fetches large ranges (e.g. 100,000 IDs) and can operate for hours without the coordinator.

  • Analytics under viral load: A viral link could flood the message queue with millions of click events. Mitigation: Workers can sample events (e.g. count 1 in every 10 events and multiply by 10) or use probabilistic counters.

  • Database as eventual bottleneck: While the database is far from a bottleneck today, in a 10x growth scenario it could become one. Mitigation: Shard by short_code hash; or move to a key-value store (DynamoDB) which scales horizontally with no manual sharding.

Design Summary

AspectDecisionKey Trade-off
Short code generationPre-generated ranges converted to base62Fast and collision-free, but needs a coordination service
Redirect speedRedis cache in front of PostgreSQLSub-ms reads; graceful fallback if cache is down
AnalyticsAsync via message queue and batch workersDecoupled from hot path; count may lag a few seconds
ConsistencyStrong for URL mapping; eventual for analyticsRight consistency level for each use case
Scaling pathStart simple (1 DB, 1 Redis); add replicas, then shardAvoids premature complexity; clear growth roadmap

This design prioritizes the redirect path above everything else — it is the operation that happens 100x more often and that users experience directly. By keeping the redirect service stateless and cache-first, using async analytics, and starting with a simple architecture that can grow, we build a system that is fast, reliable, and straightforward to operate.