Chapter 8: System Design Interview Mastery

8.3 Design a paste service (Pastebin)

1. Restate the Problem and Pick the Scope

We are designing a web-based paste service — similar to Pastebin — where users can upload plain text or code snippets, receive a unique short URL, and share it with others. Anyone with the link can view the paste content in a browser.

Main user groups and actions:

  • Paste creators — write or paste text into a form, hit submit, and get a shareable link back.
  • Paste readers — click a link and read the paste content, optionally with syntax highlighting.
  • Paste owners — (nice to have) manage their pastes, set expiry, or delete them.

Scope decisions:

  • We will focus on the core create-read loop: creating a paste, reading a paste, and supporting paste expiry.
  • We will NOT cover: user accounts/authentication, rich text editing, real-time collaborative editing, paid tiers, or a full admin dashboard. These can be layered on later.

2. Clarify Functional Requirements

Must-Have Features

  • A user can create a paste by submitting plain text content (up to a reasonable size limit, e.g. 10 MB).
  • The system returns a unique, short URL for each paste (e.g. paste.io/aBc12Xy).
  • Anyone with the short URL can view the paste content in a browser.
  • Pastes have a configurable expiration time (e.g. 10 minutes, 1 hour, 1 day, 1 week, never). Default: 30 days.
  • Expired pastes are automatically deleted and return 410 Gone.
  • Users can optionally specify a syntax-highlighting language (e.g. Python, JSON, SQL) when creating a paste.
  • The system provides a "raw" endpoint that returns the paste as plain text (no HTML wrapping), useful for curl and scripts.

Nice-to-Have Features

  • Users can set a paste as "unlisted" (accessible only via the link, not indexed on any public page).
  • Users can supply a custom alias for the paste URL (e.g. paste.io/my-config).
  • Basic analytics: view count per paste.

Functional Requirements

3. Clarify Non-Functional Requirements

MetricAssumption / Target
Monthly active users (MAU)~5 million readers; ~500K creators
Daily active users (DAU)~500K readers; ~50K creators
Read:Write ratio~5:1 — read-heavy, but the gap is smaller than a URL shortener because pastes are often created and shared to a small audience
Average paste size~10 KB (most pastes are short code snippets or config files)
Max paste size10 MB
Read latency< 100 ms p99 — users expect the content to appear quickly
Write latency< 300 ms p99 — acceptable for a submission flow
Availability99.9% (three nines) — pastes are useful but not life-critical
ConsistencyStrong consistency for the paste content (once created, the link must work immediately); eventual consistency is fine for view counts
Data retentionConfigurable per paste; un-expired pastes kept indefinitely; default 30 days

Non-Functional Requirements

4. Back-of-the-Envelope Estimates

Write QPS (paste creation)

50K creators/day. Assume each creates ~2 pastes per day on average.

Writes/day = 50,000 × 2 = 100,000
Write QPS = 100,000 / 86,400 ≈ 1.2 QPS (average)
Peak QPS = 1.2 × 5 ≈ ~6 QPS

Read QPS (paste views)

Read:Write ratio is 5:1.

Read QPS (avg) = 1.2 × 5 = 6 QPS
Peak read QPS = 6 × 5 = ~30 QPS

These numbers are modest. Even a single well-configured server can handle this. But we design for redundancy and growth.

Storage

Average paste is 10 KB. Using the write estimate above:

New data/day = 100,000 pastes × 10 KB = ~1 GB/day
New data/year = 1 GB × 365 = ~365 GB/year

With a 30-day default expiry, much of this data gets cleaned up. Active storage at any point might be much smaller — perhaps 30–50 GB of live paste content. But if many pastes are set to "never expire," storage grows to hundreds of GB over years.

For the paste metadata (short code, expiry, language tag, timestamps): ~200 bytes each.

Metadata/year = 100,000/day × 365 × 200 bytes ≈ ~7 GB/year

Metadata is tiny. The paste content itself is the bulk of storage.

Large pastes (up to 10 MB)

If 1% of pastes are large (say 1 MB average among that 1%):

Large paste data/day = 1,000 × 1 MB = 1 GB/day extra

This doubles our daily storage but is still manageable. For very large pastes, we will use object storage (like S3) rather than the database.

Back-of-the-Envelope Estimation

5. API Design

We expose a simple REST API. Four endpoints cover all must-have features.

5.1 Create Paste

FieldValue
Method & PathPOST /api/v1/pastes
Request body{ "content": "print('hello')", "language": "python" (optional), "expires_in": "1d" (optional), "custom_alias": "my-config" (optional) }
Success response (201 Created){ "paste_url": "https://paste.io/aBc12Xy", "paste_key": "aBc12Xy", "expires_at": "2026-05-06T00:00:00Z", "raw_url": "https://paste.io/raw/aBc12Xy" }
Error codes400 — content empty or exceeds 10 MB; 409 — custom alias taken; 429 — rate limit exceeded

5.2 Read Paste (HTML view)

FieldValue
Method & PathGET /p/{paste_key}
Success response (200 OK)HTML page rendering the paste content with syntax highlighting
Error codes404 — paste not found; 410 — paste expired

5.3 Read Paste (raw text)

FieldValue
Method & PathGET /raw/{paste_key}
Success response (200 OK)Plain text body with Content-Type: text/plain
Error codes404 — paste not found; 410 — paste expired

5.4 Delete Paste

FieldValue
Method & PathDELETE /api/v1/pastes/{paste_key}
AuthRequires the delete_token returned at creation time
Success response204 No Content
Error codes403 — invalid token; 404 — not found

At creation time, we return a delete_token in the response so the creator can delete their paste later without needing a user account.

6. High-Level Architecture

High-Level Architecture

Component Responsibilities

  • Load Balancer — distributes requests across service instances; terminates TLS; routes /api/v1/pastes (writes) and /p/, /raw/ (reads) to the appropriate services.
  • Paste Write Service — validates input, generates a unique paste key, writes metadata to PostgreSQL, and uploads the paste content to object storage.
  • Paste Read Service — the hot read path. Checks the Redis cache first; on a miss, reads metadata from the database and fetches content from object storage. Returns HTML or raw text.
  • Metadata Database (PostgreSQL) — stores paste metadata: key, expiry, language, timestamps, delete token. Small rows, fast lookups.
  • Content Store (Object Storage) — stores the actual paste content as objects keyed by paste_key. Handles content from a few bytes to 10 MB. Much cheaper and more scalable than storing blobs in the database.
  • Redis Cache — caches hot paste content and metadata for popular or recently created pastes. Reduces load on object storage.
  • Cleanup Worker — a background cron-like job that periodically scans for expired pastes, deletes their metadata from PostgreSQL, and removes their content from object storage.

7. Data Model

Database Choice

We use PostgreSQL for metadata and object storage (S3) for paste content. This separation is important:

  • Metadata is small, structured, and benefits from SQL indexes (lookups by paste_key, scans by expiry time).
  • Paste content is variable-size (bytes to 10 MB), append-only, and a perfect fit for object storage, which is cheap and infinitely scalable.

Storing large blobs in PostgreSQL would bloat the database, slow down backups, and waste expensive database storage.

Main Table: pastes

ColumnTypeNotes
idBIGINT PKAuto-increment internal ID
paste_keyVARCHAR(10)UNIQUE index — the short code used in URLs
content_pathVARCHAR(255)Object storage key (e.g. pastes/aBc12Xy)
languageVARCHAR(30)Syntax highlighting hint (nullable)
size_bytesINTEGERContent size, for display and limit enforcement
created_atTIMESTAMPWhen the paste was created
expires_atTIMESTAMPNULL means never expires
delete_tokenVARCHAR(64)Hashed token for creator deletion
view_countBIGINTDenormalized counter, updated async (default 0)

Indexes

  • Unique index on paste_key — powers the read lookup. Single B-tree scan per request.
  • Index on expires_at — the cleanup worker queries WHERE expires_at < NOW() to find and delete expired pastes in batch.
  • No index on content — content lives in object storage, not in the database.

How the Model Supports Core Queries

  • Read paste: SELECT paste_key, content_path, language, expires_at FROM pastes WHERE paste_key = ? — fast unique index lookup. Then fetch content from content_path in object storage (or cache).
  • Create paste: INSERT INTO pastes (paste_key, content_path, ...) VALUES (...) — unique index prevents collisions.
  • Cleanup expired: DELETE FROM pastes WHERE expires_at < NOW() LIMIT 1000 — uses the expires_at index, batched to avoid long locks.

Content in Object Storage

Paste content is stored as plain-text objects:

Bucket: paste-content
Key: pastes/{paste_key} e.g. pastes/aBc12Xy
Body: The raw paste text

This is simple and scales to billions of pastes without any database pressure.


8. Core Flows — End to End

This section walks through the three most critical operations in detail.

Flow 1: Create a Paste

This is what happens when a user writes some text and clicks "Create Paste."

  • Step 1 — Client sends the request. The user's browser (or curl command) sends a POST to /api/v1/pastes with the paste content, optional language tag, and optional expiry in the JSON body. This request arrives at the load balancer.

  • Step 2 — Load balancer routes to a Paste Write Service instance. The LB picks a healthy instance using round-robin. TLS is terminated at the LB.

  • Step 3 — Validate the input. The service checks that: the content is not empty, the content does not exceed 10 MB, the expiry value (if provided) is one of the allowed options, and the optional custom alias is available. If any check fails, the service returns 400 (or 409 for a taken alias) immediately.

  • Step 4 — Generate a unique paste key. The service produces a random 7-character base62 string (a-z, A-Z, 0-9). With 62^7 = 3.5 trillion possible keys, collisions are astronomically rare. As a safety net, the unique index on paste_key catches any collision, and the service retries once with a new random key. (If the user provided a custom alias, we use that instead and rely on the unique index to enforce uniqueness.)

  • Step 5 — Upload content to object storage. The service writes the raw paste text to the content store at key pastes/{paste_key}. Object storage is designed for this: it accepts the bytes, stores them durably across multiple replicas, and returns success. For a typical 10 KB paste, this takes 10–30 ms. For a 10 MB paste, it may take 100–200 ms.

  • Step 6 — Write metadata to PostgreSQL. After the content is safely stored, the service inserts a row into the pastes table with the paste key, content path, language, timestamps, expiry, and a hashed delete token. This is a single-row insert on a small table — very fast (~5 ms).

    Why content first, then metadata? If the metadata insert fails after the content upload, we have an orphan object in storage (harmless — it gets cleaned up). But if we wrote metadata first and the content upload failed, users would get a link that points to nothing (broken experience). Content-first is the safer order.

  • Step 7 — Optionally warm the cache. The service writes the paste content and metadata into Redis with a TTL. This means the first person who clicks the link (often the creator themselves, immediately) gets a cache hit.

  • Step 8 — Return the response. The service responds with HTTP 201 and a JSON body containing the paste URL (https://paste.io/p/aBc12Xy), the raw URL, the expiry timestamp, and the delete token. The user sees their shareable link in the browser within 100–300 ms.

Flow 1

Flow 2: Read a Paste (the Hot Path)

This is the most frequent operation. Someone clicks a paste link and wants to see the content.

  • Step 1 — User clicks or enters the paste URL. Their browser sends a GET request to https://paste.io/p/aBc12Xy. DNS resolves to our load balancer.

  • Step 2 — Load balancer routes to a Paste Read Service instance. This service is optimized for speed. We run multiple instances and can scale them independently from the write service.

  • Step 3 — Check the Redis cache. The service looks up paste:{paste_key} in Redis. The cached value contains both the metadata (language, expiry) and the content itself (for pastes under a certain size, e.g. 512 KB). For most pastes (average 10 KB), the entire paste fits in the cache.

    • Cache hit: Redis returns the data in under 1 ms. Skip to Step 6.
    • Cache miss: Proceed to Step 4.
  • Step 4 — Read metadata from PostgreSQL. The service queries SELECT content_path, language, expires_at FROM pastes WHERE paste_key = ?. The unique index makes this a single B-tree lookup (~2–5 ms). If no row is found, return 404. If expires_at is in the past, return 410 Gone.

  • Step 5 — Fetch content from object storage. Using the content_path from the metadata, the service downloads the paste content from S3/object storage. For a 10 KB paste, this takes 10–30 ms. For a 10 MB paste, it takes longer, but these are rare. After fetching, the service writes the result to Redis (with a TTL of 1 hour or until expiry, whichever is shorter) so future reads are cache hits.

  • Step 6 — Render and return the response.

    • For the HTML endpoint (/p/{key}): the service wraps the content in an HTML template with syntax highlighting (applied server-side using the language tag, or client-side via JavaScript).
    • For the raw endpoint (/raw/{key}): the service returns the plain text with Content-Type: text/plain.
  • The user sees the paste content in their browser. Total time: 5–50 ms for cache hits, 30–100 ms for cache misses.

  • Step 7 — Fire a view-count event (async). In the background, the service increments a counter. This can be done by publishing a lightweight event to a small in-memory buffer or a message queue, which a worker batches into UPDATE pastes SET view_count = view_count + N WHERE paste_key = ?. This never blocks the read response.

Flow 2

Flow 3: Cleanup Expired Pastes (Background)

This flow runs continuously in the background to keep the system clean and storage costs under control.

  • Step 1 — Cleanup worker wakes up on a schedule. A cron job or a background worker runs every 5 minutes (or similar interval).

  • Step 2 — Query for expired pastes in batch. The worker runs: SELECT paste_key, content_path FROM pastes WHERE expires_at < NOW() LIMIT 1000. The index on expires_at makes this efficient. It processes in batches of 1,000 to avoid long-running queries.

  • Step 3 — Delete content from object storage. For each expired paste, the worker sends a delete request to the content store. Object storage supports bulk deletes, so the worker can issue a single "delete objects" call for up to 1,000 keys at once.

  • Step 4 — Delete metadata from PostgreSQL. After the content is deleted, the worker removes the rows from the pastes table: DELETE FROM pastes WHERE paste_key IN (...).

    Why content first, then metadata? Same principle as creation: if the metadata delete fails, we retry on the next run. The paste might be briefly visible after expiry, but the content is already gone. If we deleted metadata first, orphan objects would pile up in storage with no record to track them.

  • Step 5 — Invalidate cache entries. The worker also issues Redis DEL commands for the deleted paste keys. This ensures that no stale cache entry serves an expired paste.

  • What the user sees: If someone visits an expired paste link, they get a 410 Gone response. The cleanup may lag by a few minutes, so the read service also checks expires_at on every request and returns 410 even before the cleanup worker catches up.

Flow 3

9. Caching and Read Performance

What We Cache

  • Primary cache: paste_key → {content, language, expires_at}. For pastes under 512 KB (the vast majority), we cache the full content. For larger pastes, we cache only the metadata and let the read service fetch content from object storage.
  • We do NOT cache view counts — they change constantly and are not latency-sensitive.

Where the Cache Sits

Redis sits between the Paste Read Service and the backend stores (PostgreSQL + object storage). The read service always checks Redis first.

Cache Key and Value Shape

Key: paste:{paste_key} e.g. paste:aBc12Xy
Value: { content: "...", language: "python", expires_at: 1746489600 }
TTL: Min(time_until_expiry, 1 hour)

The 1-hour cap prevents "never expires" pastes from living in the cache forever — even popular pastes get refreshed periodically.

Cache Update and Invalidation

  • Write-through on create: When a new paste is created, we write to Redis immediately after storing the content. The creator's first click is a cache hit.
  • Lazy loading on miss: If a read hits a cache miss, we fetch from the DB + object storage and populate the cache.
  • Expiry-based invalidation: Redis TTL removes entries naturally. The cleanup worker also explicitly deletes expired entries.
  • No update needed for content: Paste content is immutable — once created, it never changes. This is the ideal scenario for caching because we never worry about stale data.

Eviction Policy

We use LRU (Least Recently Used) eviction in Redis. Most pastes are short-lived (30-day default) and accessed only a few times. LRU keeps frequently-accessed pastes warm and silently evicts old, rarely-accessed ones. On a miss, we simply reload from storage.

10. Storage, Indexing, and Media

Primary Data Storage

  • Metadata: PostgreSQL. Small rows (~200 bytes each), fast indexed lookups. At ~7 GB/year of metadata, a single Postgres instance handles this for years.
  • Content: Object storage (e.g. AWS S3, MinIO). Paste content ranges from a few bytes to 10 MB. Object storage is cheap ($0.023/GB/month on S3), durable (11 nines), and scales infinitely without any management.

Why Not Store Content in the Database?

Storing variable-length text blobs (up to 10 MB) in PostgreSQL would bloat the tables, slow down pg_dump backups, increase replication lag, and waste expensive database I/O on bulk data reads. Object storage is purpose-built for this.

Indexes

  • Unique index on paste_key — the most critical index. Every read starts here.
  • Index on expires_at — powers the cleanup worker's batch queries.
  • No full-text index on content is needed (we do not support content search).

CDN Consideration

A CDN (e.g. CloudFront) can be placed in front of the read path. Since paste content is immutable, CDN caching works perfectly:

  • Pull-based CDN: When a paste is first requested through the CDN, it pulls from our origin, caches the response, and serves subsequent requests from the edge. TTL matches the paste's remaining expiry.
  • This is most beneficial for viral pastes that get thousands of views. For typical pastes viewed by a handful of people, the CDN adds little value but no harm.

Trade-offs

  • Cost: Object storage is extremely cheap. The entire year's paste content (~365 GB) costs about $8/month on S3. Redis (a few GB for hot cache) is the most expensive per-GB component but we only cache hot data.
  • Write load: Negligible (~6 QPS peak). Not a concern.
  • Read latency: Sub-millisecond from Redis, ~10–30 ms from S3 for a small object. Well within our 100 ms target.

11. Scaling Strategies

Version 1: Simple Setup

For the first deployment serving a few thousand users:

  • A single PostgreSQL instance for metadata.
  • A single S3 bucket (or MinIO instance) for paste content.
  • A single Redis instance for caching.
  • Two app server instances (combined read/write) behind a load balancer.
  • A single cleanup worker on a cron schedule.

This handles the entire workload comfortably. It is simple to deploy, monitor, and debug.

Growing the System

Database replication: Add one or two PostgreSQL read replicas. The read service queries replicas for paste lookups. Since paste metadata is immutable once written, replication lag is a non-issue. Writes go to the primary.

Separating read and write services: At moderate scale, split the combined app server into a Paste Read Service and a Paste Write Service. The read service gets more instances (since reads are 5x more frequent), and the write service stays lean.

Sharding (only at very large scale): If we reach billions of pastes, we can shard the metadata database by paste_key hash. Each shard holds a range of paste keys. Routing is simple: hash the key, pick the shard. A consistent hashing ring makes this straightforward.

Object storage scales automatically: S3 (and similar services) scale horizontally without any effort on our part. We never need to shard or partition the content store.

Handling Bursts

  • If a paste goes viral (e.g. linked from Hacker News), Redis and the CDN absorb the spike. Even millions of reads per hour for a single paste are fine — it is one cache entry being read repeatedly.
  • The write path is already low-traffic. If a burst of paste creation occurs (e.g. a CI pipeline creating thousands of pastes), the service handles it because each write is independent and fast (~50 ms to S3 + ~5 ms to Postgres).

12. Reliability, Failure Handling, and Backpressure

Removing Single Points of Failure

  • App servers: At least 2 instances of each service behind the load balancer. The LB health-checks and removes unhealthy instances.
  • Database: PostgreSQL with streaming replication to a standby in a different availability zone. Automatic failover using Patroni or managed service (e.g. AWS RDS Multi-AZ).
  • Object storage: S3 provides 99.999999999% durability by default. It replicates across multiple data centers. This is the most durable component in our stack.
  • Redis: Use Redis Sentinel for automatic failover. If Redis goes down entirely, the read service falls back to PostgreSQL + object storage (slower, but functional).
  • Load balancer: Use a managed cloud LB (e.g. AWS ALB) which is redundant across availability zones by default.

Timeouts, Retries, and Idempotency

  • Timeouts: The read service sets a 100 ms timeout on Redis, 500 ms on Postgres reads, and 2 seconds on S3 fetches. If any step times out, it moves to the next fallback or returns an error.
  • Retries with backoff: The write service retries S3 uploads up to 3 times with exponential backoff (200 ms, 400 ms, 800 ms). Paste creation is naturally idempotent — if the client retries with the same content, it just creates a second paste (with a different key), which is acceptable.
  • Idempotency for deletes: The DELETE endpoint uses the delete_token as an idempotency key. Deleting an already-deleted paste returns 404, which is safe to retry.

Circuit Breakers

If object storage starts timing out (rare, but possible during S3 outages), a circuit breaker prevents the read service from piling up connections. The breaker opens after 10 failures in 30 seconds, and the service returns 503 for reads until S3 recovers. Cached pastes continue to be served from Redis.

Behavior Under Overload

  • Rate limiting: The API gateway enforces per-IP rate limits on paste creation (e.g. 60 pastes/hour). This prevents abuse and protects the write path.
  • Shedding view counts: Under extreme load, the read service can stop incrementing view counts. Reads continue to work perfectly — only analytics degrades.
  • Priority: The read path is highest priority. If we must shed work, we shed paste creation and analytics first. Reads are the user-facing experience.

13. Security, Privacy, and Abuse

Authentication and Authorization

  • The read and create endpoints require no authentication — this is a public paste service.
  • The delete endpoint requires the delete_token that was issued at creation time. This is a bearer-token model without user accounts.
  • If we add user accounts later, we use standard session tokens or JWTs and associate pastes with user IDs.

Encryption

  • In transit: All traffic over HTTPS (TLS 1.3). Terminated at the load balancer.
  • At rest: S3 server-side encryption (SSE-S3 or SSE-KMS) for paste content. PostgreSQL disk encryption via the managed service.

Handling Sensitive Data

  • Pastes may contain secrets (API keys, passwords, private code). We should warn users in the UI not to paste sensitive data.
  • We do NOT inspect paste content — we treat it as opaque text.
  • If compliance requires it, we can add a content scanning pipeline (e.g. for PII detection), but this is out of scope for the initial design.

Abuse Protection

  • Rate limiting: Per-IP limits on creation to prevent spam flooding (thousands of garbage pastes).
  • Content size limit: 10 MB hard cap prevents abuse of storage.
  • Malware/spam detection: Optionally scan paste content for known phishing patterns, malicious URLs, or illegal content. This can run asynchronously after creation — flag and hide suspicious pastes rather than blocking the create flow.
  • CAPTCHA on the web form: Prevents automated bots from mass-creating pastes through the browser UI. API consumers are rate-limited by IP.
  • Abuse reporting: A simple "Report this paste" button on the read page lets users flag problematic content for manual review.

14. Bottlenecks and Next Steps

Main Bottlenecks and Risks

  • Object storage latency for large pastes: A 10 MB paste takes 100–200 ms to fetch from S3. Mitigation: Cache popular large pastes in Redis (at the cost of memory), or use a CDN to cache them at the edge. For most pastes (10 KB), this is not an issue.

  • Cleanup worker falling behind: If millions of pastes expire around the same time (e.g. after a traffic spike 30 days ago), the cleanup worker may lag. Mitigation: Run multiple cleanup workers in parallel, each handling a partition of the expires_at range. Or use database-level TTL features if available.

  • Hot pastes (viral links): A single paste linked from a popular site could receive millions of reads. Mitigation: Redis and CDN handle this well. For extreme cases, add a local in-process cache on each app server (top 100 hottest keys) to avoid even the Redis network hop.

  • Paste key collisions: With random 7-character base62 keys, collisions are negligible at our scale. But at billions of pastes, collision probability increases. Mitigation: Increase key length to 8 or 9 characters. Or use a counter-based approach (like the URL shortener) for guaranteed uniqueness.

Design Summary

AspectDecisionKey Trade-off
Content storageObject storage (S3), separate from metadataCheap and scalable, but adds a network hop on reads (mitigated by cache)
Metadata storagePostgreSQL with unique index on paste_keyACID guarantees and simple operations; single instance is enough for years
CachingRedis for hot pastes (content + metadata)Sub-ms reads for popular pastes; graceful fallback on miss
Paste key generationRandom base62, 7 charactersSimple and stateless; collision-free at our scale
Expiry cleanupBackground worker scanning expires_at indexDecoupled from the read/write path; paste may survive a few minutes past expiry
Scaling pathStart simple (1 DB, 1 Redis, 2 app servers); add replicas, CDN, then shardAvoids premature complexity; clear growth roadmap

This design prioritizes the read path — it is the most frequent operation and the one users experience directly. By separating paste content into object storage, caching aggressively in Redis, and keeping metadata lean in PostgreSQL, we build a system that is fast, cheap to operate, and straightforward to scale.

The immutability of paste content is our biggest advantage: it makes caching, CDN usage, and replication trivially simple since we never need to invalidate or update stored content.