2.6 Proxies & Reverse Proxies - The System Design Interview Handbook

Forward Proxy vs. Reverse Proxy

A proxy is any server that sits between two parties and handles communication on behalf of one of them.

The word "proxy" literally means "acting on behalf of." In networking, there are two kinds, and they sit on opposite sides of the connection.

Forward Proxy

A forward proxy acts on behalf of the client.

The client sends its request to the proxy, and the proxy forwards it to the destination server.

The destination server sees the request as coming from the proxy, not from the original client.

The client's identity stays hidden.

Forward proxies are common in corporate networks. Your company might route all employee internet traffic through a forward proxy that filters out blocked websites, logs browsing activity for compliance, and caches frequently accessed external resources to save bandwidth.

VPN services work on a similar principle, routing your traffic through an intermediary so the destination sees the VPN's IP address rather than yours.

As a system designer, you will rarely build forward proxies. But you should know they exist because they affect how your system receives traffic.

Some of your users will be behind corporate proxies that modify headers, strip cookies, or block certain request types.

Understanding that a forward proxy might sit between your user and your system helps you debug unexpected client behavior.

Reverse Proxy

A reverse proxy acts on behalf of the server. The client sends a request to the reverse proxy, and the reverse proxy decides which backend server should handle it, forwards the request, receives the response, and sends it back to the client.

The client never communicates directly with your backend servers. It only knows about the reverse proxy.

This is the type of proxy you will work with constantly in system design. Every major web application puts a reverse proxy in front of its backend servers.

When you type a URL into your browser and get a response, you are almost certainly talking to a reverse proxy, not the application server itself.

The key difference is perspective.

A forward proxy hides the client from the server.
A reverse proxy hides the server from the client.

Aspect	Forward Proxy	Reverse Proxy
Acts on behalf of	Client	Server
Hides	Client identity from the server	Server identity from the client
Deployed by	Client side (corporate network, VPN)	Server side (your infrastructure)
Client awareness	Client knows about the proxy	Client does not know about the proxy
Common uses	Privacy, content filtering, access control	Security, load balancing, SSL termination, caching

Forward vs. Reverse Proxy

Reverse Proxy Use Cases: Security, SSL Termination, Compression, Caching

A reverse proxy earns its place in your architecture by handling several critical responsibilities that your application servers should not be burdened with.

Security

Your backend servers contain your application logic, your database connections, and your business data. Exposing them directly to the public internet is risky.

A reverse proxy acts as a shield.

Clients connect to the proxy, and the proxy connects to your backend over a private internal network. Your backend servers have no public IP addresses and are unreachable from the outside world.

This architecture lets you implement security controls at the proxy layer.

The reverse proxy can block requests from known malicious IP addresses, reject malformed requests before they reach your application, enforce rate limits to prevent abuse, and add security headers (like Strict-Transport-Security, X-Content-Type-Options, and X-Frame-Options) to every response without modifying your application code.

If an attacker tries to exploit a vulnerability, they hit the reverse proxy first.

The proxy can absorb or filter many types of attacks, including basic DDoS floods, slowloris attacks (which try to hold connections open indefinitely), and oversized request payloads.

Your application servers sit safely behind the proxy, protected from direct exposure.

SSL Termination

SSL/TLS termination at the reverse proxy works identically to termination at a load balancer (covered in Chapter II, Lesson 4). The proxy handles the computationally expensive work of encrypting and decrypting HTTPS traffic.

Backend servers receive plain HTTP requests over the trusted internal network, freeing their CPU cycles for application logic.

Certificate management is centralized at the proxy.

When your SSL certificate expires, you renew it in one place rather than on every backend server.

When you need to add a new domain or switch to a wildcard certificate, the change happens at the proxy level.

For organizations that require end-to-end encryption (where traffic must be encrypted even on the internal network), the reverse proxy can re-encrypt traffic before forwarding it to the backend. This is called SSL bridging. It adds latency but satisfies strict compliance requirements.

Compression

The reverse proxy can compress responses before sending them to the client, reducing bandwidth usage and improving load times.

A JSON API response that is 50 KB uncompressed might shrink to 8 KB after gzip compression.

For users on slow mobile connections, that difference is significant.

Common compression algorithms include gzip (widely supported, good compression), Brotli (better compression ratios than gzip, supported by all modern browsers), and zstd (excellent compression speed, growing adoption).

The proxy negotiates with the client via the Accept-Encoding header to determine which algorithm to use.

Compression at the proxy layer means your application servers send uncompressed responses and the proxy handles compression transparently.

This keeps application code simple and lets you change compression settings (algorithm, compression level, minimum response size for compression) without touching your application.

Caching

A reverse proxy can cache responses and serve them directly without forwarding requests to your backend. This is the same concept as CDN caching (Chapter II, Lesson 5) but at the origin level rather than the edge level.

If 1,000 users request your homepage within a minute, the reverse proxy fetches it from your application server once, caches the response, and serves the remaining 999 requests from its local cache. Your application server processes one request instead of 1,000.

Reverse proxy caching works best for content that is identical across users: public pages, API responses that do not vary by user, and assets that are not yet on a CDN.

For personalized content, the proxy can cache the base response and use techniques like edge-side includes (ESI) to inject user-specific fragments, though this adds complexity.

Nginx and Varnish are particularly strong as caching reverse proxies.

Varnish was built specifically for HTTP caching and can handle extremely high cache hit rates with minimal hardware.

Interview-Style Question

Q: Your application servers are handling 10,000 requests per second, and CPU utilization is at 90%. You cannot add more application servers immediately. What can a reverse proxy do to reduce the load?

A: Three immediate wins. First, enable caching at the reverse proxy for any responses that are identical across users (public pages, non-personalized API endpoints). Even caching responses for 5 seconds dramatically reduces the number of requests reaching your application servers during a traffic spike. Second, enable response compression so your servers spend less time on I/O transmitting large payloads. Third, move SSL termination to the reverse proxy if your application servers are currently handling TLS. Encryption is CPU-intensive, and offloading it frees significant processing capacity. Combined, these changes can cut application server load by 40% to 70% without adding a single server.

Load Balancer vs. Reverse Proxy: When You Need Which

This is one of the most common points of confusion for engineers learning system design, and for good reason: load balancers and reverse proxies overlap significantly in functionality.

Nginx, HAProxy, and Envoy can all function as both.

So when do you need one versus the other?

What They Share

Both sit between clients and backend servers. Both hide backend servers from direct client access.

Both can perform health checks and route traffic away from unhealthy servers. Both can terminate SSL.

In many deployments, a single piece of software (like Nginx) serves as both the reverse proxy and the load balancer simultaneously.

Where They Differ

A load balancer's primary job is distributing traffic across multiple backend servers. Its value comes from its routing algorithms (round robin, least connections, consistent hashing) and its ability to scale a service horizontally. If you have only one backend server, a load balancer has nothing to balance.

A reverse proxy's primary job is sitting in front of your backend and handling concerns that the backend should not deal with: SSL termination, compression, caching, security headers, request filtering, and protocol translation.

A reverse proxy adds value even if you have only one backend server.

When You Need Which

If you have a single backend server, you still benefit from a reverse proxy. It handles SSL, caching, compression, and security. No load balancing is involved because there is only one destination.

If you have multiple backend servers, you need both load balancing and reverse proxy functionality. In practice, you deploy a single tool (like Nginx or an AWS ALB) that does both.

The load balancing distributes traffic, and the reverse proxy features handle SSL, caching, and the rest.

If you need content-aware routing (directing requests to different backend services based on URL path, headers, or cookies), you need a Layer 7 reverse proxy that understands HTTP.

A pure Layer 4 load balancer cannot make these decisions because it does not inspect the request content.

Scenario	What You Need	Why
Single backend server	Reverse proxy only	SSL, caching, security, compression
Multiple identical servers	Load balancer + reverse proxy	Traffic distribution + SSL, caching, security
Multiple different services	Layer 7 reverse proxy with routing rules	Route /api to service A, /web to service B
Non-HTTP traffic (database, gaming)	Layer 4 load balancer	No need for HTTP inspection
Microservices architecture	API gateway (which is a specialized reverse proxy)	Routing, auth, rate limiting across many services

The practical reality is that you rarely think of them as separate components. You deploy Nginx or an ALB, configure routing rules, enable SSL termination, set up caching where appropriate, and the single component handles all of these responsibilities.

The distinction matters conceptually for interviews and for understanding what each feature does, but operationally they collapse into one deployment.

Popular Reverse Proxies: Nginx, HAProxy, Envoy, Traefik

Four tools dominate the reverse proxy landscape. Each has a different personality, and knowing their strengths helps you make the right choice for your architecture.

Nginx

Nginx is the most widely deployed reverse proxy and web server in the world. It powers roughly a third of all websites on the internet. Its event-driven, asynchronous architecture handles tens of thousands of concurrent connections with minimal memory consumption.

Nginx excels at static file serving, reverse proxying, SSL termination, caching, and basic load balancing. Its configuration is file-based and declarative, which makes it easy to version control and reproduce.

A basic reverse proxy configuration is a few lines long.

Nginx Plus (the commercial version) adds dynamic configuration, active health checks, session persistence, and a monitoring dashboard.

The open-source version requires a reload to pick up configuration changes, which briefly interrupts long-lived connections.

Nginx is the default choice for most web applications.

If you do not have a specific reason to choose something else, Nginx is almost always a safe and performant option.

HAProxy

HAProxy (High Availability Proxy) is focused specifically on load balancing and proxying.

Where Nginx is also a web server, HAProxy is purely a proxy. This focus makes it exceptionally good at its job.

HAProxy has a richer set of load balancing algorithms than Nginx out of the box, more sophisticated health checking, and better connection handling for high-throughput scenarios. It supports seamless reloads without dropping connections, which is a meaningful advantage for environments where configuration changes are frequent.

HAProxy's stats dashboard gives detailed real-time visibility into backend server health, connection rates, error rates, and queue depths. Its configuration language is more complex than Nginx's but offers finer control over routing logic.

HAProxy is the preferred choice for teams that need advanced load balancing features, high connection throughput, and detailed operational visibility. It is commonly used in front of database clusters and internal services where its load balancing sophistication matters more than Nginx's web server capabilities.

Envoy

Envoy was built by Lyft specifically for microservice architectures.

Where Nginx and HAProxy were designed for the monolithic era and adapted to microservices, Envoy was born in it.

Envoy's standout features are its observability and dynamic configuration. It produces detailed metrics, distributed traces, and access logs out of the box, which makes debugging request flows across dozens of microservices dramatically easier.

Its configuration is API-driven (via the xDS protocol), meaning a control plane can update routing rules, add new services, and modify traffic policies without restarting Envoy.

Envoy is the default data plane proxy in service mesh platforms like Istio and AWS App Mesh. When you see "service mesh" in a system design discussion, Envoy is almost certainly the proxy running as a sidecar next to each service (more on this below).

The trade-off is complexity. Envoy's configuration is more verbose than Nginx's, and its feature set is oriented toward operators who need fine-grained traffic control in complex microservice topologies.

For a simple web application with a few backend servers, Envoy is overkill.

Traefik

Traefik is a modern reverse proxy designed specifically for containerized and cloud-native environments. It automatically discovers services from container orchestrators (Docker, Kubernetes, ECS) and configures routing rules without manual configuration.

When you deploy a new service in Kubernetes with the right annotations, Traefik automatically detects it, creates a route to it, and can even provision a TLS certificate via Let's Encrypt. No configuration files to edit.

No reloads.

This automatic service discovery is Traefik's defining feature.

Traefik includes a built-in dashboard for monitoring routes, services, and middleware. It supports middleware chains for common tasks like rate limiting, authentication, and header manipulation.

The trade-off is raw performance. Traefik is slower than Nginx and HAProxy for extremely high-throughput scenarios.

For most containerized applications where convenience and automatic configuration matter more than squeezing out maximum requests per second, Traefik is an excellent choice.

Proxy	Architecture	Configuration	Strengths	Best For
Nginx	Event-driven web server + proxy	File-based, declarative	Performance, simplicity, ecosystem	Most web applications, static serving
HAProxy	Dedicated proxy/load balancer	File-based, detailed	Load balancing depth, health checks, stats	High-throughput proxy, database fronting
Envoy	Microservice-native proxy	API-driven (xDS)	Observability, dynamic config, service mesh	Microservice architectures, Istio/service mesh
Traefik	Cloud-native proxy	Auto-discovery from orchestrators	Container integration, automatic TLS	Kubernetes, Docker, cloud-native deployments

Sidecar Proxy Pattern and Service Mesh

As applications grew from a handful of services to dozens or hundreds of microservices, managing the communication between them became a serious engineering challenge.

The sidecar proxy pattern and service meshes emerged to solve that problem.

The Problem with Service-to-Service Communication

In a microservices architecture, services need to find each other (service discovery), communicate securely (mutual TLS), handle failures gracefully (retries, circuit breaking, timeouts), and produce observability data (metrics, traces, logs).

If every service implements these features in its own application code, you end up with duplicated logic across every service, inconsistent implementations, and a maintenance nightmare whenever a policy changes.

If your company has 80 microservices written in five different programming languages, implementing retry logic with circuit breaking in all of them means building and maintaining that logic in five language-specific libraries.

When you want to change the retry policy, you update five libraries and redeploy 80 services.

The Sidecar Proxy Pattern

The sidecar pattern extracts all networking concerns out of the application and into a separate proxy process that runs alongside each service instance. Every service gets its own dedicated proxy (the sidecar) that handles all inbound and outbound network traffic on that service's behalf.

Your application code makes a plain HTTP call to localhost:8080.

The sidecar proxy intercepts that call and handles service discovery (finding the destination), load balancing (choosing which instance of the destination to call), mutual TLS (encrypting the connection), retries with exponential backoff (handling transient failures), circuit breaking (stopping calls to a failing service), and observability (emitting metrics and trace spans).

The application knows nothing about any of this. It thinks it is making a simple local call. All the networking complexity lives in the sidecar, which is the same binary (usually Envoy) running beside every service in the system.

Service Mesh

A service mesh is the infrastructure layer formed by deploying sidecar proxies alongside every service and connecting them through a central control plane.

The sidecars form the data plane (they carry the actual traffic).

The control plane manages configuration, certificates, and policies.

The control plane (like Istio, Linkerd, or Consul Connect) pushes routing rules, security policies, and observability configuration to every sidecar proxy in the mesh. When you want to shift 10% of traffic to a canary deployment, you tell the control plane, and it updates every relevant sidecar.

When you want to enforce mutual TLS between all services, the control plane distributes certificates and enables encryption across the entire mesh.

Istio is the most widely adopted service mesh. It uses Envoy as its sidecar proxy and provides traffic management, security (mutual TLS, authorization policies), and observability (integrated with Prometheus, Grafana, Jaeger). Istio is powerful but adds significant operational complexity and resource overhead (each sidecar consumes CPU and memory).
Linkerd takes a lighter approach. It is simpler to install and operate than Istio, uses fewer resources, and focuses on the most common service mesh use cases: mutual TLS, traffic metrics, and retries. It sacrifices some of Istio's advanced features (like complex traffic routing rules) for simplicity.
Consul Connect by HashiCorp integrates service mesh capabilities with Consul's existing service discovery and configuration management. It supports both sidecar proxies and native application integration.

Mesh	Proxy	Complexity	Resource Usage	Strengths
Istio	Envoy	High	Higher	Feature-rich, advanced traffic control
Linkerd	linkerd2-proxy (Rust)	Low	Lower	Simplicity, performance, ease of operation
Consul Connect	Envoy or built-in	Medium	Medium	Integration with Consul ecosystem

When You Need a Service Mesh (and When You Do Not)

A service mesh adds value when you have a large number of microservices (typically 20 or more), when you need consistent security policies (mutual TLS) across all service communication, when debugging cross-service request flows is a regular pain point, and when you want traffic management features like canary deployments, traffic mirroring, or fault injection.

A service mesh is overkill when you have fewer than 10 services, when your services communicate through a message queue rather than direct calls, or when your team does not have the operational capacity to manage another layer of infrastructure. Every sidecar proxy consumes memory and CPU.

Across hundreds of service instances, that overhead adds up.

Many successful companies run large microservice architectures without a service mesh. They use client-side libraries for service discovery and retries, manage TLS certificates through a centralized tool, and get observability through application-level instrumentation.

A service mesh is a powerful tool, but it is not a prerequisite for microservices.

Beginner Mistake to Avoid

New engineers sometimes hear "service mesh" and assume every microservice architecture needs one.

Adopting Istio for a system with five services and a team of four engineers will slow you down, not speed you up.

The configuration overhead, debugging complexity, and resource cost of a mesh are justified only at a scale where the problems it solves are actually causing pain.

Start without a mesh.

Add one when you have concrete evidence that cross-service networking is a real bottleneck for your team.

Interview-Style Question

Q: Your company runs 60 microservices across three Kubernetes clusters. Different teams use different programming languages. You need to enforce mutual TLS between all services and get consistent metrics. How would you approach this?

A: Deploy a service mesh. With 60 services across multiple languages, implementing mutual TLS and consistent metrics through application-level libraries would mean maintaining libraries in every language and ensuring every team adopts them correctly. A service mesh like Linkerd or Istio injects a sidecar proxy alongside each service that handles TLS and metrics transparently. The application code does not change. Linkerd would be the first choice here for its lower operational complexity and resource usage. It provides mutual TLS out of the box, integrates with Prometheus for metrics, and installs with minimal configuration. If the team later needs advanced traffic management (canary deployments, traffic mirroring, fault injection), Istio becomes worth the additional complexity.

KEY TAKEAWAYS

A forward proxy acts on behalf of the client, hiding the client's identity. A reverse proxy acts on behalf of the server, hiding backend infrastructure from clients.
Reverse proxies handle security, SSL termination, compression, and caching. They add value even with a single backend server.
Load balancers and reverse proxies overlap heavily. In practice, a single tool (Nginx, ALB) often serves both roles. The distinction matters conceptually but collapses operationally.
Nginx is the safe default for most applications. HAProxy excels at advanced load balancing. Envoy is built for microservices and service meshes. Traefik shines in containerized environments with automatic service discovery.
The sidecar proxy pattern extracts networking concerns from application code into a dedicated proxy running alongside each service.
A service mesh (Istio, Linkerd, Consul Connect) provides mutual TLS, traffic management, and observability across all services through sidecar proxies managed by a central control plane.
Do not adopt a service mesh prematurely. It solves real problems at scale but adds overhead that small teams and small service counts do not need.

Up Next: Your system now has the networking, storage, caching, load balancing, CDN, and proxy layers covered. The final building block in Part II is asynchronous communication. Part II, Lesson 7 covers message queues, event streaming with Kafka, task queues, and the patterns that let your services communicate without waiting for each other. If synchronous calls are phone conversations, asynchronous messaging is email, and knowing when to use each one will make your systems dramatically more resilient.