CDNs and Edge Caching
The Speed of Light Problem
Your server is in Virginia. A user in Tokyo makes a request. Light travels through fiber optic cable at roughly 200,000 km/s. Tokyo to Virginia is about 11,000 km. That's 55ms just for the signal to travel one way. Round trip: 110ms. Add DNS, TCP, TLS, and server processing, and you're looking at 300-500ms before the first byte of HTML arrives.
You can optimize your server response time to zero. You can shrink your HTML to nothing. You still can't beat physics. The distance between your server and the user is a hard floor on latency.
CDNs solve this by putting copies of your content on servers around the world, so the nearest copy is always close to the user. A user in Tokyo hits a Tokyo edge server. A user in London hits a London edge server. Distance drops from 11,000km to 50km. Physics stops being the bottleneck.
The Mental Model
A CDN is like a chain of local libraries. The publisher (your origin server) has the master copy of every book. Instead of making everyone travel to the publisher's warehouse, the publisher distributes copies to local libraries (edge nodes) around the world. When you want a book, you go to the nearest library. If they have it (cache hit), you get it instantly. If they don't (cache miss), they order it from the publisher, give you a copy, and keep one on the shelf for the next person.
How CDNs Work
A CDN (Content Delivery Network) is a globally distributed network of servers called PoPs (Points of Presence). Each PoP contains edge servers that cache and serve content close to users.
Cache Hit vs Cache Miss
- Cache HIT — edge server has a fresh copy. Response time: 5-50ms (just the distance to the nearest PoP).
- Cache MISS — edge server doesn't have it or it's expired. Edge fetches from origin, serves it, and caches it. Response time: similar to no CDN for the first request, but subsequent requests are fast.
- Stale — cached copy has expired but edge serves it while fetching a fresh copy in the background (if
stale-while-revalidateis configured).
The goal of CDN optimization is maximizing your cache hit ratio — the percentage of requests served from cache without touching your origin server.
Cache-Control: The Headers That Control Everything
You control CDN caching behavior through the Cache-Control HTTP header. Getting these right is the difference between a 95% cache hit ratio and a 20% cache hit ratio.
Essential Cache-Control Directives
Cache-Control: public, max-age=31536000, immutable
| Directive | What it does |
|---|---|
public | Any cache (CDN, browser, proxy) can store this response |
private | Only the browser can cache this (not CDN). Use for user-specific data |
max-age=N | Fresh for N seconds. Browser uses cached copy without revalidation |
s-maxage=N | Fresh for N seconds on shared caches (CDN/proxy). Overrides max-age for CDNs |
no-cache | Must revalidate with server before using cached copy (still caches!) |
no-store | Do not cache at all. Anywhere. Not on disk, not in memory |
immutable | Content will NEVER change. Don't even ask for revalidation |
stale-while-revalidate=N | Serve stale cache for N seconds while fetching fresh copy in background |
no-cache does NOT mean "don't cache." It means "cache it, but revalidate with the server before using the cached copy." If you want absolutely no caching, you need no-store. This naming mistake has confused developers since 1999.
The Caching Strategy Cheat Sheet
Static assets with content hashes (e.g., app.a1b2c3.js):
Cache-Control: public, max-age=31536000, immutable
Cache forever. The filename changes when the content changes, so cache invalidation is automatic.
HTML pages (the entry point that references hashed assets):
Cache-Control: public, max-age=0, must-revalidate
Always check with the server. HTML needs to be fresh so it references the latest hashed asset URLs. A 304 response (not modified) is fast and saves bandwidth.
API responses (dynamic, user-specific data):
Cache-Control: private, no-cache
Only browser caches, and always revalidate. Never cache user-specific data on a shared CDN.
Never cache (sensitive data, auth tokens):
Cache-Control: no-store
No caching anywhere. Period.
stale-while-revalidate: The Best of Both Worlds
This directive lets CDNs serve a stale response instantly while fetching a fresh one in the background:
Cache-Control: public, max-age=60, stale-while-revalidate=3600
This means:
- For the first 60 seconds: serve from cache (guaranteed fresh)
- From 60s to 3660s: serve the stale cached version immediately, but fetch a fresh copy from origin in the background
- After 3660s: cache is too stale, wait for origin response
The user always gets an instant response. The content is at most 60 seconds out of date. This is the go-to strategy for content that should be fresh but where a few seconds of staleness is acceptable (blog posts, product listings, docs).
Browsers also support stale-while-revalidate (added to all major browsers). The behavior is the same: serve stale instantly, revalidate in the background. For CDNs, the s-maxage directive controls the CDN's freshness window while max-age controls the browser's. You can set different values for each.
Cache Invalidation: The Hard Problem
"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton
When content changes, how do you get CDN edge servers to stop serving the old version?
Strategy 1: Content-Hashed Filenames (Best for Assets)
style.a8f3e2.css → style.b7c4d1.css
The URL changes, so the CDN naturally fetches the new file. Old versions remain cached but are never requested. This is why webpack, Vite, and Next.js put hashes in filenames.
Strategy 2: Purge API (Best for Dynamic Content)
Every CDN provides an API to purge cached content:
# Cloudflare purge
curl -X POST "https://api.cloudflare.com/client/v4/zones/{zone}/purge_cache" \
-H "Authorization: Bearer {token}" \
-d '{"files": ["https://example.com/page"]}'
Purges propagate across all PoPs globally, typically in under 5 seconds. Use this for urgent content changes (fixing a typo, removing sensitive data).
Strategy 3: Low TTLs + stale-while-revalidate
Set s-maxage=60, stale-while-revalidate=300. Content refreshes every 60 seconds at the edge, but users always get an instant response.
Strategy 4: Cache Tags (Surrogate Keys)
Advanced CDNs let you tag cached responses and purge by tag:
Surrogate-Key: product-123 homepage featured-products
When product 123 changes, purge all content tagged with product-123 — this invalidates the product page, the homepage (which shows featured products), and any other page that included product 123. Much more precise than URL-based purging.
Edge Computing: Running Code at the Edge
Modern CDNs aren't just caches — they can run code at the edge. This means you can execute logic (authentication checks, A/B tests, geolocation, personalization) at the PoP closest to the user, before the request even reaches your origin server.
Major edge computing platforms:
| Platform | Runtime | Cold Start |
|---|---|---|
| Cloudflare Workers | V8 isolates | None (0ms) |
| Vercel Edge Functions | V8 isolates | None (0ms) |
| AWS Lambda@Edge | Node.js | 50-200ms |
| AWS CloudFront Functions | JavaScript (limited) | None |
| Deno Deploy | V8 isolates | None |
| Fastly Compute | Wasm | ~5ms |
When to Use Edge Computing
Use edge for:
- Authentication/authorization checks
- A/B test routing
- Geolocation-based redirects or content
- Request/response header manipulation
- Bot detection and rate limiting
- Personalization (user-specific data from edge KV stores)
Don't use edge for:
- Database queries (the database is still in one region)
- Heavy computation (edge has CPU time limits)
- Anything that needs low-latency access to a centralized data store
V8 isolates vs containers
Traditional serverless (Lambda) runs your code in a container — an isolated OS-level environment. Containers take 50-200ms to cold start. Edge platforms like Cloudflare Workers use V8 isolates instead — lightweight JavaScript execution contexts within the same V8 engine process. V8 isolates start in under 5ms (often 0ms) because they share the engine runtime. The tradeoff: V8 isolates can only run JavaScript/WebAssembly, while containers can run any language and access the full OS. For edge use cases (header manipulation, routing, auth checks), V8 isolates are overwhelmingly better.
CDN Architecture Details
Anycast Routing
Most CDNs use anycast — multiple servers worldwide share the same IP address. When a user sends a packet to that IP, the internet's routing infrastructure (BGP) delivers it to the geographically nearest server.
This is why CDN DNS resolution is fast: the CDN doesn't need complex GeoDNS logic. It advertises the same IP from every PoP, and BGP routing handles the rest.
Tiered Caching
A simple CDN has two levels: edge and origin. Tiered caching adds a middle layer:
User → Edge PoP (city) → Regional Cache (country) → Origin Server
If the edge PoP doesn't have the content, it checks the regional cache before going to origin. This reduces origin load and improves cache hit ratios because the regional cache aggregates requests from many edge PoPs.
Common Mistakes
| What developers do | What they should do |
|---|---|
| Setting Cache-Control: no-cache to prevent all caching no-cache still stores the response — it just requires revalidation on every use. This is useful (conditional requests with ETags are cheap), but if you have sensitive data that must never be stored on disk, you need no-store. | Use no-store if you truly want zero caching. no-cache means 'cache but revalidate before use.' |
| Putting user-specific data behind Cache-Control: public Cache-Control: public allows CDN edge servers to cache and serve the response to other users. If it contains a user's profile, payment info, or session data, other users could receive it. Private restricts caching to the user's browser only. | Always use private for user-specific responses. Use public only for content identical for all users. |
| Using CDN for all API requests including writes CDN caching only makes sense for content that can be shared across users and doesn't change with every request. Write operations (POST, PUT, DELETE) should go to your origin directly. Even for GET APIs, only cache responses that are the same for all users. | CDN caching works best for read-heavy, identical-for-all-users content. POST/PUT/DELETE should bypass the CDN. |
Key Takeaways
- 1CDNs solve the speed of light problem by caching content at edge servers close to users. Cache hit = 5-50ms. Cache miss = origin round trip.
- 2Use content-hashed filenames for static assets (max-age=31536000, immutable). Use must-revalidate for HTML. Use no-store for sensitive data.
- 3stale-while-revalidate serves stale content instantly while refreshing in the background. Best of both worlds for content freshness and speed.
- 4Cache invalidation via purge APIs propagates globally in under 5 seconds. Content-hashed filenames provide automatic invalidation for assets.
- 5Edge computing runs code at the CDN PoP. Use for auth, A/B testing, geolocation, and personalization. Don't use for database-heavy operations.