CDN & Edge Caching

Reducing global latency by distributing static and dynamic content across Anycast-routed edge locations with granular cache keys and serverless code execution.

IntermediateCachingChapter: Caching Infrastructure10 min read

Geography-Based Latency Reduction

Even with high-speed internet, the speed of light limits network latency. A round-trip request between San Francisco and London takes roughly 70 to 80 milliseconds under ideal conditions. When a webpage requires dozens of assets, these round-trips quickly accumulate, resulting in noticeable page load delays.

A Content Delivery Network (CDN) solves this problem by moving content closer to users. CDNs deploy small datacenters, known as Points of Presence (PoPs), in hundreds of cities worldwide.

To route users to the nearest PoP, CDNs use Anycast routing. In an Anycast network, multiple edge servers across different locations share the exact same IP address. Routers on the internet automatically send a user's packets to the path that is topologically closest in the BGP routing table. This reduces network transit time, often dropping connection setup latency from hundreds of milliseconds to single digits.


CDN Caching Hierarchy

Rather than forwarding every cache miss back to the primary database, CDNs organize their storage into a structured hierarchy:

  • Edge Servers: These are the servers closest to the user. They handle client TLS termination and serve cached content directly.
  • Regional Caches: If the edge server has a cache miss, it queries a larger, regional cache container that pools requests from multiple nearby edge PoPs.
  • Origin Shield: A dedicated caching layer situated directly in front of the origin infrastructure. The origin shield consolidates misses from regional caches, protecting the main application database from spikes in traffic.
  • Origin Server: The authoritative application server where the dynamic code runs and the primary database resides.

This tiered hierarchy ensures that even when edge nodes expire cache keys, requests are absorbed by intermediate cache shields, minimizing expensive database queries at the origin.

CDN Caching Hierarchy Global Users Anycast routed Edge PoP Cache Hit (Fast) Origin Shield Regional Cache Origin Infrastructure App & Database Miss Miss Persistent Connection Reuse (Fast HTTP/2 Multiplexing)

Cache-Control Headers on the Edge

To manage how edge servers store content, HTTP provides cache-control instructions:

  • s-maxage: This directive specifically instructs shared caches (like CDNs and proxies) how long to store an asset in seconds, ignoring client-specific max-age values.
  • stale-while-revalidate: This allows the CDN to serve a stale asset from its cache immediately if the request falls within a specified window. The CDN then fetches the fresh asset from the origin in the background, keeping edge lookups fast.
  • Surrogate-Key (or Cache-Tag): An HTTP response header sent by the origin to associate cache keys with specific tags. For example, a response can have the header Surrogate-Key: product-1002 author-42. If product 1002 is updated, the application sends a purge request for tag product-1002, instantly invalidating all matching edge cache objects across the entire network.

Cache Invalidation at Scale

Evicting assets from a globally distributed network is a major operational challenge. CDNs provide several purge models:

  • Instant Purging: The CDN controller broadcasts a key invalidation request to all active edge PoPs. Most modern CDNs complete this global invalidation in under 150 milliseconds.
  • Path and Wildcard Purging: Invalidating content using URL paths (such as /images/products/*), which clears large collections of static files at once.
  • Soft Purging: Rather than deleting the asset outright, the CDN marks the item as stale. The next request receives the stale asset while the edge server triggers a background fetch to validate the content with the origin.

Edge Computing

Modern CDNs go beyond serving static files. They allow developers to run serverless functions directly on edge nodes. These scripts run in V8 isolates, which start in microseconds and consume far less memory than full container sandboxes.

With edge computing, you can:

  • Inspect and modify headers: Enforce security rules or inspect user locations to perform redirect logic.
  • Perform A/B testing: Serve different variants of a page based on user cookies directly at the edge, avoiding database reads.
  • Assemble pages dynamic: Combine static HTML fragments with user-specific data from local edge key-value databases, serving personalized pages with minimal latency.

Further Reading

Code Examples

Core Literature References

Hypertext Transfer Protocol (HTTP/1.1): Caching

by Roy T. Fielding, Mark Nottingham, and Julian Reschke — RFC 7234, pp. Section 4

View source