Cache Invalidation

Decoupling data updates from stale cache reads using change data capture, lease models, and single-flight reads in distributed environments.

AdvancedCachingChapter: Caching Infrastructure15 min read

The Dual-Write Problem

Caching is a highly effective tool for speeding up reads, but it introduces a major architectural challenge: keeping the cache in sync with the primary database. The most intuitive approach to updating data is the dual-write pattern, where the application updates the database and the cache in sequence.

For example, when a user edits their profile, the application writes to the database first, and then updates the corresponding cache key. Alternatively, the application might update the database and delete the cache key, forcing the next read to fetch the fresh data (cache-aside).

While simple, dual-writes are highly fragile. In a distributed system, network glitches, application crashes, or concurrent requests can disrupt this sequence. If the database write succeeds but the cache update fails due to a brief network blip, the cache remains out of sync until the key's TTL expires. This creates a state of inconsistent data, where clients receive stale information despite a successful write operation.

Cache-Aside Concurrency Bugs

The most subtle errors in cache invalidation happen when multiple application instances read and write concurrently. In the classic cache-aside pattern, readers handle cache misses by querying the database and writing the result back to the cache. This creates a race condition between a reader processing a cache miss and a writer updating the database.

Consider the following timeline:

Reader A queries the cache for a key and gets a miss.
Reader A queries the database and retrieves the old value (let us say, val=A).
Writer B updates the database to val=B.
Writer B deletes the cache key to invalidate it.
Reader A finally writes the old value val=A back to the cache.

At the end of this sequence, the database has val=B, but the cache contains the stale val=A. Because the cache invalidation occurred before the stale write reached the cache, the stale value remains there indefinitely until evicted or expired by its TTL.

Lease-Based Caching

To solve this race condition without adding heavy locks to the database, you can use lease-based caching. When an application queries the cache and encounters a miss, the cache issues a unique lease token along with the miss notification.

This token acts as a ticket to write back to the cache. The application performs its database query and tries to write the result back to the cache, presenting the lease token. If a write operation occurred in the database during this window, the cache invalidates any outstanding lease tokens for that key.

When the reader tries to write back the stale data using its invalidated lease token, the cache rejects the write. This simple check ensures that stale, slow reads cannot overwrite fresher data written by subsequent transactions.

CDC-Based Invalidation

To completely avoid the dual-write problem, you can remove cache updates from the synchronous application code path entirely. Instead, use Change Data Capture (CDC) to drive cache invalidation asynchronously.

Under this model, the application only writes to the database. The database appends the change to its transaction log (such as PostgreSQL's WAL). A dedicated CDC daemon tails this transaction log and publishes updates to a messaging system or directly calls the cache invalidation API.

This approach offers two major benefits:

Atomicity: The cache is only invalidated if the database transaction successfully commits.
Order Preservation: Because the transaction log represents a serialized timeline of database changes, the cache is invalidated in the exact order the updates occurred, eliminating concurrency races.

Single-Flight Read Optimization

In high-traffic systems, a cache miss for a popular key can lead to a cache stampede, where thousands of concurrent readers query the database simultaneously. To prevent this database overload, you can use a single-flight read optimization.

A single-flight wrapper tracks active requests. When a request for a key comes in, the wrapper checks if a database query for that key is already in progress. If so, new callers wait for the existing query to finish instead of spawning their own database connections. When the single active database query returns, the wrapper distributes the result to all waiting callers and populates the cache.

Prerequisites

Caching Strategies Consistent Hashing

Code Examples

Core Literature References

Scaling Memcached at Facebook

by Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C. Li, Ryan McElroy, Mike Paleczny, Daniel Lescarbeau, Stephen Frachtenberg, Albert Wong, and Kenji Kaneko — Sections 3 and 4, pp. 3-8

View source

Continue learning

ACID & Isolation Levels

Deep dive into database transaction guarantees, isolation levels, concurrency anomalies like write skew, and control mechanisms such as MVCC, 2PL, and SSI.

API Gateways

Understand the API Gateway pattern as the central ingress point for microservices, handling routing, auth, rate limiting, and protocol translation.

API Security & OAuth 2.0

Understand API authentication and authorization mechanisms, JWT security, and the OAuth 2.0 framework including Authorization Code Flow with PKCE.