Consistent Hashing

Consistent hashing is a partitioning strategy that maps keys and nodes to a circular hash ring, ensuring that adding or removing servers only reassigns a small fraction of keys.

IntermediateReliability & ScaleChapter: Reliability & Scalability12 min read

The Modulo Hashing Limitation

In a partitioned system (like a sharded database or a distributed cache cluster), we need a predictable way to map a data key to a specific server node.

A simple, intuitive way to do this is modulo hashing:

server_index = hash(key) % N

Where N is the number of active servers in the cluster. While this works well in static environments, it fails catastrophically when the cluster scales:

If we add a server (changing the pool size to N + 1), the formula becomes hash(key) % (N + 1).
If a server crashes (reducing the pool size to N - 1), the formula becomes hash(key) % (N - 1).

Because the divisor changes, almost every key in the system hashes to a completely different server index. In a caching cluster, this causes a mass cache invalidation event (near 100% cache miss rate), flooding the database origin server with traffic. In a database sharding context, it requires migrating nearly all data across the network to different physical nodes.

The Consistent Hashing Ring Concept

Consistent hashing solves this scaling problem. Instead of mapping keys directly to a fixed index, consistent hashing maps both servers and data keys to a shared circular space called the hash ring.

Typically, this ring is modeled as a 360-degree circle corresponding to a 32-bit unsigned integer space (from 0 to 2^32 - 1):

Hash the Servers: Each physical server's identifier (like its hostname or IP) is passed through a hash function, producing a 32-bit value. This value maps the server to a specific position on the ring.
Hash the Keys: When a request arrives, the target key is passed through the same hash function, mapping the key to a position on the same ring.
Route the Keys: To find the owner server for a key, the system starts at the key's position on the ring and moves clockwise until it encounters the first server node. That server is designated the owner of the key.

Diagram: Consistent Hashing Ring

The following diagram demonstrates a 3-node consistent hashing ring and clockwise routing of data keys:

Scaling Mechanics: Server Adds and Removes

Consistent hashing dramatically improves scaling efficiency:

Adding a Node: If we add Node D, it is mapped to a position on the ring. The only keys affected are those located immediately counter-clockwise of Node D's position (which previously routed to Node D's clockwise neighbor). All other keys remain assigned to their existing nodes.
Removing a Node: If Node B fails, only the keys that previously routed to Node B must be reassigned to the next clockwise node (Node C). The rest of the key distribution is unaffected.

When scaling a cluster of size N, adding or removing a server only triggers re-mapping for roughly 1 / N of the total keys, preventing massive system invalidations.

Mitigating Hotspots with Virtual Nodes

A basic consistent hashing ring has a major limitation, data skew. Because physical servers are hashed randomly, they are rarely spaced uniformly on the ring. One node might end up owning a massive segment of the ring while another node owns a tiny sliver, resulting in hotspotting where one node receives the bulk of traffic.

To solve this, systems introduce virtual nodes (vnodes):

Instead of hashing a physical node once, the system hashes the node multiple times with different suffixes (e.g. node-a#1, node-a#2, node-a#3).
This maps hundreds of virtual positions per physical server across the ring.
The ring is populated by these interleaved vnodes. When a key routes to a vnode, the request is forwarded to the corresponding physical node.

Using vnodes blends the ownership boundaries, guaranteeing a uniform distribution of keys and load balancing across all physical servers.

Real-World Distributed Implementations

Consistent hashing is the foundation of many distributed architectures:

Apache Cassandra: Uses consistent hashing (via the Murmur3Partitioner) to distribute rows across database nodes in its cluster ring.
Amazon DynamoDB: Coordinates partition keys across storage nodes using a consistent hashing ring.
Memcached: Many client libraries use consistent hashing (specifically the Ketama algorithm) to ensure client-side key routing remains stable when adding cache servers.

Implementation details: Binary Search Lookup

In software, the hash ring is not represented as a circular data structure. Instead, it is implemented using a sorted array of virtual node hash tokens and a hash map:

When nodes are added, their token hashes are computed and appended to an array.
The array is sorted in ascending order.
A hash map links each token hash back to its parent physical node name.
To route a key, the system computes keyHash and runs a binary search (bisect) on the sorted array to find the first token hash index that is greater than or equal to keyHash.
If the binary search index reaches the end of the array, the search wraps around to index 0 (circular boundary).
The key maps to the physical node corresponding to that token.

Prerequisites

Hash Functions

Code Examples

Core Literature References

Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web

by David Karger, Eric Lehman, Tom Leighton, Rina Panigrahy, Matthew Levine, and Daniel Lewin — Section 3: Consistent Hashing Protocols, pp. 55-66

View source

Continue learning

ACID & Isolation Levels

Deep dive into database transaction guarantees, isolation levels, concurrency anomalies like write skew, and control mechanisms such as MVCC, 2PL, and SSI.

API Gateways

Understand the API Gateway pattern as the central ingress point for microservices, handling routing, auth, rate limiting, and protocol translation.

API Security & OAuth 2.0

Understand API authentication and authorization mechanisms, JWT security, and the OAuth 2.0 framework including Authorization Code Flow with PKCE.