Consistent Hashing Algorithms: Jump Hash vs Ring Hash vs Rendezvous Hash
Whether you’re building a database, cache, load balancer, or content delivery network, you face the same fundamental challenge: distributing workload across multiple machines. Poor distribution leads to unreliable and slow systems. Optimal distribution enables scalability. The question is: how?
Your e-commerce platform started with one database but now handles 100 million users across 50 database shards. Every user profile, order history, and product review must be stored in exactly the right shard for fast queries. When user “john_doe_12345” places an order, which of your 50 shards should store it? Route it wrong and you’ll query all shards (slow). Route it right and you get instant lookups. You add new shards monthly during planned maintenance, and replace failed ones immediately with identical replicas.
What is JumpConsistentHash?
JumpConsistentHash is a simple algorithm created by Google engineers in 2014. It solves the problem of evenly distributing data across multiple servers with just 5 lines of code.
The key benefits:
- Zero memory usage: Uses no extra RAM (traditional methods need 46MB for 1000 servers)
- Perfect distribution: Achieves 99.999998% accuracy
- Minimal data movement: When scaling, only moves the optimal amount of data
- Fast execution: O(ln n) time complexity
The Complete Algorithm
Here’s the entire algorithm in Java:
public class JumpConsistentHash {
public static int hash(long key, int buckets) {
long b = -1, j = 0;
while (j < buckets) {
b = j;
key = key * 2862933555777941757L + 1;
j = (long)((b + 1) * (double)(1L << 31) / (double)((key >>> 33) + 1));
}
return (int)b;
}
}
The constant 2862933555777941757L
is a carefully chosen multiplier that ensures uniform random distribution.
Real-World Implementation
Here’s how you can implement this for database sharding:
import com.google.common.hash.Hashing;
import java.nio.charset.StandardCharsets;
public class DatabaseSharding {
private final List<String> shards;
public DatabaseSharding(List<String> shards) {
this.shards = shards;
}
public String getShard(String userId) {
// Hash the user ID for consistent distribution
long hash = Hashing.murmur3_128()
.hashString(userId, StandardCharsets.UTF_8)
.asLong();
// Find the optimal database shard
int shardIndex = JumpConsistentHash.hash(hash, shards.size());
return shards.get(shardIndex);
}
}
// Distribute user data across database shards
DatabaseSharding sharding = new DatabaseSharding(
Arrays.asList("db-shard-1", "db-shard-2", "db-shard-3", "db-shard-4", "db-shard-5")
);
System.out.println(sharding.getShard("john_doe_12345")); // → db-shard-2
System.out.println(sharding.getShard("jane_smith_67890")); // → db-shard-4
System.out.println(sharding.getShard("mike_wilson_11111")); // → db-shard-1
When to Choose This Solution
Ideal for any system that needs:
- Perfect load distribution across nodes
- Minimal memory overhead (zero lookup tables)
- Fast routing decisions (O(ln n) complexity)
- Predictable behavior during scaling operations
- Maximum resource utilization efficiency
Consider alternatives when you have:
- Frequently failing nodes requiring immediate removal
- Complex routing logic beyond simple distribution
- Need for weighted or priority-based routing
- Systems requiring arbitrary node naming schemes
Traditional Consistent Hashing: The Alternative
While Jump Hash excels in controlled environments, traditional consistent hashing (ring hash) handles dynamic node membership better. Here’s how it works:
import java.util.*;
public class TraditionalConsistentHash {
private final TreeMap<Long, String> ring = new TreeMap<>();
private final int virtualNodes = 150; // Virtual nodes per physical node
public void addNode(String node) {
for (int i = 0; i < virtualNodes; i++) {
long hash = hash(node + ":" + i);
ring.put(hash, node);
}
}
public void removeNode(String node) {
for (int i = 0; i < virtualNodes; i++) {
long hash = hash(node + ":" + i);
ring.remove(hash);
}
}
public String getNode(String key) {
if (ring.isEmpty()) return null;
long hash = hash(key);
Map.Entry<Long, String> entry = ring.ceilingEntry(hash);
// Wrap around to first node if we've gone past the end
if (entry == null) {
entry = ring.firstEntry();
}
return entry.getValue();
}
// Handle node failures with replica fallback
public String getAvailableNode(String key, Set<String> failedNodes) {
if (ring.isEmpty()) return null;
long hash = hash(key);
Map.Entry<Long, String> entry = ring.ceilingEntry(hash);
// Try multiple nodes until we find an available one
for (int i = 0; i < ring.size(); i++) {
if (entry == null) {
entry = ring.firstEntry();
}
String node = entry.getValue();
if (!failedNodes.contains(node)) {
return node;
}
entry = ring.higherEntry(entry.getKey());
}
return null; // All nodes failed
}
private long hash(String input) {
// Using xxHash64 for better performance
return XXHashFactory.fastestInstance()
.hash64()
.hash(input.getBytes(), 0);
}
}
Traditional Hash Benefits with Examples:
1. Dynamic membership - Add/remove nodes anytime:
TraditionalConsistentHash hash = new TraditionalConsistentHash();
// Initial setup
hash.addNode("redis-1");
hash.addNode("redis-2");
hash.addNode("redis-3");
String node = hash.getNode("user:123"); // → "redis-2"
// Add new node during runtime - no restart needed
hash.addNode("redis-4");
// Only ~25% of data needs to move (from 3 to 4 nodes)
// Remove failed node gracefully
hash.removeNode("redis-2");
String newNode = hash.getNode("user:123"); // → "redis-3" (automatic rerouting)
2. Automatic failover with replica routing:
public class ReplicatedConsistentHash extends TraditionalConsistentHash {
private final int replicationFactor = 3;
public List<String> getReplicaNodes(String key) {
// Get 3 different nodes for storing replicas
Set<String> nodes = new HashSet<>();
long hash = hash(key);
Map.Entry<Long, String> entry = ring.ceilingEntry(hash);
while (nodes.size() < replicationFactor && nodes.size() < ring.size()) {
if (entry == null) entry = ring.firstEntry();
String node = entry.getValue();
if (!nodes.contains(node)) {
nodes.add(node);
}
entry = ring.higherEntry(entry.getKey());
}
return new ArrayList<>(nodes);
}
}
// Usage with automatic failover
ReplicatedConsistentHash hash = new ReplicatedConsistentHash();
Set<String> failedNodes = Set.of("redis-2", "redis-5");
// Data "user:123" was stored in: redis-1, redis-2, redis-5
List<String> replicas = hash.getReplicaNodes("user:123"); // → [redis-1, redis-2, redis-5]
String availableNode = hash.getAvailableNode("user:123", failedNodes); // → redis-1
// Automatic failover to working replica!
3. Arbitrary node names - Use real infrastructure:
// Jump Hash: Limited to sequential numbers
Arrays.asList("node-0", "node-1", "node-2", "node-3");
// Traditional Hash: Use actual server info
hash.addNode("redis-us-east-1a.cache.amazonaws.com:6379");
hash.addNode("redis-eu-west-1b.cache.amazonaws.com:6379");
hash.addNode("10.0.1.15:6379");
hash.addNode("prod-cache-server-07.company.com:6379");
4. Gradual rebalancing - Minimal data movement:
// Before: 1000 keys distributed across 3 nodes (~333 keys each)
// Add 4th node: Only ~250 keys move (25% of total data)
// Compare to Jump Hash: ALL 1000 keys get redistributed
TraditionalConsistentHash hash = new TraditionalConsistentHash();
hash.addNode("cache-1");
hash.addNode("cache-2");
hash.addNode("cache-3");
// Simulate 1000 keys
Map<String, String> keyDistribution = new HashMap<>();
for (int i = 0; i < 1000; i++) {
String key = "key:" + i;
keyDistribution.put(key, hash.getNode(key));
}
// Add 4th node
hash.addNode("cache-4");
// Count how many keys moved
int movedKeys = 0;
for (Map.Entry<String, String> entry : keyDistribution.entrySet()) {
String oldNode = entry.getValue();
String newNode = hash.getNode(entry.getKey());
if (!oldNode.equals(newNode)) {
movedKeys++;
}
}
// Result: ~250 keys moved (25%), not all 1000 keys like Jump Hash
Traditional Hash Drawbacks:
- Memory overhead: 150 virtual nodes × 1000 physical nodes = 150K entries
- Complex implementation: Ring management, virtual nodes, wraparound logic
- Slower lookups: O(log n) TreeMap operations
- Cache pollution: Large ring structures affect CPU cache
Rendezvous Hashing: The Simple Alternative
Rendezvous Hashing (Highest Random Weight) takes a completely different approach: for each key, compute a score with every node and pick the highest scorer.
public class RendezvousHashing {
private final List<String> nodes;
public RendezvousHashing(List<String> nodes) {
this.nodes = new ArrayList<>(nodes);
}
public String getNode(String key) {
if (nodes.isEmpty()) return null;
String bestNode = null;
long maxScore = Long.MIN_VALUE;
// Compute score for each node
for (String node : nodes) {
long score = hash(key + ":" + node);
if (score > maxScore) {
maxScore = score;
bestNode = node;
}
}
return bestNode;
}
// Handle node failures gracefully
public String getAvailableNode(String key, Set<String> failedNodes) {
String bestNode = null;
long maxScore = Long.MIN_VALUE;
for (String node : nodes) {
if (!failedNodes.contains(node)) {
long score = hash(key + ":" + node);
if (score > maxScore) {
maxScore = score;
bestNode = node;
}
}
}
return bestNode;
}
// Add/remove nodes dynamically
public void addNode(String node) {
if (!nodes.contains(node)) {
nodes.add(node);
}
}
public void removeNode(String node) {
nodes.remove(node);
}
private long hash(String input) {
// Using xxHash64 for performance
return XXHashFactory.fastestInstance()
.hash64()
.hash(input.getBytes(), 0);
}
}
Rendezvous Hash in Action:
// Small microservice cluster
RendezvousHashing router = new RendezvousHashing(
Arrays.asList("auth-service", "user-service", "order-service", "payment-service")
);
// Route requests deterministically
System.out.println(router.getNode("process-payment-user123")); // → payment-service
System.out.println(router.getNode("authenticate-user123")); // → auth-service
System.out.println(router.getNode("get-profile-user123")); // → user-service
// auth-service crashes - automatic failover
Set<String> failed = Set.of("auth-service");
String fallback = router.getAvailableNode("authenticate-user123", failed);
System.out.println(fallback); // → user-service (next highest score)
// Add new service - minimal disruption
router.addNode("notification-service");
// Only ~20% of keys will be rerouted (1/5 of total traffic)
Rendezvous Hash Benefits:
- Zero memory overhead: No lookup tables or ring structures
- Perfect failover: Automatic fallback to next-best node
- Simple implementation: No virtual nodes or complex ring management
- Dynamic membership: Add/remove nodes anytime with minimal code
- Deterministic: Same key+nodes always produces same result
- Minimal data movement: Adding nth node moves only 1/n of data
Rendezvous Hash Drawbacks:
- O(n) lookup time: Must check every node for each key
- Scales poorly: 1000 nodes = 1000 hash computations per lookup
- CPU intensive: High computational overhead for large clusters
Three-Way Performance Comparison
Aspect | Jump Hash | Traditional Ring Hash | Rendezvous Hash |
---|---|---|---|
Memory Usage | O(1) - Zero overhead | O(n×v) - ~46MB for 1000 nodes | O(1) - Zero overhead |
Lookup Speed | O(ln n) - ~10-15 cycles | O(log n) - ~50-100 cycles | O(n) - ~1000 cycles for 1000 nodes |
Add Node | All data reshuffles | 1/n data moves | 1/n data moves |
Remove Node | ❌ Not supported | ✅ Graceful removal | ✅ Automatic failover |
Node Failure | ❌ Cache miss | ✅ Automatic failover | ✅ Next-best selection |
Implementation | 5 lines of code | ~100 lines of code | ~20 lines of code |
Best For | Planned scaling | Large dynamic clusters | Small dynamic clusters |
When to Use Each Algorithm
Choose Jump Hash when:
- Planned scaling: You control when nodes are added
- Stable infrastructure: Nodes rarely fail unexpectedly
- Memory constraints: Every MB of RAM matters
- Performance critical: Maximum lookup speed needed
- Simple deployment: Minimal code complexity preferred
Examples: Database sharding, analytics workloads, batch processing systems
Choose Traditional Consistent Hash when:
- Dynamic environments: Nodes join/leave frequently
- High availability: Automatic failover required
- Unpredictable failures: Nodes can crash anytime
- Flexible infrastructure: Need arbitrary node naming
- Distributed caches: CDNs, cache clusters, microservices
Examples: Redis clusters, CDN routing, microservice discovery, distributed caches
Choose Rendezvous Hash when:
- Small clusters: Up to 50-100 nodes maximum
- Simple implementation: Want minimal code complexity
- Dynamic membership: Frequent node additions/removals
- Perfect failover: Need guaranteed fallback behavior
- Development/testing: Prototyping distributed systems
Examples: Microservice routing, small cache clusters, development environments, edge computing
The Universal Impact
These three algorithms solve the same fundamental problem but represent different points on the performance-flexibility spectrum:
- Jump Hash: Maximum performance through constraint acceptance
- Ring Hash: Maximum flexibility through complexity investment
- Rendezvous Hash: Balanced simplicity for smaller scale
Real-world adoption:
- Jump Hash: ClickHouse, Google internal systems, high-performance analytics platforms
- Ring Hash: Redis Cluster, Amazon DynamoDB, Apache Cassandra, CDN routing systems
- Rendezvous Hash: Small microservice clusters, development tools, edge computing platforms
The choice between algorithms reflects a fundamental principle in distributed systems: every architectural decision involves trade-offs. Performance, flexibility, operational complexity, and infrastructure constraints all influence the optimal choice.
Understanding all three approaches helps you make informed decisions rather than defaulting to “industry standard” solutions. Sometimes the simplest algorithm (Rendezvous) works best for your use case, sometimes you need maximum performance (Jump Hash), and sometimes flexibility trumps all (Ring Hash).
Redis Hash Slots: A Hybrid Approach
Redis Cluster uses a unique hybrid approach that doesn’t fit neatly into our three categories. Instead of using consistent hashing directly, Redis pre-divides the keyspace into exactly 16384 hash slots.
Why 16384 Slots?
Redis chose 16384 as the magic number for several practical reasons:
1. Network efficiency: Each Redis node maintains a bitmap of which slots it owns. With 16384 slots, this bitmap is exactly 2KB (16384 bits ÷ 8 = 2048 bytes). This fits comfortably in network packets and is small enough to gossip efficiently between nodes.
2. Memory optimization: 16384 = 2^14, making bit operations extremely fast. The slot calculation becomes a simple bitwise AND operation with a 14-bit mask.
3. Practical cluster limits: Redis clusters typically run 3-1000 nodes. With 16384 slots, you get good distribution even with small clusters while avoiding the complexity of millions of virtual nodes.
How Redis Distributes Slots
// Redis slot calculation (simplified)
public class RedisSlotDistribution {
private static final int HASH_SLOTS = 16384;
public static int calculateSlot(String key) {
// Use CRC16 for slot calculation
return CRC16.crc16(key.getBytes()) % HASH_SLOTS;
}
// Example: 20 nodes in cluster
public static void distributeSlots(int nodeCount) {
int slotsPerNode = HASH_SLOTS / nodeCount; // 819
int remainingSlots = HASH_SLOTS % nodeCount; // 5
System.out.println("Slots per node: " + slotsPerNode);
System.out.println("Extra slots for first " + remainingSlots + " nodes");
// First 5 nodes get 820 slots, remaining 15 nodes get 819 slots
// Node 0: slots 0-819 (820 slots)
// Node 1: slots 820-1639 (820 slots)
// Node 2: slots 1640-2459 (820 slots)
// Node 3: slots 2460-3279 (820 slots)
// Node 4: slots 3280-4099 (820 slots)
// Node 5: slots 4100-4918 (819 slots)
// ...
// Node 19: slots 15565-16383 (819 slots)
}
}
Benefits of the Hash Slots Approach
1. Predictable resharding: When adding/removing nodes, you know exactly which slot ranges to move. No complex ring calculations.
2. Simple replication: Each slot can be replicated to multiple nodes independently. If slot 1234 is on node A, its replica can be on any other node.
3. Efficient cluster management: The cluster state (which node owns which slots) is compact and can be gossiped efficiently.
4. Deterministic routing: Any client can calculate which node handles a key without asking the cluster.
This approach combines the benefits of pre-sharding (like database sharding) with the flexibility of consistent hashing, making it ideal for Redis’s use case of medium-sized clusters with frequent data access.
References
- Original Google Paper (2014) - “A Fast, Minimal Memory, Consistent Hash Algorithm” by John Lamping and Eric Veach