I built this distributed in-memory cache in Go to mess around with consistent hashing and synchronous replication. It handles automatic sharding, failover, and has pluggable eviction policies (LRU/LFU) along with a token-bucket rate limiter.
- Consistent Hashing: The cluster uses a hash ring with 150 virtual nodes per server. Clients calculate the hash and route requests directly to the right node, which minimizes data shuffling when a server crashes or joins.
- Sync Replication (RF=2): Every key gets 1 primary and 1 replica. Writes block until the replica successfully acknowledges the data. It costs write latency, but guarantees strong durability.
- Auto-Failover: Nodes ping each other every second. Miss 3 pings? You're kicked off the local hash ring. When a primary dies, the replica naturally takes over because it sits next to it on the ring.
- Pluggable Eviction: Built a clean interface so you can easily swap between LRU (Least Recently Used) and LFU (Least Frequently Used).
Just use Docker Compose:
docker-compose up --build
This boots up a 5-node cluster exposed on localhost ports 8081-8085.
I wrote an automated chaos test to prove the failover actually works. It spins up 5 local nodes, sets a key, completely kills the primary node responsible for that key, and verifies that the replica promotes itself and serves the data.
cd test
go test -v .
To see how it performs under pressure, I wrote a benchmark tool that mimics real-world "hot keys" using a skewed Zipfian distribution.
# Start your servers in the background with rate-limiting disabled, then:
./bench_bin -peers localhost:8081,localhost:8082,localhost:8083,localhost:8084,localhost:8085 -ops 50000 -keys 5000
Results (50,000 ops, 50 parallel workers):
| Policy | Throughput | Hit Ratio |
|---|---|---|
| LRU | ~85,000 ops/sec | ~83% |
| LFU | ~84,500 ops/sec | ~84% |
Note: LFU edges out LRU on hit ratio here because, under heavy Zipfian skew, it strictly protects the absolute hottest keys, whereas LRU can get flushed by a random burst of cold reads. Throughput is basically identical because I implemented LFU in O(1) time using a map of doubly-linked lists.
Because I went with Synchronous Replication, a write looks like this:
- Primary gets write.
- Primary forwards it to the Replica.
- Replica ACKs.
- Primary applies it locally.
- Primary ACKs the client.
The Race Conditions:
- If the primary dies before step 2: The replica never gets it. The client gets a connection error. The write fails cleanly.
- If the primary dies after step 3, but before step 5: The replica has the data, but the client connection drops. The client gets an error and thinks the write failed, even though it survived. (A false negative).
TL;DR: If the client receives a "Success" response, that data is definitively on two nodes. We trade a bit of write speed and accept potential false negatives to guarantee we never lose an acknowledged write.