Skip to content

h30s/ShardCache

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ShardCache

I built this distributed in-memory cache in Go to mess around with consistent hashing and synchronous replication. It handles automatic sharding, failover, and has pluggable eviction policies (LRU/LFU) along with a token-bucket rate limiter.

How it Works

  • Consistent Hashing: The cluster uses a hash ring with 150 virtual nodes per server. Clients calculate the hash and route requests directly to the right node, which minimizes data shuffling when a server crashes or joins.
  • Sync Replication (RF=2): Every key gets 1 primary and 1 replica. Writes block until the replica successfully acknowledges the data. It costs write latency, but guarantees strong durability.
  • Auto-Failover: Nodes ping each other every second. Miss 3 pings? You're kicked off the local hash ring. When a primary dies, the replica naturally takes over because it sits next to it on the ring.
  • Pluggable Eviction: Built a clean interface so you can easily swap between LRU (Least Recently Used) and LFU (Least Frequently Used).

Spinning it Up

Just use Docker Compose:

docker-compose up --build

This boots up a 5-node cluster exposed on localhost ports 8081-8085.

Breaking Things (Failover Test)

I wrote an automated chaos test to prove the failover actually works. It spins up 5 local nodes, sets a key, completely kills the primary node responsible for that key, and verifies that the replica promotes itself and serves the data.

cd test
go test -v .

Benchmarks

To see how it performs under pressure, I wrote a benchmark tool that mimics real-world "hot keys" using a skewed Zipfian distribution.

# Start your servers in the background with rate-limiting disabled, then:
./bench_bin -peers localhost:8081,localhost:8082,localhost:8083,localhost:8084,localhost:8085 -ops 50000 -keys 5000

Results (50,000 ops, 50 parallel workers):

Policy Throughput Hit Ratio
LRU ~85,000 ops/sec ~83%
LFU ~84,500 ops/sec ~84%

Note: LFU edges out LRU on hit ratio here because, under heavy Zipfian skew, it strictly protects the absolute hottest keys, whereas LRU can get flushed by a random burst of cold reads. Throughput is basically identical because I implemented LFU in O(1) time using a map of doubly-linked lists.

The Edge Case: What if the primary dies mid-write?

Because I went with Synchronous Replication, a write looks like this:

  1. Primary gets write.
  2. Primary forwards it to the Replica.
  3. Replica ACKs.
  4. Primary applies it locally.
  5. Primary ACKs the client.

The Race Conditions:

  • If the primary dies before step 2: The replica never gets it. The client gets a connection error. The write fails cleanly.
  • If the primary dies after step 3, but before step 5: The replica has the data, but the client connection drops. The client gets an error and thinks the write failed, even though it survived. (A false negative).

TL;DR: If the client receives a "Success" response, that data is definitively on two nodes. We trade a bit of write speed and accept potential false negatives to guarantee we never lose an acknowledged write.

About

ShardCache is a distributed in-memory cache written in Go. It supports automatic sharding via consistent hashing, primary-replica synchronous replication, automatic failover, pluggable eviction policies (LRU, LFU), and token-bucket rate limiting.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors