PetraCache: Building a Memcached-Compatible Cache with RocksDB

Feb 3, 2026

The Problem

Memcached is fast. Really fast. But when it restarts, your cache is gone. Cold cache means every request hits your database until the cache warms up again. At scale, this can take down your entire system.

I wanted to explore a different approach: what if we could add persistence to memcached without changing the protocol?

Why memcached Protocol?

mcrouter is Meta’s memcached router—5 billion requests per second in production. Consistent hashing, failover, replication, connection pooling. All battle-tested at massive scale.

But mcrouter only speaks memcached protocol. To leverage it, your backend needs to be memcached-compatible.

That’s the gap PetraCache fills: memcached protocol + persistent storage. Drop it behind mcrouter and you get distributed caching with durability—without reinventing the routing layer.

Enter PetraCache

PetraCache is a memcached-compatible server backed by RocksDB. That’s it.

┌──────────────┐     ┌───────────┐     ┌─────────────────────────┐
│ Your App     │────▶│ mcrouter  │────▶│ PetraCache              │
│ (memcache    │     │ (routing, │     │  ├─ memcached protocol  │
│  client)     │     │  failover)│     │  ├─ RocksDB storage     │
└──────────────┘     └───────────┘     │  └─ Data survives       │
                                       │     restarts            │
                                       └─────────────────────────┘

Your app thinks it’s talking to memcached. mcrouter handles routing. PetraCache handles storage. Everyone does one job well.

Technical Decisions

Why RocksDB?

RocksDB is purpose-built for write-heavy workloads like caching:

LSM-tree architecture: Sequential writes to disk. Writes go to memory first (memtable), flush to disk in batches. Perfect for cache SET operations.
Block cache: Configurable LRU cache keeps hot data in memory. Cache hit? Sub-microsecond response, competitive with pure in-memory stores.
Compaction filters: TTL expiration happens during background compaction—no separate cleanup jobs needed.
Compression per level: LZ4 for upper levels (speed), ZSTD for bottom levels (ratio). Tunable per use case.
Battle-tested at scale: Powers Meta’s distributed systems, Netflix’s data infrastructure, CockroachDB, TiKV. Billions of operations per second across the industry.
Tunable for any workload: 100+ configuration options. Prioritize write throughput, read latency, or memory efficiency—it’s all configurable.

RocksDB gives you in-memory speed for hot data and persistence for everything else—with 10+ years of production hardening behind it.

TTL Expiration: Two Strategies

memcached supports TTL (time-to-live) on keys. PetraCache implements expiration two ways:

1. Lazy expiration (on read)

pub fn get(&self, key: &[u8]) -> Result<Option<StoredValue>> {
    match self.db.get(key)? {
        Some(bytes) => {
            let value = StoredValue::decode(&bytes)?;
            if value.is_expired() {
                self.db.delete(key)?;
                Ok(None)  // Pretend it doesn't exist
            } else {
                Ok(Some(value))
            }
        }
        None => Ok(None),
    }
}

When you GET an expired key, we delete it and return nothing. Simple.

2. Compaction filter (background cleanup)

fn ttl_compaction_filter(_level: u32, _key: &[u8], value: &[u8]) -> CompactionDecision {
    if value.len() >= 8 {
        let expire_at = u64::from_le_bytes(value[0..8].try_into().unwrap());
        if expire_at != 0 && current_timestamp() >= expire_at {
            return CompactionDecision::Remove;
        }
    }
    CompactionDecision::Keep
}

During RocksDB compaction, we check each key’s expiration. Expired keys are dropped, reclaiming disk space without explicit deletes.

Value Format

Each value stored in RocksDB:

[8 bytes: expire_at][4 bytes: flags][N bytes: data]

expire_at first: compaction filter can check TTL without decoding data
Little-endian: matches the CPU, no conversion needed
Fixed header: O(1) access to metadata

Zero-Copy Parsing

The memcached protocol is text-based:

set mykey 0 3600 5\r\n
hello\r\n

Parsing this without allocations:

pub enum Command<'a> {
    Get { keys: Vec<Cow<'a, [u8]>> },
    Set { key: Cow<'a, [u8]>, data: Cow<'a, [u8]>, ... },
    // ...
}

Cow<'a, [u8]> (Copy-on-Write) borrows from the input buffer when possible, only allocating when necessary. For most requests, zero heap allocations during parsing.

Why Not spawn_blocking for RocksDB?

RocksDB operations can block (disk I/O). Tokio’s golden rule: don’t block the async runtime.

I broke the rule intentionally.

// We do this (direct call)
let value = storage.get(key)?;

// Instead of this
let value = tokio::task::spawn_blocking(move || storage.get(key)).await?;

Why? Block cache hits are ~100 nanoseconds. spawn_blocking overhead is ~5-10 microseconds. For a cache with 95%+ hit ratio, the overhead exceeds the benefit.

If your working set exceeds block cache (lots of disk reads), reconsider this.

Performance

Single instance, Apple Silicon, 1KB values, 50% GET / 50% SET:

$ memtier_benchmark -s 127.0.0.1 -p 11211 --protocol=memcache_text \
    --clients=10 --threads=2 --test-time=30 --ratio=1:1 --data-size=1000

Type         Ops/sec     p50 Latency     p99 Latency   p99.9 Latency
--------------------------------------------------------------------
Sets        68504.04         0.14ms          0.37ms          0.49ms
Gets        68503.77         0.14ms          0.33ms          0.44ms
Totals     137007.81         0.14ms          0.35ms          0.47ms

137K ops/sec with sub-millisecond latency. Good enough for most use cases.

Scale horizontally with mcrouter: add more PetraCache instances, mcrouter distributes keys via consistent hashing.

mcrouter Configuration Example

Here’s an example setup: Istanbul as primary, Ankara as async replica. All reads go to Istanbul. Writes go to Istanbul first (sync), then replicate to Ankara (async).

{
  "pools": {
    "istanbul": {
      "servers": [
        "istanbul-petracache-1:11211",
        "istanbul-petracache-2:11211"
      ]
    },
    "ankara": {
      "servers": [
        "ankara-petracache-1:11211",
        "ankara-petracache-2:11211"
      ]
    }
  },
  "route": {
    "type": "OperationSelectorRoute",
    "default_policy": {
      "type": "FailoverRoute",
      "children": [
        { "type": "PoolRoute", "pool": "istanbul" },
        { "type": "PoolRoute", "pool": "ankara" }
      ]
    },
    "operation_policies": {
      "set": {
        "type": "AllInitialRoute",
        "children": [
          { "type": "PoolRoute", "pool": "istanbul" },
          { "type": "PoolRoute", "pool": "ankara" }
        ]
      },
      "delete": {
        "type": "AllInitialRoute",
        "children": [
          { "type": "PoolRoute", "pool": "istanbul" },
          { "type": "PoolRoute", "pool": "ankara" }
        ]
      }
    }
  }
}

What this does:

GET: Routes to Istanbul, fails over to Ankara if Istanbul is down
SET: Writes to Istanbul (sync), then Ankara (async)
DELETE: Same as SET—Istanbul first, Ankara async

Two key route types here:

FailoverRoute: Tries Istanbul first. If it fails (timeout, connection refused), automatically retries on Ankara. No manual intervention needed.
AllInitialRoute: Waits for the first child (Istanbul) to respond, then fires off the rest (Ankara) without waiting. Your app sees Istanbul latency, Ankara gets eventual consistency.

Istanbul down? GETs automatically fail over to Ankara. When Istanbul recovers, it starts serving again. Zero config changes needed.

Roadmap

PetraCache covers the core operations. Next up:

add, replace (conditional writes)
incr, decr (atomic counters)
cas (compare-and-swap)
stats (server statistics)
flush_all (clear all keys)

For most cache workloads—GET/SET/DELETE—it’s ready to go.

The Philosophy

Building PetraCache reinforced a core belief: the best engineering is often assembly, not invention.

mcrouter handles distribution—Meta’s 10 years of battle-testing at 5B req/sec
RocksDB handles storage—proven across thousands of deployments at Netflix, Meta, and beyond
memcached protocol handles compatibility—20 years of client library ecosystem

The result is ~2,000 lines of Rust that connect these proven components into something new. No need to reinvent consistent hashing. No need to build a custom storage engine. No need to design a new protocol.

You don’t need to reinvent the wheel to build a great car. Sometimes the engineering is in knowing which wheels to pick and how to connect them.

Lessons Learned

1. Solve one problem

I needed persistence. That’s it. mcrouter already solved distribution. I didn’t build a distributed cache—I built a storage backend.

2. Measure before optimizing

I assumed RocksDB blocking calls would be a problem. Benchmarks showed they weren’t. spawn_blocking would have added latency for no benefit.

3. Simple beats clever

No custom storage format—RocksDB handles it
No custom network protocol—memcached ASCII works
No custom distribution—mcrouter handles it

The best code is code you don’t write.

4. Trade-offs are features

Every architectural decision is a trade-off. Document them clearly and move on.

5. Existing solutions are underrated

Before writing code, ask: “Has someone already solved this?” Usually, yes. Your job is to find it and integrate it well.

Try It

git clone https://github.com/umit/petracache
cd petracache
cargo build --release
./target/release/petracache config.toml

Or with Docker (coming soon).

PetraCache is open source under MIT license. Contributions welcome.

Petra (πέτρα) means “rock” in Greek—a nod to RocksDB.