https://www.usenix.org/system/files/conference/nsdi13/nsdi13-final170_update.pdf
- Summary: https://medium.com/@shagun/scaling-memcache-at-facebook-1ba77d71c082
- Problems
- Low latency access to shared storage pool at low cost
- K/V Interface, Scalable, eventually consistent.
- Demand filled look-aside cache.
- Users consume orders of magnitude more content than they create.
- Summary
-
Consistent hashing to distribute keys across servers.
- Web servers have to routinely communicate with multiple cache servers to satisfy user req. This may cause incast congestion or a single server becoming the bottleneck.[Long tail]
-
Clients
- serialization, compression, request-routing, error handling, req batching
- Maintain a map of all available servers, updated via auxiliary config system.
- Batch requests to individual partitions to fetch multiple items concurrently
- UDP for GET requests
- Bypass mcrouter, no TCP overhead; custom detection of dropped out of order packets via seq numbers.
- TCP for set and delete via mcrouter
-
MCRouter
- A proxy deployed to web servers
- Request Coalescing via the open TCP connections to Memcached servers.
-
Favor Deletes for invalidations over updates. Deletes are Idempotent.
-
Cache invalidations via Daemons reading MySQL update log.