天天看點

Introducing mcrouter: A memcached protocol router for scaling memcached deployments

most web-based services begin as a collection of front-end application servers paired with databases used to manage data storage. as they grow, the databases are augmented with caches to store frequently-read pieces of data and improve site performance. often, the ability to quickly access data moves from being an optimization to a requirement for a site. this evolution of cache from neat optimization to necessity is a common path that has been followed by many large web scale companies, including facebook, twitter[1], instagram, reddit, and many others.

since any client that wants to talk to memcached can already speak the standard ascii memcached protocol, we use that as the common api and enter the picture silently. to a client, mcrouter looks like a memcached server. to a server, mcrouter looks like a normal memcached client. but mcrouter's feature-rich configurability makes it more than a simple proxy.

some features of mcrouter are listed below. in the following, a “destination” is a memcached host (or some other cache service that understands the memcached protocol) and “pool” is a set of destinations configured for some workload — e.g., a sharded pool with a specified hashing function, or a replicated pool with multiple copies of the data on separate hosts. finally, pools can be organized into multiple clusters.

Introducing mcrouter: A memcached protocol router for scaling memcached deployments

standard open source memcached ascii protocol support: any client that can talk the memcached protocol can already talk to mcrouter — no changes are needed. mcrouter can simply be simply dropped in between clients and memcached boxes to take advantage of its functionality.

connection pooling: multiple clients can connect to a single mcrouter instance and share the outgoing connections, reducing the number of open connections to memcached instances.

multiple hashing schemes: mcrouter provides a proven consistent hashing algorithm (furc_hash) that allows distribution of keys across many memcached instances. hostname hashing is useful for selecting a unique replica per client. there are a number of other hashes useful in specialized applications.

Introducing mcrouter: A memcached protocol router for scaling memcached deployments

prefix routing: mcrouter can route keys according to common key prefixes. for example, you can send all keys starting with “foo” to one pool, “bar” prefix to another pool, and everything else to a “wildcard” pool. this is a simple way to separate different workloads.

Introducing mcrouter: A memcached protocol router for scaling memcached deployments

replicated pools: a replicated pool has the same data on multiple hosts. writes are replicated to all hosts in the pool, while reads are routed to a single replica chosen separately for each client. this could be done either due to per-host packet limitations where a sharded pool would not be able to handle the read rate; or for increased availability of the data (one replica going down doesn't affect availability due to automatic failover).

Introducing mcrouter: A memcached protocol router for scaling memcached deployments

production traffic shadowing: when testing new cache hardware, we found it extremely useful to be able to route a complete copy of production traffic from clients. mcrouter supports flexible shadowing configuration. it's possible to shadow test a different pool size (re-hashing the key space), shadow only a fraction of the key space, or vary shadowing settings dynamically at runtime.

online reconfiguration: mcrouter monitors its configuration files and automatically reloads them on any file change; this loading and parsing is done on a background thread and new requests are routed according to the new configuration as soon as it's ready. there's no extra latency from client's point of view.

flexible routing: configuration is specified as a graph of small routing modules called “route handles,” which share a common interface (route a request and return a reply) and which can be composed freely. route handles are easy to understand, create, and test individually, allowing for arbitrarily complex logic when used together. for example: an “all-sync” route handle will be set up with multiple child route handles (which themselves could be arbitrary route handles). it will pass a request to all of its children and wait for all of the replies to come back before returning one of these replies. other examples include, among many others, “all-async” (send to all but don't wait for replies), “all-majority” (for consensus polling), and “failover” (send to every child in order until an non-error reply is returned). expanding a pool can be done quickly by using a “cold cache warmup” route handle on the pool (with the old set of servers as the warm pool). moving this handle handle up the stack will allow for an entire cluster to be warmed up from a warm cluster.

destination health monitoring and automatic failover: mcrouter keeps track of the health status of each destination. if mcrouter marks a destination as unresponsive, it will fail over incoming requests to an alternate destination automatically (fast failover) without attempting to send them to the original destination. at the same time health check requests will be sent in the background, and as soon as a health check is successful, mcrouter will revert to using the original destination. we distinguish between “soft errors” (e.g., data timeouts) that are allowed to happen a few times in a row and “hard errors” (e.g., connection refused) that cause a host to be marked unresponsive immediately. needless to say, all of this is completely transparent to the client.

Introducing mcrouter: A memcached protocol router for scaling memcached deployments
Introducing mcrouter: A memcached protocol router for scaling memcached deployments

cold cache warm up: mcrouter can smooth the performance impact of starting a brand new empty cache host or set of hosts (as large as an entire cluster) by automatically refilling it from a designated “warm” cache.

Introducing mcrouter: A memcached protocol router for scaling memcached deployments

broadcast operations: by adding a special prefix to a key in a request, it's easy to replicate the same request into multiple pools and/or clusters.

Introducing mcrouter: A memcached protocol router for scaling memcached deployments

reliable delete stream: in a demand-filled look-aside cache, it's important to ensure all deletes are eventually delivered to guarantee consistency. mcrouter supports logging delete commands to disk in cases when the destination is not accessible (due to a network outage or other failure). a separate process then replays those deletes asynchronously. this is done transparently to the client — the original delete command is always reported as successful.

Introducing mcrouter: A memcached protocol router for scaling memcached deployments

multi-cluster support: configuration management for large multi-cluster setups is easy. a single config can be distributed to all clusters and, depending on command line options, mcrouter will interpret the config based on its location.

rich stats and debug commands: mcrouter exports many internal counters (via a “stats” command; also to a json file on disk). introspection debug commands are also available, which can answer questions like “which host would a particular request go to?” at runtime.

quality of service: mcrouter allows throttling the rate of any type of request (e.g., get/set/delete) at any level (per-host, per-pool, per-cluster), rejecting requests over a specified limit. we also support rate limit requests to slow delivery.

large values: mcrouter can automatically split/re-stitch large values that would not normally fit in a memcached slab.

multi-level caches: mcrouter supports local/remote cache setup, where values would be looked up locally first and automatically set in a local cache from remote after fetching.

ipv6 support: we have strong support internally for ipv6 at facebook, so mcrouter is ipv6 compatible out of the box.

ssl support: mcrouter supports ssl connections (incoming or outgoing), as long as the client or the destination hosts support it as well. it is also possible to set up multiple mcrouters in series, in which case the middle connection between mcrouters can be over ssl out of the box.

multi-threaded architecture: mcrouter can take full advantage of multicore systems by starting one thread per core.

we invite software engineers using memcached everywhere to evaluate mcrouter and see if it helps to simplify the site administration while providing the new capabilities listed above (shadow testing, cold cache warmup, and so on). instagram used mcrouter for the last year, before transitioning to facebook's infrastructure, so mcrouter is proven in an amazon web services setup. prior to open sourcing, we partnered with reddit for a limited beta test, and they are currently running mcrouter in production for some of their caches.

we would also love to see patches come back that will make mcrouter more helpful to you and to others in the memcached community.

footnotes:

繼續閱讀