Parse the cache design of a distributed system

First, the introduction to caching

1.1 What is caching

A cache is a buffer for data exchange. The essence of the cache is a memory Hash. Caching is a design that trades space for time, and the goal is to be faster and closer: greatly improved.

Faster storage (device) to write/read data;
Cache data to the location closest to the app;
Cache data to the location closest to the user.

Caching is an integral part of the hardware or software used to store data to enable faster subsequent access to the corresponding data. The data in the cache may be calculated in advance, a copy of the data, and so on. Typical application scenarios: there are cpu cache, disk cache, etc. The caching mentioned in this article mainly refers to the caching components used in Internet applications.

Cache hit ratio is an important metric of caching, the higher the hit ratio, the better.

Cache hit ratio = number of reads from cache / total number of reads

1.2 When caching is required

The introduction of caching increases the complexity of the system. Therefore, before introducing caching, you need to weigh whether it is worth it, and the points to consider are as follows:

CPU overhead - If you apply a compute that consumes a lot of CPU, consider caching its compute results. Typical scenarios: complex, frequently invoked regular computing; distributed computing intermediate state, etc.
IO cost - If the database connection pool is busy, consider caching its query results.

Introducing caching in the data layer has several benefits:

Increase data read speed.
Improve the system scalability, and improve the system carrying capacity by expanding the cache.
Reducing storage costs, Cache+DB can take on the amount of requests that originally required multiple DB to bear, saving machine costs.

1.3 Fundamentals of Caching

Depending on the business scenario, caching is generally used in the following ways:

Lazy (triggered on read): Queries the data in the DB first, and then writes the relevant data to the Cache.
Hungry (triggered on write): After writing to the DB, the relevant data is also written to the Cache.
Periodic refresh: Suitable for periodic tasks that run data, or list-based data, and do not require absolute real-time.

1.4 Cache retirement strategies

Types of cache retirement:

1) Space-based: Set the cache space size.

2) Based on capacity: Set the number of cache storage records.

3) Time-based

TTL (Time To Live) caches the time from creation to expiration of data.
TTI (Time To Idle) how long the cached data has not been accessed.

Cache retirement algorithm:

1) FIFO: Fifo: First in, first out. In this elimination algorithm, the first to enter the cache will be eliminated first. This is arguably the simplest, but it leads to a low hit rate. Imagine if we had a data that was accessed very frequently and all the data was accessed first, and those that were not very high were accessed later, which would squeeze out our first data but his access frequency was very high.

2) LRU: The least recently used algorithm. The above problem is avoided in this algorithm, and every time the data is accessed, it is placed at the end of our team, and if the data needs to be eliminated, it is only necessary to eliminate the team leader. But there is still a problem with this, if there is a data in the first 59 minutes of 1 hour accessed 10,000 times (it can be seen that this is a hot spot data), and then the next minute did not access this data, but there are other data accesses, which led to our hot data being eliminated.

3) LFU: The least frequently used recently. In this algorithm, the above is optimized, using additional space to record the frequency of use of each data, and then select the lowest frequency for elimination. This avoids the problem of time periods that the LRU cannot handle.

These three cache elimination algorithms, the implementation complexity is higher than one, and the same hit rate is also better than one. In general, the solution we choose is to be centered, that is, the implementation cost is not too high, and the hit rate is OK LRU.

Second, the classification of the cache

From a deployment perspective, caching can be divided into client-side caching and server-side caching.

Client-side caching

HTTP caching
Browser cache
APP caching (1, Android 2, IOS)

Server-side caching

CDN cache: Holds static resources such as HTML, CSS, and JS.
Reverse proxy caching: Static separation, only caching static resources requested by users.
Database caching: Databases (such as MySQL) generally have their own caches, but are not recommended because of hit rates and update frequencies.
In-process caching: Cache commonly used data such as app dictionaries.
Distributed cache: Caches hotspot data in a database.

Among them, cdN cache, reverse proxy cache, and database cache are generally maintained by full-time personnel (operation and maintenance, DBA). Back-end development generally focuses on in-process caching and distributed caching.

2.1 HTTP caching

2.2 CDN caching

The CDN caches the data to the server closest to the user's physical distance, allowing the user to get the requested content nearby. CDNs generally cache static resource files (pages, scripts, images, videos, files, etc.).

Domestic networks are extremely complex, and cross-carrier network access can be slow. In order to solve the problem of cross-carrier or local user access, CDN applications can be deployed in important cities. Enable users to get the content they need nearby, reduce network congestion, and improve user access response speed and hit rate.

Parse the cache design of a distributed system

Image courtesy of Why use a CDN

2.1.1 Principles of CDN

The basic principle of CDN is to widely adopt a variety of cache servers, these cache servers are distributed in areas or networks where user access is relatively concentrated, and when users access websites, use global load technology to point users' access to the nearest working cache server, and the cache server directly responds to user requests.

1) Network path before the CDN application is deployed:

Request: Native Network (LAN) = > Carrier Network = > Application Server Room
Response: Application Server Room = > Carrier Network = > Native Network (LAN)

Regardless of the complex network, the request to the response requires 3 nodes and 6 steps to complete a user access operation.

2) Network path after deploying the CDN application:

Request: Native Network (LAN) = > Carrier Network
Response: Carrier Network = > Native Network (LAN)

Regardless of the complex network, from request to response, it takes 2 nodes and 2 steps to complete a user access operation. Compared to not deploying CDN services, 1 node, 4 steps fewer access. Greatly improve the response speed of the system.

2.1.2 CDN Features

merit

Local Cache acceleration: Improve access speed, especially sites with a large number of images and static pages;
Cross-operator network acceleration: Eliminates the impact of the bottleneck of interconnection between different operators, realizes cross-operator network acceleration, and ensures that users in different networks can get good access quality;
Remote acceleration: Remote access users intelligently select cache servers according to DNS load balancing technology, select the fastest Cache servers, and speed up remote access;
Bandwidth optimization: Automatically generates the remote Mirror (mirror) cache server of the server, reads data from the cache server when the remote user accesses, reduces the bandwidth of remote access, shares network traffic, reduces the load on the original site WEB server and other functions.
Cluster anti-attack: Widely distributed CDN nodes plus intelligent redundancy mechanism between nodes can effectively prevent hacking and reduce the impact of various D.D.o.S attacks on websites, while ensuring better service quality.

shortcoming

It is not suitable for caching dynamic resources

Solution: Mainly cache static resources, establish multi-level cache or near real-time synchronization of dynamic resources;

There is a data consistency issue

1. Solution (mainly to find a balance between performance and data consistency).

2. Set the cache expiration time (1 hour, synchronize data after expiration).

3. Set the version number for the resource.

2.2 Reverse proxy caching

Reverse proxy means that the proxy server accepts connection requests on the Internet, then forwards the request to the server on the internal network, and returns the results obtained from the server to the client requesting the connection on the Internet, at which point the proxy server behaves as a reverse proxy server.

2.2.1 Reverse proxy caching principles

The reverse proxy sits on the same network as the application server and handles all requests to the WEB server. How reverse proxy caching works:

If the page requested by the user is cached on the proxy server, the proxy server sends the cached content directly to the user.
If there is no cache, a request is made to the WEB server to retrieve the data, which is cached locally and then sent to the user.

This reduces the load on the WEB server by reducing the number of requests to the WEB server.

Reverse proxy caching is typically for static resources, and dynamic resource requests are forwarded to the application server for processing. Common caching application servers are Varnish, Ngnix, Squid.

2.2.2 Reverse proxy cache comparison

Commonly used proxy caches are Varnish, Squid, Ngnix, a simple comparison is as follows:

Varnish and Squid are professional cache services, and Ngnix requires third-party module support;
Varnish uses in-memory caching, which avoids frequent swapping of files in memory and disk, and has higher performance than Squid;
Varnish is a memory cache, so the support for small files such as css, js, small images is great, and the persistence cache of the backend can be Squid or ATS;
Squid has full and large functions, suitable for all kinds of static file caching, generally hanging a HAProxy or Ngnix on the front end to do load balancing and run multiple instances;
Nginx uses a third-party module ncache to do buffering, the performance is basically up to Varnish, generally used as a reverse proxy, can achieve simple caching.

3. In-process caching

In-process cache refers to the internal cache of the application, a standard distributed system, generally composed of multi-level caches. The local cache is the closest cache to the application, and data can generally be cached to the hard disk or memory.

Hard disk caching: Caches data to the hard disk and reads from the hard disk when it is read. The principle is to read the native file directly, which reduces the network transfer consumption and is faster than reading the database over the network. It can be applied in scenarios where the speed requirements are not very high, but a large amount of cache storage is required.
In-memory caching: Storing data directly into native memory and maintaining cached objects directly through the program is the fastest way to access.

Common local cache implementations: HashMap, Guava Cache, Caffeine, Ehcache.

3.1 ConcurrentHashMap

The simplest in-process caching can be implemented with either the HASHMap or ConcurrentHashMap that comes with the JDK.

Applicable scenario: Cached data that does not need to be retired.
Disadvantages: Cache retirement cannot be performed, and memory will grow indefinitely.

3.2 LRUHashMap

A simple LRUHashMap can be implemented by inheriting LinkedHashMap. Override the removeEldestEntry method to accomplish a simple minimally used algorithm.

shortcoming:

Lock competition is serious and the performance is relatively low.
Expiration times are not supported.
Automatic refresh is not supported.

3.3 Guava Cache

Addresses several drawbacks in LRUHashMap. Guava Cache adopts a concept similar to ConcurrentHashMap, which adds locks in segments to reduce lock competition.

Guava Cache does not expire expired entry immediately (that is, no background thread has been scanning), but expires when reading and writing operations, which has the advantage of avoiding global locking when the background thread scans. Directly through the query, determine whether it meets the refresh conditions, and refresh.

3.4 Caffeine

Caffeine implements W-TinyLFU (a variant of the LFU + LRU algorithm) with a hit ratio and read and write throughput that is significantly better than the Guava Cache. The implementation principle is more complex, you can refer to the cache evolutionary history you should know.

3.5 Ehcache

EhCache is a pure Java in-process caching framework with fast, lean and other characteristics, and is the default CacheProvider in Hibernate.

merit

Fast and simple;
Support a variety of caching strategies: LRU, LFU, FIFO elimination algorithm;
There are two levels of cached data: memory and disk, so there is no need to worry about capacity;
Cached data is written to disk during virtual machine restart;
Distributed caching can be performed through RMI, pluggable APIs, etc.;
Listening interface with cache and cache manager;
Support multiple cache manager instances, as well as multiple cache areas for one instance;
Provides a cache implementation of Hibernate.

shortcoming

Very large disk space when using disk caches;
The security of the data is not guaranteed;
Although distributed caching is supported, it is not efficient (through multicast, synchronizing data between different nodes).

3.6 In-process cache comparison

Comparison of common in-process caching techniques:

ConcurrentHashMap: Relatively suitable for caching relatively fixed elements and a smaller number of caches. Although it is a bit inferior from the table above, it is still used a lot in various frameworks because it is a class that comes with the JDK, such as the Methods we can use to cache our reflections, Fields, etc.; we can also cache some links to prevent them from being re-established. ConcurrentHashMap is also used in Caffeine to store elements.

LRUMap: You can use this if you don't want to introduce third-party packages and want to use the elimination algorithm to retire the data.

Ehcache: Due to its large jar package, it is more heavyweight. For some functionality that requires persistence and clustering, you can choose Ehcache. It should be noted that although Ehcache also supports distributed caching, it is generally not recommended as a distributed cache because of its rmi-to-node communication method, which is not as good as Redis.

Guava Cache: Guava This jar package has been introduced in many Java applications, so many times it is actually used directly, and it is lightweight and feature-rich, and you can choose Guava Cache without understanding Caffeine.

Caffeine: It is much better than Guava Cache in terms of hit rate, read and write performance, and its API and Guava cache are basically the same, or even a little more. Using Caffeine in a real environment has achieved good results.

To summarize: If you do not need to retire the algorithm, choose ConcurrentHashMap, if you need the retirement algorithm and some rich APIs, the recommended choice.

4. Distributed Cache

Distributed caching solves the biggest problem with in-process caching: if the app is a distributed system, nodes cannot share each other's in-process cache. Scenarios for distributed caching:

Caches data that has undergone complex calculations.
Cache frequently accessed hotspot data in the system to reduce database pressure.

The implementation principles of different distributed caches tend to vary greatly. This article focuses on Memcached and Redis.

4.1 Memcached

Memcached is a high-performance, distributed in-memory object caching system that can be used to store data in a variety of formats, including images, videos, files, and database retrieval results, by maintaining a single, large hash table in memory.

Simply put: the data is cached into memory and then read from memory, which greatly improves the read speed.

4.1.1 Memcached features

Using physical memory as a buffer, it can run independently on the server. If you want to cache more data, you can open up more Memcached processes (different ports) or use distributed Memcached to cache and cache data to different physical machines or virtual machines.

Use the key-value approach to store data. This is a single-index, structured form of data organization that makes data item query time complexity O(1).

The protocol is simple, based on a line of text protocol. Directly through telnet on the Memcached server can access data operations, simple, convenient for a variety of cache reference this protocol;

Libevent-based high-performance communication. Libevent is a set of libraries developed using C, which encapsulates event processing functions such as kqueue of BSD systems and epoll of Linux systems into an interface, which improves performance compared with traditional select.

The distributed capability depends on the Memcached client, and the servers do not communicate with each other. The various Memcached servers do not communicate with each other, access data independently, and do not share any information. Servers are not distributed, and distributed deployments depend on the Memcached client.

Adopt an LRU cache retirement strategy. When you store a data item within Memcached, you can specify its cache expiration time, which defaults to persistent. When the Memcached server runs out of allocated data, the invalid data is replaced first, followed by the most recently unused data. In the LRU, Memcached uses a Lazy Expiration policy, which itself does not monitor whether the stored key/vlue pair is expired, but instead looks at the recorded timestamp when fetching the key value and checks whether the key/value pair space expires, which reduces the load on the server.

A set of efficient memory management algorithms is built-in. This set of memory management is very efficient and does not cause memory fragmentation, but its biggest drawback is that it will lead to wasted space. When the memory is full, the unused cache is automatically deleted by the LRU algorithm.

Persistence is not supported. Memcached did not consider the data disaster recovery problem, restart the service, all data will be lost.

4.1.2 How Memcached Works

1) Memory management

Memcached uses the slab allocation mechanism to allocate and manage memory, it is according to the predetermined size, the allocated memory is divided into memory blocks of a specific length, and then the same size of the memory block is divided into groups, the data is stored, according to the key value size to match the size of the slab, find the nearest slab storage, so there is a waste of space.

This set of memory management is very efficient and does not cause memory fragmentation, but its biggest drawback is that it will lead to wasted space.

2) Cache retirement policy

Memcached's cache retirement strategy is the LRU + expiration invalidation policy.

When you store a data item within Memcached, you may specify its cache expiration time, which defaults to persistent. When the Memcached server runs out of allocated data, the invalid data is replaced first, followed by the most recently unused data.

In the LRU, Memcached uses a Lazy Expiration policy: Memcached does not monitor whether the stored key/vlue pair expires, but instead looks at the recorded timestamp when fetching the key value and checks whether the key/value pair expires for the space, which reduces the load on the server.

3) Partitioning

Memcached servers do not communicate with each other, and their distributed capabilities are client-dependent. Specifically, an algorithm is implemented on the client side to calculate which server node the data should be read/written to based on the key.

There are three common algorithms for selecting cluster nodes:

Hash remainder algorithm: Use the formula: Hash(key)%N to calculate the hash value to determine which node the data is mapped to.
Consistent hashing algorithm: can solve the stability problem very well, you can arrange all the storage nodes on the end-to-end Hash ring, each key will be calculated after the hash will be clockwise found to be connected storage node storage. When a node joins or exits, only the subsequent nodes that are clockwise adjacent to that node on the Hash ring are affected.
Virtual Hash slot algorithm: Uses a well-dispersed hash function to map all data into a fixed range of integers, defined as slots, which are generally much larger than the number of nodes. A slot is the basic unit for data management and migration within a cluster. The main purpose of using a wide range of slots is to facilitate data splitting and cluster scaling. Each node is responsible for a certain number of slots.

4.2 Redis

Redis is an open source (BSD licensed), memory-based, multi-data structure storage system. Can be used as database, cache, and messaging middleware.

Redis can also use client-side sharding to extend write performance. Replication, LUA scripting, LRU eviction, transactions, and different levels of persistence are built-in, and high availability is provided via Redis Sentinel and Cluster.

4.2.1 Redis Features

Supports multiple data types - string, Hash, list, set, sorted set.
Support multiple data elimination strategies;

volatile-lru: Pick the least recently used data retirement from the dataset that has an expiration time set;

volatile-ttl: Pick out the data that will expire from the dataset that has an expiration time set;

volatile-random: arbitrarily select data retirement from the dataset that has an expiration time set;

allkeys-lru: Pick the least recently used data retirement from all datasets;

allkeys-random: arbitrarily select data from all datasets to be eliminated;

noeviction: Ban on deportation of data.

Two persistence methods are available - RDB and AOF.
Cluster mode is available through the Redis cluster.

4.2.2 Principles of Redis

1) Cache retirement

Redis has two data retirement implementations;

Negative: When you access the Redis key, if you find that it has expired, delete it
Active method: Periodically, from the key with the expiration time set, select a part of the failed key to delete according to the elimination strategy.

2) Partitioning

The Redis Cluster cluster contains 16,384 virtual Hash slots that use an efficient algorithm to calculate which Hash slot the key belongs to.
Redis Cluster supports pull distribution - when a node receives a command request, it will first detect whether the slot in which the key to be processed by the command request is responsible for itself, if not, the node will return a MOVED error to the client, and the information carried by the MOVED error can direct the client to redirect the request to the node that is responsible for the relevant slot.

3) Master-slave replication

Redis 2.8 supports asynchronous replication. It has two modes:

Full resychronization - Used for initial replication. The execution steps are basically the same as the SYNC command.

Partial resychronization - Used to recopy after a disconnection. If possible, the master server can send write commands that were executed during the master-slave connection disconnect to the slave, and the slave only needs to receive and execute these write commands to keep the master-slave database state consistent.

Each node in the cluster periodically sends a PING message to the other nodes in the cluster to detect whether the other is online.
If a master node is considered to be offline, a node is elected from its slave node, according to the Raft algorithm, and promoted to the primary node.

4) Data consistency

Redis does not guarantee strong consistency because it can significantly degrade cluster performance.
Redis achieves eventual consistency through asynchronous replication.

4.3 Distributed Cache Comparison

Different distributed cache features and implementation principles vary greatly, so the scenarios they adapt to are also different.

Here are three well-known distributed caches (MemCache, Redis, Tair) for comparison:

MemCache: Only suitable for memory-based caching frameworks; data persistence and disaster recovery are not supported.
Redis: Supports rich data structures, high read and write performance, but the data is fully memory, which must consider resource costs and support persistence.
Tair: Support rich data structures, high read and write performance, some types are relatively slow, theoretically the capacity can be infinitely expanded.

Summary: If the service is sensitive to latency and has more Map/Set data, it is more suitable for Redis. If the amount of data that the service needs to put into the cache is large and is not particularly sensitive to latency, memcached can be selected.

Fifth, multi-level caching

5.1 Overall Caching Framework

Typically, the caching of a large software system uses a multi-level caching scheme:

Request process:

The browser makes a request to the client and returns directly if the CDN has a cache;
If the CDN is cacheless, access the reverse proxy server;
If the reverse proxy server has cache, it returns directly;
If the reverse proxy server has no cache or dynamic requests, access the application server;
The application server accesses the in-process cache; if there is a cache, it returns to the proxy server and caches the data (dynamic requests are not cached);
If there is no data in the in-process cache, the distributed cache is read; and returned to the application server; the application server caches the data to a local cache (partial);
If the distributed cache has no data, the application reads the database data and puts it in the distributed cache;

5.2 Using in-process caching

If App Service is a single-point app, in-process caching is of course the preferred scenario for caching. For in-process caches, which are originally limited by the size of the memory, and other caches cannot be known after the process cache is updated, so in general the process cache is suitable for:

Data that is not very large and is updated less frequently.
If you update frequently and want to use in-process caching, you can set its expiration time to a shorter time, or set a shorter automatic refresh time.

This scenario has the following problems:

If App Service is a distributed system, the cache cannot be shared between application nodes, and there is a data inconsistency problem.
Because the in-process cache is limited by the size of the memory, the cache cannot be expanded indefinitely.

5.3 Use Distributed Cache

If App Service is a distributed system, the simplest caching solution is to use distributed caching directly. Its application scenarios are shown in the figure:

Redis is used to store hotspot data, and if the cache misses, it queries the database and updates the cache. This scenario has the following problems:

If the caching service is suspended, the application can only access the database, which can easily cause a cache avalanche.
Accessing the Distributed Cache service has some I/O and serialized deserialization overhead, and although the performance is high, it is not as fast as querying in memory.

5.4 Use multi-level caching

The use of in-process caching alone and distributed caching have their own shortcomings. If we need higher performance and better availability, we can design the cache as a multi-level structure. Store the hottest data in-process cache in memory to further speed up access.

This design idea also exists in computer systems, such as the USE of L1, L2, L3 multi-level caches for CPUs, which are used to reduce direct access to memory and thus speed up access. In general, multi-level cache architectures can already meet most business needs by using L2 caches, and excessive ratings can increase the complexity of the system and the cost of maintenance. Therefore, multi-level caching is not as good as possible with more ratings, and needs to be weighed according to the actual situation.

A typical L2 cache architecture can use an in-process cache (e.g., Caffeine/Google Guava/Ehcache/HashMap) as a L1 cache and a distributed cache (e.g., Redis/Memcached) as a L2 cache.

5.4.1 Multi-level cache queries

The multi-level cache query flow is as follows:

First, query the L1 cache, and if the cache hits, it returns the result directly; if not, proceed to the next step.
Next, query the L2 cache, return the result directly and backfill the L1 cache if the cache hits, and if not, perform the next step.
Finally, query the database, return the results, and backfill the L2 cache, then the L1 cache.

5.4.2 Multilevel Cache Updates

For the L1 cache, if there is a data update, only the cache on the machine on which it resides can be deleted and updated, and other machines can only refresh the cache through the timeout mechanism. There are two strategies for timeout settings:

Set to how long it expires after writing;
Set to how much time to refresh after writing.

For L2 caches, other machines are immediately visible if there is a data update. However, you must also set a timeout period that is longer than the validity time of the L1 cache. In order to solve the problem of inconsistencies in the in-process cache, the design can be further optimized;

Through the publishing and subscription mechanism of Message Queuing, you can notify other application nodes to update the in-process cache. With this scheme, even if the Message Queuing service is hung or unreliable, because the database update is performed first, the in-process cache expires, and the data is eventually consistent when the cache is refreshed.

Sixth, the issue of caching

6.1 Cache Avalanche

A cache avalanche refers to a system avalanche due to cache unavailability or a large number of caches that fail at the same time period due to the same timeout period, a large number of requests to directly access the database, and excessive database pressure.

For example, for System A, suppose there are 5,000 requests per second during peak periods per day, and the cache could have carried 4,000 requests per second during peak periods, but the cache machine unexpectedly went down completely. The cache hangs, at this time 1 second 5000 requests all fall into the database, the database must not be able to hold, it will report a warning, and then hang up. At this point, if there is no special solution to deal with this failure, the DBA is in a hurry and restarts the database, but the database is immediately killed by the new traffic.

The main means of resolving cache avalanches are as follows:

Increase cache system availability (in advance). For example, deploy Redis Cluster (master-slave + sentry) to achieve high availability of Redis and avoid a total crash.
Adopt a multi-level caching scheme (in progress). For example: local cache (Ehcache/Caffine/Guava Cache) + distributed cache (Redis/Memcached).
Flow restriction, downgrading, fuse scheme (in the event) to avoid being killed by traffic. For example, using Hystrix for fuse and downgrade.
If the cache supports persistence, it can recover data after it resumes work (after the fact). For example, Redis supports persistence, once restarted, automatically load data from disk, and quickly restore cached data.

The solution above is, in simple terms, a multi-level caching scheme. The system receives a query request, first look up the local cache, then look up the distributed cache, and finally check the database, as soon as it hits, it is returned immediately.

The aids to resolve the cache avalanche are as follows:

Monitor the cache and elastically expand the capacity.
The expiration time of the cache can be taken by a random value. This is done to avoid simultaneous cache invalidation, causing the database IO to spike. For example, in the past, a 10-minute timeout period was set, so each key can expire randomly 8-13 minutes, try to make the expiration time of different keys different.

6.2 Cache penetration

Cache penetration means that the queried data does not exist in the database, so it naturally does not exist in the cache. Therefore, if the application cannot find it in the cache, it will query the database. When there are more such requests, the pressure on the database increases.

There are generally two ways to solve cache penetration:

1) Cache null values

Still cached for returns to NULL, not for returns that throw exceptions.

Using this method will increase the maintenance cost of our cache, we need to delete this empty cache when inserting the cache, of course, we can solve this problem by setting a short timeout period.

2) Filter data that can't exist

Make some rules to filter some data that can't exist. You can use the Bloom filter (for binary operations of the data structure, so the performance is high), for example, your order ID is obviously in a range of 1-1000, if not within 1-1000 data can actually be filtered out.

For some malicious attacks, the attack brings a large number of keys that do not exist, so we use the first scheme to cache a large number of data that do not have a key. At this point, it is not appropriate for us to use the first scheme, and we can filter out these keys using the second scheme first. For this kind of data with unusually many keys and a low request repetition rate, we do not need to cache it, and use the second scheme to filter it out directly. For empty data with limited keys and a relatively high repetition rate, we can use the first way to cache.

6.3 Cache breakdown

Cache breakdown refers to the instantaneous failure of hotspot data, and a large number of requests directly access the database. For example, some keys are hot data and are accessed very frequently. If a key fails the moment, a large number of requests come over, the cache misses, and then go to the database access, the database access will increase sharply.

To avoid this problem, we can take the following two measures:

Distributed lock: Locks the key of hotspot data to prevent a large number of threads from accessing the same key at the same time.
Scheduled asynchronous refresh: You can adopt the policy of automatic refresh before expiration of some data instead of automatic retirement when it expires. Elimination is actually for the timeliness of data, so it is also possible to use automatic refresh.

6.4 Summary

The common problems in cache usage are described above. Here, from the perspective of the time period of occurrence, the overall solution to the cache problem is summarized.

Beforehand: Redis highly available scenario (Redis Cluster + Master-Slave + Sentinel) to avoid a full cache crash.
In fact: (1) Adopt a multi-level caching scheme, local caching (Ehcache/Caffine/Guava Cache) + distributed caching (Redis/Memcached). (b) Current limit + circuit breaker + downgrade (Hystrix) to avoid extreme cases where the database is killed.
After the fact: Redis persistence (RDB+AOF), once rebooted, automatically load data from disk, quickly restore cached data.

Distributed Cache Memcached, because the data type is not as rich as Redis, and does not support persistence, disaster recovery. Therefore, Redis is generally chosen for distributed caching.

7. Caching policy

7.1 Cache warm-up

Cache warm-up refers to the direct query of hotspot data and caching after the system starts. This avoids the problem of querying the database and then updating the cache when requested by the user.

solution:

Manually refresh the cache: Directly write a cache to refresh the page, and manually operate it when it goes online.
Refresh the cache when the application starts: The amount of data is not large, and it can be loaded automatically when the project starts.
Periodically flush the cache asynchronously.

7.2 How to Cache

7.2.1 Do not defer caching

Cache update mode:
Open transactions;
Write SQL;
Commit the transaction;
Write caching;

Do not place write cache operations in transactions, especially write distributed caches. Because network jitter can cause write cache response times to be slow, database transactions are blocked. If the cache data consistency requirements are not so high and the amount of data is not very large, you can consider regularly synchronizing the cache in full.

In this pattern, there is a situation where the transaction succeeds, but the cache write fails. However, this situation has less impact than the above problem.

7.2.2 Expiration Cache

Lazy loading. For hotspot data, you can set a short caching time and load it asynchronously at regular intervals.

7.3 Cache Updates

In general, if the system does not strictly require cache and database consistency, try not to serialize read and write requests. Serialization guarantees that there will be no data inconsistencies, but it can lead to a significant decrease in the throughput of the system.

In general, there are two cases of cached updates:

Delete the cache before updating the database;
Update the database before deleting the cache;

Why delete the cache instead of updating it?

You can think of how many concurrent requests to update data, and you don't guarantee that the order in which the database is updated and the order in which the cache is updated is consistent, there will be inconsistencies in the data in the database and in the cache. So in general, consider deleting the cache.

Delete the cache before updating the database;

For an update operation, it is simple to first go to the cache at all levels to delete, and then update the database. This operation has a relatively big problem, after the cache is deleted, there is a read request, this time because the cache is deleted, it will read the library directly, the data of the read operation is old and will be loaded into the cache, and the subsequent read requests all access the old data.

Operations on the cache cannot block our operations on the database regardless of success or failure, so many times deleting the cache can be done asynchronously, but deleting the cache first is not very suitable for this scenario. One advantage of deleting the cache first is that if the operation on the database fails, the cache will only be missed at best due to the cache being deleted first.

1) Update the database before deleting the cache (Note: This strategy is more recommended).

If we use an update database, then deleting the cache can avoid the above problem.

But the same introduces a new problem: suppose that when an update operation is performed and a query request is received, the old data in the cache is returned. To make matters worse, if a database update operation fails, the cache may always be dirty data.

2) Which update strategy should be chosen

From the above, we know that both update strategies have concurrency issues.

However, it is recommended that you choose to update the database before dropping the cache, as the probability of concurrency issues can be very low, because this condition needs to occur when the cache is stale while the read cache is in place, and there is a concurrent write operation. In fact, the write operation of the database will be much slower than the read operation, and the table must be locked, and the read operation must enter the database operation before the write operation, and the cache is updated later than the write operation, all of which have a basic probability of having a small probability.

If you need to guarantee strong consistency from your database and cache, you can do so using the 2PC or Paxos protocol. But 2PC is too slow and Paxos is too complex, so a strong consistency scheme is not recommended if it is not very important data. For more detailed analysis, please refer to: Distributed Database and Cache Write-Coherency Scheme Resolution.

VIII. Summary

Finally, through a mind map to summarize the knowledge points described in this article, to help you have a systematic understanding of caching.

I spent 2 months to sort out a set of JAVA development technical data, covering Java basics, distributed, microservices and other mainstream technical materials, including large factory face experience, learning notes, source code handouts, project practice, and explanatory videos.

Parse the cache design of a distributed system

Hope to help some friends who want to improve their abilities through self-study, get information, scan the code and pay attention to it

Remember to forward + follow + private message

Private Message Reply【2022 Learning Materials】

Receive more learning materials