4 common cache problems and solutions

Foreword
Using cache can alleviate the pressure of large traffic, significantly Improve the performance of the program. When we use the cache system, especially in the case of large concurrency, we often encounter some “difficulties”. This article summarizes some common problems and solutions when using cache, which can be used as a reference in the future when encountering such problems. These common situations should also be considered when designing a cache system.

< div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="66v75-0-0"> For the convenience of presentation, this article Take the database query cache as an example. Using the cache can reduce the pressure on the database.

< div> Share a picture

Cache penetration< /span>
When we use the cache, we often try to get the value in the cache first, if not, then go to the database to get the value, if the database If there is no value, it will return null or throw an exception according to business requirements.

< span data-text="true">If a user has been accessing data that does not exist in a database, such as data with an id of -1, it will cause each request to go to the cache to check once, and then go to the database to check again, causing serious Performance issues. This situation is called cache penetration.

< span data-text="true">Solution

The following solutions:

  • to verify the request parameters, such as user Authentication verification, id for basic verification, id <= 0 direct interception.
  • If there is no value in the database, it will be corresponding The key is stored in the cache, and the value is null. In this way, the next query will be returned directly from the cache. But the key cache time here should be relatively short, such as 30s. To prevent this data from being inserted into the database later, and the user cannot get it.
  • Use Bloom filter to determine whether a key It has been checked. If it has been checked, it will not go to the database for query.
share picture

Cache breakdown
Cache breakdown refers to the fact that a key has a very large amount of visits, such as a spike activity. There is a concurrency of 1w/s. If this key expires at a certain moment, these large numbers of requests will arrive at the database in an instant, and the database may crash directly.

< span data-text="true">Solution

There are also several solutions for cache breakdown, which can be used together:

  • For hot data, consider carefully Expiration time, to ensure that the key will not expire during hotspots, and some can even be set to never expire.
  • Using mutual exclusion locks (such as Java’s multi-threaded lock mechanism), the first thread is locked when accessing the key. After the query database returns, insert the value into the cache and then release the lock, so that subsequent requests can directly fetch the data in the cache.
Cache avalanche
Cache avalanche refers to the failure of multiple keys at a certain moment. In this way, a large number of requests will not get values ​​from the cache, and all will go to the database. There is another situation where the cache server is down, which is also regarded as a cache avalanche.

< span data-text="true">Solution

For the above two situations, there are two solutions to cache avalanche:

  • the expiration of each key The time is set to a random value, not all keys are the same.
  • Use a highly available distributed cache cluster to ensure high availability of the cache, such as redis-cluster.
share picture

Inconsistent double writing
When using database cache, the read and write process is often like this:

  • When reading, first read the cache, if there is no cache, read it directly from the database, and then take out the data and put it into the cache
  • When updating , delete the cache first, and then update the database
The so-called double write inconsistency means that a write operation occurs ( During the update) or after the write operation, the value in the database may be different from the value in the cache.

< span data-text="true">Why do I need to delete the cache when updating, and then update the database? Because if you update the database first, and then fail when deleting the cache, the value in the cache will be inconsistent with the value in the database.

< span data-text="true">However, this does not completely avoid the double-write inconsistency problem. Suppose that in a large concurrency scenario, one thread deletes the cache first, and then updates the database. At this time, another thread fetches the cache and finds that there is no value, so it reads the database, and then sets the old value of the database into the cache. After the first thread updates the database, the database contains the new value, and the cache contains the old value, so there is a problem of data inconsistency.

< span data-text="true">A simpler solution is to set the expiration time relatively low, so that there is a data inconsistency problem only before the cache expires, which is acceptable in some business scenarios.

< span data-text="true">Another solution is to use queue assistance. Update the database first, and then delete the cache. If the deletion fails, it is put in the queue. Then another task takes out the message from the queue and keeps trying to delete the corresponding key.

< span data-text="true">Another solution is to use a queue for a data to serialize read and write operations. For example, create a queue for data with id n. For the write operation of this piece of data, delete the cache and put it into a queue; then another thread comes over and finds that there is no cache, so the read operation is also put into this queue.

< span data-text="true">But this will increase the complexity of the program, and serialization will also reduce the throughput of the program, which may outweigh the gain. Generally, the mainstream solution is to delete the cache first, and then update the database. Can meet most of the needs.

last
Welcome to pay attention to my public account [Programmer Chasing Wind] , The articles will be updated in it, and the collated information will also be placed in it.

Foreword

You can use the cache Relieve the pressure of large flow and significantly improve the performance of the program. When we use the cache system, especially in the case of large concurrency, we often encounter some “difficulties”. This article summarizes some common problems and solutions when using cache, which can be used as a reference in the future when encountering such problems. These common situations should also be considered when designing a cache system.

Using cache can alleviate the pressure of large traffic and significantly improve the performance of the program. When we use the cache system, especially in the case of large concurrency, we often encounter some “difficulties”. This article summarizes some common problems and solutions when using cache, which can be used as a reference in the future when encountering such problems. These common situations should also be considered when designing a cache system.

For the convenience of presentation, this article takes the database query cache as an example, the use of cache can reduce the pressure on the database.

For the convenience of presentation, this article takes the database query cache as an example. The use of cache can reduce the database pressure.

share picture

share picture

Share a picture

< p>Share a picture

Cache penetration

we When using the cache, it is often first to try to get the value in the cache, if not, then go to the database to get the value, if the database also has no value, it will return empty or throw an exception according to business requirements.

When we use the cache , Often try to get the value in the cache first, if not, then go to the database to get the value, if the database also has no value, it will return empty or throw an exception according to business needs.

If a user has been accessing data that does not exist in a database, such as data with an id of -1, it will cause each request to be cached first Check it once, and then go to the database to check it again, causing serious performance problems. This situation is called cache penetration.

If the user always visits one Data that does not exist in the database, such as data with an id of -1, will cause each request to be cached and then checked again in the database, causing serious performance problems. This situation is called cache penetration.

Solution

Solution

The following solutions:

The following solutions: span>

to verify the request parameters, such as user authentication Verification, id for basic verification, id <= 0 direct interception.

If there is no value in the query database, the corresponding key will also be stored in the cache. The value is null. In this way, the next query will be returned directly from the cache. But the key cache time here should be relatively short, such as 30s. To prevent this data from being inserted into the database later, and the user cannot get it.

Use the Bloom filter to determine whether a key has been checked. If it has been checked, it will not go to the database for query.

share picture

 Share pictures

Share a picture

< p>Share a picture

Cache breakdown

Cache Breakdown refers to a very large number of visits to a key, such as a spike activity with a concurrency of 1w/s. If this key expires at a certain moment, these large numbers of requests will arrive at the database in an instant, and the database may crash directly.

Cache breakdown refers to Yes, a key has a very large number of visits. For example, a spike activity has a concurrency of 1w/s. If this key expires at a certain moment, these large numbers of requests will arrive at the database in an instant, and the database may crash directly.

Solution

Solution

There are several solutions for cache breakdown , Can be used together:

There are also several solutions for cache breakdown, which can be used together:

For hotspot data, carefully consider the expiration time to ensure that the key will not expire during the hotspot, and some can even be set to never expire.

Using mutual exclusion locks (such as Java Thread lock mechanism), when the first thread accesses the key, it is locked. After the query database returns, insert the value into the cache and then release the lock, so that subsequent requests can directly fetch the data in the cache.

Cache avalanche

Cache avalanche refers to the failure of multiple keys at a certain moment. In this way, a large number of requests will not get values ​​from the cache, and all will go to the database. There is another situation where the cache server is down, which is also regarded as a cache avalanche.

Cache avalanche refers to , At a certain moment, multiple keys fail. In this way, a large number of requests will not get values ​​from the cache, and all will go to the database. There is another situation where the cache server is down, which is also regarded as a cache avalanche.

Solution

Solution

For the above two situations, cache avalanche has Two solutions:

In view of the above two situations, there are two solutions to cache avalanche:

Set a random value for the expiration time of each key, not all keys are the same.

Use a highly available distributed cache cluster, Ensure the high availability of the cache, such as redis-cluster.

share picture

分享图片

分享图片

分享图片

双写不一致

在使用数据库缓存的时候,读和写的流程往往是这样的:

在使用数据库缓存的时候,读和写的流程往往是这样的:

读取的时候,先读取缓存,如果缓存中没有,就直接从数据库中读取,然后取出数据后放入缓存

更新的时候,先删除缓存,再更新数据库

所谓双写不一致,就是在发生写操作(更新)的时候或写操作之后,可能会存在数据库里面的值和缓存中的值不同的情况。

所谓双写不一致,就是在发生写操作(更新)的时候或写操作之后,可能会存在数据库里面的值和缓存中的值不同的情况。

为什么更新的时候要先删除缓存,再更新数据库?因为如果先更新数据库,然后在删除缓存的时候失败了,就会造成缓存里面的值和数据库的值不一致。

为什么更新的时候要先删除缓存,再更新数据库?因为如果先更新数据库,然后在删除缓存的时候失败了,就会造成缓存里面的值和数据库的值不一致。

然而这样并不能完全避免双写不一致问题。假设在大并发情景下,一个线程先删除缓存,然后取更新数据库,这个时候另一个线程去取缓存,发现没有值,于是去读数据库,然后把数据库旧的值设置进缓存。等第一个线程更新完数据库后,数据库里面就是新的值,而缓存里面是旧的值,所以就存在了数据不一致的问题。

然而这样并不能完全避免双写不一致问题。假设在大并发情景下,一个线程先删除缓存,然后取更新数据库,这个时候另一个线程去取缓存,发现没有值,于是去读数据库,然后把数据库旧的值设置进缓存。等第一个线程更新完数据库后,数据库里面就是新的值,而缓存里面是旧的值,所以就存在了数据不一致的问题。

一个比较简单的解决办法是把过期时间设置得比较低,这样就只有在缓存没过期之前存在数据不一致问题,在一些业务场景下也还能接受。

一个比较简单的解决办法是把过期时间设置得比较低,这样就只有在缓存没过期之前存在数据不一致问题,在一些业务场景下也还能接受。

另一种解决方案是使用队列辅助。先更新数据库,再删除缓存。如果删除失败,就放进队列。然后另一个任务从队列中取出消息,不断去重试删除相应的key。

另一种解决方案是使用队列辅助。先更新数据库,再删除缓存。如果删除失败,就放进队列。然后另一个任务从队列中取出消息,不断去重试删除相应的key。

还有一种解决方案是使用对一个数据使用一个队列,使读写操作串行化。比如对id为n的数据建立一个队列。对这条数据的写操作,删除缓存后,放进一个队列;然后另一个线程过来了,发现没有缓存,则把这个读操作也放进这个队列里面。

还有一种解决方案是使用对一个数据使用一个队列,使读写操作串行化。比如对id为n的数据建立一个队列。对这条数据的写操作,删除缓存后,放进一个队列;然后另一个线程过来了,发现没有缓存,则把这个读操作也放进这个队列里面。

不过这样会增加程序的复杂性,串行化也会降低程序的吞吐量,可能得不偿失。一般主流的解决方案还是先删除缓存,再更新数据库。可以满足绝大部分需求。

不过这样会增加程序的复杂性,串行化也会降低程序的吞吐量,可能得不偿失。一般主流的解决方案还是先删除缓存,再更新数据库。可以满足绝大部分需求。

最后
欢迎大家关注我的公众号【程序员追风】,文章都会在里面更新,整理的资料也会放在里面。

最后

欢迎大家关注我的公众号【程序员追风】,文章都会在里面更新,整理的资料也会放在里面。

欢迎大家关注我的公众号【程序员追风】,文章都会在里面更新,整理的资料也会放在里面。