[Distributed] cache penetration, cache avalanche, cache breakdown solution

1. What kind of data is suitable for caching

Share a picture

Second, cache penetration

Cache penetration refers to querying a data that must not exist, because the cache is missed, it needs to be queried from the database, if the data is not found Do not write to the cache, which will cause this non-existent data to be queried in the database every time it is requested, causing cache penetration. When the traffic is large, the DB may be down. If someone uses a non-existent key to frequently attack our application, this is a loophole.

Solution:

1) There are many ways to effectively solve the problem of cache penetration. The most common one is to use Bloom filters to remove all possible data. It is hoped that in a sufficiently large bitmap, a data that must not exist will be intercepted by this bitmap, thereby avoiding the query pressure on the underlying database.

2) There is also a simpler and more rude method. If the data returned by a query is empty (whether the data does not exist or the system is faulty), the empty result is still cached, but its The expiration time will be very short, no more than five minutes at the longest.

3. Cache avalanche:

Cache avalanche means that the same expiration time is used when setting the cache, which causes the cache to become invalid at a certain moment. As a result, all queries fell on the database, causing a cache avalanche.

Solution:

1) After the cache is invalid, the number of threads that read the database and write the cache is controlled by locking or queueing. For example, only one thread is allowed to query data and write cache for a certain key, and other threads wait.

2) The cache reload mechanism can be used to update the cache in advance, and manually trigger the loading of the cache before a large concurrent access is about to occur.

3) For different keys, set different expiration times to make the time points of cache invalidation as even as possible.

4) Do secondary cache, or double cache strategy. A1 is the original cache, and A2 is the copy cache. When A1 fails, you can access A2, the cache expiration time of A1 is set to short-term, and A2 is set to long-term.

Four. Cache breakdown

For some keys with expiration time set, if these keys may be accessed super concurrently at certain points in time, it is very ” Hotspot” data. At this time, there is a problem that needs to be considered: the problem of cache being “breakdown”. The difference between this and the cache avalanche is that it is cached for a certain key, and the former has many keys.
When the cache expires at a certain point in time, there are a large number of concurrent requests for this Key at this point in time. When these requests find that the cache expires, data will generally be loaded from the back-end DB and reset to the cache. This time is large. Concurrent requests may overwhelm the back-end DB instantly.

Solution:

1) Background refresh

The background defines a job (timed task) to actively update the cache data. For example, the data in a cache expires The time is 30 minutes, then the job refreshes the data regularly every 29 minutes (updates the data found in the database to the cache).

Note: This scheme is easier to understand, but it will increase the complexity of the system Spend. It is more suitable for businesses with relatively fixed keys and large cache granularity, while those with more scattered keys are not suitable and are more complicated to implement.

2) Check for updates

Save the expiration time (absolute time) of the cache key to the cache together (can be spliced, new fields can be added, and a separate key can be used to save.. No matter what method is used, as long as the relationship between the two is established).After each execution of the get operation, the cache expiration time of the get is compared with the current system time, if the cache expiration time-current system time <=1 Minutes (a custom value), the cache is actively updated. This ensures that the data in the cache is always up-to-date (same as solution one, so that the data does not expire.)

Note: This The program may also have problems in special circumstances. Suppose that the cache expiration time is 12:00, and there is no get request from 11:59 to 12:00, and the requests happen to be concurrently sent at 11:30, that would be a tragedy. This situation is extreme, but it is not impossible. Because “high concurrency” may also be a phased outburst at a certain point in time.

3) Hierarchical caching

L1 (first level cache) and L2 (second level cache) caching methods are adopted. L1 cache invalidation time is short, and L2 cache invalidation time is long. Requests to obtain data from the L1 cache first, and lock if the L1 cache misses. Only one thread obtains the lock. This thread reads the data from the database and updates the data to the L1 cache and the L2 cache. Others The thread still gets data from the L2 cache and returns.

Note: This method is mainly achieved by avoiding simultaneous cache invalidation and combining with the lock mechanism. Therefore, when the data is updated, only the L1 cache can be eliminated, and the caches in L1 and L2 cannot be eliminated at the same time. There may be dirty data in the L2 cache, and the business needs to be able to tolerate this short-term inconsistency. Moreover, this solution may result in a waste of additional cache space.

4) Locking

Method 1:

1
2
3
4
5
6
7
8
9
10
11
12
13
// Method 1:
/code> public synchronized List getData01() {
List result = new ArrayList();
// Read data from cache
result = getDataFromCache();
if (result.isEmpty()) {
// Query data from the database div>

result = getDataFromDB();

< div class="line number9 index8 alt2"> // Write the queried data to the cache

setDataToCache(result);
< /code> }
return result;

code> }  

Note: This method can indeed prevent high concurrency to the database when the cache fails, but when the cache is not invalid, you need to queue up for locks when fetching data from the cache, which will inevitably be greatly reduced System throughput.

Method 2:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
< div class="container">

// Method 2:
static Object lock = new Object();
public List getData02() {
List result = new ArrayList();
// Read data from cache
result = getDataFromCache();
if (result.isEmpty()) { div>

synchronized ( lock ) {

< div class="line number10 index9 alt1"> // Query data from the database

result = getDataFromDB();
// Write the queried data to the cache
setDataToCache(result);
}
}
return code> result;
}  

Note: This method will not affect the throughput of the system when the cache hits, but when the cache fails, the request will still hit the database, but it is not high concurrency but blocking. However, this will cause user experience It is not good, and it also brings extra pressure to the database.

Method 3:

1
2
3
4
5
6
7
8
9
10
11
12
13 < /div>

14
15

16
17
18
19
20
//Method 3
public List getData03() {
List result = new ArrayList();
// Read data from cache
result = getDataFromCache( );
if (result.isEmpty()) {
synchronized ( lock ) {
//Double judgment, the second and subsequent requests do not have to go to the database, and directly hit the cache
// Query cache
result = getDataFromCache();
if (result.isEmpty()) {
// Query data from the database
result = getDataFromDB();
// Write the queried data into the cache
setDataToCache(result);
} code>
}
}
return result;
}

< /div>

Note: Although double judgment can prevent high-concurrency requests from hitting the database, the second and subsequent requests are still queued when they hit the cache. For example. , When 30 requests are sent concurrently, in the double judgment, the first request goes to the database to query and update the cached data, and the remaining 29 requests are queued to fetch the data from the cache in turn. The experience of the user who requests the latter Will be upset.

Method 4:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17 < /div>

18
19
20
21
22

23
24
25
26
27
28
29
static Lock reenLock = new ReentrantLock();
public List getData04() throws InterruptedException {
List result = new< /code> ArrayList();
// Read data from cache
< code class="csharp plain">result = getDataFromCache();
if (result.isEmpty()) {
if (reenLock.tryLock()) {
try {
< code class="csharp spaces"> System. out .println( "I got the lock, and write it to the cache after getting the database from the DB " );
// Query data from the database
result = getDataFromDB();
// Write the queried data into the cache
setDataToCache(result);
}< /code> finally {
{
reenLock.unlock(); // Release the lock
}
} else {
result = getDataFromCache(); // Check the cache first
if (result.isEmpty()) {
System. out .println ( "I didn’t get the lock, and there is no data in the cache. Take a nap first" ); < /div>

Thread.sleep(100); // Take a nap
return getData04(); // Try again
} div>

}
}
return result;
}

  

Note: Finally, the use of mutual exclusion locks can effectively avoid the previous problems.

Of course, in actual distributed scenarios, we can also use Distributed locks provided by redis, tair, zookeeper, etc. are implemented. However, if our concurrency is only a few thousand, why use a sledgehammer?

1
2
3
4
5
6
7
8
9
10
11
12
13
// 方法1:
     public  synchronized List getData01() {
         List result =  new  ArrayList();
         // 从缓存读取数据
         result = getDataFromCache();
         if  (result.isEmpty()) {
             // 从数据库查询数据
             result = getDataFromDB();
             // 将查询到的数据写入缓存
             setDataToCache(result);
         }
         return  result;
     }  

1
2
3
4
5
6
7
8
9
10
11
12
13
// 方法1:
     public  synchronized List getData01() {
         List result =  new  ArrayList();
         // 从缓存读取数据
         result = getDataFromCache();
         if  (result.isEmpty()) {
             // 从数据库查询数据
             result = getDataFromDB();
             // 将查询到的数据写入缓存
             setDataToCache(result);
         }
         return  result;
     }  

1

2

3

4

5

6

7

8

9

10

11

12

13

// 方法1:
     public  synchronized List getData01() {
         List result =  new  ArrayList();
         // 从缓存读取数据
         result = getDataFromCache();
         if  (result.isEmpty()) {
             // 从数据库查询数据
             result = getDataFromDB();
             // 将查询到的数据写入缓存
             setDataToCache(result);
         }
         return  result;
     }  

// 方法1:

     public  synchronized List getData01() {

         List result =  new  ArrayList();

         // 从缓存读取数据

         result = getDataFromCache();

         if  (result.isEmpty()) {

             // 从数据库查询数据

             result = getDataFromDB();

             // 将查询到的数据写入缓存

             setDataToCache(result);

         }

         return  result;

     }  

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

< div class="line number16 index15 alt1"> 16

17
// 方法2:
     static  Object  lock  new  Object();
 
     public  List getData02() {
         List result =  new  ArrayList();
         // 从缓存读取数据
         result = getDataFromCache();
         if  (result.isEmpty()) {
             synchronized ( lock ) {
                 // 从数据库查询数据
                 result = getDataFromDB();
                 // 将查询到的数据写入缓存
                 setDataToCache(result);
             }
         }
         return  result;
     }  

1
2
3
4
5
6

7
8
9
10
11
12
13
14
15
16
17
// 方法2:
     static  Object  lock  new  Object();
 
     public  List getData02() {
         List result =  new  ArrayList();
         // 从缓存读取数据
         result = getDataFromCache();
         if  (result.isEmpty()) {
             synchronized ( lock ) {
                 // 从数据库查询数据
                 result = getDataFromDB();
                 // 将查询到的数据写入缓存
                 setDataToCache(result);
             }
         }
         return  result;
     }  

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

// 方法2:
     static  Object  lock  new  Object();
 
     public  List getData02() {
         List result =  new  ArrayList();
         // 从缓存读取数据
         result = getDataFromCache();
         if  (result.isEmpty()) {
             synchronized ( lock ) {
                 // 从数据库查询数据
                 result = getDataFromDB();
                 // 将查询到的数据写入缓存
                 setDataToCache(result);
             }
         }
         return  result;
     }  

// 方法2:

     static  Object  lock  new  Object();

 

     public  List getData02() {

         List result =  new  ArrayList();

         // 从缓存读取数据

         result = getDataFromCache();

         if  (result.isEmpty()) {

             synchronized ( lock ) {

                 // 从数据库查询数据

                 result = getDataFromDB();

                 // 将查询到的数据写入缓存

                 setDataToCache(result);

             }

         }

         return  res ult;

     }  

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
//方法3
     public  List getData03() {
         List result =  new  ArrayList();
         // 从缓存读取数据
         result = getDataFromCache();
         if  (result.isEmpty()) {
             synchronized ( lock< /code> ) {
             //双重判断,第二个以及之后的请求不必去找数据库,直接命中缓存
                 // 查询缓存
                 result = getDataFromCache();
                 if  (result.isEmpty()) {
                     // 从数据库查询数据
                     result = getDataFromDB();
                     // 将查询到的数据写入缓存
                     setDataToCache(result);
                 }
             }
         }
         return  result;
     }

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
//方法3
     public  List getData03() {
         List result =  new  ArrayList();
         // 从缓存读取数据
         result = getDataFromCache();
         if  (result.isEmpty()) {
             synchronized ( lock ) {
             //双重判断,第二个以及之后的请求不必去找数据库,直接命中缓存
                 // 查询缓存
                 result = getDataFromCache();
                 if  (result .isEmpty()) {
                     // 从数据库查询数据
                     result = getDataFromDB();
                     // 将查询到的数据写入缓存
                     setDataToCache(result);
                 }
             }
         }
         return  result;
     }

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

//方法3
     public  List getData03() {
         List result =  new  ArrayList();
         // 从缓存读取数据
         result = getDataFromCache();
         if  (result.isEmpty()) {
             synchronized ( lock ) {
             //双重判断,第二个以及之后的请求不必去找数据库,直接命中缓存
                 // 查询缓存
                 result = getDataFromCache();
                 if  (result.isEmpty()) {
                     // 从数据库查询数据
                     result = getData FromDB();
                     // 将查询到的数据写入缓存
                     setDataToCache(result);
                 }
             }
         }
         return  result;
     }

//方法3

     public  List getData03() {

         List result =  new  ArrayList();

         // 从缓存读取数据

         result = getDataFromCache();

         if  (result.isEmpty()) {

             synchronized ( lock ) {

< p>             //双重判断,第二个以及之后的请求不必去找数据库,直接命中缓存

                 // 查询缓存

                 result = getDataFromCache();

                 if  (result.isEmpty()) {

                     // 从数据库查询数据

                     result = getDataFromDB();

                     // 将查询到的数据写入缓存

                     setDataToCache(result);

                 }

             }

         }

         return  result;

     }

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

< div class="line number22 index21 alt1"> 22

23
24
25
26
27
28
29
static  Lock reenLock =  new  ReentrantLock();
 
     public  List getData04() throws InterruptedException {
         List result =  new  ArrayList();
         // 从缓存读取数据
         result = getDataFromCache();
         if  (result.isEmpty()) {
             if  (reenLock.tryLock()) {
                 try  {
                     System. out .println( "我拿到锁了,从DB获取数据库后写入缓存" );
                     // 从数据库查询数据
                     result = getDataFromDB();
                     // 将查询到的数据写入缓存
                     setDataToCache(result);
                 finally  {
                     reenLock.unlock(); // 释放锁
                 }
 
             else  {
                 result = getDataFromCache(); // 先查一下缓存
                 if  (result.isEmpty()) {
                     System. out .println( "我没拿到锁,缓存也没数据,先小憩一下" );
                     Thread.sleep(100); // 小憩一会儿
                     return  getData04(); // 重试
                 }
             }
         }
         return  result;
     }

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
static  Lock reenLock =  new  ReentrantLock();
 
     public  List getData04() throws InterruptedException {
         List result =  new  ArrayList();
         // 从缓存读取数据
         result = getDataFromCache();
         if  (result.isEmpty()) {
             if  (reenLock.tryLock()) {
                 try  {
                     System. out .println( "我拿到锁了,从DB获取数据库后写入缓存" );
                     // 从数据库查询数据
                     result = getDataFromDB();
                     // 将查询到的数据写入缓存
                     setDataToCache(result);
                 finally  {
                     reenLock.unlock(); // 释放锁
                 }
 
             else  {
                 result = getDataFromCache(); // 先查一下缓存
                 if  (result.isEmpty()) {
                     System. out .println( "我没拿到锁,缓存也没数据,先小憩一下" );
                     Thread.sleep(100); // 小憩一会儿
                     return  getData04(); // 重试
                 }
             }
         }
         return  result;
     }

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

< p>20

21

22

23

24

25

26

27

28

29

static  Lock reenLock =  new  ReentrantLock();
 
     public  List getData04() throws InterruptedException {
         List result =  new  ArrayList();
         // 从缓存读取数据
         result = getDataFromCache();
         if  (result.isEmpty()) {
             if  (reenLock.tryLock()) {
                 try  {
                     System. out .println( "我拿到锁了,从DB获取数据库后写入缓存" );
                     // 从数据库查询数据
                     result = getDataFromDB();
                     // 将查询到的数据写入缓存
                     setDataToCache(result);
                 finally  {
                     reenLock.unlock(); // 释放锁
                 }
 
             else  {
                 result = getDataFromCache(); // 先查一下缓存
                 if  (result.isEmpty()) {
                     System. out .println( "我没拿到锁,缓存也没数据,先小憩一下" );
                     Thread.sleep(100); // 小憩一会儿
                     return  getData04(); // 重试
                 }
             }
         }
         return  result;
     }

static  Lock reenLock =  new  ReentrantLock();

 

     public  List getData04() throws InterruptedException {

         List result =  new  ArrayList();

         // 从缓存读取数据

         result = getDataFromCache();

         if  (result.isEmpty()) {

             if  (reenLock.tryLock()) {

                 try  {

                     System. out .println( "我拿到锁了,从DB获取数据库后写入缓存" );

                     // 从数据库查询数据

                     result = getDataFromDB();

                     // 将查询到的数据写入缓存

                     setDataToCache(result);

                 finally  {

                     reenLock.unlock(); // 释放锁

                 }

 

             else  {

                 result = getDataFromCache(); // 先查一下缓存

                 if  (result.isEmpty()) {

                     System. out .println( "我没拿到锁,缓存也没数据,先小憩一下" );

                     Thread.sleep(100); // 小憩一会儿

                     return  getData04(); // 重试

                 }

             }

         }

         return  result;

     }

Leave a Comment

Your email address will not be published.