High concurrent cache consistency problem

When the data is read:

Check the cache first, and the cache can’t check the database, and then put the checked results in the cache. These are basically not controversial.

But when the data is updated:

Whether to update the database first, or update (or delete) the cache again

or first Update (or delete) the cache, and then update the database.

There has always been a lot of controversy. Several implementation methods will have data consistency problems.

Let me talk about how our system currently does:

0, first confirm the cache hit rate. Don’t go to the cache at all times. Some caches have a hit rate that is meaningless. For example, when it comes to account-related assets, orders and other information, even if it is placed in the cache, only users will check their own information, and the hit rate is extremely low.

Generally, the ones that are not related to the account and have a large query volume are placed in the cache.

1. Set the expiration time of the cache to ensure final consistency.

2, update the database first, and then delete the cache.

It may happen that after updating the database, deleting the cache fails, causing the cache to be old data, and the database is new data, and inconsistencies appear.
But this probability is very low.
And after the cache deletion fails, we can also do some processing.
Why delete the cache instead of updating the cache.
Because the cache is not necessarily directly the content in the database, it may be calculated by multiple fields. If the cache is written for each update, it will cause performance consumption.
Why not delete the cache first, and then update the database.
Because the cache is deleted first, if another thread comes in to read the data before the update operation is committed, the cache cannot find it, and it checks the database and puts it in the cache. Then the first update operation was committed.
The cache is old data, and the database is new data. And it will not be as convenient as the previously mentioned delete cache failure.
If the business requires strong consistency, avoid caching as much as possible.

It may happen that after updating the database, deleting the cache fails, causing the cache to be old data, and the database is new data, and inconsistencies appear.

But this probability is very low.

And after the cache deletion fails, we can also do some processing.

Why delete the cache instead of updating the cache.

Because the cache is not necessarily directly the content in the database, it may be calculated by multiple fields. If you write to the cache every time you update, it will cause performance consumption.

Why not delete the cache first, and then update the database.

Because the cache is deleted first, if another thread comes in to read the data before the update operation is committed, the cache cannot find it, and the database is checked and put into the cache. Then the first update operation was committed.

Cause the cache is old data, and the database is new data. And it will not be as convenient as the previously mentioned delete cache failure.

If the business requires strong consistency, avoid caching as much as possible.

Leave a Comment

Your email address will not be published.