Mysql thousands of large data query optimization - 10 million, data, large, Level, mysql, Optimization, Quantity, query

1. Optimize queries and avoid them as much as possible For a full table scan, you should first consider establishing an index on the columns involved in where and order by.

2. Try to avoid the where clause The null value of the field is judged in the field, otherwise it will cause the engine to give up using the index and perform a full table scan, such as: select id from t where num is null, you can set the default value of 0 on num to ensure that the num column in the table does not have a null value, and then Query like this: select id from t where num=0

3. Try to avoid using the != or <> operator in the where clause, otherwise the engine will give up using the index and perform a full table scan.

4. Try to avoid using it in the where clause or to join the condition, otherwise it will cause the engine to give up using the index and perform a full table scan, such as: select id from t where num=10 or num=20, you can query like this: select id from t where num=10 union all select id from t where num=20

5.in and not in should also be used with caution, otherwise it will cause Full table scan, such as: select id from t where num in(1,2,3) For continuous values, don’t use in if you can use between: select id from t where num between 1 and 3

6. The following query will also lead to full Table scan: select id from t where name like’%李%’ To improve efficiency, you can consider full-text search.

7. If you use parameters in the where clause, it will also cause a full table scan. Because SQL only parses local variables at runtime, the optimizer cannot defer the choice of access plan until runtime; it must choose at compile time. However, if the access plan is established at compile time, the value of the variable is still unknown and therefore cannot be used as an input item for index selection. For example, the following statement will perform a full table scan: select id from t where [email protected] can be changed to force the query to use an index: select id from t with(index (index name)) where [email protected]

8. Try to avoid performing expression operations on fields in the where clause, which will cause the engine to abandon the use of indexes and perform full table scans. Such as: select id from t where num/2=100 should be changed to: select id from t where num=100*2

9. Should try to avoid the field in the where clause Perform function operations, which will cause the engine to give up using the index and perform a full table scan. For example: select id from t where substring(name,1,3)=’abc’, the id whose name starts with abc should be changed to:

select id from t where name like’abc%’

10. Do not use the “in the where clause =” Perform functions, arithmetic operations or other expression operations on the left, otherwise the system may not be able to use the index correctly.

11. When using index fields as a condition, if the index is a compound index, then the first field in the index must be used as Conditions can ensure that the system uses the index, otherwise the index will not be used, and the field order should be consistent with the index order as much as possible.

12. Do not write some without Meaningful query, if you need to generate an empty table structure: select col1, col2 into #t from t where 1=0

This kind of code will not return any result set, but will consume system resources, should Change it to this:
create table #t(…)

13. In many cases, using exists instead of in is a good choice: select num from a where num in(select num from b)

Use the following Statement replacement:
select num from a where exists(select 1 from b where num=a.num)

14. Not all indexes are effective for queries. SQL is optimized based on the data in the table. When a large amount of data is repeated in the index column, the SQL query may not use the index. For example, there are fields sex, male, and female in a table. Each half, then even if an index is built on sex, it will not play a role in query efficiency.

15. Indexes are not as many as possible. While indexes can improve the efficiency of the corresponding select, they also reduce the efficiency of insert and update, because the index may be rebuilt during insert or update. Therefore, how to build an index needs to be carefully considered, depending on the specific circumstances. The number of indexes of a table should not exceed 6, if there are too many, you should consider whether it is necessary to build indexes on columns that are not frequently used.

16. It should be avoided as much as possible to update the clustered index Data columns, because the order of clustered index data columns is the physical storage order of table records, once the column value changes, the order of the entire table records will be adjusted, which will consume considerable resources. If the application system needs to update the clustered index data columns frequently, then you need to consider whether the index should be built as a clustered index.

17. Use numeric fields as much as possible. If fields containing only numeric information, try not to design them as character types. This will reduce the performance of queries and connections, and will increase storage overhead. This is because the engine compares each character in the string one by one when processing queries and concatenations. For numeric types, only one comparison is sufficient.

18. Use varchar/nvarchar instead of char/nchar as much as possible, because the storage space of variable-length fields is small, which can save storage space, and secondly, for queries, search in a relatively small field The efficiency is obviously higher.

19. Do not use select anywhere * from t, replace “*” with a specific field list, and don’t return any fields that are not used.

20. Try to use table variables instead of temporary tables. If the table variable contains a lot of data, please note that the index is very limited (only the primary key index).

21. Avoid frequent creation and deletion of temporary tables to reduce the consumption of system table resources.

22. Temporary tables are not unusable. Proper use of them can make certain routines more effective, for example, when you need to repeatedly refer to a large table or a data set in a commonly used table . However, for one-time events, it is better to use an export table.

23. When creating a new temporary table, if you insert a large amount of data at one time, you can use select into instead of create table to avoid a large number of logs. Increase the speed; if the amount of data is not large, in order to ease the resources of the system table, you should create table first, then insert.

24. If temporary tables are used, all temporary tables must be explicitly deleted at the end of the stored procedure, first truncate table, then drop table, so that a long time lock of system tables can be avoided.

25. Try to avoid using the cursor because of the efficiency of the cursor Worse, if the data operated by the cursor exceeds 10,000 rows, then rewriting should be considered.

26. Use the cursor-based method Or the temporary table method, you should first find a set-based solution to solve the problem, the set-based method is usually more effective.

27. Like temporary tables, cursors are not unusable. Using FAST_FORWARD cursors for small data sets is generally better than other row-by-row processing methods, especially when you must reference several tables to get the data you need. Routines that include “total” in the result set usually execute faster than using cursors. If the development time permits, you can try both the cursor-based method and the set-based method to see which method works better.

28. Set SET NOCOUNT ON at the beginning of all stored procedures and triggers, and set SET NOCOUNT OFF at the end. There is no need to send a DONE_IN_PROC message to the client after executing each statement of stored procedures and triggers.

29. Try to avoid large transaction operations and improve system concurrency.

30. Try to avoid returning a large amount of data to the client. If it is too large, you should consider whether the corresponding demand is reasonable.