Mysql – Counting – Optimization of Each Connection

Results:
I have used three methods:

>Three sub-queries, each adding 1 (my)
>Three subqueries, no connection, use where to filter (SlimsGhost)
>Three connections (Solarflare)

I have used “interpretation” and “analysis” to make some statistics, which explains each The work that must be done for a query, the following results are not surprising: stats

Relative results:

> 100%
> 79%
> 1715%

p>

three sub queries with simple join
three sub queries with where clause
one query with triple join

Original Post

Our idea is to connect 4 tables, use the same PK each time, and then calculate each connection separately Number of rows.

The obvious answer is that each join… is separated from the subquery.

But is it possible to do it with one query? Will it be more efficient?

select "LES CIGARES DU PHARAON" as "Titre",
(select count( payalb.idPays)
from album alb
left join pays_album payalb using (idAlb )
where alb.titreAlb = "LES CIGARES DU PHARAON") as "Pays",
(select count( peralb.idPers)
from album alb
left join pers_album peralb using (idAlb )
where alb.titreAlb = "LES CIGARES DU PHARAON") as "Personnages",
(select count( juralb.idJur)
from album alb< br /> left join juron_album juralb using (idAlb )
where alb.titreAlb = "LES CIGARES DU PHARAON") as "Jurons"
;
+--------- ---------------+------+-------------+--------+
| Titre | Pays | Personnages | Jurons |
+------------------------+------+---- ---------+--------+
| LES CIGARES DU PHARAON | 3 | 13 | 50 |
+---------- - ------------+------+-------------+--------+

Table album row: 22

table pays_album row: 45

table personnage_album row: 100

Table juron_album row: 1704

This is What I tried:

select alb.titreAlb as "Titre",
sum(case when alb.idAlb=payalb.idAlb then 1 else 0 end) " Pays",
sum(case when alb.idAlb=peralb.idAlb then 1 else 0 end) "Personnages",
sum(case when alb.idAlb=juralb.idAlb then 1 else 0 end) " Jurons"
from album alb
left join pays_album payalb using (idAlb )
left join pers_album peralb using (idAlb )
left join juron_album juralb using (idAlb )
where alb.titreAlb = "LES CIGARES DU PHARAON"
group by alb.titreAlb
;
+--------------------- ---+------+-------------+--------+
| Titre | Pays | Personnages | Jurons |
+------------------------+------+-------------+-- ------+
| LES CIGARES DU PHARAON | 1950 | 1950 | 1950 |
+------------------------+------+-------------+--- -----+

But it calculates the total number of rows in the complete connection table,…(1950 = 3 * 13 * 50)

Architecture: https://github.com/ LittleNooby/gbd2015-2016/blob/master/tintin_schema.png

Table content: https://github.com/LittleNooby/gbd2015-2016/blob/master/tintin_description

If Do you want to play with it:

db_init: https://github.com/LittleNooby/gbd2015-2016/blob/master/tintin_ok.mysql

For optimization purposes, a good rule of thumb is to add less instead of more. In fact, you should try to add as little as possible OK. With any other connection, you will increase the cost instead of increasing the cost. Because mysql basically only generates a large multiplication matrix. Many of them are optimized by indexes and other things.

But to answer your question: Assuming that the table has a unique key and idalb is the only key of the album, then there is actually only one big connection that can be calculated. Then, only in this way can you do something similar to your code:

select alb.titreAlb as "Titre",
count(distinct payalb.idAlb, payalb.PrimaryKeyFields) "Pays",
count(distinct peralb.idAlb , peralb.PrimaryKeyFields) "Personnages",
count(distinct juralb.idAlb, juralb.PrimaryKeyFields) "Jurons"
from album alb
left join pays_album payalb using (idAlb )
left join pers_album peralb using (idAlb )
left join juron_album juralb using (idAlb )
where alb.titreAlb = "LES CIGARES DU PHARAON"
group by alb.titreAlb

Where PrimaryKeyFields represents the primary key fields of the joined table (you have to find them).

Distinct will remove the impact of other connections on the count. But unfortunately, in general, the different will not be eliminated The impact of join on cost.

Although, if your index covers all (idAlb PrimaryKeyFields) fields of the table, then it may even be as fast as the original solution (because it can optimize distinct to not sort ) And close to all your content thinking (just traverse each table/index once). But in normal or worst case, it should perform worse than a reasonable solution (such as SlimGhost’s solution)-because it will Finding the best strategy is questionable. But play with it and check the explanation (and post the results), maybe mysql will do some crazy things.

Result:
I have used three methods:

>Three sub-queries, each adding 1 (my)
>Three sub-queries, no connection, use where to filter (SlimsGhost)
>Three Connections (Solarflare)

I have used “interpretation” and “analysis” to make some statistics, which explains the work that must be done for each query. The following results are not surprising: stats

Relative results:

> 100%
> 79%
> 1715%

three sub queries with simple join
 three sub queries with where clause
one query with triple join

Original Post

Our idea is to connect 4 tables, using the same each time PK, and then calculate the number of rows given by each connection.

The obvious answer is that each connection… is separated from the subquery.

But it is possible to do it with one query NS? Will it be more efficient?

select "LES CIGARES DU PHARAON" as "Titre",
(select count( payalb.idPays)
from album alb
left join pays_album payalb using (idAlb )
where alb.titreAlb = "LES CIGARES DU PHARAON") as "Pays",
(select count( peralb.idPers)
from album alb
left join pers_album peralb using (idAlb )
where alb.titreAlb = "LES CIGARES DU PHARAON") as "Personnages",
(select count( juralb.idJur)
from album alb< br /> left join juron_album juralb using (idAlb )
where alb.titreAlb = "LES CIGARES DU PHARAON") as "Jurons"
;
+--------- ---------------+------+-------------+--------+
| Titre | Pays | Personnages | Jurons |
+------------------------+------+---- ---------+--------+
| LES CIGARES DU PHARAON | 3 | 13 | 50 |
+---------- ---- ----------+------+-------------+--------+

List album Line: 22

table pays_album line: 45

table personnage_album line: 100

table juron_album line: 1704

This is what I tried Past:

select alb.titreAlb as "Titre",
sum(case when alb.idAlb=payalb.idAlb then 1 else 0 end) "Pays" ,
sum(case when alb.idAlb=peralb.idAlb then 1 else 0 end) "Personnages",
sum(case when alb.idAlb=juralb.idAlb then 1 else 0 end) "Jurons"
from album alb
left join pays_album payalb using (idAlb )
left join pers_album peralb using (idAlb )
left join juron_album juralb using (idAlb )
where alb. titreAlb = "LES CIGARES DU PHARAON"
group by alb.titreAlb
;
+----------------------- -+------+-------------+--------+
| Titre | Pays | Personnages | Jurons |
+ ------------------------+------+-------------+---- ----+
| LES CIGARES DU PHARAON | 1950 | 1950 | 1950 |
+-------- ----------------+------+-------------+--------+

But it calculates the total number of rows in the complete connection table,…(1950 = 3 * 13 * 50)

Architecture: https://github.com/LittleNooby/gbd2015-2016/blob/master /tintin_schema.png

Table content: https://github.com/LittleNooby/gbd2015-2016/blob/master/tintin_description

If you want to play with it:

db_init: https://github.com/LittleNooby/gbd2015-2016/blob/master/tintin_ok.mysql

For optimization purposes, A good rule of thumb is to join less instead of more. In fact, you should try to join as few rows as possible. With any other connection, you will increase the cost instead of increasing the cost. Because mysql basically only Will generate a large multiplication matrix. Many of them are optimized by indexes and other things.

But to answer your question: suppose the table has a unique key and idalb is the only key of the album Key, then there is actually only one big connection that can be calculated. Then, only in this way, you can do something similar to your code:

select alb.titreAlb as "Titre", 
count(distinct payalb.idAlb, payalb.PrimaryKeyFields) "Pays",
count(distinct peralb.idAlb, peralb.PrimaryKeyFields) "Personnages",
count(distinct juralb.idAlb, juralb .PrimaryKeyFields) "Jurons"
from album alb
left join pays_album payalb using (idAlb )
left join pers_album peralb using (idAlb )
left join juron_a lbum juralb using (idAlb )
where alb.titreAlb = "LES CIGARES DU PHARAON"
group by alb.titreAlb

where PrimaryKeyFields represents the primary key fields of the joined table (you must find them ).

Distinct will remove the impact of other connections on the count. But unfortunately, in general, different will not eliminate the impact of connections on the cost.

Although, if Your index covers all (idAlb PrimaryKeyFields) fields of the table, then it might even be as fast as the original solution (because it can optimize distinct to not sort) and close to everything you think about (just traverse each table/ Indexed once). But under normal or worst-case scenarios, it should perform worse than reasonable solutions (such as SlimGhost’s solution)-because it is questionable that it will find the best strategy. But play with it and check the explanation ( And post the results), maybe mysql will do some crazy things.

Leave a Comment

Your email address will not be published.