Performance – The query is very slow when loading on PostgreSQL

We are using PostgreSQL version 9.4 database on Amazon EC2. All of our queries ran very slow on the first attempt, until after it was cached they were very fast, but it was not Mediation because it slows down the page loading speed.

In one of the queries we used:

SELECT HE.fs_perm_sec_id,
HE.TICKER_EXCHANGE,
HE.proper_name,
OP.shares_outstanding,

(SELECT factset_industry_desc
FROM factset_industry_map AS fim
WHERE fim.factset_industry_code = HES.industry_code) AS industry,

(SELECT SUM(POSITION) AS ST_HOLDINGS
FROM OWN_STAKES_HOLDINGS S
WHERE S.POSITION> 0
AND S.fs_perm_sec_id = HE .fs_perm_sec_id
GROUP BY FS_PERM_SEC_ID) AS stake_holdings,

(SELECT SUM(CURRENT_HOLDINGS)
FROM
(SELECT CURRENT_HOLDINGS
FROM OWN_INST_HOLDINGS IHT
WHERE FS_PERM_SEC_ID=HE.FS_PERM_SEC_ID
ORDER BY CURRENT_HOLDINGS DESC LIMIT 10)A) AS top_10_inst_hodings,

(SELECT SUM(OIH.current_holdings)
FROM own_inst_holdings /> WHERE OIH.fs_perm_sec_id = HE.fs_ perm_sec_id) AS inst_holdings

FROM own_prices OP
JOIN h_security_ticker_exchange HE ON OP.fs_perm_sec_id = HE.fs_perm_sec_id
JOIN h_entity_sector HES ON HES.factset_entity_id = HES HE.ticker_exchange ='PG-NYS'
ORDER BY OP.price_date DESC LIMIT 1

Run EXPLAIN ANALYZE and receive the following results:

QUERY PLAN
Limit (cost=223.39..223.39 rows=1 width=100) (actual time=2420.644..2420.645 rows=1 loops=1)
-> Sort (cost=223.39.. 223.39 rows=1 width=100) (actual time=2420.643..2420.643 rows=1 loops=1)
Sort Key: op.price_date
Sort Method: top-N heapsort Memory: 25kB
-> Nested Loop (cost=0.26..223.39 rows=1 width=100) (actual time=2316.169..2420.566 rows=36 loops=1)
-> Nested Loop (cost=0.17..8.87 rows =1 width=104) (actual time=3.958..5.084 rows=36 loops=1)
-> Index Scan using h_sec_exch_factset_entity_id_idx on h_security_ticker_exchange he (cost=0.09..4 .09 rows=1 width=92) (actual time=1.452..1.454 rows=1 loops=1)
Index Cond: ((ticker_exchange)::text ='PG-NYS'::text)
-> Index Scan using alex_prices on own_prices op (cost=0.09..4.68 rows=33 width=23) (actual time=2.496..3.592 rows=36 loops=1)
Index Cond: ((fs_perm_sec_id )::text = (he.fs_perm_sec_id)::text)
-> Index Scan using alex_factset_entity_idx on h_entity_sector hes (cost=0.09..4.09 rows=1 width=14) (actual time=0.076..0.077 rows =1 loops=36)
Index Cond: (factset_entity_id = he.factset_entity_id)
SubPlan 1
-> Index Only Scan using alex_factset_industry_code_idx on factset_industry_map fim (cost=0.03..2.03 rows=1 width=20) (actual time=0.006..0.007 rows=1 loops=36)
Index Cond: (factset_industry_code = hes.industry_code)
Heap Fetches: 0
SubPlan 2
-> GroupAg gregate (cost=0.08..2.18 rows=2 width=17) (actual time=0.735..0.735 rows=1 loops=36)
Group Key: s.fs_perm_sec_id
-> Index Only Scan using own_stakes_holdings_perm_position_idx on own_stakes_holdings s (cost=0.08..2.15 rows=14 width=17) (actual time=0.080..0.713 rows=39 loops=36)
Index Cond: ((fs_perm_sec_id = (he.fs_perm_sec_id): :text) AND (\position\> 0::numeric))
Heap Fetches: 1155
SubPlan 3
-> Aggregate (cost=11.25..11.26 rows=1 width=6) (actual time=0.166..0.166 rows=1 loops=36)
-> Limit (cost=0.09..11.22 rows=10 width=6) (actual time=0.081..0.150 rows=10 loops=36 )
-> Index Only Scan Backward using alex_current_holdings_idx on own_inst_holdings iht (cost=0.09..194.87 rows=175 width=6) (actual time=0.080..0.147 rows=10 loops=36)
Index Cond: (fs_perm_sec_ id = (he.fs_perm_sec_id)::text)
Heap Fetches: 288
SubPlan 4
-> Aggregate (cost=194.96..194.96 rows=1 width=6) (actual time= 66.102..66.102 rows=1 loops=36)
-> Index Only Scan using alex_current_holdings_idx on own_inst_holdings oih (cost=0.09..194.87 rows=175 width=6) (actual time=0.060..65.209 rows=2505 loops=36)
Index Cond: (fs_perm_sec_id = (he.fs_perm_sec_id)::text)
Heap Fetches: 33453
Planning time: 1.581 ms
Execution time: 2420.830 ms< /pre>

Once we disable SELECT SUM() for 3 aggregations, it will speed up greatly, but it will destroy the degree of having a relational database.

We use the PG plugin (https:/ /www.npmjs.com/package/pg) Run queries on NodeJS to connect and run queries on the database

How can we speed up the query? What additional steps can we take? We have indexed the database and all fields seem to be indexed correctly, but still not fast enough.

Any help, comments and/or suggestions are appreciated.

Nested loops with aggregation are usually a bad thing. This should be avoided below. (Untested; SQLFiddle will help. ) Give me a spin to let me know. I am curious how the engine uses window function filters.

WITH security
AS (
SELECT HE.fs_perm_sec_id
, HE.TICKER_EXCHANGE
, HE.proper_name
, OP.shares_outstanding
, OP.price_date
FROM own_prices AS OP
JOIN h_security_ticker_exchange AS HE
ON OP.fs_perm_sec_id = HE.fs_perm_sec_id
JOIN h_entity_sector AS HES
ON HES.factset_entity_id = HE.factset_entity_id
WHERE HE.ticker_exchange ='PG-NYS'
)
SELECT SE.fs_perm_sec_id
, SE.TICKER_EXCHANGE
, SE.proper_name
, SE.shares_outstanding
, S.stake_holdings
, IHT.top_10_inst_holdings
, OIH.inst_holdings
FROM security SE
JOIN (
SELECT S.fs_perm_sec_id
, SUM(S.POSITION) AS stake_holdings
FROM OWN_STAKES_HOLDINGS AS S
WHERE S.fs_perm_sec_id IN (
SELECT fs_perm_sec_id
FROM security
)
AND S.POSITION> 0
GROUP BY S.fs_perm_sec_id
) AS S
ON SE.fs_perm_sec_id = S.fs_perm_sec_id
JOIN (
SELECT IHT.FS_PERM_SEC_ID
, SUM(IHT.CURRENT_HOLDINGS) AS top_10_inst_holdings
FROM OWN_INST_HOLDINGS AS IHT
WHERE IHT.FS_PERM_SEC_ID IN (
SELECT fs_sec_ID IN (
FROM SELECT fs_sec_ID IN (
FROM OWN_INST_HOLDINGS AS IHT)
> )
AND ROW_NUMBER() OVER (
PARTITION BY IHT.FS_PERM_SEC_ID
ORDER BY IHT.CURRENT_HOLDINGS DESC
) <= 10
GROUP BY IHT.FS_PERM_SEC_ID
) AS IHT
ON SE.fs_perm_sec_id = IHT.fs_perm_sec_id
JOIN (
SELECT S.fs_perm_sec_id
, SUM(OIH.current_holdings ) AS inst_holdings
FROM own_inst_holdings AS OIH
WHERE OIH.fs_perm_sec_id IN (
SELECT fs_perm_sec_id
FROM security
)
GROUP BY OIH.fs_perm_sec_id
) AS OIH
ON SE.fs_perm_sec_id = OIH.fs_perm_sec_id
ORDER BY SE.price_date
LIMIT 1

We are on Amazon EC2 PostgreSQL version 9.4 is used on the database. All of our queries ran very slow on the first attempt, and they were very fast until after it was cached, but it was not a mediation because it slowed down the page loading speed.

p>

In one of the queries we used:

SELECT HE.fs_perm_sec_id,
HE.TICKER_EXCHANGE,
HE.proper_name,< br /> OP.shares_outstanding,

(SELECT factset_industry_desc
FROM factset_industry_map AS fim
WHERE fim.factset_industry_code = HES. industry_code) AS industry,

(SELECT SUM(POSITION) AS ST_HOLDINGS
FROM OWN_STAKES_HOLDINGS S
WHERE S.POSITION> 0
AND S.fs_perm_sec_id = HE.fs_perm_sec_id
GROUP BY FS_PERM_SEC_ID) AS stake_holdings,

(SELECT SUM(CURRENT_HOLDINGS)
FROM
(SELECT CURRENT_HOLDINGS
FROM OWN_INST_HOLDINGS IHT
WHERE FS_PERM_SEC =HE.FS_PERM_SEC_ID
ORDER BY CURRENT_HOLDINGS DESC LIMIT 10)A) AS top_10_inst_hodings,

(SELECT SUM(OIH.current_holdings)
FROM own_inst_holdings OIH
WHERE OIH. fs_perm_sec_id = HE.fs_perm_sec_id) AS inst_holdings

FROM own_prices OP
JOIN h_security_ticker_exchange HE ON OP.fs_perm_sec_id = HE.fs_perm_sec_id
IN h_entity_sector_fact. br />WHERE HE.ticker_exchange ='PG-NYS'
ORDER BY OP.price_date DESC LIMIT 1

Run EXPLAIN ANALYZE and receive the following results:

QUERY PLAN
Limit (cost=223.39..223.39 rows=1 width=100) (actual time=2420.644..2420.645 rows=1 loops=1)
-> Sort (cost=223.39..223.39 rows=1 width=100) (actual time=2420.643.. 2420.643 rows=1 loops=1)
Sort Key: op.price_date
Sort Method: top-N heapsort Memory: 25kB
-> Nested Loop (cost=0.26..223.39 rows=1 width=100) (actual time=2316.169..2420.566 rows=36 loops=1)
-> Nested Loop (cost=0.17..8.87 rows=1 width=104) (actual time=3.958..5.084 rows =36 loops=1)
-> Index Scan using h_sec_exch_factset_entity_id_idx on h_security_ticker_exchange he (cost=0.09..4.09 rows=1 width=92) (actual time=1.452..1.454 rows=1 loops=1)
Index Cond: ((ticker_exchange)::text ='PG-NYS'::text)
-> Index Scan using alex_prices on own_prices op (cost=0.09..4.68 rows=33 width=23) ( actual time=2.496..3.592 rows=36 loops=1)
Index Cond: ((fs_perm_sec_id)::text = (he.fs_perm_sec_id)::text)
-> Index Scan using alex_factset_entity_idx on h_entity_sector hes (cost=0.09..4.09 rows=1 width=14) (actual time=0.076..0.077 rows=1 loops=36)
Index Cond: (factset_entity_id = he. factset_entity_id)
SubPlan 1
-> Index Only Scan using alex_factset_industry_code_idx on factset_industry_map fim (cost=0.03..2.03 rows=1 width=20) (actual time=0.006..0.007 rows=1 loops=36 )
Index Cond: (factset_industry_code = hes.industry_code)
Heap Fetches: 0
SubPlan 2
-> GroupAggregate (cost=0.08..2.18 rows=2 width=17) (actual time=0.735..0.735 rows=1 loops=36)
Group Key: s.fs_perm_sec_id
-> Index Only Scan using own_stakes_holdings_perm_position_idx on own_stakes_holdings s (cost=0.08..2.15 rows=14 width =17) (actual time=0.080..0.713 rows=39 loops=36)
Index Cond: ((fs_perm_sec_id = (he.fs_perm_s ec_id)::text) AND (\position\> 0::numeric))
Heap Fetches: 1155
SubPlan 3
-> Aggregate (cost=11.25..11.26 rows=1 width =6) (actual time=0.166..0.166 rows=1 loops=36)
-> Limit (cost=0.09..11.22 rows=10 width=6) (actual time=0.081..0.150 rows=10 loops=36)
-> Index Only Scan Backward using alex_current_holdings_idx on own_inst_holdings iht (cost=0.09..194.87 rows=175 width=6) (actual time=0.080..0.147 rows=10 loops=36)
Index Cond: (fs_perm_sec_id = (he.fs_perm_sec_id)::text)
Heap Fetches: 288
SubPlan 4
-> Aggregate (cost=194.96..194.96 rows=1 width= 6) (actual time=66.102..66.102 rows=1 loops=36)
-> Index Only Scan using alex_current_holdings_idx on own_inst_holdings oih (cost=0.09..194.87 rows=175 width=6) (actual time=0.060 ..65.209 rows=2505 loo ps=36)
Index Cond: (fs_perm_sec_id = (he.fs_perm_sec_id)::text)
Heap Fetches: 33453
Planning time: 1.581 ms
Execution time: 2420.830 ms< /pre>

Once we disable SELECT SUM() for 3 aggregations, it will speed up greatly, but it will destroy the degree of having a relational database.

We use the PG plugin (https:/ /www.npmjs.com/package/pg) Run queries on NodeJS to connect and run queries on the database

How can we speed up the query? What additional steps can we take? We have indexed the database and all fields seem to be indexed correctly, but still not fast enough.

Any help, comments and/or suggestions are appreciated.

Nested loops with aggregation are usually a bad thing. The following should avoid this. (Untested; SQLFiddle will help.) Give me a spin and let me know. I'm curious How the engine uses window function filters.

WITH security
AS (
SELECT HE.fs_perm_sec_id
, HE.TICKER_EXCHANGE< br />, HE.proper_name
, OP.shares_outstanding
, OP.price_date
FROM own_prices AS OP
JOIN h_security_ticker_exchange AS HE
ON OP.fs_perm_sec_id = HE. fs_perm_sec_id
JOIN h_entity_sector AS HES
ON HES.factset_entity_id = HE.factset_entity_id
WHERE HE.ticker_exchange ='PG-NYS'
)
SELECT SE.fs_perm_sec_id
, SE.TICKER_EXCHANGE
, SE.proper_name
, SE.shares_outstanding
, S.stake_holdings
, IHT.top_10_inst_holdings
, OIH.inst_holdings
FROM security SE
JOIN (
SELECT S.fs_perm_sec_id
, S UM(S.POSITION) AS stake_holdings
FROM OWN_STAKES_HOLDINGS AS S
WHERE S.fs_perm_sec_id IN (
SELECT fs_perm_sec_id
FROM security
)
AND S. POSITION> 0
GROUP BY S.fs_perm_sec_id
) AS S
ON SE.fs_perm_sec_id = S.fs_perm_sec_id
JOIN (
SELECT IHT.FS_PERM_SEC_ID
, SUM(IHT.CURRENT_HOLDINGS) AS top_10_inst_holdings
FROM OWN_INST_HOLDINGS AS IHT
WHERE IHT.FS_PERM_SEC_ID IN (
SELECT fs_perm_sec_id
FROM security
)
) OVER (
PARTITION BY IHT.FS_PERM_SEC_ID
ORDER BY IHT.CURRENT_HOLDINGS DESC
) <= 10
GROUP BY IHT.FS_PERM_SEC_ID
) AS IHT
ON SE.fs_perm_sec_id = IHT.fs_perm_sec_id
JOIN (
SELECT S.fs_perm_sec_id
, SUM(OIH.current_holdings) AS inst_holdings
FROM own_inst_holdings AS OI
WHERE OIH.fs_perm_sec_id IN (
SELECT fs_perm_sec_id
FROM security
)
GROUP BY OIH.fs_perm_sec_id
) AS OIH
ON SE. fs_perm_sec_id = OIH.fs_perm_sec_id
ORDER BY SE.price_date
LIMIT 1

Leave a Comment

Your email address will not be published.