My SQL - trying to optimize Query returns more rows
up vote
-1
down vote
favorite
I have one query which I inherit from my previous collegue, but I need to optimize it.
This query returns 72 rows.
SELECT id, contract_no, customer, address, cm_mac, aps
FROM
(
SELECT *
from new_installed_devices
where insert4date >='2018-10-28'
AND insert4date <='2018-10-28'
AND install_mark<2
) as d1
left join
(
SELECT *
from
(
SELECT contract_no AS c_no, cm_mac AS c_mc, MIN(tstamp) as time2,
sum(1) as aps
from devices_change
where contract_no in (
SELECT distinct(contract_no)
from devices_change
where tstamp >= '2018-10-28 06:59:59'
AND tstamp <= '2018-10-29 07:00:00'
)
group by contract_no, cm_mac
) as mtmbl
where mtmbl.time2 >= '2018-10-28 06:59:59'
and mtmbl.time2 <= '2018-10-29 07:00:00'
) as tmp ON d1.contract_no=tmp.c_no
where aps>0
group by contract_no, customer, address, cm_mac;
It takes 20 seconds to execute.
I re-write it, trying to optimize it but in that case I have 75 rows (3 additional rows are returned), but result is presented in 2 seconds.
I have done like this (only difference is in one sub query):
SELECT id, contract_no, customer, address, cm_mac, aps
FROM
(
SELECT *
from new_installed_devices
where insert4date >='2018-10-28'
AND insert4date <='2018-10-28'
AND install_mark<2
) as d1
left join
(
SELECT *
from
(
SELECT distinct
(contract_no) AS c_no,
cm_mac AS c_mc, MIN(tstamp) as time2,
sum(1) as aps
from devices_change
where tstamp >= '2018-10-28 06:59:59'
AND tstamp <= '2018-10-29 07:00:00'
group by contract_no, cm_mac
) as mtmbl
where mtmbl.time2 >= '2018-10-28 06:59:59'
and mtmbl.time2 <= '2018-10-29 07:00:00'
) as tmp ON d1.contract_no=tmp.c_no
where aps>0
group by contract_no, customer, address, cm_mac;
Like you see I did not change a lot in my case but still I am getting more rows that it should be in result.
Can someone please tell me the cause why my second query does not return completely correct result. I tried many things to optimize but without a success.
Thanks a lot!!!
mysql select subquery query-optimization distinct
|
show 1 more comment
up vote
-1
down vote
favorite
I have one query which I inherit from my previous collegue, but I need to optimize it.
This query returns 72 rows.
SELECT id, contract_no, customer, address, cm_mac, aps
FROM
(
SELECT *
from new_installed_devices
where insert4date >='2018-10-28'
AND insert4date <='2018-10-28'
AND install_mark<2
) as d1
left join
(
SELECT *
from
(
SELECT contract_no AS c_no, cm_mac AS c_mc, MIN(tstamp) as time2,
sum(1) as aps
from devices_change
where contract_no in (
SELECT distinct(contract_no)
from devices_change
where tstamp >= '2018-10-28 06:59:59'
AND tstamp <= '2018-10-29 07:00:00'
)
group by contract_no, cm_mac
) as mtmbl
where mtmbl.time2 >= '2018-10-28 06:59:59'
and mtmbl.time2 <= '2018-10-29 07:00:00'
) as tmp ON d1.contract_no=tmp.c_no
where aps>0
group by contract_no, customer, address, cm_mac;
It takes 20 seconds to execute.
I re-write it, trying to optimize it but in that case I have 75 rows (3 additional rows are returned), but result is presented in 2 seconds.
I have done like this (only difference is in one sub query):
SELECT id, contract_no, customer, address, cm_mac, aps
FROM
(
SELECT *
from new_installed_devices
where insert4date >='2018-10-28'
AND insert4date <='2018-10-28'
AND install_mark<2
) as d1
left join
(
SELECT *
from
(
SELECT distinct
(contract_no) AS c_no,
cm_mac AS c_mc, MIN(tstamp) as time2,
sum(1) as aps
from devices_change
where tstamp >= '2018-10-28 06:59:59'
AND tstamp <= '2018-10-29 07:00:00'
group by contract_no, cm_mac
) as mtmbl
where mtmbl.time2 >= '2018-10-28 06:59:59'
and mtmbl.time2 <= '2018-10-29 07:00:00'
) as tmp ON d1.contract_no=tmp.c_no
where aps>0
group by contract_no, customer, address, cm_mac;
Like you see I did not change a lot in my case but still I am getting more rows that it should be in result.
Can someone please tell me the cause why my second query does not return completely correct result. I tried many things to optimize but without a success.
Thanks a lot!!!
mysql select subquery query-optimization distinct
Have you proved that your query is incorrect or are are you assuming it is because it's different?
– P.Salmon
Nov 1 at 15:55
Are the queries you posted exactly the same as what you are using? After reviewing them, I'm unsure why you would be getting extra rows. I did notice that you are using DISTINCT incorrectly, however. DISTINCT works on a per-row basis, not a per column basis. That said, I cannot immediately see why you are getting different results. I strongly recommended comparing the data and see if it is 3 new rows being returned, or more than 3 new ones, but also some old ones not being returned. Update your question with what you find.
– Willem Renzema
Nov 1 at 18:03
@WillemRenzema I checked once more now.. I copy pasted and run these queires and yes definteky I am getting 3 more rows by the second query. In second query I am getting some contract no, which does noe exists in first query but also and getting duplicated contract no (duplicated value). I am defintely getting more rows I really tried now exeucitng those two queries. Also please check for similar topic - the second last answer from Solarflare. He thinks it is ok that I am receiving different results but I am not sure also how. Please assist me if you can
– Dejan
Nov 1 at 18:22
stackoverflow.com/questions/53101378/…
– Dejan
Nov 1 at 18:22
Even if you "did not change a lot", it's not the same query, and it's just faster because it has to do less work. Could you clarify what you are looking for? An explanation why they are different (I tried to explain that, but if that's what's bothering you, I can elaborate). Are you trying to optimize query 1? Then add the required details (explain, ...). What you currently seem to be asking is how to magically make query 2 return the correct results while keeping it as fast, which just doesn't work as easy as that.
– Solarflare
Nov 1 at 20:06
|
show 1 more comment
up vote
-1
down vote
favorite
up vote
-1
down vote
favorite
I have one query which I inherit from my previous collegue, but I need to optimize it.
This query returns 72 rows.
SELECT id, contract_no, customer, address, cm_mac, aps
FROM
(
SELECT *
from new_installed_devices
where insert4date >='2018-10-28'
AND insert4date <='2018-10-28'
AND install_mark<2
) as d1
left join
(
SELECT *
from
(
SELECT contract_no AS c_no, cm_mac AS c_mc, MIN(tstamp) as time2,
sum(1) as aps
from devices_change
where contract_no in (
SELECT distinct(contract_no)
from devices_change
where tstamp >= '2018-10-28 06:59:59'
AND tstamp <= '2018-10-29 07:00:00'
)
group by contract_no, cm_mac
) as mtmbl
where mtmbl.time2 >= '2018-10-28 06:59:59'
and mtmbl.time2 <= '2018-10-29 07:00:00'
) as tmp ON d1.contract_no=tmp.c_no
where aps>0
group by contract_no, customer, address, cm_mac;
It takes 20 seconds to execute.
I re-write it, trying to optimize it but in that case I have 75 rows (3 additional rows are returned), but result is presented in 2 seconds.
I have done like this (only difference is in one sub query):
SELECT id, contract_no, customer, address, cm_mac, aps
FROM
(
SELECT *
from new_installed_devices
where insert4date >='2018-10-28'
AND insert4date <='2018-10-28'
AND install_mark<2
) as d1
left join
(
SELECT *
from
(
SELECT distinct
(contract_no) AS c_no,
cm_mac AS c_mc, MIN(tstamp) as time2,
sum(1) as aps
from devices_change
where tstamp >= '2018-10-28 06:59:59'
AND tstamp <= '2018-10-29 07:00:00'
group by contract_no, cm_mac
) as mtmbl
where mtmbl.time2 >= '2018-10-28 06:59:59'
and mtmbl.time2 <= '2018-10-29 07:00:00'
) as tmp ON d1.contract_no=tmp.c_no
where aps>0
group by contract_no, customer, address, cm_mac;
Like you see I did not change a lot in my case but still I am getting more rows that it should be in result.
Can someone please tell me the cause why my second query does not return completely correct result. I tried many things to optimize but without a success.
Thanks a lot!!!
mysql select subquery query-optimization distinct
I have one query which I inherit from my previous collegue, but I need to optimize it.
This query returns 72 rows.
SELECT id, contract_no, customer, address, cm_mac, aps
FROM
(
SELECT *
from new_installed_devices
where insert4date >='2018-10-28'
AND insert4date <='2018-10-28'
AND install_mark<2
) as d1
left join
(
SELECT *
from
(
SELECT contract_no AS c_no, cm_mac AS c_mc, MIN(tstamp) as time2,
sum(1) as aps
from devices_change
where contract_no in (
SELECT distinct(contract_no)
from devices_change
where tstamp >= '2018-10-28 06:59:59'
AND tstamp <= '2018-10-29 07:00:00'
)
group by contract_no, cm_mac
) as mtmbl
where mtmbl.time2 >= '2018-10-28 06:59:59'
and mtmbl.time2 <= '2018-10-29 07:00:00'
) as tmp ON d1.contract_no=tmp.c_no
where aps>0
group by contract_no, customer, address, cm_mac;
It takes 20 seconds to execute.
I re-write it, trying to optimize it but in that case I have 75 rows (3 additional rows are returned), but result is presented in 2 seconds.
I have done like this (only difference is in one sub query):
SELECT id, contract_no, customer, address, cm_mac, aps
FROM
(
SELECT *
from new_installed_devices
where insert4date >='2018-10-28'
AND insert4date <='2018-10-28'
AND install_mark<2
) as d1
left join
(
SELECT *
from
(
SELECT distinct
(contract_no) AS c_no,
cm_mac AS c_mc, MIN(tstamp) as time2,
sum(1) as aps
from devices_change
where tstamp >= '2018-10-28 06:59:59'
AND tstamp <= '2018-10-29 07:00:00'
group by contract_no, cm_mac
) as mtmbl
where mtmbl.time2 >= '2018-10-28 06:59:59'
and mtmbl.time2 <= '2018-10-29 07:00:00'
) as tmp ON d1.contract_no=tmp.c_no
where aps>0
group by contract_no, customer, address, cm_mac;
Like you see I did not change a lot in my case but still I am getting more rows that it should be in result.
Can someone please tell me the cause why my second query does not return completely correct result. I tried many things to optimize but without a success.
Thanks a lot!!!
mysql select subquery query-optimization distinct
mysql select subquery query-optimization distinct
edited Nov 2 at 5:57
Rick James
65k55796
65k55796
asked Nov 1 at 14:29
Dejan
54982147
54982147
Have you proved that your query is incorrect or are are you assuming it is because it's different?
– P.Salmon
Nov 1 at 15:55
Are the queries you posted exactly the same as what you are using? After reviewing them, I'm unsure why you would be getting extra rows. I did notice that you are using DISTINCT incorrectly, however. DISTINCT works on a per-row basis, not a per column basis. That said, I cannot immediately see why you are getting different results. I strongly recommended comparing the data and see if it is 3 new rows being returned, or more than 3 new ones, but also some old ones not being returned. Update your question with what you find.
– Willem Renzema
Nov 1 at 18:03
@WillemRenzema I checked once more now.. I copy pasted and run these queires and yes definteky I am getting 3 more rows by the second query. In second query I am getting some contract no, which does noe exists in first query but also and getting duplicated contract no (duplicated value). I am defintely getting more rows I really tried now exeucitng those two queries. Also please check for similar topic - the second last answer from Solarflare. He thinks it is ok that I am receiving different results but I am not sure also how. Please assist me if you can
– Dejan
Nov 1 at 18:22
stackoverflow.com/questions/53101378/…
– Dejan
Nov 1 at 18:22
Even if you "did not change a lot", it's not the same query, and it's just faster because it has to do less work. Could you clarify what you are looking for? An explanation why they are different (I tried to explain that, but if that's what's bothering you, I can elaborate). Are you trying to optimize query 1? Then add the required details (explain, ...). What you currently seem to be asking is how to magically make query 2 return the correct results while keeping it as fast, which just doesn't work as easy as that.
– Solarflare
Nov 1 at 20:06
|
show 1 more comment
Have you proved that your query is incorrect or are are you assuming it is because it's different?
– P.Salmon
Nov 1 at 15:55
Are the queries you posted exactly the same as what you are using? After reviewing them, I'm unsure why you would be getting extra rows. I did notice that you are using DISTINCT incorrectly, however. DISTINCT works on a per-row basis, not a per column basis. That said, I cannot immediately see why you are getting different results. I strongly recommended comparing the data and see if it is 3 new rows being returned, or more than 3 new ones, but also some old ones not being returned. Update your question with what you find.
– Willem Renzema
Nov 1 at 18:03
@WillemRenzema I checked once more now.. I copy pasted and run these queires and yes definteky I am getting 3 more rows by the second query. In second query I am getting some contract no, which does noe exists in first query but also and getting duplicated contract no (duplicated value). I am defintely getting more rows I really tried now exeucitng those two queries. Also please check for similar topic - the second last answer from Solarflare. He thinks it is ok that I am receiving different results but I am not sure also how. Please assist me if you can
– Dejan
Nov 1 at 18:22
stackoverflow.com/questions/53101378/…
– Dejan
Nov 1 at 18:22
Even if you "did not change a lot", it's not the same query, and it's just faster because it has to do less work. Could you clarify what you are looking for? An explanation why they are different (I tried to explain that, but if that's what's bothering you, I can elaborate). Are you trying to optimize query 1? Then add the required details (explain, ...). What you currently seem to be asking is how to magically make query 2 return the correct results while keeping it as fast, which just doesn't work as easy as that.
– Solarflare
Nov 1 at 20:06
Have you proved that your query is incorrect or are are you assuming it is because it's different?
– P.Salmon
Nov 1 at 15:55
Have you proved that your query is incorrect or are are you assuming it is because it's different?
– P.Salmon
Nov 1 at 15:55
Are the queries you posted exactly the same as what you are using? After reviewing them, I'm unsure why you would be getting extra rows. I did notice that you are using DISTINCT incorrectly, however. DISTINCT works on a per-row basis, not a per column basis. That said, I cannot immediately see why you are getting different results. I strongly recommended comparing the data and see if it is 3 new rows being returned, or more than 3 new ones, but also some old ones not being returned. Update your question with what you find.
– Willem Renzema
Nov 1 at 18:03
Are the queries you posted exactly the same as what you are using? After reviewing them, I'm unsure why you would be getting extra rows. I did notice that you are using DISTINCT incorrectly, however. DISTINCT works on a per-row basis, not a per column basis. That said, I cannot immediately see why you are getting different results. I strongly recommended comparing the data and see if it is 3 new rows being returned, or more than 3 new ones, but also some old ones not being returned. Update your question with what you find.
– Willem Renzema
Nov 1 at 18:03
@WillemRenzema I checked once more now.. I copy pasted and run these queires and yes definteky I am getting 3 more rows by the second query. In second query I am getting some contract no, which does noe exists in first query but also and getting duplicated contract no (duplicated value). I am defintely getting more rows I really tried now exeucitng those two queries. Also please check for similar topic - the second last answer from Solarflare. He thinks it is ok that I am receiving different results but I am not sure also how. Please assist me if you can
– Dejan
Nov 1 at 18:22
@WillemRenzema I checked once more now.. I copy pasted and run these queires and yes definteky I am getting 3 more rows by the second query. In second query I am getting some contract no, which does noe exists in first query but also and getting duplicated contract no (duplicated value). I am defintely getting more rows I really tried now exeucitng those two queries. Also please check for similar topic - the second last answer from Solarflare. He thinks it is ok that I am receiving different results but I am not sure also how. Please assist me if you can
– Dejan
Nov 1 at 18:22
stackoverflow.com/questions/53101378/…
– Dejan
Nov 1 at 18:22
stackoverflow.com/questions/53101378/…
– Dejan
Nov 1 at 18:22
Even if you "did not change a lot", it's not the same query, and it's just faster because it has to do less work. Could you clarify what you are looking for? An explanation why they are different (I tried to explain that, but if that's what's bothering you, I can elaborate). Are you trying to optimize query 1? Then add the required details (explain, ...). What you currently seem to be asking is how to magically make query 2 return the correct results while keeping it as fast, which just doesn't work as easy as that.
– Solarflare
Nov 1 at 20:06
Even if you "did not change a lot", it's not the same query, and it's just faster because it has to do less work. Could you clarify what you are looking for? An explanation why they are different (I tried to explain that, but if that's what's bothering you, I can elaborate). Are you trying to optimize query 1? Then add the required details (explain, ...). What you currently seem to be asking is how to magically make query 2 return the correct results while keeping it as fast, which just doesn't work as easy as that.
– Solarflare
Nov 1 at 20:06
|
show 1 more comment
2 Answers
2
active
oldest
votes
up vote
0
down vote
- Don't use
SELECT *when you don't need all the columns. It looks likecontract_nois the only column needed fromdl, hence fromnew_installed_devices. - Is there some reason for testing
insert4datefor equality in that weird way? - Recommend
INDEX(insert4date, install_mark, dl)(in that order) - Try to avoid the construct
IN ( SELECT ... ). Usually it is better to useEXISTSorLEFT JOIN. - Don't say
DISTINCT(contract_no), ...--DISTINCTis not a function; it's effect applies to the entire set of expressions. Get rid ofDISTINCTsince theGROUP BYhas that effect. - Recommend
INDEX(contract_no, cm_max, tstamp)(in that order) - The test on mtmbl.time2 is redundant since `MIN(tstamp) is already limited to that (1 day + 2 second) time range.
- Please provide
SHOW CREATE TABLE.
add a comment |
up vote
0
down vote
- You can replace the first subquery in the FROM clause with a direct reference to the table
new_installed_devices, with some conditions in the WHERE clause. In older versions, MySQL doesn't handle subqueries very well, so try to avoid them in the FROM clause (especially if you have more than 1 or 2 of them). - The range conditions for
mtmbl.time2can be folded into the subquery's HAVING clause, to make sure you filter that data as quickly as possible, without creating a large temp table with that subquery. - Can you provide the SHOW CREATE TABLE of these tables and the EXPLAIN for the query? It can be helpful.
When guessing the order MySQL will choose here, you can try to add these indexes and run the following query, to see if it works better. I applied the recommendations above to the query here below (hope my guesses about columns origins were correct, otherwise please fix everything accordingly):
ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_no_mac_tstamp` (`contract_no`,`cm_mac`,`tstamp`);
ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_tstamp_no` (`tstamp`,`contract_no`);
ALTER TABLE `new_installed_devices` ADD INDEX `new_installed_device_idx_no_insert4date` (`contract_no`,`insert4date`);
The query:
SELECT
new_installed_devices.id,
new_installed_devices.contract_no,
new_installed_devices.customer,
new_installed_devices.address,
new_installed_devices.cm_mac,
new_installed_devices.aps
FROM
new_installed_devices AS d1
LEFT JOIN
(
SELECT
*
FROM
(SELECT
devices_change.contract_no AS c_no,
devices_change.cm_mac AS c_mc,
MIN(devices_change.tstamp) AS time2,
sum(1) AS aps
FROM
devices_change
WHERE
devices_change.contract_no IN (
SELECT
DISTINCT (devices_change.contract_no)
FROM
devices_change
WHERE
devices_change.tstamp >= '2018-10-28 06:59:59'
AND devices_change.tstamp <= '2018-10-29 07:00:00'
)
GROUP BY
devices_change.contract_no,
devices_change.cm_mac
HAVING
devices_change.time2 >= '2018-10-28 06:59:59'
AND devices_change.time2 <= '2018-10-29 07:00:00'
ORDER BY
NULL) AS mtmbl) AS tmp
ON d1.contract_no = tmp.c_no
WHERE
aps > 0
AND d1.insert4date >= '2018-10-28'
AND d1.insert4date <= '2018-10-28'
AND d1.install_mark < 2
GROUP BY
new_installed_devices.contract_no,
new_installed_devices.customer,
new_installed_devices.address,
new_installed_devices.cm_mac
ORDER BY
NULL
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
- Don't use
SELECT *when you don't need all the columns. It looks likecontract_nois the only column needed fromdl, hence fromnew_installed_devices. - Is there some reason for testing
insert4datefor equality in that weird way? - Recommend
INDEX(insert4date, install_mark, dl)(in that order) - Try to avoid the construct
IN ( SELECT ... ). Usually it is better to useEXISTSorLEFT JOIN. - Don't say
DISTINCT(contract_no), ...--DISTINCTis not a function; it's effect applies to the entire set of expressions. Get rid ofDISTINCTsince theGROUP BYhas that effect. - Recommend
INDEX(contract_no, cm_max, tstamp)(in that order) - The test on mtmbl.time2 is redundant since `MIN(tstamp) is already limited to that (1 day + 2 second) time range.
- Please provide
SHOW CREATE TABLE.
add a comment |
up vote
0
down vote
- Don't use
SELECT *when you don't need all the columns. It looks likecontract_nois the only column needed fromdl, hence fromnew_installed_devices. - Is there some reason for testing
insert4datefor equality in that weird way? - Recommend
INDEX(insert4date, install_mark, dl)(in that order) - Try to avoid the construct
IN ( SELECT ... ). Usually it is better to useEXISTSorLEFT JOIN. - Don't say
DISTINCT(contract_no), ...--DISTINCTis not a function; it's effect applies to the entire set of expressions. Get rid ofDISTINCTsince theGROUP BYhas that effect. - Recommend
INDEX(contract_no, cm_max, tstamp)(in that order) - The test on mtmbl.time2 is redundant since `MIN(tstamp) is already limited to that (1 day + 2 second) time range.
- Please provide
SHOW CREATE TABLE.
add a comment |
up vote
0
down vote
up vote
0
down vote
- Don't use
SELECT *when you don't need all the columns. It looks likecontract_nois the only column needed fromdl, hence fromnew_installed_devices. - Is there some reason for testing
insert4datefor equality in that weird way? - Recommend
INDEX(insert4date, install_mark, dl)(in that order) - Try to avoid the construct
IN ( SELECT ... ). Usually it is better to useEXISTSorLEFT JOIN. - Don't say
DISTINCT(contract_no), ...--DISTINCTis not a function; it's effect applies to the entire set of expressions. Get rid ofDISTINCTsince theGROUP BYhas that effect. - Recommend
INDEX(contract_no, cm_max, tstamp)(in that order) - The test on mtmbl.time2 is redundant since `MIN(tstamp) is already limited to that (1 day + 2 second) time range.
- Please provide
SHOW CREATE TABLE.
- Don't use
SELECT *when you don't need all the columns. It looks likecontract_nois the only column needed fromdl, hence fromnew_installed_devices. - Is there some reason for testing
insert4datefor equality in that weird way? - Recommend
INDEX(insert4date, install_mark, dl)(in that order) - Try to avoid the construct
IN ( SELECT ... ). Usually it is better to useEXISTSorLEFT JOIN. - Don't say
DISTINCT(contract_no), ...--DISTINCTis not a function; it's effect applies to the entire set of expressions. Get rid ofDISTINCTsince theGROUP BYhas that effect. - Recommend
INDEX(contract_no, cm_max, tstamp)(in that order) - The test on mtmbl.time2 is redundant since `MIN(tstamp) is already limited to that (1 day + 2 second) time range.
- Please provide
SHOW CREATE TABLE.
answered Nov 2 at 5:55
Rick James
65k55796
65k55796
add a comment |
add a comment |
up vote
0
down vote
- You can replace the first subquery in the FROM clause with a direct reference to the table
new_installed_devices, with some conditions in the WHERE clause. In older versions, MySQL doesn't handle subqueries very well, so try to avoid them in the FROM clause (especially if you have more than 1 or 2 of them). - The range conditions for
mtmbl.time2can be folded into the subquery's HAVING clause, to make sure you filter that data as quickly as possible, without creating a large temp table with that subquery. - Can you provide the SHOW CREATE TABLE of these tables and the EXPLAIN for the query? It can be helpful.
When guessing the order MySQL will choose here, you can try to add these indexes and run the following query, to see if it works better. I applied the recommendations above to the query here below (hope my guesses about columns origins were correct, otherwise please fix everything accordingly):
ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_no_mac_tstamp` (`contract_no`,`cm_mac`,`tstamp`);
ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_tstamp_no` (`tstamp`,`contract_no`);
ALTER TABLE `new_installed_devices` ADD INDEX `new_installed_device_idx_no_insert4date` (`contract_no`,`insert4date`);
The query:
SELECT
new_installed_devices.id,
new_installed_devices.contract_no,
new_installed_devices.customer,
new_installed_devices.address,
new_installed_devices.cm_mac,
new_installed_devices.aps
FROM
new_installed_devices AS d1
LEFT JOIN
(
SELECT
*
FROM
(SELECT
devices_change.contract_no AS c_no,
devices_change.cm_mac AS c_mc,
MIN(devices_change.tstamp) AS time2,
sum(1) AS aps
FROM
devices_change
WHERE
devices_change.contract_no IN (
SELECT
DISTINCT (devices_change.contract_no)
FROM
devices_change
WHERE
devices_change.tstamp >= '2018-10-28 06:59:59'
AND devices_change.tstamp <= '2018-10-29 07:00:00'
)
GROUP BY
devices_change.contract_no,
devices_change.cm_mac
HAVING
devices_change.time2 >= '2018-10-28 06:59:59'
AND devices_change.time2 <= '2018-10-29 07:00:00'
ORDER BY
NULL) AS mtmbl) AS tmp
ON d1.contract_no = tmp.c_no
WHERE
aps > 0
AND d1.insert4date >= '2018-10-28'
AND d1.insert4date <= '2018-10-28'
AND d1.install_mark < 2
GROUP BY
new_installed_devices.contract_no,
new_installed_devices.customer,
new_installed_devices.address,
new_installed_devices.cm_mac
ORDER BY
NULL
add a comment |
up vote
0
down vote
- You can replace the first subquery in the FROM clause with a direct reference to the table
new_installed_devices, with some conditions in the WHERE clause. In older versions, MySQL doesn't handle subqueries very well, so try to avoid them in the FROM clause (especially if you have more than 1 or 2 of them). - The range conditions for
mtmbl.time2can be folded into the subquery's HAVING clause, to make sure you filter that data as quickly as possible, without creating a large temp table with that subquery. - Can you provide the SHOW CREATE TABLE of these tables and the EXPLAIN for the query? It can be helpful.
When guessing the order MySQL will choose here, you can try to add these indexes and run the following query, to see if it works better. I applied the recommendations above to the query here below (hope my guesses about columns origins were correct, otherwise please fix everything accordingly):
ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_no_mac_tstamp` (`contract_no`,`cm_mac`,`tstamp`);
ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_tstamp_no` (`tstamp`,`contract_no`);
ALTER TABLE `new_installed_devices` ADD INDEX `new_installed_device_idx_no_insert4date` (`contract_no`,`insert4date`);
The query:
SELECT
new_installed_devices.id,
new_installed_devices.contract_no,
new_installed_devices.customer,
new_installed_devices.address,
new_installed_devices.cm_mac,
new_installed_devices.aps
FROM
new_installed_devices AS d1
LEFT JOIN
(
SELECT
*
FROM
(SELECT
devices_change.contract_no AS c_no,
devices_change.cm_mac AS c_mc,
MIN(devices_change.tstamp) AS time2,
sum(1) AS aps
FROM
devices_change
WHERE
devices_change.contract_no IN (
SELECT
DISTINCT (devices_change.contract_no)
FROM
devices_change
WHERE
devices_change.tstamp >= '2018-10-28 06:59:59'
AND devices_change.tstamp <= '2018-10-29 07:00:00'
)
GROUP BY
devices_change.contract_no,
devices_change.cm_mac
HAVING
devices_change.time2 >= '2018-10-28 06:59:59'
AND devices_change.time2 <= '2018-10-29 07:00:00'
ORDER BY
NULL) AS mtmbl) AS tmp
ON d1.contract_no = tmp.c_no
WHERE
aps > 0
AND d1.insert4date >= '2018-10-28'
AND d1.insert4date <= '2018-10-28'
AND d1.install_mark < 2
GROUP BY
new_installed_devices.contract_no,
new_installed_devices.customer,
new_installed_devices.address,
new_installed_devices.cm_mac
ORDER BY
NULL
add a comment |
up vote
0
down vote
up vote
0
down vote
- You can replace the first subquery in the FROM clause with a direct reference to the table
new_installed_devices, with some conditions in the WHERE clause. In older versions, MySQL doesn't handle subqueries very well, so try to avoid them in the FROM clause (especially if you have more than 1 or 2 of them). - The range conditions for
mtmbl.time2can be folded into the subquery's HAVING clause, to make sure you filter that data as quickly as possible, without creating a large temp table with that subquery. - Can you provide the SHOW CREATE TABLE of these tables and the EXPLAIN for the query? It can be helpful.
When guessing the order MySQL will choose here, you can try to add these indexes and run the following query, to see if it works better. I applied the recommendations above to the query here below (hope my guesses about columns origins were correct, otherwise please fix everything accordingly):
ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_no_mac_tstamp` (`contract_no`,`cm_mac`,`tstamp`);
ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_tstamp_no` (`tstamp`,`contract_no`);
ALTER TABLE `new_installed_devices` ADD INDEX `new_installed_device_idx_no_insert4date` (`contract_no`,`insert4date`);
The query:
SELECT
new_installed_devices.id,
new_installed_devices.contract_no,
new_installed_devices.customer,
new_installed_devices.address,
new_installed_devices.cm_mac,
new_installed_devices.aps
FROM
new_installed_devices AS d1
LEFT JOIN
(
SELECT
*
FROM
(SELECT
devices_change.contract_no AS c_no,
devices_change.cm_mac AS c_mc,
MIN(devices_change.tstamp) AS time2,
sum(1) AS aps
FROM
devices_change
WHERE
devices_change.contract_no IN (
SELECT
DISTINCT (devices_change.contract_no)
FROM
devices_change
WHERE
devices_change.tstamp >= '2018-10-28 06:59:59'
AND devices_change.tstamp <= '2018-10-29 07:00:00'
)
GROUP BY
devices_change.contract_no,
devices_change.cm_mac
HAVING
devices_change.time2 >= '2018-10-28 06:59:59'
AND devices_change.time2 <= '2018-10-29 07:00:00'
ORDER BY
NULL) AS mtmbl) AS tmp
ON d1.contract_no = tmp.c_no
WHERE
aps > 0
AND d1.insert4date >= '2018-10-28'
AND d1.insert4date <= '2018-10-28'
AND d1.install_mark < 2
GROUP BY
new_installed_devices.contract_no,
new_installed_devices.customer,
new_installed_devices.address,
new_installed_devices.cm_mac
ORDER BY
NULL
- You can replace the first subquery in the FROM clause with a direct reference to the table
new_installed_devices, with some conditions in the WHERE clause. In older versions, MySQL doesn't handle subqueries very well, so try to avoid them in the FROM clause (especially if you have more than 1 or 2 of them). - The range conditions for
mtmbl.time2can be folded into the subquery's HAVING clause, to make sure you filter that data as quickly as possible, without creating a large temp table with that subquery. - Can you provide the SHOW CREATE TABLE of these tables and the EXPLAIN for the query? It can be helpful.
When guessing the order MySQL will choose here, you can try to add these indexes and run the following query, to see if it works better. I applied the recommendations above to the query here below (hope my guesses about columns origins were correct, otherwise please fix everything accordingly):
ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_no_mac_tstamp` (`contract_no`,`cm_mac`,`tstamp`);
ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_tstamp_no` (`tstamp`,`contract_no`);
ALTER TABLE `new_installed_devices` ADD INDEX `new_installed_device_idx_no_insert4date` (`contract_no`,`insert4date`);
The query:
SELECT
new_installed_devices.id,
new_installed_devices.contract_no,
new_installed_devices.customer,
new_installed_devices.address,
new_installed_devices.cm_mac,
new_installed_devices.aps
FROM
new_installed_devices AS d1
LEFT JOIN
(
SELECT
*
FROM
(SELECT
devices_change.contract_no AS c_no,
devices_change.cm_mac AS c_mc,
MIN(devices_change.tstamp) AS time2,
sum(1) AS aps
FROM
devices_change
WHERE
devices_change.contract_no IN (
SELECT
DISTINCT (devices_change.contract_no)
FROM
devices_change
WHERE
devices_change.tstamp >= '2018-10-28 06:59:59'
AND devices_change.tstamp <= '2018-10-29 07:00:00'
)
GROUP BY
devices_change.contract_no,
devices_change.cm_mac
HAVING
devices_change.time2 >= '2018-10-28 06:59:59'
AND devices_change.time2 <= '2018-10-29 07:00:00'
ORDER BY
NULL) AS mtmbl) AS tmp
ON d1.contract_no = tmp.c_no
WHERE
aps > 0
AND d1.insert4date >= '2018-10-28'
AND d1.insert4date <= '2018-10-28'
AND d1.install_mark < 2
GROUP BY
new_installed_devices.contract_no,
new_installed_devices.customer,
new_installed_devices.address,
new_installed_devices.cm_mac
ORDER BY
NULL
answered Nov 11 at 14:05
Tomer Shay
537214
537214
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53103345%2fmy-sql-trying-to-optimize-query-returns-more-rows%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Have you proved that your query is incorrect or are are you assuming it is because it's different?
– P.Salmon
Nov 1 at 15:55
Are the queries you posted exactly the same as what you are using? After reviewing them, I'm unsure why you would be getting extra rows. I did notice that you are using DISTINCT incorrectly, however. DISTINCT works on a per-row basis, not a per column basis. That said, I cannot immediately see why you are getting different results. I strongly recommended comparing the data and see if it is 3 new rows being returned, or more than 3 new ones, but also some old ones not being returned. Update your question with what you find.
– Willem Renzema
Nov 1 at 18:03
@WillemRenzema I checked once more now.. I copy pasted and run these queires and yes definteky I am getting 3 more rows by the second query. In second query I am getting some contract no, which does noe exists in first query but also and getting duplicated contract no (duplicated value). I am defintely getting more rows I really tried now exeucitng those two queries. Also please check for similar topic - the second last answer from Solarflare. He thinks it is ok that I am receiving different results but I am not sure also how. Please assist me if you can
– Dejan
Nov 1 at 18:22
stackoverflow.com/questions/53101378/…
– Dejan
Nov 1 at 18:22
Even if you "did not change a lot", it's not the same query, and it's just faster because it has to do less work. Could you clarify what you are looking for? An explanation why they are different (I tried to explain that, but if that's what's bothering you, I can elaborate). Are you trying to optimize query 1? Then add the required details (explain, ...). What you currently seem to be asking is how to magically make query 2 return the correct results while keeping it as fast, which just doesn't work as easy as that.
– Solarflare
Nov 1 at 20:06