My SQL - trying to optimize Query returns more rows









up vote
-1
down vote

favorite












I have one query which I inherit from my previous collegue, but I need to optimize it.



This query returns 72 rows.



SELECT id, contract_no, customer, address, cm_mac, aps
FROM
(
SELECT *
from new_installed_devices
where insert4date >='2018-10-28'
AND insert4date <='2018-10-28'
AND install_mark<2
) as d1
left join
(
SELECT *
from
(
SELECT contract_no AS c_no, cm_mac AS c_mc, MIN(tstamp) as time2,
sum(1) as aps
from devices_change
where contract_no in (
SELECT distinct(contract_no)
from devices_change
where tstamp >= '2018-10-28 06:59:59'
AND tstamp <= '2018-10-29 07:00:00'
)
group by contract_no, cm_mac
) as mtmbl
where mtmbl.time2 >= '2018-10-28 06:59:59'
and mtmbl.time2 <= '2018-10-29 07:00:00'
) as tmp ON d1.contract_no=tmp.c_no
where aps>0
group by contract_no, customer, address, cm_mac;


It takes 20 seconds to execute.
I re-write it, trying to optimize it but in that case I have 75 rows (3 additional rows are returned), but result is presented in 2 seconds.



I have done like this (only difference is in one sub query):



SELECT id, contract_no, customer, address, cm_mac, aps
FROM
(
SELECT *
from new_installed_devices
where insert4date >='2018-10-28'
AND insert4date <='2018-10-28'
AND install_mark<2
) as d1
left join
(
SELECT *
from
(
SELECT distinct
(contract_no) AS c_no,
cm_mac AS c_mc, MIN(tstamp) as time2,
sum(1) as aps
from devices_change
where tstamp >= '2018-10-28 06:59:59'
AND tstamp <= '2018-10-29 07:00:00'
group by contract_no, cm_mac
) as mtmbl
where mtmbl.time2 >= '2018-10-28 06:59:59'
and mtmbl.time2 <= '2018-10-29 07:00:00'
) as tmp ON d1.contract_no=tmp.c_no
where aps>0
group by contract_no, customer, address, cm_mac;


Like you see I did not change a lot in my case but still I am getting more rows that it should be in result.
Can someone please tell me the cause why my second query does not return completely correct result. I tried many things to optimize but without a success.
Thanks a lot!!!










share|improve this question























  • Have you proved that your query is incorrect or are are you assuming it is because it's different?
    – P.Salmon
    Nov 1 at 15:55










  • Are the queries you posted exactly the same as what you are using? After reviewing them, I'm unsure why you would be getting extra rows. I did notice that you are using DISTINCT incorrectly, however. DISTINCT works on a per-row basis, not a per column basis. That said, I cannot immediately see why you are getting different results. I strongly recommended comparing the data and see if it is 3 new rows being returned, or more than 3 new ones, but also some old ones not being returned. Update your question with what you find.
    – Willem Renzema
    Nov 1 at 18:03










  • @WillemRenzema I checked once more now.. I copy pasted and run these queires and yes definteky I am getting 3 more rows by the second query. In second query I am getting some contract no, which does noe exists in first query but also and getting duplicated contract no (duplicated value). I am defintely getting more rows I really tried now exeucitng those two queries. Also please check for similar topic - the second last answer from Solarflare. He thinks it is ok that I am receiving different results but I am not sure also how. Please assist me if you can
    – Dejan
    Nov 1 at 18:22










  • stackoverflow.com/questions/53101378/…
    – Dejan
    Nov 1 at 18:22










  • Even if you "did not change a lot", it's not the same query, and it's just faster because it has to do less work. Could you clarify what you are looking for? An explanation why they are different (I tried to explain that, but if that's what's bothering you, I can elaborate). Are you trying to optimize query 1? Then add the required details (explain, ...). What you currently seem to be asking is how to magically make query 2 return the correct results while keeping it as fast, which just doesn't work as easy as that.
    – Solarflare
    Nov 1 at 20:06














up vote
-1
down vote

favorite












I have one query which I inherit from my previous collegue, but I need to optimize it.



This query returns 72 rows.



SELECT id, contract_no, customer, address, cm_mac, aps
FROM
(
SELECT *
from new_installed_devices
where insert4date >='2018-10-28'
AND insert4date <='2018-10-28'
AND install_mark<2
) as d1
left join
(
SELECT *
from
(
SELECT contract_no AS c_no, cm_mac AS c_mc, MIN(tstamp) as time2,
sum(1) as aps
from devices_change
where contract_no in (
SELECT distinct(contract_no)
from devices_change
where tstamp >= '2018-10-28 06:59:59'
AND tstamp <= '2018-10-29 07:00:00'
)
group by contract_no, cm_mac
) as mtmbl
where mtmbl.time2 >= '2018-10-28 06:59:59'
and mtmbl.time2 <= '2018-10-29 07:00:00'
) as tmp ON d1.contract_no=tmp.c_no
where aps>0
group by contract_no, customer, address, cm_mac;


It takes 20 seconds to execute.
I re-write it, trying to optimize it but in that case I have 75 rows (3 additional rows are returned), but result is presented in 2 seconds.



I have done like this (only difference is in one sub query):



SELECT id, contract_no, customer, address, cm_mac, aps
FROM
(
SELECT *
from new_installed_devices
where insert4date >='2018-10-28'
AND insert4date <='2018-10-28'
AND install_mark<2
) as d1
left join
(
SELECT *
from
(
SELECT distinct
(contract_no) AS c_no,
cm_mac AS c_mc, MIN(tstamp) as time2,
sum(1) as aps
from devices_change
where tstamp >= '2018-10-28 06:59:59'
AND tstamp <= '2018-10-29 07:00:00'
group by contract_no, cm_mac
) as mtmbl
where mtmbl.time2 >= '2018-10-28 06:59:59'
and mtmbl.time2 <= '2018-10-29 07:00:00'
) as tmp ON d1.contract_no=tmp.c_no
where aps>0
group by contract_no, customer, address, cm_mac;


Like you see I did not change a lot in my case but still I am getting more rows that it should be in result.
Can someone please tell me the cause why my second query does not return completely correct result. I tried many things to optimize but without a success.
Thanks a lot!!!










share|improve this question























  • Have you proved that your query is incorrect or are are you assuming it is because it's different?
    – P.Salmon
    Nov 1 at 15:55










  • Are the queries you posted exactly the same as what you are using? After reviewing them, I'm unsure why you would be getting extra rows. I did notice that you are using DISTINCT incorrectly, however. DISTINCT works on a per-row basis, not a per column basis. That said, I cannot immediately see why you are getting different results. I strongly recommended comparing the data and see if it is 3 new rows being returned, or more than 3 new ones, but also some old ones not being returned. Update your question with what you find.
    – Willem Renzema
    Nov 1 at 18:03










  • @WillemRenzema I checked once more now.. I copy pasted and run these queires and yes definteky I am getting 3 more rows by the second query. In second query I am getting some contract no, which does noe exists in first query but also and getting duplicated contract no (duplicated value). I am defintely getting more rows I really tried now exeucitng those two queries. Also please check for similar topic - the second last answer from Solarflare. He thinks it is ok that I am receiving different results but I am not sure also how. Please assist me if you can
    – Dejan
    Nov 1 at 18:22










  • stackoverflow.com/questions/53101378/…
    – Dejan
    Nov 1 at 18:22










  • Even if you "did not change a lot", it's not the same query, and it's just faster because it has to do less work. Could you clarify what you are looking for? An explanation why they are different (I tried to explain that, but if that's what's bothering you, I can elaborate). Are you trying to optimize query 1? Then add the required details (explain, ...). What you currently seem to be asking is how to magically make query 2 return the correct results while keeping it as fast, which just doesn't work as easy as that.
    – Solarflare
    Nov 1 at 20:06












up vote
-1
down vote

favorite









up vote
-1
down vote

favorite











I have one query which I inherit from my previous collegue, but I need to optimize it.



This query returns 72 rows.



SELECT id, contract_no, customer, address, cm_mac, aps
FROM
(
SELECT *
from new_installed_devices
where insert4date >='2018-10-28'
AND insert4date <='2018-10-28'
AND install_mark<2
) as d1
left join
(
SELECT *
from
(
SELECT contract_no AS c_no, cm_mac AS c_mc, MIN(tstamp) as time2,
sum(1) as aps
from devices_change
where contract_no in (
SELECT distinct(contract_no)
from devices_change
where tstamp >= '2018-10-28 06:59:59'
AND tstamp <= '2018-10-29 07:00:00'
)
group by contract_no, cm_mac
) as mtmbl
where mtmbl.time2 >= '2018-10-28 06:59:59'
and mtmbl.time2 <= '2018-10-29 07:00:00'
) as tmp ON d1.contract_no=tmp.c_no
where aps>0
group by contract_no, customer, address, cm_mac;


It takes 20 seconds to execute.
I re-write it, trying to optimize it but in that case I have 75 rows (3 additional rows are returned), but result is presented in 2 seconds.



I have done like this (only difference is in one sub query):



SELECT id, contract_no, customer, address, cm_mac, aps
FROM
(
SELECT *
from new_installed_devices
where insert4date >='2018-10-28'
AND insert4date <='2018-10-28'
AND install_mark<2
) as d1
left join
(
SELECT *
from
(
SELECT distinct
(contract_no) AS c_no,
cm_mac AS c_mc, MIN(tstamp) as time2,
sum(1) as aps
from devices_change
where tstamp >= '2018-10-28 06:59:59'
AND tstamp <= '2018-10-29 07:00:00'
group by contract_no, cm_mac
) as mtmbl
where mtmbl.time2 >= '2018-10-28 06:59:59'
and mtmbl.time2 <= '2018-10-29 07:00:00'
) as tmp ON d1.contract_no=tmp.c_no
where aps>0
group by contract_no, customer, address, cm_mac;


Like you see I did not change a lot in my case but still I am getting more rows that it should be in result.
Can someone please tell me the cause why my second query does not return completely correct result. I tried many things to optimize but without a success.
Thanks a lot!!!










share|improve this question















I have one query which I inherit from my previous collegue, but I need to optimize it.



This query returns 72 rows.



SELECT id, contract_no, customer, address, cm_mac, aps
FROM
(
SELECT *
from new_installed_devices
where insert4date >='2018-10-28'
AND insert4date <='2018-10-28'
AND install_mark<2
) as d1
left join
(
SELECT *
from
(
SELECT contract_no AS c_no, cm_mac AS c_mc, MIN(tstamp) as time2,
sum(1) as aps
from devices_change
where contract_no in (
SELECT distinct(contract_no)
from devices_change
where tstamp >= '2018-10-28 06:59:59'
AND tstamp <= '2018-10-29 07:00:00'
)
group by contract_no, cm_mac
) as mtmbl
where mtmbl.time2 >= '2018-10-28 06:59:59'
and mtmbl.time2 <= '2018-10-29 07:00:00'
) as tmp ON d1.contract_no=tmp.c_no
where aps>0
group by contract_no, customer, address, cm_mac;


It takes 20 seconds to execute.
I re-write it, trying to optimize it but in that case I have 75 rows (3 additional rows are returned), but result is presented in 2 seconds.



I have done like this (only difference is in one sub query):



SELECT id, contract_no, customer, address, cm_mac, aps
FROM
(
SELECT *
from new_installed_devices
where insert4date >='2018-10-28'
AND insert4date <='2018-10-28'
AND install_mark<2
) as d1
left join
(
SELECT *
from
(
SELECT distinct
(contract_no) AS c_no,
cm_mac AS c_mc, MIN(tstamp) as time2,
sum(1) as aps
from devices_change
where tstamp >= '2018-10-28 06:59:59'
AND tstamp <= '2018-10-29 07:00:00'
group by contract_no, cm_mac
) as mtmbl
where mtmbl.time2 >= '2018-10-28 06:59:59'
and mtmbl.time2 <= '2018-10-29 07:00:00'
) as tmp ON d1.contract_no=tmp.c_no
where aps>0
group by contract_no, customer, address, cm_mac;


Like you see I did not change a lot in my case but still I am getting more rows that it should be in result.
Can someone please tell me the cause why my second query does not return completely correct result. I tried many things to optimize but without a success.
Thanks a lot!!!







mysql select subquery query-optimization distinct






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 2 at 5:57









Rick James

65k55796




65k55796










asked Nov 1 at 14:29









Dejan

54982147




54982147











  • Have you proved that your query is incorrect or are are you assuming it is because it's different?
    – P.Salmon
    Nov 1 at 15:55










  • Are the queries you posted exactly the same as what you are using? After reviewing them, I'm unsure why you would be getting extra rows. I did notice that you are using DISTINCT incorrectly, however. DISTINCT works on a per-row basis, not a per column basis. That said, I cannot immediately see why you are getting different results. I strongly recommended comparing the data and see if it is 3 new rows being returned, or more than 3 new ones, but also some old ones not being returned. Update your question with what you find.
    – Willem Renzema
    Nov 1 at 18:03










  • @WillemRenzema I checked once more now.. I copy pasted and run these queires and yes definteky I am getting 3 more rows by the second query. In second query I am getting some contract no, which does noe exists in first query but also and getting duplicated contract no (duplicated value). I am defintely getting more rows I really tried now exeucitng those two queries. Also please check for similar topic - the second last answer from Solarflare. He thinks it is ok that I am receiving different results but I am not sure also how. Please assist me if you can
    – Dejan
    Nov 1 at 18:22










  • stackoverflow.com/questions/53101378/…
    – Dejan
    Nov 1 at 18:22










  • Even if you "did not change a lot", it's not the same query, and it's just faster because it has to do less work. Could you clarify what you are looking for? An explanation why they are different (I tried to explain that, but if that's what's bothering you, I can elaborate). Are you trying to optimize query 1? Then add the required details (explain, ...). What you currently seem to be asking is how to magically make query 2 return the correct results while keeping it as fast, which just doesn't work as easy as that.
    – Solarflare
    Nov 1 at 20:06
















  • Have you proved that your query is incorrect or are are you assuming it is because it's different?
    – P.Salmon
    Nov 1 at 15:55










  • Are the queries you posted exactly the same as what you are using? After reviewing them, I'm unsure why you would be getting extra rows. I did notice that you are using DISTINCT incorrectly, however. DISTINCT works on a per-row basis, not a per column basis. That said, I cannot immediately see why you are getting different results. I strongly recommended comparing the data and see if it is 3 new rows being returned, or more than 3 new ones, but also some old ones not being returned. Update your question with what you find.
    – Willem Renzema
    Nov 1 at 18:03










  • @WillemRenzema I checked once more now.. I copy pasted and run these queires and yes definteky I am getting 3 more rows by the second query. In second query I am getting some contract no, which does noe exists in first query but also and getting duplicated contract no (duplicated value). I am defintely getting more rows I really tried now exeucitng those two queries. Also please check for similar topic - the second last answer from Solarflare. He thinks it is ok that I am receiving different results but I am not sure also how. Please assist me if you can
    – Dejan
    Nov 1 at 18:22










  • stackoverflow.com/questions/53101378/…
    – Dejan
    Nov 1 at 18:22










  • Even if you "did not change a lot", it's not the same query, and it's just faster because it has to do less work. Could you clarify what you are looking for? An explanation why they are different (I tried to explain that, but if that's what's bothering you, I can elaborate). Are you trying to optimize query 1? Then add the required details (explain, ...). What you currently seem to be asking is how to magically make query 2 return the correct results while keeping it as fast, which just doesn't work as easy as that.
    – Solarflare
    Nov 1 at 20:06















Have you proved that your query is incorrect or are are you assuming it is because it's different?
– P.Salmon
Nov 1 at 15:55




Have you proved that your query is incorrect or are are you assuming it is because it's different?
– P.Salmon
Nov 1 at 15:55












Are the queries you posted exactly the same as what you are using? After reviewing them, I'm unsure why you would be getting extra rows. I did notice that you are using DISTINCT incorrectly, however. DISTINCT works on a per-row basis, not a per column basis. That said, I cannot immediately see why you are getting different results. I strongly recommended comparing the data and see if it is 3 new rows being returned, or more than 3 new ones, but also some old ones not being returned. Update your question with what you find.
– Willem Renzema
Nov 1 at 18:03




Are the queries you posted exactly the same as what you are using? After reviewing them, I'm unsure why you would be getting extra rows. I did notice that you are using DISTINCT incorrectly, however. DISTINCT works on a per-row basis, not a per column basis. That said, I cannot immediately see why you are getting different results. I strongly recommended comparing the data and see if it is 3 new rows being returned, or more than 3 new ones, but also some old ones not being returned. Update your question with what you find.
– Willem Renzema
Nov 1 at 18:03












@WillemRenzema I checked once more now.. I copy pasted and run these queires and yes definteky I am getting 3 more rows by the second query. In second query I am getting some contract no, which does noe exists in first query but also and getting duplicated contract no (duplicated value). I am defintely getting more rows I really tried now exeucitng those two queries. Also please check for similar topic - the second last answer from Solarflare. He thinks it is ok that I am receiving different results but I am not sure also how. Please assist me if you can
– Dejan
Nov 1 at 18:22




@WillemRenzema I checked once more now.. I copy pasted and run these queires and yes definteky I am getting 3 more rows by the second query. In second query I am getting some contract no, which does noe exists in first query but also and getting duplicated contract no (duplicated value). I am defintely getting more rows I really tried now exeucitng those two queries. Also please check for similar topic - the second last answer from Solarflare. He thinks it is ok that I am receiving different results but I am not sure also how. Please assist me if you can
– Dejan
Nov 1 at 18:22












stackoverflow.com/questions/53101378/…
– Dejan
Nov 1 at 18:22




stackoverflow.com/questions/53101378/…
– Dejan
Nov 1 at 18:22












Even if you "did not change a lot", it's not the same query, and it's just faster because it has to do less work. Could you clarify what you are looking for? An explanation why they are different (I tried to explain that, but if that's what's bothering you, I can elaborate). Are you trying to optimize query 1? Then add the required details (explain, ...). What you currently seem to be asking is how to magically make query 2 return the correct results while keeping it as fast, which just doesn't work as easy as that.
– Solarflare
Nov 1 at 20:06




Even if you "did not change a lot", it's not the same query, and it's just faster because it has to do less work. Could you clarify what you are looking for? An explanation why they are different (I tried to explain that, but if that's what's bothering you, I can elaborate). Are you trying to optimize query 1? Then add the required details (explain, ...). What you currently seem to be asking is how to magically make query 2 return the correct results while keeping it as fast, which just doesn't work as easy as that.
– Solarflare
Nov 1 at 20:06












2 Answers
2






active

oldest

votes

















up vote
0
down vote













  • Don't use SELECT * when you don't need all the columns. It looks like contract_no is the only column needed from dl, hence from new_installed_devices.

  • Is there some reason for testing insert4date for equality in that weird way?

  • Recommend INDEX(insert4date, install_mark, dl) (in that order)

  • Try to avoid the construct IN ( SELECT ... ). Usually it is better to use EXISTS or LEFT JOIN.

  • Don't say DISTINCT(contract_no), ... -- DISTINCT is not a function; it's effect applies to the entire set of expressions. Get rid of DISTINCT since the GROUP BY has that effect.

  • Recommend INDEX(contract_no, cm_max, tstamp) (in that order)

  • The test on mtmbl.time2 is redundant since `MIN(tstamp) is already limited to that (1 day + 2 second) time range.

  • Please provide SHOW CREATE TABLE.





share|improve this answer



























    up vote
    0
    down vote













    • You can replace the first subquery in the FROM clause with a direct reference to the table new_installed_devices, with some conditions in the WHERE clause. In older versions, MySQL doesn't handle subqueries very well, so try to avoid them in the FROM clause (especially if you have more than 1 or 2 of them).

    • The range conditions for mtmbl.time2 can be folded into the subquery's HAVING clause, to make sure you filter that data as quickly as possible, without creating a large temp table with that subquery.

    • Can you provide the SHOW CREATE TABLE of these tables and the EXPLAIN for the query? It can be helpful.

    When guessing the order MySQL will choose here, you can try to add these indexes and run the following query, to see if it works better. I applied the recommendations above to the query here below (hope my guesses about columns origins were correct, otherwise please fix everything accordingly):



    ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_no_mac_tstamp` (`contract_no`,`cm_mac`,`tstamp`);
    ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_tstamp_no` (`tstamp`,`contract_no`);
    ALTER TABLE `new_installed_devices` ADD INDEX `new_installed_device_idx_no_insert4date` (`contract_no`,`insert4date`);


    The query:



    SELECT
    new_installed_devices.id,
    new_installed_devices.contract_no,
    new_installed_devices.customer,
    new_installed_devices.address,
    new_installed_devices.cm_mac,
    new_installed_devices.aps
    FROM
    new_installed_devices AS d1
    LEFT JOIN
    (
    SELECT
    *
    FROM
    (SELECT
    devices_change.contract_no AS c_no,
    devices_change.cm_mac AS c_mc,
    MIN(devices_change.tstamp) AS time2,
    sum(1) AS aps
    FROM
    devices_change
    WHERE
    devices_change.contract_no IN (
    SELECT
    DISTINCT (devices_change.contract_no)
    FROM
    devices_change
    WHERE
    devices_change.tstamp >= '2018-10-28 06:59:59'
    AND devices_change.tstamp <= '2018-10-29 07:00:00'
    )
    GROUP BY
    devices_change.contract_no,
    devices_change.cm_mac
    HAVING
    devices_change.time2 >= '2018-10-28 06:59:59'
    AND devices_change.time2 <= '2018-10-29 07:00:00'
    ORDER BY
    NULL) AS mtmbl) AS tmp
    ON d1.contract_no = tmp.c_no
    WHERE
    aps > 0
    AND d1.insert4date >= '2018-10-28'
    AND d1.insert4date <= '2018-10-28'
    AND d1.install_mark < 2
    GROUP BY
    new_installed_devices.contract_no,
    new_installed_devices.customer,
    new_installed_devices.address,
    new_installed_devices.cm_mac
    ORDER BY
    NULL





    share|improve this answer




















      Your Answer






      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53103345%2fmy-sql-trying-to-optimize-query-returns-more-rows%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      0
      down vote













      • Don't use SELECT * when you don't need all the columns. It looks like contract_no is the only column needed from dl, hence from new_installed_devices.

      • Is there some reason for testing insert4date for equality in that weird way?

      • Recommend INDEX(insert4date, install_mark, dl) (in that order)

      • Try to avoid the construct IN ( SELECT ... ). Usually it is better to use EXISTS or LEFT JOIN.

      • Don't say DISTINCT(contract_no), ... -- DISTINCT is not a function; it's effect applies to the entire set of expressions. Get rid of DISTINCT since the GROUP BY has that effect.

      • Recommend INDEX(contract_no, cm_max, tstamp) (in that order)

      • The test on mtmbl.time2 is redundant since `MIN(tstamp) is already limited to that (1 day + 2 second) time range.

      • Please provide SHOW CREATE TABLE.





      share|improve this answer
























        up vote
        0
        down vote













        • Don't use SELECT * when you don't need all the columns. It looks like contract_no is the only column needed from dl, hence from new_installed_devices.

        • Is there some reason for testing insert4date for equality in that weird way?

        • Recommend INDEX(insert4date, install_mark, dl) (in that order)

        • Try to avoid the construct IN ( SELECT ... ). Usually it is better to use EXISTS or LEFT JOIN.

        • Don't say DISTINCT(contract_no), ... -- DISTINCT is not a function; it's effect applies to the entire set of expressions. Get rid of DISTINCT since the GROUP BY has that effect.

        • Recommend INDEX(contract_no, cm_max, tstamp) (in that order)

        • The test on mtmbl.time2 is redundant since `MIN(tstamp) is already limited to that (1 day + 2 second) time range.

        • Please provide SHOW CREATE TABLE.





        share|improve this answer






















          up vote
          0
          down vote










          up vote
          0
          down vote









          • Don't use SELECT * when you don't need all the columns. It looks like contract_no is the only column needed from dl, hence from new_installed_devices.

          • Is there some reason for testing insert4date for equality in that weird way?

          • Recommend INDEX(insert4date, install_mark, dl) (in that order)

          • Try to avoid the construct IN ( SELECT ... ). Usually it is better to use EXISTS or LEFT JOIN.

          • Don't say DISTINCT(contract_no), ... -- DISTINCT is not a function; it's effect applies to the entire set of expressions. Get rid of DISTINCT since the GROUP BY has that effect.

          • Recommend INDEX(contract_no, cm_max, tstamp) (in that order)

          • The test on mtmbl.time2 is redundant since `MIN(tstamp) is already limited to that (1 day + 2 second) time range.

          • Please provide SHOW CREATE TABLE.





          share|improve this answer












          • Don't use SELECT * when you don't need all the columns. It looks like contract_no is the only column needed from dl, hence from new_installed_devices.

          • Is there some reason for testing insert4date for equality in that weird way?

          • Recommend INDEX(insert4date, install_mark, dl) (in that order)

          • Try to avoid the construct IN ( SELECT ... ). Usually it is better to use EXISTS or LEFT JOIN.

          • Don't say DISTINCT(contract_no), ... -- DISTINCT is not a function; it's effect applies to the entire set of expressions. Get rid of DISTINCT since the GROUP BY has that effect.

          • Recommend INDEX(contract_no, cm_max, tstamp) (in that order)

          • The test on mtmbl.time2 is redundant since `MIN(tstamp) is already limited to that (1 day + 2 second) time range.

          • Please provide SHOW CREATE TABLE.






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 2 at 5:55









          Rick James

          65k55796




          65k55796






















              up vote
              0
              down vote













              • You can replace the first subquery in the FROM clause with a direct reference to the table new_installed_devices, with some conditions in the WHERE clause. In older versions, MySQL doesn't handle subqueries very well, so try to avoid them in the FROM clause (especially if you have more than 1 or 2 of them).

              • The range conditions for mtmbl.time2 can be folded into the subquery's HAVING clause, to make sure you filter that data as quickly as possible, without creating a large temp table with that subquery.

              • Can you provide the SHOW CREATE TABLE of these tables and the EXPLAIN for the query? It can be helpful.

              When guessing the order MySQL will choose here, you can try to add these indexes and run the following query, to see if it works better. I applied the recommendations above to the query here below (hope my guesses about columns origins were correct, otherwise please fix everything accordingly):



              ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_no_mac_tstamp` (`contract_no`,`cm_mac`,`tstamp`);
              ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_tstamp_no` (`tstamp`,`contract_no`);
              ALTER TABLE `new_installed_devices` ADD INDEX `new_installed_device_idx_no_insert4date` (`contract_no`,`insert4date`);


              The query:



              SELECT
              new_installed_devices.id,
              new_installed_devices.contract_no,
              new_installed_devices.customer,
              new_installed_devices.address,
              new_installed_devices.cm_mac,
              new_installed_devices.aps
              FROM
              new_installed_devices AS d1
              LEFT JOIN
              (
              SELECT
              *
              FROM
              (SELECT
              devices_change.contract_no AS c_no,
              devices_change.cm_mac AS c_mc,
              MIN(devices_change.tstamp) AS time2,
              sum(1) AS aps
              FROM
              devices_change
              WHERE
              devices_change.contract_no IN (
              SELECT
              DISTINCT (devices_change.contract_no)
              FROM
              devices_change
              WHERE
              devices_change.tstamp >= '2018-10-28 06:59:59'
              AND devices_change.tstamp <= '2018-10-29 07:00:00'
              )
              GROUP BY
              devices_change.contract_no,
              devices_change.cm_mac
              HAVING
              devices_change.time2 >= '2018-10-28 06:59:59'
              AND devices_change.time2 <= '2018-10-29 07:00:00'
              ORDER BY
              NULL) AS mtmbl) AS tmp
              ON d1.contract_no = tmp.c_no
              WHERE
              aps > 0
              AND d1.insert4date >= '2018-10-28'
              AND d1.insert4date <= '2018-10-28'
              AND d1.install_mark < 2
              GROUP BY
              new_installed_devices.contract_no,
              new_installed_devices.customer,
              new_installed_devices.address,
              new_installed_devices.cm_mac
              ORDER BY
              NULL





              share|improve this answer
























                up vote
                0
                down vote













                • You can replace the first subquery in the FROM clause with a direct reference to the table new_installed_devices, with some conditions in the WHERE clause. In older versions, MySQL doesn't handle subqueries very well, so try to avoid them in the FROM clause (especially if you have more than 1 or 2 of them).

                • The range conditions for mtmbl.time2 can be folded into the subquery's HAVING clause, to make sure you filter that data as quickly as possible, without creating a large temp table with that subquery.

                • Can you provide the SHOW CREATE TABLE of these tables and the EXPLAIN for the query? It can be helpful.

                When guessing the order MySQL will choose here, you can try to add these indexes and run the following query, to see if it works better. I applied the recommendations above to the query here below (hope my guesses about columns origins were correct, otherwise please fix everything accordingly):



                ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_no_mac_tstamp` (`contract_no`,`cm_mac`,`tstamp`);
                ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_tstamp_no` (`tstamp`,`contract_no`);
                ALTER TABLE `new_installed_devices` ADD INDEX `new_installed_device_idx_no_insert4date` (`contract_no`,`insert4date`);


                The query:



                SELECT
                new_installed_devices.id,
                new_installed_devices.contract_no,
                new_installed_devices.customer,
                new_installed_devices.address,
                new_installed_devices.cm_mac,
                new_installed_devices.aps
                FROM
                new_installed_devices AS d1
                LEFT JOIN
                (
                SELECT
                *
                FROM
                (SELECT
                devices_change.contract_no AS c_no,
                devices_change.cm_mac AS c_mc,
                MIN(devices_change.tstamp) AS time2,
                sum(1) AS aps
                FROM
                devices_change
                WHERE
                devices_change.contract_no IN (
                SELECT
                DISTINCT (devices_change.contract_no)
                FROM
                devices_change
                WHERE
                devices_change.tstamp >= '2018-10-28 06:59:59'
                AND devices_change.tstamp <= '2018-10-29 07:00:00'
                )
                GROUP BY
                devices_change.contract_no,
                devices_change.cm_mac
                HAVING
                devices_change.time2 >= '2018-10-28 06:59:59'
                AND devices_change.time2 <= '2018-10-29 07:00:00'
                ORDER BY
                NULL) AS mtmbl) AS tmp
                ON d1.contract_no = tmp.c_no
                WHERE
                aps > 0
                AND d1.insert4date >= '2018-10-28'
                AND d1.insert4date <= '2018-10-28'
                AND d1.install_mark < 2
                GROUP BY
                new_installed_devices.contract_no,
                new_installed_devices.customer,
                new_installed_devices.address,
                new_installed_devices.cm_mac
                ORDER BY
                NULL





                share|improve this answer






















                  up vote
                  0
                  down vote










                  up vote
                  0
                  down vote









                  • You can replace the first subquery in the FROM clause with a direct reference to the table new_installed_devices, with some conditions in the WHERE clause. In older versions, MySQL doesn't handle subqueries very well, so try to avoid them in the FROM clause (especially if you have more than 1 or 2 of them).

                  • The range conditions for mtmbl.time2 can be folded into the subquery's HAVING clause, to make sure you filter that data as quickly as possible, without creating a large temp table with that subquery.

                  • Can you provide the SHOW CREATE TABLE of these tables and the EXPLAIN for the query? It can be helpful.

                  When guessing the order MySQL will choose here, you can try to add these indexes and run the following query, to see if it works better. I applied the recommendations above to the query here below (hope my guesses about columns origins were correct, otherwise please fix everything accordingly):



                  ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_no_mac_tstamp` (`contract_no`,`cm_mac`,`tstamp`);
                  ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_tstamp_no` (`tstamp`,`contract_no`);
                  ALTER TABLE `new_installed_devices` ADD INDEX `new_installed_device_idx_no_insert4date` (`contract_no`,`insert4date`);


                  The query:



                  SELECT
                  new_installed_devices.id,
                  new_installed_devices.contract_no,
                  new_installed_devices.customer,
                  new_installed_devices.address,
                  new_installed_devices.cm_mac,
                  new_installed_devices.aps
                  FROM
                  new_installed_devices AS d1
                  LEFT JOIN
                  (
                  SELECT
                  *
                  FROM
                  (SELECT
                  devices_change.contract_no AS c_no,
                  devices_change.cm_mac AS c_mc,
                  MIN(devices_change.tstamp) AS time2,
                  sum(1) AS aps
                  FROM
                  devices_change
                  WHERE
                  devices_change.contract_no IN (
                  SELECT
                  DISTINCT (devices_change.contract_no)
                  FROM
                  devices_change
                  WHERE
                  devices_change.tstamp >= '2018-10-28 06:59:59'
                  AND devices_change.tstamp <= '2018-10-29 07:00:00'
                  )
                  GROUP BY
                  devices_change.contract_no,
                  devices_change.cm_mac
                  HAVING
                  devices_change.time2 >= '2018-10-28 06:59:59'
                  AND devices_change.time2 <= '2018-10-29 07:00:00'
                  ORDER BY
                  NULL) AS mtmbl) AS tmp
                  ON d1.contract_no = tmp.c_no
                  WHERE
                  aps > 0
                  AND d1.insert4date >= '2018-10-28'
                  AND d1.insert4date <= '2018-10-28'
                  AND d1.install_mark < 2
                  GROUP BY
                  new_installed_devices.contract_no,
                  new_installed_devices.customer,
                  new_installed_devices.address,
                  new_installed_devices.cm_mac
                  ORDER BY
                  NULL





                  share|improve this answer












                  • You can replace the first subquery in the FROM clause with a direct reference to the table new_installed_devices, with some conditions in the WHERE clause. In older versions, MySQL doesn't handle subqueries very well, so try to avoid them in the FROM clause (especially if you have more than 1 or 2 of them).

                  • The range conditions for mtmbl.time2 can be folded into the subquery's HAVING clause, to make sure you filter that data as quickly as possible, without creating a large temp table with that subquery.

                  • Can you provide the SHOW CREATE TABLE of these tables and the EXPLAIN for the query? It can be helpful.

                  When guessing the order MySQL will choose here, you can try to add these indexes and run the following query, to see if it works better. I applied the recommendations above to the query here below (hope my guesses about columns origins were correct, otherwise please fix everything accordingly):



                  ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_no_mac_tstamp` (`contract_no`,`cm_mac`,`tstamp`);
                  ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_tstamp_no` (`tstamp`,`contract_no`);
                  ALTER TABLE `new_installed_devices` ADD INDEX `new_installed_device_idx_no_insert4date` (`contract_no`,`insert4date`);


                  The query:



                  SELECT
                  new_installed_devices.id,
                  new_installed_devices.contract_no,
                  new_installed_devices.customer,
                  new_installed_devices.address,
                  new_installed_devices.cm_mac,
                  new_installed_devices.aps
                  FROM
                  new_installed_devices AS d1
                  LEFT JOIN
                  (
                  SELECT
                  *
                  FROM
                  (SELECT
                  devices_change.contract_no AS c_no,
                  devices_change.cm_mac AS c_mc,
                  MIN(devices_change.tstamp) AS time2,
                  sum(1) AS aps
                  FROM
                  devices_change
                  WHERE
                  devices_change.contract_no IN (
                  SELECT
                  DISTINCT (devices_change.contract_no)
                  FROM
                  devices_change
                  WHERE
                  devices_change.tstamp >= '2018-10-28 06:59:59'
                  AND devices_change.tstamp <= '2018-10-29 07:00:00'
                  )
                  GROUP BY
                  devices_change.contract_no,
                  devices_change.cm_mac
                  HAVING
                  devices_change.time2 >= '2018-10-28 06:59:59'
                  AND devices_change.time2 <= '2018-10-29 07:00:00'
                  ORDER BY
                  NULL) AS mtmbl) AS tmp
                  ON d1.contract_no = tmp.c_no
                  WHERE
                  aps > 0
                  AND d1.insert4date >= '2018-10-28'
                  AND d1.insert4date <= '2018-10-28'
                  AND d1.install_mark < 2
                  GROUP BY
                  new_installed_devices.contract_no,
                  new_installed_devices.customer,
                  new_installed_devices.address,
                  new_installed_devices.cm_mac
                  ORDER BY
                  NULL






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 11 at 14:05









                  Tomer Shay

                  537214




                  537214



























                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53103345%2fmy-sql-trying-to-optimize-query-returns-more-rows%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      這個網誌中的熱門文章

                      What does pagestruct do in Eviews?

                      Dutch intervention in Lombok and Karangasem

                      Channel Islands