Update large table (also wide) in SQL Server without indexes









up vote
1
down vote

favorite












I have a requirement; we have 2 SQL Server 2012 databases.



  1. MARKETS_DAILY: tables in this database are expected to contain only days worth of data


  2. MARKETS_HISTORY: tables in this database are expected to contain histories for the same set of tables in MARKETS_DAILY (copied daily data from MARKETS_DAILY)


All tables are fine except one. A table named RE_FEED is 1000 columns wide and has a couple of issues.



  1. 12 column definitions are defined as INT in MARKETS_DAILY and VARCHAR(4) in MARKETS_HISTORY. Unfortunately when we get length of 5, data gets copied as '*' into MARKETS_HISTORY (when we copy daily data from MARKETS_DAILY to MARKETS_HISTORY). So the data is different in MARKETS_DAILY and MARKETS_HISTORY for a particular day.


  2. There are no indexes on either of the tables and strangely we maintain history in both the databases with different values for these 12 columns (correct version in DAILY).


My requirement is to sync both these tables. How do I do this?



Performance seems to take an impact as there are no indexes and these columns are not good for indexing anyway.



I need to change the datatype to INT in the HISTORY version and then update the values to replicate DAILY version. Searching for performance efficient way to do this. Your inputs will be helpful.










share|improve this question























  • What are you using to copy the table?
    – user1443098
    Nov 11 at 22:17










  • Why are the columns not good for indexing? None of the columns are large (like a varchar(500)), so why do you believe that is the case? If you aren't going to index them then every query will be forced to run a table scan, which means they'll be as fast (slow) as one.
    – Larnu
    Nov 11 at 22:18










  • Daily the following statement is run, INSERT INTO MARKETS_HISTORY.RE_FEED SELECT * FROM MARKETS_DAILY.RE_FEED WHERE market_process_dt = (select max(market_process_Dt) from MARKETS_DAILY.RE_FEED )
    – SQLschooler
    Nov 11 at 22:19







  • 1




    You should first add a column list to your insert and select statements, although I suspect at this stage you may not actually know the actual column mappings
    – Nick.McDermaid
    Nov 11 at 22:29






  • 1




    Why would that be a bad idea? One of the most common candidates for an index is the primary key of a table. By nature primary keys have to be unique, so the range of values is the same as the number of rows you have. In fact, I'd argue that columns that have a small number of values (in a large dataset) are going to provide a much lower value for an index than those with a high number of values. A seek on 7M out of 7.1M rows is going to be pointless (the DBMS would probably still do a scan), where as a seek for 1 row in 5M is going to be a huge benefit.
    – Larnu
    Nov 11 at 22:34















up vote
1
down vote

favorite












I have a requirement; we have 2 SQL Server 2012 databases.



  1. MARKETS_DAILY: tables in this database are expected to contain only days worth of data


  2. MARKETS_HISTORY: tables in this database are expected to contain histories for the same set of tables in MARKETS_DAILY (copied daily data from MARKETS_DAILY)


All tables are fine except one. A table named RE_FEED is 1000 columns wide and has a couple of issues.



  1. 12 column definitions are defined as INT in MARKETS_DAILY and VARCHAR(4) in MARKETS_HISTORY. Unfortunately when we get length of 5, data gets copied as '*' into MARKETS_HISTORY (when we copy daily data from MARKETS_DAILY to MARKETS_HISTORY). So the data is different in MARKETS_DAILY and MARKETS_HISTORY for a particular day.


  2. There are no indexes on either of the tables and strangely we maintain history in both the databases with different values for these 12 columns (correct version in DAILY).


My requirement is to sync both these tables. How do I do this?



Performance seems to take an impact as there are no indexes and these columns are not good for indexing anyway.



I need to change the datatype to INT in the HISTORY version and then update the values to replicate DAILY version. Searching for performance efficient way to do this. Your inputs will be helpful.










share|improve this question























  • What are you using to copy the table?
    – user1443098
    Nov 11 at 22:17










  • Why are the columns not good for indexing? None of the columns are large (like a varchar(500)), so why do you believe that is the case? If you aren't going to index them then every query will be forced to run a table scan, which means they'll be as fast (slow) as one.
    – Larnu
    Nov 11 at 22:18










  • Daily the following statement is run, INSERT INTO MARKETS_HISTORY.RE_FEED SELECT * FROM MARKETS_DAILY.RE_FEED WHERE market_process_dt = (select max(market_process_Dt) from MARKETS_DAILY.RE_FEED )
    – SQLschooler
    Nov 11 at 22:19







  • 1




    You should first add a column list to your insert and select statements, although I suspect at this stage you may not actually know the actual column mappings
    – Nick.McDermaid
    Nov 11 at 22:29






  • 1




    Why would that be a bad idea? One of the most common candidates for an index is the primary key of a table. By nature primary keys have to be unique, so the range of values is the same as the number of rows you have. In fact, I'd argue that columns that have a small number of values (in a large dataset) are going to provide a much lower value for an index than those with a high number of values. A seek on 7M out of 7.1M rows is going to be pointless (the DBMS would probably still do a scan), where as a seek for 1 row in 5M is going to be a huge benefit.
    – Larnu
    Nov 11 at 22:34













up vote
1
down vote

favorite









up vote
1
down vote

favorite











I have a requirement; we have 2 SQL Server 2012 databases.



  1. MARKETS_DAILY: tables in this database are expected to contain only days worth of data


  2. MARKETS_HISTORY: tables in this database are expected to contain histories for the same set of tables in MARKETS_DAILY (copied daily data from MARKETS_DAILY)


All tables are fine except one. A table named RE_FEED is 1000 columns wide and has a couple of issues.



  1. 12 column definitions are defined as INT in MARKETS_DAILY and VARCHAR(4) in MARKETS_HISTORY. Unfortunately when we get length of 5, data gets copied as '*' into MARKETS_HISTORY (when we copy daily data from MARKETS_DAILY to MARKETS_HISTORY). So the data is different in MARKETS_DAILY and MARKETS_HISTORY for a particular day.


  2. There are no indexes on either of the tables and strangely we maintain history in both the databases with different values for these 12 columns (correct version in DAILY).


My requirement is to sync both these tables. How do I do this?



Performance seems to take an impact as there are no indexes and these columns are not good for indexing anyway.



I need to change the datatype to INT in the HISTORY version and then update the values to replicate DAILY version. Searching for performance efficient way to do this. Your inputs will be helpful.










share|improve this question















I have a requirement; we have 2 SQL Server 2012 databases.



  1. MARKETS_DAILY: tables in this database are expected to contain only days worth of data


  2. MARKETS_HISTORY: tables in this database are expected to contain histories for the same set of tables in MARKETS_DAILY (copied daily data from MARKETS_DAILY)


All tables are fine except one. A table named RE_FEED is 1000 columns wide and has a couple of issues.



  1. 12 column definitions are defined as INT in MARKETS_DAILY and VARCHAR(4) in MARKETS_HISTORY. Unfortunately when we get length of 5, data gets copied as '*' into MARKETS_HISTORY (when we copy daily data from MARKETS_DAILY to MARKETS_HISTORY). So the data is different in MARKETS_DAILY and MARKETS_HISTORY for a particular day.


  2. There are no indexes on either of the tables and strangely we maintain history in both the databases with different values for these 12 columns (correct version in DAILY).


My requirement is to sync both these tables. How do I do this?



Performance seems to take an impact as there are no indexes and these columns are not good for indexing anyway.



I need to change the datatype to INT in the HISTORY version and then update the values to replicate DAILY version. Searching for performance efficient way to do this. Your inputs will be helpful.







sql sql-server tsql






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 12 at 5:11









marc_s

569k12811001250




569k12811001250










asked Nov 11 at 22:11









SQLschooler

83




83











  • What are you using to copy the table?
    – user1443098
    Nov 11 at 22:17










  • Why are the columns not good for indexing? None of the columns are large (like a varchar(500)), so why do you believe that is the case? If you aren't going to index them then every query will be forced to run a table scan, which means they'll be as fast (slow) as one.
    – Larnu
    Nov 11 at 22:18










  • Daily the following statement is run, INSERT INTO MARKETS_HISTORY.RE_FEED SELECT * FROM MARKETS_DAILY.RE_FEED WHERE market_process_dt = (select max(market_process_Dt) from MARKETS_DAILY.RE_FEED )
    – SQLschooler
    Nov 11 at 22:19







  • 1




    You should first add a column list to your insert and select statements, although I suspect at this stage you may not actually know the actual column mappings
    – Nick.McDermaid
    Nov 11 at 22:29






  • 1




    Why would that be a bad idea? One of the most common candidates for an index is the primary key of a table. By nature primary keys have to be unique, so the range of values is the same as the number of rows you have. In fact, I'd argue that columns that have a small number of values (in a large dataset) are going to provide a much lower value for an index than those with a high number of values. A seek on 7M out of 7.1M rows is going to be pointless (the DBMS would probably still do a scan), where as a seek for 1 row in 5M is going to be a huge benefit.
    – Larnu
    Nov 11 at 22:34

















  • What are you using to copy the table?
    – user1443098
    Nov 11 at 22:17










  • Why are the columns not good for indexing? None of the columns are large (like a varchar(500)), so why do you believe that is the case? If you aren't going to index them then every query will be forced to run a table scan, which means they'll be as fast (slow) as one.
    – Larnu
    Nov 11 at 22:18










  • Daily the following statement is run, INSERT INTO MARKETS_HISTORY.RE_FEED SELECT * FROM MARKETS_DAILY.RE_FEED WHERE market_process_dt = (select max(market_process_Dt) from MARKETS_DAILY.RE_FEED )
    – SQLschooler
    Nov 11 at 22:19







  • 1




    You should first add a column list to your insert and select statements, although I suspect at this stage you may not actually know the actual column mappings
    – Nick.McDermaid
    Nov 11 at 22:29






  • 1




    Why would that be a bad idea? One of the most common candidates for an index is the primary key of a table. By nature primary keys have to be unique, so the range of values is the same as the number of rows you have. In fact, I'd argue that columns that have a small number of values (in a large dataset) are going to provide a much lower value for an index than those with a high number of values. A seek on 7M out of 7.1M rows is going to be pointless (the DBMS would probably still do a scan), where as a seek for 1 row in 5M is going to be a huge benefit.
    – Larnu
    Nov 11 at 22:34
















What are you using to copy the table?
– user1443098
Nov 11 at 22:17




What are you using to copy the table?
– user1443098
Nov 11 at 22:17












Why are the columns not good for indexing? None of the columns are large (like a varchar(500)), so why do you believe that is the case? If you aren't going to index them then every query will be forced to run a table scan, which means they'll be as fast (slow) as one.
– Larnu
Nov 11 at 22:18




Why are the columns not good for indexing? None of the columns are large (like a varchar(500)), so why do you believe that is the case? If you aren't going to index them then every query will be forced to run a table scan, which means they'll be as fast (slow) as one.
– Larnu
Nov 11 at 22:18












Daily the following statement is run, INSERT INTO MARKETS_HISTORY.RE_FEED SELECT * FROM MARKETS_DAILY.RE_FEED WHERE market_process_dt = (select max(market_process_Dt) from MARKETS_DAILY.RE_FEED )
– SQLschooler
Nov 11 at 22:19





Daily the following statement is run, INSERT INTO MARKETS_HISTORY.RE_FEED SELECT * FROM MARKETS_DAILY.RE_FEED WHERE market_process_dt = (select max(market_process_Dt) from MARKETS_DAILY.RE_FEED )
– SQLschooler
Nov 11 at 22:19





1




1




You should first add a column list to your insert and select statements, although I suspect at this stage you may not actually know the actual column mappings
– Nick.McDermaid
Nov 11 at 22:29




You should first add a column list to your insert and select statements, although I suspect at this stage you may not actually know the actual column mappings
– Nick.McDermaid
Nov 11 at 22:29




1




1




Why would that be a bad idea? One of the most common candidates for an index is the primary key of a table. By nature primary keys have to be unique, so the range of values is the same as the number of rows you have. In fact, I'd argue that columns that have a small number of values (in a large dataset) are going to provide a much lower value for an index than those with a high number of values. A seek on 7M out of 7.1M rows is going to be pointless (the DBMS would probably still do a scan), where as a seek for 1 row in 5M is going to be a huge benefit.
– Larnu
Nov 11 at 22:34





Why would that be a bad idea? One of the most common candidates for an index is the primary key of a table. By nature primary keys have to be unique, so the range of values is the same as the number of rows you have. In fact, I'd argue that columns that have a small number of values (in a large dataset) are going to provide a much lower value for an index than those with a high number of values. A seek on 7M out of 7.1M rows is going to be pointless (the DBMS would probably still do a scan), where as a seek for 1 row in 5M is going to be a huge benefit.
– Larnu
Nov 11 at 22:34













1 Answer
1






active

oldest

votes

















up vote
0
down vote













First, find out what fallout will happen if you fix the data type in markets history. That really needs to be done, since you are losing data.
Second, the maket daily will have some transactional tables, so not all should be copied. You might need several strategies, depending on how data is rolled up for historic records. In the case you have here, adding an index to the market_process_dt field will slow down inserts a bit, but will speed up this query.






share|improve this answer




















    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53253755%2fupdate-large-table-also-wide-in-sql-server-without-indexes%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    0
    down vote













    First, find out what fallout will happen if you fix the data type in markets history. That really needs to be done, since you are losing data.
    Second, the maket daily will have some transactional tables, so not all should be copied. You might need several strategies, depending on how data is rolled up for historic records. In the case you have here, adding an index to the market_process_dt field will slow down inserts a bit, but will speed up this query.






    share|improve this answer
























      up vote
      0
      down vote













      First, find out what fallout will happen if you fix the data type in markets history. That really needs to be done, since you are losing data.
      Second, the maket daily will have some transactional tables, so not all should be copied. You might need several strategies, depending on how data is rolled up for historic records. In the case you have here, adding an index to the market_process_dt field will slow down inserts a bit, but will speed up this query.






      share|improve this answer






















        up vote
        0
        down vote










        up vote
        0
        down vote









        First, find out what fallout will happen if you fix the data type in markets history. That really needs to be done, since you are losing data.
        Second, the maket daily will have some transactional tables, so not all should be copied. You might need several strategies, depending on how data is rolled up for historic records. In the case you have here, adding an index to the market_process_dt field will slow down inserts a bit, but will speed up this query.






        share|improve this answer












        First, find out what fallout will happen if you fix the data type in markets history. That really needs to be done, since you are losing data.
        Second, the maket daily will have some transactional tables, so not all should be copied. You might need several strategies, depending on how data is rolled up for historic records. In the case you have here, adding an index to the market_process_dt field will slow down inserts a bit, but will speed up this query.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 12 at 5:54









        Lev

        514




        514



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53253755%2fupdate-large-table-also-wide-in-sql-server-without-indexes%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            這個網誌中的熱門文章

            How to read a connectionString WITH PROVIDER in .NET Core?

            Node.js Script on GitHub Pages or Amazon S3

            Museum of Modern and Contemporary Art of Trento and Rovereto