Update large table (also wide) in SQL Server without indexes

up vote
1
down vote

favorite

I have a requirement; we have 2 SQL Server 2012 databases.

MARKETS_DAILY: tables in this database are expected to contain only days worth of data

MARKETS_HISTORY: tables in this database are expected to contain histories for the same set of tables in MARKETS_DAILY (copied daily data from MARKETS_DAILY)

All tables are fine except one. A table named RE_FEED is 1000 columns wide and has a couple of issues.

12 column definitions are defined as INT in MARKETS_DAILY and VARCHAR(4) in MARKETS_HISTORY. Unfortunately when we get length of 5, data gets copied as '*' into MARKETS_HISTORY (when we copy daily data from MARKETS_DAILY to MARKETS_HISTORY). So the data is different in MARKETS_DAILY and MARKETS_HISTORY for a particular day.

There are no indexes on either of the tables and strangely we maintain history in both the databases with different values for these 12 columns (correct version in DAILY).

My requirement is to sync both these tables. How do I do this?

Performance seems to take an impact as there are no indexes and these columns are not good for indexing anyway.

I need to change the datatype to INT in the HISTORY version and then update the values to replicate DAILY version. Searching for performance efficient way to do this. Your inputs will be helpful.

edited Nov 12 at 5:11

marc_s

569k12811001250

asked Nov 11 at 22:11

SQLschooler

What are you using to copy the table?
– user1443098
Nov 11 at 22:17

Why are the columns not good for indexing? None of the columns are large (like a varchar(500)), so why do you believe that is the case? If you aren't going to index them then every query will be forced to run a table scan, which means they'll be as fast (slow) as one.
– Larnu
Nov 11 at 22:18

Daily the following statement is run, INSERT INTO MARKETS_HISTORY.RE_FEED SELECT * FROM MARKETS_DAILY.RE_FEED WHERE market_process_dt = (select max(market_process_Dt) from MARKETS_DAILY.RE_FEED )
– SQLschooler
Nov 11 at 22:19

1

You should first add a column list to your insert and select statements, although I suspect at this stage you may not actually know the actual column mappings
– Nick.McDermaid
Nov 11 at 22:29

1

Why would that be a bad idea? One of the most common candidates for an index is the primary key of a table. By nature primary keys have to be unique, so the range of values is the same as the number of rows you have. In fact, I'd argue that columns that have a small number of values (in a large dataset) are going to provide a much lower value for an index than those with a high number of values. A seek on 7M out of 7.1M rows is going to be pointless (the DBMS would probably still do a scan), where as a seek for 1 row in 5M is going to be a huge benefit.
– Larnu
Nov 11 at 22:34

|
show 2 more comments

up vote
1
down vote

favorite

I have a requirement; we have 2 SQL Server 2012 databases.

MARKETS_DAILY: tables in this database are expected to contain only days worth of data

MARKETS_HISTORY: tables in this database are expected to contain histories for the same set of tables in MARKETS_DAILY (copied daily data from MARKETS_DAILY)

All tables are fine except one. A table named RE_FEED is 1000 columns wide and has a couple of issues.

12 column definitions are defined as INT in MARKETS_DAILY and VARCHAR(4) in MARKETS_HISTORY. Unfortunately when we get length of 5, data gets copied as '*' into MARKETS_HISTORY (when we copy daily data from MARKETS_DAILY to MARKETS_HISTORY). So the data is different in MARKETS_DAILY and MARKETS_HISTORY for a particular day.

There are no indexes on either of the tables and strangely we maintain history in both the databases with different values for these 12 columns (correct version in DAILY).

My requirement is to sync both these tables. How do I do this?

Performance seems to take an impact as there are no indexes and these columns are not good for indexing anyway.

I need to change the datatype to INT in the HISTORY version and then update the values to replicate DAILY version. Searching for performance efficient way to do this. Your inputs will be helpful.

edited Nov 12 at 5:11

marc_s

569k12811001250

asked Nov 11 at 22:11

SQLschooler

What are you using to copy the table?
– user1443098
Nov 11 at 22:17

Why are the columns not good for indexing? None of the columns are large (like a varchar(500)), so why do you believe that is the case? If you aren't going to index them then every query will be forced to run a table scan, which means they'll be as fast (slow) as one.
– Larnu
Nov 11 at 22:18

Daily the following statement is run, INSERT INTO MARKETS_HISTORY.RE_FEED SELECT * FROM MARKETS_DAILY.RE_FEED WHERE market_process_dt = (select max(market_process_Dt) from MARKETS_DAILY.RE_FEED )
– SQLschooler
Nov 11 at 22:19

1

You should first add a column list to your insert and select statements, although I suspect at this stage you may not actually know the actual column mappings
– Nick.McDermaid
Nov 11 at 22:29

1

Why would that be a bad idea? One of the most common candidates for an index is the primary key of a table. By nature primary keys have to be unique, so the range of values is the same as the number of rows you have. In fact, I'd argue that columns that have a small number of values (in a large dataset) are going to provide a much lower value for an index than those with a high number of values. A seek on 7M out of 7.1M rows is going to be pointless (the DBMS would probably still do a scan), where as a seek for 1 row in 5M is going to be a huge benefit.
– Larnu
Nov 11 at 22:34

|
show 2 more comments

up vote
1
down vote

favorite

I have a requirement; we have 2 SQL Server 2012 databases.

MARKETS_DAILY: tables in this database are expected to contain only days worth of data

MARKETS_HISTORY: tables in this database are expected to contain histories for the same set of tables in MARKETS_DAILY (copied daily data from MARKETS_DAILY)

All tables are fine except one. A table named RE_FEED is 1000 columns wide and has a couple of issues.

12 column definitions are defined as INT in MARKETS_DAILY and VARCHAR(4) in MARKETS_HISTORY. Unfortunately when we get length of 5, data gets copied as '*' into MARKETS_HISTORY (when we copy daily data from MARKETS_DAILY to MARKETS_HISTORY). So the data is different in MARKETS_DAILY and MARKETS_HISTORY for a particular day.

There are no indexes on either of the tables and strangely we maintain history in both the databases with different values for these 12 columns (correct version in DAILY).

My requirement is to sync both these tables. How do I do this?

Performance seems to take an impact as there are no indexes and these columns are not good for indexing anyway.

I need to change the datatype to INT in the HISTORY version and then update the values to replicate DAILY version. Searching for performance efficient way to do this. Your inputs will be helpful.

edited Nov 12 at 5:11

marc_s

569k12811001250

asked Nov 11 at 22:11

SQLschooler

I have a requirement; we have 2 SQL Server 2012 databases.

MARKETS_DAILY: tables in this database are expected to contain only days worth of data

MARKETS_HISTORY: tables in this database are expected to contain histories for the same set of tables in MARKETS_DAILY (copied daily data from MARKETS_DAILY)

All tables are fine except one. A table named RE_FEED is 1000 columns wide and has a couple of issues.

12 column definitions are defined as INT in MARKETS_DAILY and VARCHAR(4) in MARKETS_HISTORY. Unfortunately when we get length of 5, data gets copied as '*' into MARKETS_HISTORY (when we copy daily data from MARKETS_DAILY to MARKETS_HISTORY). So the data is different in MARKETS_DAILY and MARKETS_HISTORY for a particular day.

There are no indexes on either of the tables and strangely we maintain history in both the databases with different values for these 12 columns (correct version in DAILY).

My requirement is to sync both these tables. How do I do this?

Performance seems to take an impact as there are no indexes and these columns are not good for indexing anyway.

I need to change the datatype to INT in the HISTORY version and then update the values to replicate DAILY version. Searching for performance efficient way to do this. Your inputs will be helpful.

sql sql-server tsql

edited Nov 12 at 5:11

marc_s

569k12811001250

asked Nov 11 at 22:11

SQLschooler

edited Nov 12 at 5:11

marc_s

569k12811001250

asked Nov 11 at 22:11

SQLschooler

edited Nov 12 at 5:11

marc_s

569k12811001250

edited Nov 12 at 5:11

marc_s

569k12811001250

edited Nov 12 at 5:11

marc_s

569k12811001250

asked Nov 11 at 22:11

SQLschooler

asked Nov 11 at 22:11

SQLschooler

asked Nov 11 at 22:11

SQLschooler

What are you using to copy the table?
– user1443098
Nov 11 at 22:17

Why are the columns not good for indexing? None of the columns are large (like a varchar(500)), so why do you believe that is the case? If you aren't going to index them then every query will be forced to run a table scan, which means they'll be as fast (slow) as one.
– Larnu
Nov 11 at 22:18

Daily the following statement is run, INSERT INTO MARKETS_HISTORY.RE_FEED SELECT * FROM MARKETS_DAILY.RE_FEED WHERE market_process_dt = (select max(market_process_Dt) from MARKETS_DAILY.RE_FEED )
– SQLschooler
Nov 11 at 22:19

1

You should first add a column list to your insert and select statements, although I suspect at this stage you may not actually know the actual column mappings
– Nick.McDermaid
Nov 11 at 22:29

1

Why would that be a bad idea? One of the most common candidates for an index is the primary key of a table. By nature primary keys have to be unique, so the range of values is the same as the number of rows you have. In fact, I'd argue that columns that have a small number of values (in a large dataset) are going to provide a much lower value for an index than those with a high number of values. A seek on 7M out of 7.1M rows is going to be pointless (the DBMS would probably still do a scan), where as a seek for 1 row in 5M is going to be a huge benefit.
– Larnu
Nov 11 at 22:34

|
show 2 more comments

What are you using to copy the table?
– user1443098
Nov 11 at 22:17

Why are the columns not good for indexing? None of the columns are large (like a varchar(500)), so why do you believe that is the case? If you aren't going to index them then every query will be forced to run a table scan, which means they'll be as fast (slow) as one.
– Larnu
Nov 11 at 22:18

Daily the following statement is run, INSERT INTO MARKETS_HISTORY.RE_FEED SELECT * FROM MARKETS_DAILY.RE_FEED WHERE market_process_dt = (select max(market_process_Dt) from MARKETS_DAILY.RE_FEED )
– SQLschooler
Nov 11 at 22:19

1

You should first add a column list to your insert and select statements, although I suspect at this stage you may not actually know the actual column mappings
– Nick.McDermaid
Nov 11 at 22:29

1

Why would that be a bad idea? One of the most common candidates for an index is the primary key of a table. By nature primary keys have to be unique, so the range of values is the same as the number of rows you have. In fact, I'd argue that columns that have a small number of values (in a large dataset) are going to provide a much lower value for an index than those with a high number of values. A seek on 7M out of 7.1M rows is going to be pointless (the DBMS would probably still do a scan), where as a seek for 1 row in 5M is going to be a huge benefit.
– Larnu
Nov 11 at 22:34

What are you using to copy the table?
– user1443098
Nov 11 at 22:17

Why are the columns not good for indexing? None of the columns are large (like a varchar(500)), so why do you believe that is the case? If you aren't going to index them then every query will be forced to run a table scan, which means they'll be as fast (slow) as one.
– Larnu
Nov 11 at 22:18

Daily the following statement is run, INSERT INTO MARKETS_HISTORY.RE_FEED SELECT * FROM MARKETS_DAILY.RE_FEED WHERE market_process_dt = (select max(market_process_Dt) from MARKETS_DAILY.RE_FEED )
– SQLschooler
Nov 11 at 22:19

You should first add a column list to your insert and select statements, although I suspect at this stage you may not actually know the actual column mappings
– Nick.McDermaid
Nov 11 at 22:29

Why would that be a bad idea? One of the most common candidates for an index is the primary key of a table. By nature primary keys have to be unique, so the range of values is the same as the number of rows you have. In fact, I'd argue that columns that have a small number of values (in a large dataset) are going to provide a much lower value for an index than those with a high number of values. A seek on 7M out of 7.1M rows is going to be pointless (the DBMS would probably still do a scan), where as a seek for 1 row in 5M is going to be a huge benefit.
– Larnu
Nov 11 at 22:34

|
show 2 more comments

1 Answer
1

active

oldest

votes

up vote
0
down vote

First, find out what fallout will happen if you fix the data type in markets history. That really needs to be done, since you are losing data.
Second, the maket daily will have some transactional tables, so not all should be copied. You might need several strategies, depending on how data is rolled up for historic records. In the case you have here, adding an index to the market_process_dt field will slow down inserts a bit, but will speed up this query.

answered Nov 12 at 5:54

Lev

514

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53253755%2fupdate-large-table-also-wide-in-sql-server-without-indexes%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
0
down vote

answered Nov 12 at 5:54

Lev

514

add a comment |

up vote
0
down vote

answered Nov 12 at 5:54

Lev

514

add a comment |

up vote
0
down vote

answered Nov 12 at 5:54

Lev

514

answered Nov 12 at 5:54

Lev

514

answered Nov 12 at 5:54

Lev

514

answered Nov 12 at 5:54

Lev

514

answered Nov 12 at 5:54

Lev

514

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Odtnhj