SQL Database How far to go to prevent Duplication of Records?









up vote
0
down vote

favorite












I've been reading up on proper DB creation techniques, and I've got a large project that will have numerous items in it. I have already mapped out the tables and the various lookup tables, but then I realized that in the user table, I have first_name, middle_name, last_name columns that could be changed to first_name_id, middle_name_id and last_name_id that act as lookups to a first name table, middle name table and last name table that hold only unique names to prevent duplication of data.



The question I have is how far do I go with this process? Does every single item that could potentially be duplicate information need to be done like this? At some point this seems like it will get confusing to ME to keep track of all the relationships and cascading updates/deletes, etc on everything...



Just looking for some advice as I want to make sure this is done properly as setting a proper foundation is very important to build and scale on top of in the future, but at the same time don't want to take this to extremes if it is not needed.










share|improve this question





















  • You should avoid the temptation of over engineering your platform.
    – John Cappelletti
    Nov 10 at 14:34










  • I would advise not to do this. you might save a miniscule amount of storage but it will add complexity to all your queries. See also Is normalizing a person's name going too far?
    – Martin Smith
    Nov 10 at 17:00










  • To answer the initial question... The answer is to use a unique constraint. That said, unless you have a compelling reason to do so, don't normalize names. The name "Matt" isn't a distinct entity in the way Order_ID: 854123 is a distinct entity. Think of it this way, if something about "Matt" were to change, would you want that change propagated to every single person with that Name_ID: Matt attribute?
    – Jason A. Long
    Nov 11 at 3:18














up vote
0
down vote

favorite












I've been reading up on proper DB creation techniques, and I've got a large project that will have numerous items in it. I have already mapped out the tables and the various lookup tables, but then I realized that in the user table, I have first_name, middle_name, last_name columns that could be changed to first_name_id, middle_name_id and last_name_id that act as lookups to a first name table, middle name table and last name table that hold only unique names to prevent duplication of data.



The question I have is how far do I go with this process? Does every single item that could potentially be duplicate information need to be done like this? At some point this seems like it will get confusing to ME to keep track of all the relationships and cascading updates/deletes, etc on everything...



Just looking for some advice as I want to make sure this is done properly as setting a proper foundation is very important to build and scale on top of in the future, but at the same time don't want to take this to extremes if it is not needed.










share|improve this question





















  • You should avoid the temptation of over engineering your platform.
    – John Cappelletti
    Nov 10 at 14:34










  • I would advise not to do this. you might save a miniscule amount of storage but it will add complexity to all your queries. See also Is normalizing a person's name going too far?
    – Martin Smith
    Nov 10 at 17:00










  • To answer the initial question... The answer is to use a unique constraint. That said, unless you have a compelling reason to do so, don't normalize names. The name "Matt" isn't a distinct entity in the way Order_ID: 854123 is a distinct entity. Think of it this way, if something about "Matt" were to change, would you want that change propagated to every single person with that Name_ID: Matt attribute?
    – Jason A. Long
    Nov 11 at 3:18












up vote
0
down vote

favorite









up vote
0
down vote

favorite











I've been reading up on proper DB creation techniques, and I've got a large project that will have numerous items in it. I have already mapped out the tables and the various lookup tables, but then I realized that in the user table, I have first_name, middle_name, last_name columns that could be changed to first_name_id, middle_name_id and last_name_id that act as lookups to a first name table, middle name table and last name table that hold only unique names to prevent duplication of data.



The question I have is how far do I go with this process? Does every single item that could potentially be duplicate information need to be done like this? At some point this seems like it will get confusing to ME to keep track of all the relationships and cascading updates/deletes, etc on everything...



Just looking for some advice as I want to make sure this is done properly as setting a proper foundation is very important to build and scale on top of in the future, but at the same time don't want to take this to extremes if it is not needed.










share|improve this question













I've been reading up on proper DB creation techniques, and I've got a large project that will have numerous items in it. I have already mapped out the tables and the various lookup tables, but then I realized that in the user table, I have first_name, middle_name, last_name columns that could be changed to first_name_id, middle_name_id and last_name_id that act as lookups to a first name table, middle name table and last name table that hold only unique names to prevent duplication of data.



The question I have is how far do I go with this process? Does every single item that could potentially be duplicate information need to be done like this? At some point this seems like it will get confusing to ME to keep track of all the relationships and cascading updates/deletes, etc on everything...



Just looking for some advice as I want to make sure this is done properly as setting a proper foundation is very important to build and scale on top of in the future, but at the same time don't want to take this to extremes if it is not needed.







sql-server duplicates data-integrity






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 10 at 14:20









MattE

390216




390216











  • You should avoid the temptation of over engineering your platform.
    – John Cappelletti
    Nov 10 at 14:34










  • I would advise not to do this. you might save a miniscule amount of storage but it will add complexity to all your queries. See also Is normalizing a person's name going too far?
    – Martin Smith
    Nov 10 at 17:00










  • To answer the initial question... The answer is to use a unique constraint. That said, unless you have a compelling reason to do so, don't normalize names. The name "Matt" isn't a distinct entity in the way Order_ID: 854123 is a distinct entity. Think of it this way, if something about "Matt" were to change, would you want that change propagated to every single person with that Name_ID: Matt attribute?
    – Jason A. Long
    Nov 11 at 3:18
















  • You should avoid the temptation of over engineering your platform.
    – John Cappelletti
    Nov 10 at 14:34










  • I would advise not to do this. you might save a miniscule amount of storage but it will add complexity to all your queries. See also Is normalizing a person's name going too far?
    – Martin Smith
    Nov 10 at 17:00










  • To answer the initial question... The answer is to use a unique constraint. That said, unless you have a compelling reason to do so, don't normalize names. The name "Matt" isn't a distinct entity in the way Order_ID: 854123 is a distinct entity. Think of it this way, if something about "Matt" were to change, would you want that change propagated to every single person with that Name_ID: Matt attribute?
    – Jason A. Long
    Nov 11 at 3:18















You should avoid the temptation of over engineering your platform.
– John Cappelletti
Nov 10 at 14:34




You should avoid the temptation of over engineering your platform.
– John Cappelletti
Nov 10 at 14:34












I would advise not to do this. you might save a miniscule amount of storage but it will add complexity to all your queries. See also Is normalizing a person's name going too far?
– Martin Smith
Nov 10 at 17:00




I would advise not to do this. you might save a miniscule amount of storage but it will add complexity to all your queries. See also Is normalizing a person's name going too far?
– Martin Smith
Nov 10 at 17:00












To answer the initial question... The answer is to use a unique constraint. That said, unless you have a compelling reason to do so, don't normalize names. The name "Matt" isn't a distinct entity in the way Order_ID: 854123 is a distinct entity. Think of it this way, if something about "Matt" were to change, would you want that change propagated to every single person with that Name_ID: Matt attribute?
– Jason A. Long
Nov 11 at 3:18




To answer the initial question... The answer is to use a unique constraint. That said, unless you have a compelling reason to do so, don't normalize names. The name "Matt" isn't a distinct entity in the way Order_ID: 854123 is a distinct entity. Think of it this way, if something about "Matt" were to change, would you want that change propagated to every single person with that Name_ID: Matt attribute?
– Jason A. Long
Nov 11 at 3:18

















active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53239864%2fsql-database-how-far-to-go-to-prevent-duplication-of-records%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes















 

draft saved


draft discarded















































 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53239864%2fsql-database-how-far-to-go-to-prevent-duplication-of-records%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







這個網誌中的熱門文章

Barbados

How to read a connectionString WITH PROVIDER in .NET Core?

Node.js Script on GitHub Pages or Amazon S3