Only keep levels that are incomplete, based on two columns










0















I could really use some help with the following:



I some sample data:



Group<-c("A","A","B","B","C","C","D", "D")
Value1<-c("7","1",8,7,"NA",9,10,12)
Value2<-c("NA","NA","NA",7,3,9,7,4)
df<-data.frame(Group, Value1, Value2)

Group Value1 Value2
A 7 NA
A 1 NA
B 8 NA
B 7 7
C NA 3
C 9 9
D 10 7
D 12 4


I want delete all groups that are either fully filled in for Value1 and Value2 (like Group D) or have no data for Value2 within a group (like Group A). So that I end up with the groups that are filled in for both Value1 and Value2 but are incomplete, like:



 Group Value1 Value2
B 8 NA
B 7 7
C NA 3
C 9 9


I Know how to delete NA's, and I have tried some things with



setDT()[, := if(any(Value2==)) "" else "" , by = .()]


but I don't really now how to use it in this case..



Does someone know how I could do this?










share|improve this question




























    0















    I could really use some help with the following:



    I some sample data:



    Group<-c("A","A","B","B","C","C","D", "D")
    Value1<-c("7","1",8,7,"NA",9,10,12)
    Value2<-c("NA","NA","NA",7,3,9,7,4)
    df<-data.frame(Group, Value1, Value2)

    Group Value1 Value2
    A 7 NA
    A 1 NA
    B 8 NA
    B 7 7
    C NA 3
    C 9 9
    D 10 7
    D 12 4


    I want delete all groups that are either fully filled in for Value1 and Value2 (like Group D) or have no data for Value2 within a group (like Group A). So that I end up with the groups that are filled in for both Value1 and Value2 but are incomplete, like:



     Group Value1 Value2
    B 8 NA
    B 7 7
    C NA 3
    C 9 9


    I Know how to delete NA's, and I have tried some things with



    setDT()[, := if(any(Value2==)) "" else "" , by = .()]


    but I don't really now how to use it in this case..



    Does someone know how I could do this?










    share|improve this question


























      0












      0








      0


      1






      I could really use some help with the following:



      I some sample data:



      Group<-c("A","A","B","B","C","C","D", "D")
      Value1<-c("7","1",8,7,"NA",9,10,12)
      Value2<-c("NA","NA","NA",7,3,9,7,4)
      df<-data.frame(Group, Value1, Value2)

      Group Value1 Value2
      A 7 NA
      A 1 NA
      B 8 NA
      B 7 7
      C NA 3
      C 9 9
      D 10 7
      D 12 4


      I want delete all groups that are either fully filled in for Value1 and Value2 (like Group D) or have no data for Value2 within a group (like Group A). So that I end up with the groups that are filled in for both Value1 and Value2 but are incomplete, like:



       Group Value1 Value2
      B 8 NA
      B 7 7
      C NA 3
      C 9 9


      I Know how to delete NA's, and I have tried some things with



      setDT()[, := if(any(Value2==)) "" else "" , by = .()]


      but I don't really now how to use it in this case..



      Does someone know how I could do this?










      share|improve this question
















      I could really use some help with the following:



      I some sample data:



      Group<-c("A","A","B","B","C","C","D", "D")
      Value1<-c("7","1",8,7,"NA",9,10,12)
      Value2<-c("NA","NA","NA",7,3,9,7,4)
      df<-data.frame(Group, Value1, Value2)

      Group Value1 Value2
      A 7 NA
      A 1 NA
      B 8 NA
      B 7 7
      C NA 3
      C 9 9
      D 10 7
      D 12 4


      I want delete all groups that are either fully filled in for Value1 and Value2 (like Group D) or have no data for Value2 within a group (like Group A). So that I end up with the groups that are filled in for both Value1 and Value2 but are incomplete, like:



       Group Value1 Value2
      B 8 NA
      B 7 7
      C NA 3
      C 9 9


      I Know how to delete NA's, and I have tried some things with



      setDT()[, := if(any(Value2==)) "" else "" , by = .()]


      but I don't really now how to use it in this case..



      Does someone know how I could do this?







      r na delete-row






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 15 '18 at 12:13









      Ronak Shah

      41k104164




      41k104164










      asked Nov 15 '18 at 11:28









      LotwLotw

      689




      689






















          2 Answers
          2






          active

          oldest

          votes


















          1














          Something like:



          setDT(df)[, .SD[!all(is.na(Value2)) & (anyNA(Value1) | anyNA(Value2))], Group]


          Output:



           Group Value1 Value2
          1: B 8 NA
          2: B 7 7
          3: C <NA> 3
          4: C 9 9


          Note that in order for this to work, your values should indeed be NA and not just "NA" as strings, i.e. it works with a modified dataframe example:



          Group<-c("A","A","B","B","C","C","D", "D")
          Value1<-c("7","1",8,7,NA,9,10,12)
          Value2<-c(NA,NA,NA,7,3,9,7,4)
          df<-data.frame(Group, Value1, Value2)





          share|improve this answer


















          • 3





            also possible as tidyverse: library(tidyverse); df %>% mutate_at(-1,~ifelse(. == "NA", NA, .)) %>% group_by(Group) %>% filter(!all(is.na(Value2)) & (anyNA(Value1) | anyNA(Value2)))

            – Jimbou
            Nov 15 '18 at 11:55






          • 1





            Also, the first two 'Value1' numbers are string instead of integers, which makes the columns factors instead of integers.

            – wl1234
            Nov 15 '18 at 11:58







          • 1





            @arg0naut & Jimbou Thank you! Works both perfectly! wl1234 & arg0naut Thank you for the notice I wasn't sure how to create the integer with NA data so I did it like this. In the real data the value columns are all integer

            – Lotw
            Nov 15 '18 at 12:04


















          2














          Using data from @arg0naut (to keep NA as NA and not "NA") a base R solution using two ave would be



          df[!with(df, ave(complete.cases(df), Group, FUN = all) | 
          ave(is.na(Value2), Group, FUN = all)), ]

          # Group Value1 Value2
          #3 B 8 NA
          #4 B 7 7
          #5 C NA 3
          #6 C 9 9


          We keep two conditions separately and find the rows which we don't want to select and then take negation of it.






          share|improve this answer






















            Your Answer






            StackExchange.ifUsing("editor", function ()
            StackExchange.using("externalEditor", function ()
            StackExchange.using("snippets", function ()
            StackExchange.snippets.init();
            );
            );
            , "code-snippets");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "1"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53318467%2fonly-keep-levels-that-are-incomplete-based-on-two-columns%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            1














            Something like:



            setDT(df)[, .SD[!all(is.na(Value2)) & (anyNA(Value1) | anyNA(Value2))], Group]


            Output:



             Group Value1 Value2
            1: B 8 NA
            2: B 7 7
            3: C <NA> 3
            4: C 9 9


            Note that in order for this to work, your values should indeed be NA and not just "NA" as strings, i.e. it works with a modified dataframe example:



            Group<-c("A","A","B","B","C","C","D", "D")
            Value1<-c("7","1",8,7,NA,9,10,12)
            Value2<-c(NA,NA,NA,7,3,9,7,4)
            df<-data.frame(Group, Value1, Value2)





            share|improve this answer


















            • 3





              also possible as tidyverse: library(tidyverse); df %>% mutate_at(-1,~ifelse(. == "NA", NA, .)) %>% group_by(Group) %>% filter(!all(is.na(Value2)) & (anyNA(Value1) | anyNA(Value2)))

              – Jimbou
              Nov 15 '18 at 11:55






            • 1





              Also, the first two 'Value1' numbers are string instead of integers, which makes the columns factors instead of integers.

              – wl1234
              Nov 15 '18 at 11:58







            • 1





              @arg0naut & Jimbou Thank you! Works both perfectly! wl1234 & arg0naut Thank you for the notice I wasn't sure how to create the integer with NA data so I did it like this. In the real data the value columns are all integer

              – Lotw
              Nov 15 '18 at 12:04















            1














            Something like:



            setDT(df)[, .SD[!all(is.na(Value2)) & (anyNA(Value1) | anyNA(Value2))], Group]


            Output:



             Group Value1 Value2
            1: B 8 NA
            2: B 7 7
            3: C <NA> 3
            4: C 9 9


            Note that in order for this to work, your values should indeed be NA and not just "NA" as strings, i.e. it works with a modified dataframe example:



            Group<-c("A","A","B","B","C","C","D", "D")
            Value1<-c("7","1",8,7,NA,9,10,12)
            Value2<-c(NA,NA,NA,7,3,9,7,4)
            df<-data.frame(Group, Value1, Value2)





            share|improve this answer


















            • 3





              also possible as tidyverse: library(tidyverse); df %>% mutate_at(-1,~ifelse(. == "NA", NA, .)) %>% group_by(Group) %>% filter(!all(is.na(Value2)) & (anyNA(Value1) | anyNA(Value2)))

              – Jimbou
              Nov 15 '18 at 11:55






            • 1





              Also, the first two 'Value1' numbers are string instead of integers, which makes the columns factors instead of integers.

              – wl1234
              Nov 15 '18 at 11:58







            • 1





              @arg0naut & Jimbou Thank you! Works both perfectly! wl1234 & arg0naut Thank you for the notice I wasn't sure how to create the integer with NA data so I did it like this. In the real data the value columns are all integer

              – Lotw
              Nov 15 '18 at 12:04













            1












            1








            1







            Something like:



            setDT(df)[, .SD[!all(is.na(Value2)) & (anyNA(Value1) | anyNA(Value2))], Group]


            Output:



             Group Value1 Value2
            1: B 8 NA
            2: B 7 7
            3: C <NA> 3
            4: C 9 9


            Note that in order for this to work, your values should indeed be NA and not just "NA" as strings, i.e. it works with a modified dataframe example:



            Group<-c("A","A","B","B","C","C","D", "D")
            Value1<-c("7","1",8,7,NA,9,10,12)
            Value2<-c(NA,NA,NA,7,3,9,7,4)
            df<-data.frame(Group, Value1, Value2)





            share|improve this answer













            Something like:



            setDT(df)[, .SD[!all(is.na(Value2)) & (anyNA(Value1) | anyNA(Value2))], Group]


            Output:



             Group Value1 Value2
            1: B 8 NA
            2: B 7 7
            3: C <NA> 3
            4: C 9 9


            Note that in order for this to work, your values should indeed be NA and not just "NA" as strings, i.e. it works with a modified dataframe example:



            Group<-c("A","A","B","B","C","C","D", "D")
            Value1<-c("7","1",8,7,NA,9,10,12)
            Value2<-c(NA,NA,NA,7,3,9,7,4)
            df<-data.frame(Group, Value1, Value2)






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 15 '18 at 11:51









            arg0nautarg0naut

            5,7241420




            5,7241420







            • 3





              also possible as tidyverse: library(tidyverse); df %>% mutate_at(-1,~ifelse(. == "NA", NA, .)) %>% group_by(Group) %>% filter(!all(is.na(Value2)) & (anyNA(Value1) | anyNA(Value2)))

              – Jimbou
              Nov 15 '18 at 11:55






            • 1





              Also, the first two 'Value1' numbers are string instead of integers, which makes the columns factors instead of integers.

              – wl1234
              Nov 15 '18 at 11:58







            • 1





              @arg0naut & Jimbou Thank you! Works both perfectly! wl1234 & arg0naut Thank you for the notice I wasn't sure how to create the integer with NA data so I did it like this. In the real data the value columns are all integer

              – Lotw
              Nov 15 '18 at 12:04












            • 3





              also possible as tidyverse: library(tidyverse); df %>% mutate_at(-1,~ifelse(. == "NA", NA, .)) %>% group_by(Group) %>% filter(!all(is.na(Value2)) & (anyNA(Value1) | anyNA(Value2)))

              – Jimbou
              Nov 15 '18 at 11:55






            • 1





              Also, the first two 'Value1' numbers are string instead of integers, which makes the columns factors instead of integers.

              – wl1234
              Nov 15 '18 at 11:58







            • 1





              @arg0naut & Jimbou Thank you! Works both perfectly! wl1234 & arg0naut Thank you for the notice I wasn't sure how to create the integer with NA data so I did it like this. In the real data the value columns are all integer

              – Lotw
              Nov 15 '18 at 12:04







            3




            3





            also possible as tidyverse: library(tidyverse); df %>% mutate_at(-1,~ifelse(. == "NA", NA, .)) %>% group_by(Group) %>% filter(!all(is.na(Value2)) & (anyNA(Value1) | anyNA(Value2)))

            – Jimbou
            Nov 15 '18 at 11:55





            also possible as tidyverse: library(tidyverse); df %>% mutate_at(-1,~ifelse(. == "NA", NA, .)) %>% group_by(Group) %>% filter(!all(is.na(Value2)) & (anyNA(Value1) | anyNA(Value2)))

            – Jimbou
            Nov 15 '18 at 11:55




            1




            1





            Also, the first two 'Value1' numbers are string instead of integers, which makes the columns factors instead of integers.

            – wl1234
            Nov 15 '18 at 11:58






            Also, the first two 'Value1' numbers are string instead of integers, which makes the columns factors instead of integers.

            – wl1234
            Nov 15 '18 at 11:58





            1




            1





            @arg0naut & Jimbou Thank you! Works both perfectly! wl1234 & arg0naut Thank you for the notice I wasn't sure how to create the integer with NA data so I did it like this. In the real data the value columns are all integer

            – Lotw
            Nov 15 '18 at 12:04





            @arg0naut & Jimbou Thank you! Works both perfectly! wl1234 & arg0naut Thank you for the notice I wasn't sure how to create the integer with NA data so I did it like this. In the real data the value columns are all integer

            – Lotw
            Nov 15 '18 at 12:04













            2














            Using data from @arg0naut (to keep NA as NA and not "NA") a base R solution using two ave would be



            df[!with(df, ave(complete.cases(df), Group, FUN = all) | 
            ave(is.na(Value2), Group, FUN = all)), ]

            # Group Value1 Value2
            #3 B 8 NA
            #4 B 7 7
            #5 C NA 3
            #6 C 9 9


            We keep two conditions separately and find the rows which we don't want to select and then take negation of it.






            share|improve this answer



























              2














              Using data from @arg0naut (to keep NA as NA and not "NA") a base R solution using two ave would be



              df[!with(df, ave(complete.cases(df), Group, FUN = all) | 
              ave(is.na(Value2), Group, FUN = all)), ]

              # Group Value1 Value2
              #3 B 8 NA
              #4 B 7 7
              #5 C NA 3
              #6 C 9 9


              We keep two conditions separately and find the rows which we don't want to select and then take negation of it.






              share|improve this answer

























                2












                2








                2







                Using data from @arg0naut (to keep NA as NA and not "NA") a base R solution using two ave would be



                df[!with(df, ave(complete.cases(df), Group, FUN = all) | 
                ave(is.na(Value2), Group, FUN = all)), ]

                # Group Value1 Value2
                #3 B 8 NA
                #4 B 7 7
                #5 C NA 3
                #6 C 9 9


                We keep two conditions separately and find the rows which we don't want to select and then take negation of it.






                share|improve this answer













                Using data from @arg0naut (to keep NA as NA and not "NA") a base R solution using two ave would be



                df[!with(df, ave(complete.cases(df), Group, FUN = all) | 
                ave(is.na(Value2), Group, FUN = all)), ]

                # Group Value1 Value2
                #3 B 8 NA
                #4 B 7 7
                #5 C NA 3
                #6 C 9 9


                We keep two conditions separately and find the rows which we don't want to select and then take negation of it.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 15 '18 at 12:10









                Ronak ShahRonak Shah

                41k104164




                41k104164



























                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53318467%2fonly-keep-levels-that-are-incomplete-based-on-two-columns%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    這個網誌中的熱門文章

                    Barbados

                    How to read a connectionString WITH PROVIDER in .NET Core?

                    Node.js Script on GitHub Pages or Amazon S3