Is there a concise way of removing rows in every group of a GroupBy object?










0















Consider the following data that closely resembles the Pandas' Group By Tutorial:



import pandas as pd
import numpy as np

df = pd.DataFrame('Week' : [1, 2, 1, 2,
1, 2, 1, 1],
'BloodType' : ['A+', 'AB', 'AB', 'B',
'B', 'B+', 'AB', 'AB'],
'C' : np.random.randn(8),
'D' : np.random.randn(8))


This produces a DataFrame that looks like this:



Sample Data



I want to group by the "Week" and then apply some operation to only the columns C and D. So I tried:



week_group = df.groupby('Week')
week_group.apply(lambda x: x.drop(["BloodType", "Week"], 1))


Which I originally interpreted as for every DataFrame drop the "BloodType" and "Week" column and give me the resulting group. However, it gives me:



Sample apply



However, I would have expected it to give me a Group, where each index was a DataFrame with only columns C and D. I did not expect a DataFrame.



I tried switching out apply with transform and agg which gave:



ValueError: transform must return a scalar value for each group


and:



ValueError: cannot copy sequence with size 2 to array axis with dimension 5


respectively. Is there a relatively simple transformation that can remove rows by name for each DataFrame in in a pandas Group and return the resulting Group object (or perform the operation in place)?










share|improve this question






















  • df.groupby("Week")[("C", "D")] isn't what you want?

    – CJR
    Nov 15 '18 at 0:01











  • @CJ59: I have a bigger data set where I want to drop about 4 things and keep about 700 others. Is there a way to get the compliment? But otherwise, yes.

    – Dair
    Nov 15 '18 at 0:05












  • @CJ59 I got it to work thanks for the help!

    – Dair
    Nov 15 '18 at 0:11











  • NP, the answer below is how I'd have done it exactly

    – CJR
    Nov 15 '18 at 0:16
















0















Consider the following data that closely resembles the Pandas' Group By Tutorial:



import pandas as pd
import numpy as np

df = pd.DataFrame('Week' : [1, 2, 1, 2,
1, 2, 1, 1],
'BloodType' : ['A+', 'AB', 'AB', 'B',
'B', 'B+', 'AB', 'AB'],
'C' : np.random.randn(8),
'D' : np.random.randn(8))


This produces a DataFrame that looks like this:



Sample Data



I want to group by the "Week" and then apply some operation to only the columns C and D. So I tried:



week_group = df.groupby('Week')
week_group.apply(lambda x: x.drop(["BloodType", "Week"], 1))


Which I originally interpreted as for every DataFrame drop the "BloodType" and "Week" column and give me the resulting group. However, it gives me:



Sample apply



However, I would have expected it to give me a Group, where each index was a DataFrame with only columns C and D. I did not expect a DataFrame.



I tried switching out apply with transform and agg which gave:



ValueError: transform must return a scalar value for each group


and:



ValueError: cannot copy sequence with size 2 to array axis with dimension 5


respectively. Is there a relatively simple transformation that can remove rows by name for each DataFrame in in a pandas Group and return the resulting Group object (or perform the operation in place)?










share|improve this question






















  • df.groupby("Week")[("C", "D")] isn't what you want?

    – CJR
    Nov 15 '18 at 0:01











  • @CJ59: I have a bigger data set where I want to drop about 4 things and keep about 700 others. Is there a way to get the compliment? But otherwise, yes.

    – Dair
    Nov 15 '18 at 0:05












  • @CJ59 I got it to work thanks for the help!

    – Dair
    Nov 15 '18 at 0:11











  • NP, the answer below is how I'd have done it exactly

    – CJR
    Nov 15 '18 at 0:16














0












0








0








Consider the following data that closely resembles the Pandas' Group By Tutorial:



import pandas as pd
import numpy as np

df = pd.DataFrame('Week' : [1, 2, 1, 2,
1, 2, 1, 1],
'BloodType' : ['A+', 'AB', 'AB', 'B',
'B', 'B+', 'AB', 'AB'],
'C' : np.random.randn(8),
'D' : np.random.randn(8))


This produces a DataFrame that looks like this:



Sample Data



I want to group by the "Week" and then apply some operation to only the columns C and D. So I tried:



week_group = df.groupby('Week')
week_group.apply(lambda x: x.drop(["BloodType", "Week"], 1))


Which I originally interpreted as for every DataFrame drop the "BloodType" and "Week" column and give me the resulting group. However, it gives me:



Sample apply



However, I would have expected it to give me a Group, where each index was a DataFrame with only columns C and D. I did not expect a DataFrame.



I tried switching out apply with transform and agg which gave:



ValueError: transform must return a scalar value for each group


and:



ValueError: cannot copy sequence with size 2 to array axis with dimension 5


respectively. Is there a relatively simple transformation that can remove rows by name for each DataFrame in in a pandas Group and return the resulting Group object (or perform the operation in place)?










share|improve this question














Consider the following data that closely resembles the Pandas' Group By Tutorial:



import pandas as pd
import numpy as np

df = pd.DataFrame('Week' : [1, 2, 1, 2,
1, 2, 1, 1],
'BloodType' : ['A+', 'AB', 'AB', 'B',
'B', 'B+', 'AB', 'AB'],
'C' : np.random.randn(8),
'D' : np.random.randn(8))


This produces a DataFrame that looks like this:



Sample Data



I want to group by the "Week" and then apply some operation to only the columns C and D. So I tried:



week_group = df.groupby('Week')
week_group.apply(lambda x: x.drop(["BloodType", "Week"], 1))


Which I originally interpreted as for every DataFrame drop the "BloodType" and "Week" column and give me the resulting group. However, it gives me:



Sample apply



However, I would have expected it to give me a Group, where each index was a DataFrame with only columns C and D. I did not expect a DataFrame.



I tried switching out apply with transform and agg which gave:



ValueError: transform must return a scalar value for each group


and:



ValueError: cannot copy sequence with size 2 to array axis with dimension 5


respectively. Is there a relatively simple transformation that can remove rows by name for each DataFrame in in a pandas Group and return the resulting Group object (or perform the operation in place)?







python pandas dataframe






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 14 '18 at 23:45









DairDair

11.9k54274




11.9k54274












  • df.groupby("Week")[("C", "D")] isn't what you want?

    – CJR
    Nov 15 '18 at 0:01











  • @CJ59: I have a bigger data set where I want to drop about 4 things and keep about 700 others. Is there a way to get the compliment? But otherwise, yes.

    – Dair
    Nov 15 '18 at 0:05












  • @CJ59 I got it to work thanks for the help!

    – Dair
    Nov 15 '18 at 0:11











  • NP, the answer below is how I'd have done it exactly

    – CJR
    Nov 15 '18 at 0:16


















  • df.groupby("Week")[("C", "D")] isn't what you want?

    – CJR
    Nov 15 '18 at 0:01











  • @CJ59: I have a bigger data set where I want to drop about 4 things and keep about 700 others. Is there a way to get the compliment? But otherwise, yes.

    – Dair
    Nov 15 '18 at 0:05












  • @CJ59 I got it to work thanks for the help!

    – Dair
    Nov 15 '18 at 0:11











  • NP, the answer below is how I'd have done it exactly

    – CJR
    Nov 15 '18 at 0:16

















df.groupby("Week")[("C", "D")] isn't what you want?

– CJR
Nov 15 '18 at 0:01





df.groupby("Week")[("C", "D")] isn't what you want?

– CJR
Nov 15 '18 at 0:01













@CJ59: I have a bigger data set where I want to drop about 4 things and keep about 700 others. Is there a way to get the compliment? But otherwise, yes.

– Dair
Nov 15 '18 at 0:05






@CJ59: I have a bigger data set where I want to drop about 4 things and keep about 700 others. Is there a way to get the compliment? But otherwise, yes.

– Dair
Nov 15 '18 at 0:05














@CJ59 I got it to work thanks for the help!

– Dair
Nov 15 '18 at 0:11





@CJ59 I got it to work thanks for the help!

– Dair
Nov 15 '18 at 0:11













NP, the answer below is how I'd have done it exactly

– CJR
Nov 15 '18 at 0:16






NP, the answer below is how I'd have done it exactly

– CJR
Nov 15 '18 at 0:16













2 Answers
2






active

oldest

votes


















2














Based off of CJ59's answer I came up with this concise solution:



week_group = week_group[df.columns.difference(["Week", "BloodType"])]





share|improve this answer






























    0














    Are you perhaps searching for



    for name, group in df.groupby('Week'):
    print(name, group.drop(columns=['Week', 'BloodType']))

    1 C D
    0 0.496714 -0.469474
    2 0.647689 -0.463418
    4 -0.234153 0.241962
    6 1.579213 -1.724918
    7 0.767435 -0.562288
    2 C D
    1 -0.138264 0.54256
    3 1.523030 -0.46573
    5 -0.234137 -1.91328





    share|improve this answer























    • This is a method that works, although I was pretty convinced that there was a non-loopy way of doing it. See my answer. Nonetheless, thanks for the input.

      – Dair
      Nov 15 '18 at 0:11












    • I see, but note that the loop is not really part of the solution to get rid of some columns. The dropping is just done while using a loop which I assume you'll need anyway for processing your groups.

      – SpghttCd
      Nov 15 '18 at 0:13










    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53310442%2fis-there-a-concise-way-of-removing-rows-in-every-group-of-a-groupby-object%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    2














    Based off of CJ59's answer I came up with this concise solution:



    week_group = week_group[df.columns.difference(["Week", "BloodType"])]





    share|improve this answer



























      2














      Based off of CJ59's answer I came up with this concise solution:



      week_group = week_group[df.columns.difference(["Week", "BloodType"])]





      share|improve this answer

























        2












        2








        2







        Based off of CJ59's answer I came up with this concise solution:



        week_group = week_group[df.columns.difference(["Week", "BloodType"])]





        share|improve this answer













        Based off of CJ59's answer I came up with this concise solution:



        week_group = week_group[df.columns.difference(["Week", "BloodType"])]






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 15 '18 at 0:10









        DairDair

        11.9k54274




        11.9k54274























            0














            Are you perhaps searching for



            for name, group in df.groupby('Week'):
            print(name, group.drop(columns=['Week', 'BloodType']))

            1 C D
            0 0.496714 -0.469474
            2 0.647689 -0.463418
            4 -0.234153 0.241962
            6 1.579213 -1.724918
            7 0.767435 -0.562288
            2 C D
            1 -0.138264 0.54256
            3 1.523030 -0.46573
            5 -0.234137 -1.91328





            share|improve this answer























            • This is a method that works, although I was pretty convinced that there was a non-loopy way of doing it. See my answer. Nonetheless, thanks for the input.

              – Dair
              Nov 15 '18 at 0:11












            • I see, but note that the loop is not really part of the solution to get rid of some columns. The dropping is just done while using a loop which I assume you'll need anyway for processing your groups.

              – SpghttCd
              Nov 15 '18 at 0:13















            0














            Are you perhaps searching for



            for name, group in df.groupby('Week'):
            print(name, group.drop(columns=['Week', 'BloodType']))

            1 C D
            0 0.496714 -0.469474
            2 0.647689 -0.463418
            4 -0.234153 0.241962
            6 1.579213 -1.724918
            7 0.767435 -0.562288
            2 C D
            1 -0.138264 0.54256
            3 1.523030 -0.46573
            5 -0.234137 -1.91328





            share|improve this answer























            • This is a method that works, although I was pretty convinced that there was a non-loopy way of doing it. See my answer. Nonetheless, thanks for the input.

              – Dair
              Nov 15 '18 at 0:11












            • I see, but note that the loop is not really part of the solution to get rid of some columns. The dropping is just done while using a loop which I assume you'll need anyway for processing your groups.

              – SpghttCd
              Nov 15 '18 at 0:13













            0












            0








            0







            Are you perhaps searching for



            for name, group in df.groupby('Week'):
            print(name, group.drop(columns=['Week', 'BloodType']))

            1 C D
            0 0.496714 -0.469474
            2 0.647689 -0.463418
            4 -0.234153 0.241962
            6 1.579213 -1.724918
            7 0.767435 -0.562288
            2 C D
            1 -0.138264 0.54256
            3 1.523030 -0.46573
            5 -0.234137 -1.91328





            share|improve this answer













            Are you perhaps searching for



            for name, group in df.groupby('Week'):
            print(name, group.drop(columns=['Week', 'BloodType']))

            1 C D
            0 0.496714 -0.469474
            2 0.647689 -0.463418
            4 -0.234153 0.241962
            6 1.579213 -1.724918
            7 0.767435 -0.562288
            2 C D
            1 -0.138264 0.54256
            3 1.523030 -0.46573
            5 -0.234137 -1.91328






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 15 '18 at 0:08









            SpghttCdSpghttCd

            4,8372313




            4,8372313












            • This is a method that works, although I was pretty convinced that there was a non-loopy way of doing it. See my answer. Nonetheless, thanks for the input.

              – Dair
              Nov 15 '18 at 0:11












            • I see, but note that the loop is not really part of the solution to get rid of some columns. The dropping is just done while using a loop which I assume you'll need anyway for processing your groups.

              – SpghttCd
              Nov 15 '18 at 0:13

















            • This is a method that works, although I was pretty convinced that there was a non-loopy way of doing it. See my answer. Nonetheless, thanks for the input.

              – Dair
              Nov 15 '18 at 0:11












            • I see, but note that the loop is not really part of the solution to get rid of some columns. The dropping is just done while using a loop which I assume you'll need anyway for processing your groups.

              – SpghttCd
              Nov 15 '18 at 0:13
















            This is a method that works, although I was pretty convinced that there was a non-loopy way of doing it. See my answer. Nonetheless, thanks for the input.

            – Dair
            Nov 15 '18 at 0:11






            This is a method that works, although I was pretty convinced that there was a non-loopy way of doing it. See my answer. Nonetheless, thanks for the input.

            – Dair
            Nov 15 '18 at 0:11














            I see, but note that the loop is not really part of the solution to get rid of some columns. The dropping is just done while using a loop which I assume you'll need anyway for processing your groups.

            – SpghttCd
            Nov 15 '18 at 0:13





            I see, but note that the loop is not really part of the solution to get rid of some columns. The dropping is just done while using a loop which I assume you'll need anyway for processing your groups.

            – SpghttCd
            Nov 15 '18 at 0:13

















            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53310442%2fis-there-a-concise-way-of-removing-rows-in-every-group-of-a-groupby-object%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            這個網誌中的熱門文章

            How to read a connectionString WITH PROVIDER in .NET Core?

            In R, how to develop a multiplot heatmap.2 figure showing key labels successfully

            Museum of Modern and Contemporary Art of Trento and Rovereto