Is there a concise way of removing rows in every group of a GroupBy object?

Consider the following data that closely resembles the Pandas' Group By Tutorial:

import pandas as pd
import numpy as np

df = pd.DataFrame('Week' : [1, 2, 1, 2,
 1, 2, 1, 1],
 'BloodType' : ['A+', 'AB', 'AB', 'B',
 'B', 'B+', 'AB', 'AB'],
 'C' : np.random.randn(8),
 'D' : np.random.randn(8))

This produces a DataFrame that looks like this:

Sample Data

I want to group by the "Week" and then apply some operation to only the columns C and D. So I tried:

week_group = df.groupby('Week')
week_group.apply(lambda x: x.drop(["BloodType", "Week"], 1))

Which I originally interpreted as for every DataFrame drop the "BloodType" and "Week" column and give me the resulting group. However, it gives me:

Sample apply

However, I would have expected it to give me a Group, where each index was a DataFrame with only columns C and D. I did not expect a DataFrame.

I tried switching out apply with transform and agg which gave:

ValueError: transform must return a scalar value for each group

and:

ValueError: cannot copy sequence with size 2 to array axis with dimension 5

respectively. Is there a relatively simple transformation that can remove rows by name for each DataFrame in in a pandas Group and return the resulting Group object (or perform the operation in place)?

asked Nov 14 '18 at 23:45

Dair

11.9k54274

df.groupby("Week")[("C", "D")] isn't what you want?

– CJR
Nov 15 '18 at 0:01

@CJ59: I have a bigger data set where I want to drop about 4 things and keep about 700 others. Is there a way to get the compliment? But otherwise, yes.

– Dair
Nov 15 '18 at 0:05

@CJ59 I got it to work thanks for the help!

– Dair
Nov 15 '18 at 0:11

NP, the answer below is how I'd have done it exactly

– CJR
Nov 15 '18 at 0:16

add a comment |

Consider the following data that closely resembles the Pandas' Group By Tutorial:

import pandas as pd
import numpy as np

df = pd.DataFrame('Week' : [1, 2, 1, 2,
 1, 2, 1, 1],
 'BloodType' : ['A+', 'AB', 'AB', 'B',
 'B', 'B+', 'AB', 'AB'],
 'C' : np.random.randn(8),
 'D' : np.random.randn(8))

This produces a DataFrame that looks like this:

Sample Data

I want to group by the "Week" and then apply some operation to only the columns C and D. So I tried:

week_group = df.groupby('Week')
week_group.apply(lambda x: x.drop(["BloodType", "Week"], 1))

Which I originally interpreted as for every DataFrame drop the "BloodType" and "Week" column and give me the resulting group. However, it gives me:

Sample apply

However, I would have expected it to give me a Group, where each index was a DataFrame with only columns C and D. I did not expect a DataFrame.

I tried switching out apply with transform and agg which gave:

ValueError: transform must return a scalar value for each group

and:

ValueError: cannot copy sequence with size 2 to array axis with dimension 5

asked Nov 14 '18 at 23:45

Dair

11.9k54274

df.groupby("Week")[("C", "D")] isn't what you want?

– CJR
Nov 15 '18 at 0:01

@CJ59: I have a bigger data set where I want to drop about 4 things and keep about 700 others. Is there a way to get the compliment? But otherwise, yes.

– Dair
Nov 15 '18 at 0:05

@CJ59 I got it to work thanks for the help!

– Dair
Nov 15 '18 at 0:11

NP, the answer below is how I'd have done it exactly

– CJR
Nov 15 '18 at 0:16

add a comment |

Consider the following data that closely resembles the Pandas' Group By Tutorial:

import pandas as pd
import numpy as np

df = pd.DataFrame('Week' : [1, 2, 1, 2,
 1, 2, 1, 1],
 'BloodType' : ['A+', 'AB', 'AB', 'B',
 'B', 'B+', 'AB', 'AB'],
 'C' : np.random.randn(8),
 'D' : np.random.randn(8))

This produces a DataFrame that looks like this:

Sample Data

I want to group by the "Week" and then apply some operation to only the columns C and D. So I tried:

week_group = df.groupby('Week')
week_group.apply(lambda x: x.drop(["BloodType", "Week"], 1))

Which I originally interpreted as for every DataFrame drop the "BloodType" and "Week" column and give me the resulting group. However, it gives me:

Sample apply

However, I would have expected it to give me a Group, where each index was a DataFrame with only columns C and D. I did not expect a DataFrame.

I tried switching out apply with transform and agg which gave:

ValueError: transform must return a scalar value for each group

and:

ValueError: cannot copy sequence with size 2 to array axis with dimension 5

asked Nov 14 '18 at 23:45

Dair

11.9k54274

Consider the following data that closely resembles the Pandas' Group By Tutorial:

import pandas as pd
import numpy as np

df = pd.DataFrame('Week' : [1, 2, 1, 2,
 1, 2, 1, 1],
 'BloodType' : ['A+', 'AB', 'AB', 'B',
 'B', 'B+', 'AB', 'AB'],
 'C' : np.random.randn(8),
 'D' : np.random.randn(8))

This produces a DataFrame that looks like this:

Sample Data

I want to group by the "Week" and then apply some operation to only the columns C and D. So I tried:

week_group = df.groupby('Week')
week_group.apply(lambda x: x.drop(["BloodType", "Week"], 1))

Which I originally interpreted as for every DataFrame drop the "BloodType" and "Week" column and give me the resulting group. However, it gives me:

Sample apply

However, I would have expected it to give me a Group, where each index was a DataFrame with only columns C and D. I did not expect a DataFrame.

I tried switching out apply with transform and agg which gave:

ValueError: transform must return a scalar value for each group

and:

ValueError: cannot copy sequence with size 2 to array axis with dimension 5

python pandas dataframe

asked Nov 14 '18 at 23:45

Dair

11.9k54274

asked Nov 14 '18 at 23:45

Dair

11.9k54274

asked Nov 14 '18 at 23:45

Dair

11.9k54274

asked Nov 14 '18 at 23:45

Dair

11.9k54274

asked Nov 14 '18 at 23:45

Dair

11.9k54274

df.groupby("Week")[("C", "D")] isn't what you want?

– CJR
Nov 15 '18 at 0:01

@CJ59: I have a bigger data set where I want to drop about 4 things and keep about 700 others. Is there a way to get the compliment? But otherwise, yes.

– Dair
Nov 15 '18 at 0:05

@CJ59 I got it to work thanks for the help!

– Dair
Nov 15 '18 at 0:11

NP, the answer below is how I'd have done it exactly

– CJR
Nov 15 '18 at 0:16

add a comment |

df.groupby("Week")[("C", "D")] isn't what you want?

– CJR
Nov 15 '18 at 0:01

@CJ59: I have a bigger data set where I want to drop about 4 things and keep about 700 others. Is there a way to get the compliment? But otherwise, yes.

– Dair
Nov 15 '18 at 0:05

@CJ59 I got it to work thanks for the help!

– Dair
Nov 15 '18 at 0:11

NP, the answer below is how I'd have done it exactly

– CJR
Nov 15 '18 at 0:16

df.groupby("Week")[("C", "D")] isn't what you want?

– CJR
Nov 15 '18 at 0:01

@CJ59: I have a bigger data set where I want to drop about 4 things and keep about 700 others. Is there a way to get the compliment? But otherwise, yes.

– Dair
Nov 15 '18 at 0:05

@CJ59 I got it to work thanks for the help!

– Dair
Nov 15 '18 at 0:11

NP, the answer below is how I'd have done it exactly

– CJR
Nov 15 '18 at 0:16

add a comment |

2 Answers
2

active

oldest

votes

Based off of CJ59's answer I came up with this concise solution:

week_group = week_group[df.columns.difference(["Week", "BloodType"])]

answered Nov 15 '18 at 0:10

Dair

11.9k54274

add a comment |

Are you perhaps searching for

for name, group in df.groupby('Week'):
 print(name, group.drop(columns=['Week', 'BloodType']))

1 C D
0 0.496714 -0.469474
2 0.647689 -0.463418
4 -0.234153 0.241962
6 1.579213 -1.724918
7 0.767435 -0.562288
2 C D
1 -0.138264 0.54256
3 1.523030 -0.46573
5 -0.234137 -1.91328

answered Nov 15 '18 at 0:08

SpghttCd

4,8372313

This is a method that works, although I was pretty convinced that there was a non-loopy way of doing it. See my answer. Nonetheless, thanks for the input.

– Dair
Nov 15 '18 at 0:11

I see, but note that the loop is not really part of the solution to get rid of some columns. The dropping is just done while using a loop which I assume you'll need anyway for processing your groups.

– SpghttCd
Nov 15 '18 at 0:13

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53310442%2fis-there-a-concise-way-of-removing-rows-in-every-group-of-a-groupby-object%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

Based off of CJ59's answer I came up with this concise solution:

week_group = week_group[df.columns.difference(["Week", "BloodType"])]

answered Nov 15 '18 at 0:10

Dair

11.9k54274

add a comment |

Based off of CJ59's answer I came up with this concise solution:

week_group = week_group[df.columns.difference(["Week", "BloodType"])]

answered Nov 15 '18 at 0:10

Dair

11.9k54274

add a comment |

Based off of CJ59's answer I came up with this concise solution:

week_group = week_group[df.columns.difference(["Week", "BloodType"])]

answered Nov 15 '18 at 0:10

Dair

11.9k54274

Based off of CJ59's answer I came up with this concise solution:

week_group = week_group[df.columns.difference(["Week", "BloodType"])]

answered Nov 15 '18 at 0:10

Dair

11.9k54274

answered Nov 15 '18 at 0:10

Dair

11.9k54274

answered Nov 15 '18 at 0:10

Dair

11.9k54274

answered Nov 15 '18 at 0:10

Dair

11.9k54274

add a comment |

Are you perhaps searching for

for name, group in df.groupby('Week'):
 print(name, group.drop(columns=['Week', 'BloodType']))

1 C D
0 0.496714 -0.469474
2 0.647689 -0.463418
4 -0.234153 0.241962
6 1.579213 -1.724918
7 0.767435 -0.562288
2 C D
1 -0.138264 0.54256
3 1.523030 -0.46573
5 -0.234137 -1.91328

answered Nov 15 '18 at 0:08

SpghttCd

4,8372313

This is a method that works, although I was pretty convinced that there was a non-loopy way of doing it. See my answer. Nonetheless, thanks for the input.

– Dair
Nov 15 '18 at 0:11

I see, but note that the loop is not really part of the solution to get rid of some columns. The dropping is just done while using a loop which I assume you'll need anyway for processing your groups.

– SpghttCd
Nov 15 '18 at 0:13

add a comment |

Are you perhaps searching for

for name, group in df.groupby('Week'):
 print(name, group.drop(columns=['Week', 'BloodType']))

1 C D
0 0.496714 -0.469474
2 0.647689 -0.463418
4 -0.234153 0.241962
6 1.579213 -1.724918
7 0.767435 -0.562288
2 C D
1 -0.138264 0.54256
3 1.523030 -0.46573
5 -0.234137 -1.91328

answered Nov 15 '18 at 0:08

SpghttCd

4,8372313

This is a method that works, although I was pretty convinced that there was a non-loopy way of doing it. See my answer. Nonetheless, thanks for the input.

– Dair
Nov 15 '18 at 0:11

I see, but note that the loop is not really part of the solution to get rid of some columns. The dropping is just done while using a loop which I assume you'll need anyway for processing your groups.

– SpghttCd
Nov 15 '18 at 0:13

add a comment |

Are you perhaps searching for

for name, group in df.groupby('Week'):
 print(name, group.drop(columns=['Week', 'BloodType']))

1 C D
0 0.496714 -0.469474
2 0.647689 -0.463418
4 -0.234153 0.241962
6 1.579213 -1.724918
7 0.767435 -0.562288
2 C D
1 -0.138264 0.54256
3 1.523030 -0.46573
5 -0.234137 -1.91328

answered Nov 15 '18 at 0:08

SpghttCd

4,8372313

Are you perhaps searching for

for name, group in df.groupby('Week'):
 print(name, group.drop(columns=['Week', 'BloodType']))

1 C D
0 0.496714 -0.469474
2 0.647689 -0.463418
4 -0.234153 0.241962
6 1.579213 -1.724918
7 0.767435 -0.562288
2 C D
1 -0.138264 0.54256
3 1.523030 -0.46573
5 -0.234137 -1.91328

answered Nov 15 '18 at 0:08

SpghttCd

4,8372313

answered Nov 15 '18 at 0:08

SpghttCd

4,8372313

answered Nov 15 '18 at 0:08

SpghttCd

4,8372313

answered Nov 15 '18 at 0:08

SpghttCd

4,8372313

This is a method that works, although I was pretty convinced that there was a non-loopy way of doing it. See my answer. Nonetheless, thanks for the input.

– Dair
Nov 15 '18 at 0:11

I see, but note that the loop is not really part of the solution to get rid of some columns. The dropping is just done while using a loop which I assume you'll need anyway for processing your groups.

– SpghttCd
Nov 15 '18 at 0:13

add a comment |

This is a method that works, although I was pretty convinced that there was a non-loopy way of doing it. See my answer. Nonetheless, thanks for the input.

– Dair
Nov 15 '18 at 0:11

I see, but note that the loop is not really part of the solution to get rid of some columns. The dropping is just done while using a loop which I assume you'll need anyway for processing your groups.

– SpghttCd
Nov 15 '18 at 0:13

This is a method that works, although I was pretty convinced that there was a non-loopy way of doing it. See my answer. Nonetheless, thanks for the input.

– Dair
Nov 15 '18 at 0:11

I see, but note that the loop is not really part of the solution to get rid of some columns. The dropping is just done while using a loop which I assume you'll need anyway for processing your groups.

– SpghttCd
Nov 15 '18 at 0:13

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Odtnhj