Is there a concise way of removing rows in every group of a GroupBy object?
Consider the following data that closely resembles the Pandas' Group By Tutorial:
import pandas as pd
import numpy as np
df = pd.DataFrame('Week' : [1, 2, 1, 2,
1, 2, 1, 1],
'BloodType' : ['A+', 'AB', 'AB', 'B',
'B', 'B+', 'AB', 'AB'],
'C' : np.random.randn(8),
'D' : np.random.randn(8))
This produces a DataFrame
that looks like this:
I want to group by the "Week"
and then apply some operation to only the columns C
and D
. So I tried:
week_group = df.groupby('Week')
week_group.apply(lambda x: x.drop(["BloodType", "Week"], 1))
Which I originally interpreted as for every DataFrame
drop the "BloodType"
and "Week"
column and give me the resulting group. However, it gives me:
However, I would have expected it to give me a Group
, where each index was a DataFrame
with only columns C
and D
. I did not expect a DataFrame
.
I tried switching out apply
with transform
and agg
which gave:
ValueError: transform must return a scalar value for each group
and:
ValueError: cannot copy sequence with size 2 to array axis with dimension 5
respectively. Is there a relatively simple transformation that can remove rows by name for each DataFrame
in in a pandas Group
and return the resulting Group
object (or perform the operation in place)?
python pandas dataframe
add a comment |
Consider the following data that closely resembles the Pandas' Group By Tutorial:
import pandas as pd
import numpy as np
df = pd.DataFrame('Week' : [1, 2, 1, 2,
1, 2, 1, 1],
'BloodType' : ['A+', 'AB', 'AB', 'B',
'B', 'B+', 'AB', 'AB'],
'C' : np.random.randn(8),
'D' : np.random.randn(8))
This produces a DataFrame
that looks like this:
I want to group by the "Week"
and then apply some operation to only the columns C
and D
. So I tried:
week_group = df.groupby('Week')
week_group.apply(lambda x: x.drop(["BloodType", "Week"], 1))
Which I originally interpreted as for every DataFrame
drop the "BloodType"
and "Week"
column and give me the resulting group. However, it gives me:
However, I would have expected it to give me a Group
, where each index was a DataFrame
with only columns C
and D
. I did not expect a DataFrame
.
I tried switching out apply
with transform
and agg
which gave:
ValueError: transform must return a scalar value for each group
and:
ValueError: cannot copy sequence with size 2 to array axis with dimension 5
respectively. Is there a relatively simple transformation that can remove rows by name for each DataFrame
in in a pandas Group
and return the resulting Group
object (or perform the operation in place)?
python pandas dataframe
df.groupby("Week")[("C", "D")]
isn't what you want?
– CJR
Nov 15 '18 at 0:01
@CJ59: I have a bigger data set where I want to drop about 4 things and keep about 700 others. Is there a way to get the compliment? But otherwise, yes.
– Dair
Nov 15 '18 at 0:05
@CJ59 I got it to work thanks for the help!
– Dair
Nov 15 '18 at 0:11
NP, the answer below is how I'd have done it exactly
– CJR
Nov 15 '18 at 0:16
add a comment |
Consider the following data that closely resembles the Pandas' Group By Tutorial:
import pandas as pd
import numpy as np
df = pd.DataFrame('Week' : [1, 2, 1, 2,
1, 2, 1, 1],
'BloodType' : ['A+', 'AB', 'AB', 'B',
'B', 'B+', 'AB', 'AB'],
'C' : np.random.randn(8),
'D' : np.random.randn(8))
This produces a DataFrame
that looks like this:
I want to group by the "Week"
and then apply some operation to only the columns C
and D
. So I tried:
week_group = df.groupby('Week')
week_group.apply(lambda x: x.drop(["BloodType", "Week"], 1))
Which I originally interpreted as for every DataFrame
drop the "BloodType"
and "Week"
column and give me the resulting group. However, it gives me:
However, I would have expected it to give me a Group
, where each index was a DataFrame
with only columns C
and D
. I did not expect a DataFrame
.
I tried switching out apply
with transform
and agg
which gave:
ValueError: transform must return a scalar value for each group
and:
ValueError: cannot copy sequence with size 2 to array axis with dimension 5
respectively. Is there a relatively simple transformation that can remove rows by name for each DataFrame
in in a pandas Group
and return the resulting Group
object (or perform the operation in place)?
python pandas dataframe
Consider the following data that closely resembles the Pandas' Group By Tutorial:
import pandas as pd
import numpy as np
df = pd.DataFrame('Week' : [1, 2, 1, 2,
1, 2, 1, 1],
'BloodType' : ['A+', 'AB', 'AB', 'B',
'B', 'B+', 'AB', 'AB'],
'C' : np.random.randn(8),
'D' : np.random.randn(8))
This produces a DataFrame
that looks like this:
I want to group by the "Week"
and then apply some operation to only the columns C
and D
. So I tried:
week_group = df.groupby('Week')
week_group.apply(lambda x: x.drop(["BloodType", "Week"], 1))
Which I originally interpreted as for every DataFrame
drop the "BloodType"
and "Week"
column and give me the resulting group. However, it gives me:
However, I would have expected it to give me a Group
, where each index was a DataFrame
with only columns C
and D
. I did not expect a DataFrame
.
I tried switching out apply
with transform
and agg
which gave:
ValueError: transform must return a scalar value for each group
and:
ValueError: cannot copy sequence with size 2 to array axis with dimension 5
respectively. Is there a relatively simple transformation that can remove rows by name for each DataFrame
in in a pandas Group
and return the resulting Group
object (or perform the operation in place)?
python pandas dataframe
python pandas dataframe
asked Nov 14 '18 at 23:45
DairDair
11.9k54274
11.9k54274
df.groupby("Week")[("C", "D")]
isn't what you want?
– CJR
Nov 15 '18 at 0:01
@CJ59: I have a bigger data set where I want to drop about 4 things and keep about 700 others. Is there a way to get the compliment? But otherwise, yes.
– Dair
Nov 15 '18 at 0:05
@CJ59 I got it to work thanks for the help!
– Dair
Nov 15 '18 at 0:11
NP, the answer below is how I'd have done it exactly
– CJR
Nov 15 '18 at 0:16
add a comment |
df.groupby("Week")[("C", "D")]
isn't what you want?
– CJR
Nov 15 '18 at 0:01
@CJ59: I have a bigger data set where I want to drop about 4 things and keep about 700 others. Is there a way to get the compliment? But otherwise, yes.
– Dair
Nov 15 '18 at 0:05
@CJ59 I got it to work thanks for the help!
– Dair
Nov 15 '18 at 0:11
NP, the answer below is how I'd have done it exactly
– CJR
Nov 15 '18 at 0:16
df.groupby("Week")[("C", "D")]
isn't what you want?– CJR
Nov 15 '18 at 0:01
df.groupby("Week")[("C", "D")]
isn't what you want?– CJR
Nov 15 '18 at 0:01
@CJ59: I have a bigger data set where I want to drop about 4 things and keep about 700 others. Is there a way to get the compliment? But otherwise, yes.
– Dair
Nov 15 '18 at 0:05
@CJ59: I have a bigger data set where I want to drop about 4 things and keep about 700 others. Is there a way to get the compliment? But otherwise, yes.
– Dair
Nov 15 '18 at 0:05
@CJ59 I got it to work thanks for the help!
– Dair
Nov 15 '18 at 0:11
@CJ59 I got it to work thanks for the help!
– Dair
Nov 15 '18 at 0:11
NP, the answer below is how I'd have done it exactly
– CJR
Nov 15 '18 at 0:16
NP, the answer below is how I'd have done it exactly
– CJR
Nov 15 '18 at 0:16
add a comment |
2 Answers
2
active
oldest
votes
Based off of CJ59's answer I came up with this concise solution:
week_group = week_group[df.columns.difference(["Week", "BloodType"])]
add a comment |
Are you perhaps searching for
for name, group in df.groupby('Week'):
print(name, group.drop(columns=['Week', 'BloodType']))
1 C D
0 0.496714 -0.469474
2 0.647689 -0.463418
4 -0.234153 0.241962
6 1.579213 -1.724918
7 0.767435 -0.562288
2 C D
1 -0.138264 0.54256
3 1.523030 -0.46573
5 -0.234137 -1.91328
This is a method that works, although I was pretty convinced that there was a non-loopy way of doing it. See my answer. Nonetheless, thanks for the input.
– Dair
Nov 15 '18 at 0:11
I see, but note that the loop is not really part of the solution to get rid of some columns. The dropping is just done while using a loop which I assume you'll need anyway for processing your groups.
– SpghttCd
Nov 15 '18 at 0:13
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53310442%2fis-there-a-concise-way-of-removing-rows-in-every-group-of-a-groupby-object%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Based off of CJ59's answer I came up with this concise solution:
week_group = week_group[df.columns.difference(["Week", "BloodType"])]
add a comment |
Based off of CJ59's answer I came up with this concise solution:
week_group = week_group[df.columns.difference(["Week", "BloodType"])]
add a comment |
Based off of CJ59's answer I came up with this concise solution:
week_group = week_group[df.columns.difference(["Week", "BloodType"])]
Based off of CJ59's answer I came up with this concise solution:
week_group = week_group[df.columns.difference(["Week", "BloodType"])]
answered Nov 15 '18 at 0:10
DairDair
11.9k54274
11.9k54274
add a comment |
add a comment |
Are you perhaps searching for
for name, group in df.groupby('Week'):
print(name, group.drop(columns=['Week', 'BloodType']))
1 C D
0 0.496714 -0.469474
2 0.647689 -0.463418
4 -0.234153 0.241962
6 1.579213 -1.724918
7 0.767435 -0.562288
2 C D
1 -0.138264 0.54256
3 1.523030 -0.46573
5 -0.234137 -1.91328
This is a method that works, although I was pretty convinced that there was a non-loopy way of doing it. See my answer. Nonetheless, thanks for the input.
– Dair
Nov 15 '18 at 0:11
I see, but note that the loop is not really part of the solution to get rid of some columns. The dropping is just done while using a loop which I assume you'll need anyway for processing your groups.
– SpghttCd
Nov 15 '18 at 0:13
add a comment |
Are you perhaps searching for
for name, group in df.groupby('Week'):
print(name, group.drop(columns=['Week', 'BloodType']))
1 C D
0 0.496714 -0.469474
2 0.647689 -0.463418
4 -0.234153 0.241962
6 1.579213 -1.724918
7 0.767435 -0.562288
2 C D
1 -0.138264 0.54256
3 1.523030 -0.46573
5 -0.234137 -1.91328
This is a method that works, although I was pretty convinced that there was a non-loopy way of doing it. See my answer. Nonetheless, thanks for the input.
– Dair
Nov 15 '18 at 0:11
I see, but note that the loop is not really part of the solution to get rid of some columns. The dropping is just done while using a loop which I assume you'll need anyway for processing your groups.
– SpghttCd
Nov 15 '18 at 0:13
add a comment |
Are you perhaps searching for
for name, group in df.groupby('Week'):
print(name, group.drop(columns=['Week', 'BloodType']))
1 C D
0 0.496714 -0.469474
2 0.647689 -0.463418
4 -0.234153 0.241962
6 1.579213 -1.724918
7 0.767435 -0.562288
2 C D
1 -0.138264 0.54256
3 1.523030 -0.46573
5 -0.234137 -1.91328
Are you perhaps searching for
for name, group in df.groupby('Week'):
print(name, group.drop(columns=['Week', 'BloodType']))
1 C D
0 0.496714 -0.469474
2 0.647689 -0.463418
4 -0.234153 0.241962
6 1.579213 -1.724918
7 0.767435 -0.562288
2 C D
1 -0.138264 0.54256
3 1.523030 -0.46573
5 -0.234137 -1.91328
answered Nov 15 '18 at 0:08
SpghttCdSpghttCd
4,8372313
4,8372313
This is a method that works, although I was pretty convinced that there was a non-loopy way of doing it. See my answer. Nonetheless, thanks for the input.
– Dair
Nov 15 '18 at 0:11
I see, but note that the loop is not really part of the solution to get rid of some columns. The dropping is just done while using a loop which I assume you'll need anyway for processing your groups.
– SpghttCd
Nov 15 '18 at 0:13
add a comment |
This is a method that works, although I was pretty convinced that there was a non-loopy way of doing it. See my answer. Nonetheless, thanks for the input.
– Dair
Nov 15 '18 at 0:11
I see, but note that the loop is not really part of the solution to get rid of some columns. The dropping is just done while using a loop which I assume you'll need anyway for processing your groups.
– SpghttCd
Nov 15 '18 at 0:13
This is a method that works, although I was pretty convinced that there was a non-loopy way of doing it. See my answer. Nonetheless, thanks for the input.
– Dair
Nov 15 '18 at 0:11
This is a method that works, although I was pretty convinced that there was a non-loopy way of doing it. See my answer. Nonetheless, thanks for the input.
– Dair
Nov 15 '18 at 0:11
I see, but note that the loop is not really part of the solution to get rid of some columns. The dropping is just done while using a loop which I assume you'll need anyway for processing your groups.
– SpghttCd
Nov 15 '18 at 0:13
I see, but note that the loop is not really part of the solution to get rid of some columns. The dropping is just done while using a loop which I assume you'll need anyway for processing your groups.
– SpghttCd
Nov 15 '18 at 0:13
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53310442%2fis-there-a-concise-way-of-removing-rows-in-every-group-of-a-groupby-object%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
df.groupby("Week")[("C", "D")]
isn't what you want?– CJR
Nov 15 '18 at 0:01
@CJ59: I have a bigger data set where I want to drop about 4 things and keep about 700 others. Is there a way to get the compliment? But otherwise, yes.
– Dair
Nov 15 '18 at 0:05
@CJ59 I got it to work thanks for the help!
– Dair
Nov 15 '18 at 0:11
NP, the answer below is how I'd have done it exactly
– CJR
Nov 15 '18 at 0:16