pandas convert some columns into rows
So my dataset has some information by location for n dates. The problem is each date is actually a different column header. For example the CSV looks like
location name Jan-2010 Feb-2010 March-2010
A "test" 12 20 30
B "foo" 18 20 25
What I would like is for it to look like
location name Date Value
A "test" Jan-2010 12
A "test" Feb-2010 20
A "test" March-2010 30
B "foo" Jan-2010 18
B "foo" Feb-2010 20
B "foo" March-2010 25
problem is I don't know how many dates are in the column (though I know they will always start after name)
python pandas
add a comment |
So my dataset has some information by location for n dates. The problem is each date is actually a different column header. For example the CSV looks like
location name Jan-2010 Feb-2010 March-2010
A "test" 12 20 30
B "foo" 18 20 25
What I would like is for it to look like
location name Date Value
A "test" Jan-2010 12
A "test" Feb-2010 20
A "test" March-2010 30
B "foo" Jan-2010 18
B "foo" Feb-2010 20
B "foo" March-2010 25
problem is I don't know how many dates are in the column (though I know they will always start after name)
python pandas
stackoverflow.com/questions/19842066/…
– pyCthon
Feb 22 '15 at 3:21
add a comment |
So my dataset has some information by location for n dates. The problem is each date is actually a different column header. For example the CSV looks like
location name Jan-2010 Feb-2010 March-2010
A "test" 12 20 30
B "foo" 18 20 25
What I would like is for it to look like
location name Date Value
A "test" Jan-2010 12
A "test" Feb-2010 20
A "test" March-2010 30
B "foo" Jan-2010 18
B "foo" Feb-2010 20
B "foo" March-2010 25
problem is I don't know how many dates are in the column (though I know they will always start after name)
python pandas
So my dataset has some information by location for n dates. The problem is each date is actually a different column header. For example the CSV looks like
location name Jan-2010 Feb-2010 March-2010
A "test" 12 20 30
B "foo" 18 20 25
What I would like is for it to look like
location name Date Value
A "test" Jan-2010 12
A "test" Feb-2010 20
A "test" March-2010 30
B "foo" Jan-2010 18
B "foo" Feb-2010 20
B "foo" March-2010 25
problem is I don't know how many dates are in the column (though I know they will always start after name)
python pandas
python pandas
asked Feb 22 '15 at 3:08
WizurielWizuriel
6712922
6712922
stackoverflow.com/questions/19842066/…
– pyCthon
Feb 22 '15 at 3:21
add a comment |
stackoverflow.com/questions/19842066/…
– pyCthon
Feb 22 '15 at 3:21
stackoverflow.com/questions/19842066/…
– pyCthon
Feb 22 '15 at 3:21
stackoverflow.com/questions/19842066/…
– pyCthon
Feb 22 '15 at 3:21
add a comment |
3 Answers
3
active
oldest
votes
You can use pd.melt
to get most of the way there, and then sort:
>>> df
location name Jan-2010 Feb-2010 March-2010
0 A test 12 20 30
1 B foo 18 20 25
>>> df2 = pd.melt(df, id_vars=["location", "name"],
var_name="Date", value_name="Value")
>>> df2
location name Date Value
0 A test Jan-2010 12
1 B foo Jan-2010 18
2 A test Feb-2010 20
3 B foo Feb-2010 20
4 A test March-2010 30
5 B foo March-2010 25
>>> df2 = df2.sort(["location", "name"])
>>> df2
location name Date Value
0 A test Jan-2010 12
2 A test Feb-2010 20
4 A test March-2010 30
1 B foo Jan-2010 18
3 B foo Feb-2010 20
5 B foo March-2010 25
(Might want to throw in a .reset_index(drop=True)
, just to keep the output clean.)
Note: pd.DataFrame.sort
has been deprecated in favour of pd.DataFrame.sort_values
.
5
A year later, this is still the most succinct answer I've seen to the columns to rows question. Thank you!
– Moe Chughtai
Jun 30 '16 at 12:25
@DSM what would be the inverse of this function. i.e. how would one convertdf2
[back] todf
– 3kstc
Mar 8 '18 at 23:02
1
@3kstc Try here or here. You're wanting to look into pivots. Possiblypandas.pivot_table(df2,values='Value',index=['location','name'],columns='Date').reset_index()
.
– Teepeemm
Mar 9 '18 at 18:59
@DSM is there any way to go backwards? Meaning that I have a lot of rows with the same name and I would want all the dates to be on different columns
– Adrian
yesterday
add a comment |
I guess I found a simpler solution
temp1 = pd.melt(df1, id_vars=["location"], var_name='Date', value_name='Value')
temp2 = pd.melt(df1, id_vars=["name"], var_name='Date', value_name='Value')
Concat whole temp1
with temp2
's column name
temp1['new_column'] = temp2['name']
You now have what you asked for.
add a comment |
pd.wide_to_long
You can add a prefix to your year columns and then feed directly to pd.wide_to_long
. I won't pretend this is efficient, but it may in certain situations be more convenient than pd.melt
, e.g. when your columns already have an appropriate prefix.
df.columns = np.hstack((df.columns[:2], df.columns[2:].map(lambda x: f'Valuex')))
res = pd.wide_to_long(df, stubnames=['Value'], i='name', j='Date').reset_index()
.sort_values(['location', 'name'])
print(res)
name Date location Value
0 test Jan-2010 A 12
2 test Feb-2010 A 20
4 test March-2010 A 30
1 foo Jan-2010 B 18
3 foo Feb-2010 B 20
5 foo March-2010 B 25
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f28654047%2fpandas-convert-some-columns-into-rows%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can use pd.melt
to get most of the way there, and then sort:
>>> df
location name Jan-2010 Feb-2010 March-2010
0 A test 12 20 30
1 B foo 18 20 25
>>> df2 = pd.melt(df, id_vars=["location", "name"],
var_name="Date", value_name="Value")
>>> df2
location name Date Value
0 A test Jan-2010 12
1 B foo Jan-2010 18
2 A test Feb-2010 20
3 B foo Feb-2010 20
4 A test March-2010 30
5 B foo March-2010 25
>>> df2 = df2.sort(["location", "name"])
>>> df2
location name Date Value
0 A test Jan-2010 12
2 A test Feb-2010 20
4 A test March-2010 30
1 B foo Jan-2010 18
3 B foo Feb-2010 20
5 B foo March-2010 25
(Might want to throw in a .reset_index(drop=True)
, just to keep the output clean.)
Note: pd.DataFrame.sort
has been deprecated in favour of pd.DataFrame.sort_values
.
5
A year later, this is still the most succinct answer I've seen to the columns to rows question. Thank you!
– Moe Chughtai
Jun 30 '16 at 12:25
@DSM what would be the inverse of this function. i.e. how would one convertdf2
[back] todf
– 3kstc
Mar 8 '18 at 23:02
1
@3kstc Try here or here. You're wanting to look into pivots. Possiblypandas.pivot_table(df2,values='Value',index=['location','name'],columns='Date').reset_index()
.
– Teepeemm
Mar 9 '18 at 18:59
@DSM is there any way to go backwards? Meaning that I have a lot of rows with the same name and I would want all the dates to be on different columns
– Adrian
yesterday
add a comment |
You can use pd.melt
to get most of the way there, and then sort:
>>> df
location name Jan-2010 Feb-2010 March-2010
0 A test 12 20 30
1 B foo 18 20 25
>>> df2 = pd.melt(df, id_vars=["location", "name"],
var_name="Date", value_name="Value")
>>> df2
location name Date Value
0 A test Jan-2010 12
1 B foo Jan-2010 18
2 A test Feb-2010 20
3 B foo Feb-2010 20
4 A test March-2010 30
5 B foo March-2010 25
>>> df2 = df2.sort(["location", "name"])
>>> df2
location name Date Value
0 A test Jan-2010 12
2 A test Feb-2010 20
4 A test March-2010 30
1 B foo Jan-2010 18
3 B foo Feb-2010 20
5 B foo March-2010 25
(Might want to throw in a .reset_index(drop=True)
, just to keep the output clean.)
Note: pd.DataFrame.sort
has been deprecated in favour of pd.DataFrame.sort_values
.
5
A year later, this is still the most succinct answer I've seen to the columns to rows question. Thank you!
– Moe Chughtai
Jun 30 '16 at 12:25
@DSM what would be the inverse of this function. i.e. how would one convertdf2
[back] todf
– 3kstc
Mar 8 '18 at 23:02
1
@3kstc Try here or here. You're wanting to look into pivots. Possiblypandas.pivot_table(df2,values='Value',index=['location','name'],columns='Date').reset_index()
.
– Teepeemm
Mar 9 '18 at 18:59
@DSM is there any way to go backwards? Meaning that I have a lot of rows with the same name and I would want all the dates to be on different columns
– Adrian
yesterday
add a comment |
You can use pd.melt
to get most of the way there, and then sort:
>>> df
location name Jan-2010 Feb-2010 March-2010
0 A test 12 20 30
1 B foo 18 20 25
>>> df2 = pd.melt(df, id_vars=["location", "name"],
var_name="Date", value_name="Value")
>>> df2
location name Date Value
0 A test Jan-2010 12
1 B foo Jan-2010 18
2 A test Feb-2010 20
3 B foo Feb-2010 20
4 A test March-2010 30
5 B foo March-2010 25
>>> df2 = df2.sort(["location", "name"])
>>> df2
location name Date Value
0 A test Jan-2010 12
2 A test Feb-2010 20
4 A test March-2010 30
1 B foo Jan-2010 18
3 B foo Feb-2010 20
5 B foo March-2010 25
(Might want to throw in a .reset_index(drop=True)
, just to keep the output clean.)
Note: pd.DataFrame.sort
has been deprecated in favour of pd.DataFrame.sort_values
.
You can use pd.melt
to get most of the way there, and then sort:
>>> df
location name Jan-2010 Feb-2010 March-2010
0 A test 12 20 30
1 B foo 18 20 25
>>> df2 = pd.melt(df, id_vars=["location", "name"],
var_name="Date", value_name="Value")
>>> df2
location name Date Value
0 A test Jan-2010 12
1 B foo Jan-2010 18
2 A test Feb-2010 20
3 B foo Feb-2010 20
4 A test March-2010 30
5 B foo March-2010 25
>>> df2 = df2.sort(["location", "name"])
>>> df2
location name Date Value
0 A test Jan-2010 12
2 A test Feb-2010 20
4 A test March-2010 30
1 B foo Jan-2010 18
3 B foo Feb-2010 20
5 B foo March-2010 25
(Might want to throw in a .reset_index(drop=True)
, just to keep the output clean.)
Note: pd.DataFrame.sort
has been deprecated in favour of pd.DataFrame.sort_values
.
edited Apr 8 '18 at 12:32
jpp
97.4k2159109
97.4k2159109
answered Feb 22 '15 at 3:21
DSMDSM
207k35397372
207k35397372
5
A year later, this is still the most succinct answer I've seen to the columns to rows question. Thank you!
– Moe Chughtai
Jun 30 '16 at 12:25
@DSM what would be the inverse of this function. i.e. how would one convertdf2
[back] todf
– 3kstc
Mar 8 '18 at 23:02
1
@3kstc Try here or here. You're wanting to look into pivots. Possiblypandas.pivot_table(df2,values='Value',index=['location','name'],columns='Date').reset_index()
.
– Teepeemm
Mar 9 '18 at 18:59
@DSM is there any way to go backwards? Meaning that I have a lot of rows with the same name and I would want all the dates to be on different columns
– Adrian
yesterday
add a comment |
5
A year later, this is still the most succinct answer I've seen to the columns to rows question. Thank you!
– Moe Chughtai
Jun 30 '16 at 12:25
@DSM what would be the inverse of this function. i.e. how would one convertdf2
[back] todf
– 3kstc
Mar 8 '18 at 23:02
1
@3kstc Try here or here. You're wanting to look into pivots. Possiblypandas.pivot_table(df2,values='Value',index=['location','name'],columns='Date').reset_index()
.
– Teepeemm
Mar 9 '18 at 18:59
@DSM is there any way to go backwards? Meaning that I have a lot of rows with the same name and I would want all the dates to be on different columns
– Adrian
yesterday
5
5
A year later, this is still the most succinct answer I've seen to the columns to rows question. Thank you!
– Moe Chughtai
Jun 30 '16 at 12:25
A year later, this is still the most succinct answer I've seen to the columns to rows question. Thank you!
– Moe Chughtai
Jun 30 '16 at 12:25
@DSM what would be the inverse of this function. i.e. how would one convert
df2
[back] to df
– 3kstc
Mar 8 '18 at 23:02
@DSM what would be the inverse of this function. i.e. how would one convert
df2
[back] to df
– 3kstc
Mar 8 '18 at 23:02
1
1
@3kstc Try here or here. You're wanting to look into pivots. Possibly
pandas.pivot_table(df2,values='Value',index=['location','name'],columns='Date').reset_index()
.– Teepeemm
Mar 9 '18 at 18:59
@3kstc Try here or here. You're wanting to look into pivots. Possibly
pandas.pivot_table(df2,values='Value',index=['location','name'],columns='Date').reset_index()
.– Teepeemm
Mar 9 '18 at 18:59
@DSM is there any way to go backwards? Meaning that I have a lot of rows with the same name and I would want all the dates to be on different columns
– Adrian
yesterday
@DSM is there any way to go backwards? Meaning that I have a lot of rows with the same name and I would want all the dates to be on different columns
– Adrian
yesterday
add a comment |
I guess I found a simpler solution
temp1 = pd.melt(df1, id_vars=["location"], var_name='Date', value_name='Value')
temp2 = pd.melt(df1, id_vars=["name"], var_name='Date', value_name='Value')
Concat whole temp1
with temp2
's column name
temp1['new_column'] = temp2['name']
You now have what you asked for.
add a comment |
I guess I found a simpler solution
temp1 = pd.melt(df1, id_vars=["location"], var_name='Date', value_name='Value')
temp2 = pd.melt(df1, id_vars=["name"], var_name='Date', value_name='Value')
Concat whole temp1
with temp2
's column name
temp1['new_column'] = temp2['name']
You now have what you asked for.
add a comment |
I guess I found a simpler solution
temp1 = pd.melt(df1, id_vars=["location"], var_name='Date', value_name='Value')
temp2 = pd.melt(df1, id_vars=["name"], var_name='Date', value_name='Value')
Concat whole temp1
with temp2
's column name
temp1['new_column'] = temp2['name']
You now have what you asked for.
I guess I found a simpler solution
temp1 = pd.melt(df1, id_vars=["location"], var_name='Date', value_name='Value')
temp2 = pd.melt(df1, id_vars=["name"], var_name='Date', value_name='Value')
Concat whole temp1
with temp2
's column name
temp1['new_column'] = temp2['name']
You now have what you asked for.
answered Jun 25 '18 at 6:49
PrometheusPrometheus
16529
16529
add a comment |
add a comment |
pd.wide_to_long
You can add a prefix to your year columns and then feed directly to pd.wide_to_long
. I won't pretend this is efficient, but it may in certain situations be more convenient than pd.melt
, e.g. when your columns already have an appropriate prefix.
df.columns = np.hstack((df.columns[:2], df.columns[2:].map(lambda x: f'Valuex')))
res = pd.wide_to_long(df, stubnames=['Value'], i='name', j='Date').reset_index()
.sort_values(['location', 'name'])
print(res)
name Date location Value
0 test Jan-2010 A 12
2 test Feb-2010 A 20
4 test March-2010 A 30
1 foo Jan-2010 B 18
3 foo Feb-2010 B 20
5 foo March-2010 B 25
add a comment |
pd.wide_to_long
You can add a prefix to your year columns and then feed directly to pd.wide_to_long
. I won't pretend this is efficient, but it may in certain situations be more convenient than pd.melt
, e.g. when your columns already have an appropriate prefix.
df.columns = np.hstack((df.columns[:2], df.columns[2:].map(lambda x: f'Valuex')))
res = pd.wide_to_long(df, stubnames=['Value'], i='name', j='Date').reset_index()
.sort_values(['location', 'name'])
print(res)
name Date location Value
0 test Jan-2010 A 12
2 test Feb-2010 A 20
4 test March-2010 A 30
1 foo Jan-2010 B 18
3 foo Feb-2010 B 20
5 foo March-2010 B 25
add a comment |
pd.wide_to_long
You can add a prefix to your year columns and then feed directly to pd.wide_to_long
. I won't pretend this is efficient, but it may in certain situations be more convenient than pd.melt
, e.g. when your columns already have an appropriate prefix.
df.columns = np.hstack((df.columns[:2], df.columns[2:].map(lambda x: f'Valuex')))
res = pd.wide_to_long(df, stubnames=['Value'], i='name', j='Date').reset_index()
.sort_values(['location', 'name'])
print(res)
name Date location Value
0 test Jan-2010 A 12
2 test Feb-2010 A 20
4 test March-2010 A 30
1 foo Jan-2010 B 18
3 foo Feb-2010 B 20
5 foo March-2010 B 25
pd.wide_to_long
You can add a prefix to your year columns and then feed directly to pd.wide_to_long
. I won't pretend this is efficient, but it may in certain situations be more convenient than pd.melt
, e.g. when your columns already have an appropriate prefix.
df.columns = np.hstack((df.columns[:2], df.columns[2:].map(lambda x: f'Valuex')))
res = pd.wide_to_long(df, stubnames=['Value'], i='name', j='Date').reset_index()
.sort_values(['location', 'name'])
print(res)
name Date location Value
0 test Jan-2010 A 12
2 test Feb-2010 A 20
4 test March-2010 A 30
1 foo Jan-2010 B 18
3 foo Feb-2010 B 20
5 foo March-2010 B 25
answered Nov 13 '18 at 17:00
jppjpp
97.4k2159109
97.4k2159109
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f28654047%2fpandas-convert-some-columns-into-rows%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
stackoverflow.com/questions/19842066/…
– pyCthon
Feb 22 '15 at 3:21