pandas convert some columns into rows

So my dataset has some information by location for n dates. The problem is each date is actually a different column header. For example the CSV looks like

location name Jan-2010 Feb-2010 March-2010
A "test" 12 20 30
B "foo" 18 20 25

What I would like is for it to look like

location name Date Value
A "test" Jan-2010 12 
A "test" Feb-2010 20
A "test" March-2010 30
B "foo" Jan-2010 18 
B "foo" Feb-2010 20
B "foo" March-2010 25

problem is I don't know how many dates are in the column (though I know they will always start after name)

asked Feb 22 '15 at 3:08

Wizuriel

6712922

stackoverflow.com/questions/19842066/…

– pyCthon
Feb 22 '15 at 3:21

add a comment |

So my dataset has some information by location for n dates. The problem is each date is actually a different column header. For example the CSV looks like

location name Jan-2010 Feb-2010 March-2010
A "test" 12 20 30
B "foo" 18 20 25

What I would like is for it to look like

location name Date Value
A "test" Jan-2010 12 
A "test" Feb-2010 20
A "test" March-2010 30
B "foo" Jan-2010 18 
B "foo" Feb-2010 20
B "foo" March-2010 25

problem is I don't know how many dates are in the column (though I know they will always start after name)

asked Feb 22 '15 at 3:08

Wizuriel

6712922

stackoverflow.com/questions/19842066/…

– pyCthon
Feb 22 '15 at 3:21

add a comment |

So my dataset has some information by location for n dates. The problem is each date is actually a different column header. For example the CSV looks like

location name Jan-2010 Feb-2010 March-2010
A "test" 12 20 30
B "foo" 18 20 25

What I would like is for it to look like

location name Date Value
A "test" Jan-2010 12 
A "test" Feb-2010 20
A "test" March-2010 30
B "foo" Jan-2010 18 
B "foo" Feb-2010 20
B "foo" March-2010 25

problem is I don't know how many dates are in the column (though I know they will always start after name)

asked Feb 22 '15 at 3:08

Wizuriel

6712922

So my dataset has some information by location for n dates. The problem is each date is actually a different column header. For example the CSV looks like

location name Jan-2010 Feb-2010 March-2010
A "test" 12 20 30
B "foo" 18 20 25

What I would like is for it to look like

location name Date Value
A "test" Jan-2010 12 
A "test" Feb-2010 20
A "test" March-2010 30
B "foo" Jan-2010 18 
B "foo" Feb-2010 20
B "foo" March-2010 25

problem is I don't know how many dates are in the column (though I know they will always start after name)

python pandas

asked Feb 22 '15 at 3:08

Wizuriel

6712922

asked Feb 22 '15 at 3:08

Wizuriel

6712922

asked Feb 22 '15 at 3:08

Wizuriel

6712922

asked Feb 22 '15 at 3:08

Wizuriel

6712922

asked Feb 22 '15 at 3:08

Wizuriel

6712922

stackoverflow.com/questions/19842066/…

– pyCthon
Feb 22 '15 at 3:21

add a comment |

stackoverflow.com/questions/19842066/…

– pyCthon
Feb 22 '15 at 3:21

stackoverflow.com/questions/19842066/…

– pyCthon
Feb 22 '15 at 3:21

add a comment |

3 Answers
3

active

oldest

votes

You can use pd.melt to get most of the way there, and then sort:

>>> df
 location name Jan-2010 Feb-2010 March-2010
0 A test 12 20 30
1 B foo 18 20 25
>>> df2 = pd.melt(df, id_vars=["location", "name"], 
 var_name="Date", value_name="Value")
>>> df2
 location name Date Value
0 A test Jan-2010 12
1 B foo Jan-2010 18
2 A test Feb-2010 20
3 B foo Feb-2010 20
4 A test March-2010 30
5 B foo March-2010 25
>>> df2 = df2.sort(["location", "name"])
>>> df2
 location name Date Value
0 A test Jan-2010 12
2 A test Feb-2010 20
4 A test March-2010 30
1 B foo Jan-2010 18
3 B foo Feb-2010 20
5 B foo March-2010 25

(Might want to throw in a .reset_index(drop=True), just to keep the output clean.)

Note: pd.DataFrame.sort has been deprecated in favour of pd.DataFrame.sort_values.

edited Apr 8 '18 at 12:32

jpp

97.4k2159109

answered Feb 22 '15 at 3:21

DSM

207k35397372

5

A year later, this is still the most succinct answer I've seen to the columns to rows question. Thank you!

– Moe Chughtai
Jun 30 '16 at 12:25

@DSM what would be the inverse of this function. i.e. how would one convert df2 [back] to df

– 3kstc
Mar 8 '18 at 23:02

1

@3kstc Try here or here. You're wanting to look into pivots. Possibly pandas.pivot_table(df2,values='Value',index=['location','name'],columns='Date').reset_index().

– Teepeemm
Mar 9 '18 at 18:59

@DSM is there any way to go backwards? Meaning that I have a lot of rows with the same name and I would want all the dates to be on different columns

– Adrian
yesterday

add a comment |

I guess I found a simpler solution

temp1 = pd.melt(df1, id_vars=["location"], var_name='Date', value_name='Value')
temp2 = pd.melt(df1, id_vars=["name"], var_name='Date', value_name='Value')

Concat whole temp1 with temp2's column name

temp1['new_column'] = temp2['name']

You now have what you asked for.

answered Jun 25 '18 at 6:49

Prometheus

16529

add a comment |

`pd.wide_to_long`

You can add a prefix to your year columns and then feed directly to pd.wide_to_long. I won't pretend this is efficient, but it may in certain situations be more convenient than pd.melt, e.g. when your columns already have an appropriate prefix.

df.columns = np.hstack((df.columns[:2], df.columns[2:].map(lambda x: f'Valuex')))

res = pd.wide_to_long(df, stubnames=['Value'], i='name', j='Date').reset_index()
 .sort_values(['location', 'name'])

print(res)

 name Date location Value
0 test Jan-2010 A 12
2 test Feb-2010 A 20
4 test March-2010 A 30
1 foo Jan-2010 B 18
3 foo Feb-2010 B 20
5 foo March-2010 B 25

answered Nov 13 '18 at 17:00

jpp

97.4k2159109

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f28654047%2fpandas-convert-some-columns-into-rows%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

You can use pd.melt to get most of the way there, and then sort:

>>> df
 location name Jan-2010 Feb-2010 March-2010
0 A test 12 20 30
1 B foo 18 20 25
>>> df2 = pd.melt(df, id_vars=["location", "name"], 
 var_name="Date", value_name="Value")
>>> df2
 location name Date Value
0 A test Jan-2010 12
1 B foo Jan-2010 18
2 A test Feb-2010 20
3 B foo Feb-2010 20
4 A test March-2010 30
5 B foo March-2010 25
>>> df2 = df2.sort(["location", "name"])
>>> df2
 location name Date Value
0 A test Jan-2010 12
2 A test Feb-2010 20
4 A test March-2010 30
1 B foo Jan-2010 18
3 B foo Feb-2010 20
5 B foo March-2010 25

(Might want to throw in a .reset_index(drop=True), just to keep the output clean.)

Note: pd.DataFrame.sort has been deprecated in favour of pd.DataFrame.sort_values.

edited Apr 8 '18 at 12:32

jpp

97.4k2159109

answered Feb 22 '15 at 3:21

DSM

207k35397372

5

A year later, this is still the most succinct answer I've seen to the columns to rows question. Thank you!

– Moe Chughtai
Jun 30 '16 at 12:25

@DSM what would be the inverse of this function. i.e. how would one convert df2 [back] to df

– 3kstc
Mar 8 '18 at 23:02

1

@3kstc Try here or here. You're wanting to look into pivots. Possibly pandas.pivot_table(df2,values='Value',index=['location','name'],columns='Date').reset_index().

– Teepeemm
Mar 9 '18 at 18:59

@DSM is there any way to go backwards? Meaning that I have a lot of rows with the same name and I would want all the dates to be on different columns

– Adrian
yesterday

add a comment |

You can use pd.melt to get most of the way there, and then sort:

>>> df
 location name Jan-2010 Feb-2010 March-2010
0 A test 12 20 30
1 B foo 18 20 25
>>> df2 = pd.melt(df, id_vars=["location", "name"], 
 var_name="Date", value_name="Value")
>>> df2
 location name Date Value
0 A test Jan-2010 12
1 B foo Jan-2010 18
2 A test Feb-2010 20
3 B foo Feb-2010 20
4 A test March-2010 30
5 B foo March-2010 25
>>> df2 = df2.sort(["location", "name"])
>>> df2
 location name Date Value
0 A test Jan-2010 12
2 A test Feb-2010 20
4 A test March-2010 30
1 B foo Jan-2010 18
3 B foo Feb-2010 20
5 B foo March-2010 25

(Might want to throw in a .reset_index(drop=True), just to keep the output clean.)

Note: pd.DataFrame.sort has been deprecated in favour of pd.DataFrame.sort_values.

edited Apr 8 '18 at 12:32

jpp

97.4k2159109

answered Feb 22 '15 at 3:21

DSM

207k35397372

5

A year later, this is still the most succinct answer I've seen to the columns to rows question. Thank you!

– Moe Chughtai
Jun 30 '16 at 12:25

@DSM what would be the inverse of this function. i.e. how would one convert df2 [back] to df

– 3kstc
Mar 8 '18 at 23:02

1

@3kstc Try here or here. You're wanting to look into pivots. Possibly pandas.pivot_table(df2,values='Value',index=['location','name'],columns='Date').reset_index().

– Teepeemm
Mar 9 '18 at 18:59

@DSM is there any way to go backwards? Meaning that I have a lot of rows with the same name and I would want all the dates to be on different columns

– Adrian
yesterday

add a comment |

You can use pd.melt to get most of the way there, and then sort:

>>> df
 location name Jan-2010 Feb-2010 March-2010
0 A test 12 20 30
1 B foo 18 20 25
>>> df2 = pd.melt(df, id_vars=["location", "name"], 
 var_name="Date", value_name="Value")
>>> df2
 location name Date Value
0 A test Jan-2010 12
1 B foo Jan-2010 18
2 A test Feb-2010 20
3 B foo Feb-2010 20
4 A test March-2010 30
5 B foo March-2010 25
>>> df2 = df2.sort(["location", "name"])
>>> df2
 location name Date Value
0 A test Jan-2010 12
2 A test Feb-2010 20
4 A test March-2010 30
1 B foo Jan-2010 18
3 B foo Feb-2010 20
5 B foo March-2010 25

(Might want to throw in a .reset_index(drop=True), just to keep the output clean.)

Note: pd.DataFrame.sort has been deprecated in favour of pd.DataFrame.sort_values.

edited Apr 8 '18 at 12:32

jpp

97.4k2159109

answered Feb 22 '15 at 3:21

DSM

207k35397372

You can use pd.melt to get most of the way there, and then sort:

>>> df
 location name Jan-2010 Feb-2010 March-2010
0 A test 12 20 30
1 B foo 18 20 25
>>> df2 = pd.melt(df, id_vars=["location", "name"], 
 var_name="Date", value_name="Value")
>>> df2
 location name Date Value
0 A test Jan-2010 12
1 B foo Jan-2010 18
2 A test Feb-2010 20
3 B foo Feb-2010 20
4 A test March-2010 30
5 B foo March-2010 25
>>> df2 = df2.sort(["location", "name"])
>>> df2
 location name Date Value
0 A test Jan-2010 12
2 A test Feb-2010 20
4 A test March-2010 30
1 B foo Jan-2010 18
3 B foo Feb-2010 20
5 B foo March-2010 25

(Might want to throw in a .reset_index(drop=True), just to keep the output clean.)

Note: pd.DataFrame.sort has been deprecated in favour of pd.DataFrame.sort_values.

edited Apr 8 '18 at 12:32

jpp

97.4k2159109

answered Feb 22 '15 at 3:21

DSM

207k35397372

edited Apr 8 '18 at 12:32

jpp

97.4k2159109

edited Apr 8 '18 at 12:32

jpp

97.4k2159109

edited Apr 8 '18 at 12:32

jpp

97.4k2159109

answered Feb 22 '15 at 3:21

DSM

207k35397372

answered Feb 22 '15 at 3:21

DSM

207k35397372

answered Feb 22 '15 at 3:21

DSM

207k35397372

5

A year later, this is still the most succinct answer I've seen to the columns to rows question. Thank you!

– Moe Chughtai
Jun 30 '16 at 12:25

@DSM what would be the inverse of this function. i.e. how would one convert df2 [back] to df

– 3kstc
Mar 8 '18 at 23:02

1

@3kstc Try here or here. You're wanting to look into pivots. Possibly pandas.pivot_table(df2,values='Value',index=['location','name'],columns='Date').reset_index().

– Teepeemm
Mar 9 '18 at 18:59

@DSM is there any way to go backwards? Meaning that I have a lot of rows with the same name and I would want all the dates to be on different columns

– Adrian
yesterday

add a comment |

5

A year later, this is still the most succinct answer I've seen to the columns to rows question. Thank you!

– Moe Chughtai
Jun 30 '16 at 12:25

@DSM what would be the inverse of this function. i.e. how would one convert df2 [back] to df

– 3kstc
Mar 8 '18 at 23:02

1

@3kstc Try here or here. You're wanting to look into pivots. Possibly pandas.pivot_table(df2,values='Value',index=['location','name'],columns='Date').reset_index().

– Teepeemm
Mar 9 '18 at 18:59

@DSM is there any way to go backwards? Meaning that I have a lot of rows with the same name and I would want all the dates to be on different columns

– Adrian
yesterday

A year later, this is still the most succinct answer I've seen to the columns to rows question. Thank you!

– Moe Chughtai
Jun 30 '16 at 12:25

@DSM what would be the inverse of this function. i.e. how would one convert df2 [back] to df

– 3kstc
Mar 8 '18 at 23:02

@3kstc Try here or here. You're wanting to look into pivots. Possibly pandas.pivot_table(df2,values='Value',index=['location','name'],columns='Date').reset_index().

– Teepeemm
Mar 9 '18 at 18:59

@DSM is there any way to go backwards? Meaning that I have a lot of rows with the same name and I would want all the dates to be on different columns

– Adrian
yesterday

add a comment |

I guess I found a simpler solution

temp1 = pd.melt(df1, id_vars=["location"], var_name='Date', value_name='Value')
temp2 = pd.melt(df1, id_vars=["name"], var_name='Date', value_name='Value')

Concat whole temp1 with temp2's column name

temp1['new_column'] = temp2['name']

You now have what you asked for.

answered Jun 25 '18 at 6:49

Prometheus

16529

add a comment |

I guess I found a simpler solution

temp1 = pd.melt(df1, id_vars=["location"], var_name='Date', value_name='Value')
temp2 = pd.melt(df1, id_vars=["name"], var_name='Date', value_name='Value')

Concat whole temp1 with temp2's column name

temp1['new_column'] = temp2['name']

You now have what you asked for.

answered Jun 25 '18 at 6:49

Prometheus

16529

add a comment |

I guess I found a simpler solution

temp1 = pd.melt(df1, id_vars=["location"], var_name='Date', value_name='Value')
temp2 = pd.melt(df1, id_vars=["name"], var_name='Date', value_name='Value')

Concat whole temp1 with temp2's column name

temp1['new_column'] = temp2['name']

You now have what you asked for.

answered Jun 25 '18 at 6:49

Prometheus

16529

I guess I found a simpler solution

temp1 = pd.melt(df1, id_vars=["location"], var_name='Date', value_name='Value')
temp2 = pd.melt(df1, id_vars=["name"], var_name='Date', value_name='Value')

Concat whole temp1 with temp2's column name

temp1['new_column'] = temp2['name']

You now have what you asked for.

answered Jun 25 '18 at 6:49

Prometheus

16529

answered Jun 25 '18 at 6:49

Prometheus

16529

answered Jun 25 '18 at 6:49

Prometheus

16529

answered Jun 25 '18 at 6:49

Prometheus

16529

add a comment |

`pd.wide_to_long`

df.columns = np.hstack((df.columns[:2], df.columns[2:].map(lambda x: f'Valuex')))

res = pd.wide_to_long(df, stubnames=['Value'], i='name', j='Date').reset_index()
 .sort_values(['location', 'name'])

print(res)

 name Date location Value
0 test Jan-2010 A 12
2 test Feb-2010 A 20
4 test March-2010 A 30
1 foo Jan-2010 B 18
3 foo Feb-2010 B 20
5 foo March-2010 B 25

answered Nov 13 '18 at 17:00

jpp

97.4k2159109

add a comment |

`pd.wide_to_long`

df.columns = np.hstack((df.columns[:2], df.columns[2:].map(lambda x: f'Valuex')))

res = pd.wide_to_long(df, stubnames=['Value'], i='name', j='Date').reset_index()
 .sort_values(['location', 'name'])

print(res)

 name Date location Value
0 test Jan-2010 A 12
2 test Feb-2010 A 20
4 test March-2010 A 30
1 foo Jan-2010 B 18
3 foo Feb-2010 B 20
5 foo March-2010 B 25

answered Nov 13 '18 at 17:00

jpp

97.4k2159109

add a comment |

`pd.wide_to_long`

df.columns = np.hstack((df.columns[:2], df.columns[2:].map(lambda x: f'Valuex')))

res = pd.wide_to_long(df, stubnames=['Value'], i='name', j='Date').reset_index()
 .sort_values(['location', 'name'])

print(res)

 name Date location Value
0 test Jan-2010 A 12
2 test Feb-2010 A 20
4 test March-2010 A 30
1 foo Jan-2010 B 18
3 foo Feb-2010 B 20
5 foo March-2010 B 25

answered Nov 13 '18 at 17:00

jpp

97.4k2159109

`pd.wide_to_long`

df.columns = np.hstack((df.columns[:2], df.columns[2:].map(lambda x: f'Valuex')))

res = pd.wide_to_long(df, stubnames=['Value'], i='name', j='Date').reset_index()
 .sort_values(['location', 'name'])

print(res)

 name Date location Value
0 test Jan-2010 A 12
2 test Feb-2010 A 20
4 test March-2010 A 30
1 foo Jan-2010 B 18
3 foo Feb-2010 B 20
5 foo March-2010 B 25

answered Nov 13 '18 at 17:00

jpp

97.4k2159109

answered Nov 13 '18 at 17:00

jpp

97.4k2159109

answered Nov 13 '18 at 17:00

jpp

97.4k2159109

answered Nov 13 '18 at 17:00

jpp

97.4k2159109

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Odtnhj