Pandas: pivot and flatten columns by combining index and columns names

up vote
0
down vote

favorite

I would like to pivot a dataframe in Pandas. I'm following the doc here: https://pandas.pydata.org/pandas-docs/stable/reshaping.html

From this dataframe:

 date variable value
0 2000-01-03 A 0.469112
1 2000-01-04 A -0.282863
2 2000-01-05 A -1.509059
3 2000-01-03 B -1.135632
4 2000-01-04 B 1.212112
5 2000-01-05 B -0.173215
6 2000-01-03 C 0.119209
7 2000-01-04 C -1.044236
8 2000-01-05 C -0.861849
9 2000-01-03 D -2.104569
10 2000-01-04 D -0.494929
11 2000-01-05 D 1.071804

Running df.pivot(index='date', columns='variable', values='value')

Will give me this:

variable A B C D
date 
2000-01-03 0.469112 -1.135632 0.119209 -2.104569
2000-01-04 -0.282863 1.212112 -1.044236 -0.494929
2000-01-05 -1.509059 -0.173215 -0.861849 1.071804

I end up with a MultiIndex dataframe. An image might be better to describe what happens:

enter image description here

However, I would like to do this:

enter image description here

All the approaches I could find to flatten the multiindex end up giving me foo and bar on different rows. Could you give me a hand here?

asked Nov 11 at 10:24

Rififi

9611029

add a comment |

up vote
0
down vote

favorite

I would like to pivot a dataframe in Pandas. I'm following the doc here: https://pandas.pydata.org/pandas-docs/stable/reshaping.html

From this dataframe:

 date variable value
0 2000-01-03 A 0.469112
1 2000-01-04 A -0.282863
2 2000-01-05 A -1.509059
3 2000-01-03 B -1.135632
4 2000-01-04 B 1.212112
5 2000-01-05 B -0.173215
6 2000-01-03 C 0.119209
7 2000-01-04 C -1.044236
8 2000-01-05 C -0.861849
9 2000-01-03 D -2.104569
10 2000-01-04 D -0.494929
11 2000-01-05 D 1.071804

Running df.pivot(index='date', columns='variable', values='value')

Will give me this:

variable A B C D
date 
2000-01-03 0.469112 -1.135632 0.119209 -2.104569
2000-01-04 -0.282863 1.212112 -1.044236 -0.494929
2000-01-05 -1.509059 -0.173215 -0.861849 1.071804

I end up with a MultiIndex dataframe. An image might be better to describe what happens:

enter image description here

However, I would like to do this:

enter image description here

All the approaches I could find to flatten the multiindex end up giving me foo and bar on different rows. Could you give me a hand here?

asked Nov 11 at 10:24

Rififi

9611029

add a comment |

up vote
0
down vote

favorite

I would like to pivot a dataframe in Pandas. I'm following the doc here: https://pandas.pydata.org/pandas-docs/stable/reshaping.html

From this dataframe:

 date variable value
0 2000-01-03 A 0.469112
1 2000-01-04 A -0.282863
2 2000-01-05 A -1.509059
3 2000-01-03 B -1.135632
4 2000-01-04 B 1.212112
5 2000-01-05 B -0.173215
6 2000-01-03 C 0.119209
7 2000-01-04 C -1.044236
8 2000-01-05 C -0.861849
9 2000-01-03 D -2.104569
10 2000-01-04 D -0.494929
11 2000-01-05 D 1.071804

Running df.pivot(index='date', columns='variable', values='value')

Will give me this:

variable A B C D
date 
2000-01-03 0.469112 -1.135632 0.119209 -2.104569
2000-01-04 -0.282863 1.212112 -1.044236 -0.494929
2000-01-05 -1.509059 -0.173215 -0.861849 1.071804

I end up with a MultiIndex dataframe. An image might be better to describe what happens:

enter image description here

However, I would like to do this:

enter image description here

All the approaches I could find to flatten the multiindex end up giving me foo and bar on different rows. Could you give me a hand here?

asked Nov 11 at 10:24

Rififi

9611029

I would like to pivot a dataframe in Pandas. I'm following the doc here: https://pandas.pydata.org/pandas-docs/stable/reshaping.html

From this dataframe:

 date variable value
0 2000-01-03 A 0.469112
1 2000-01-04 A -0.282863
2 2000-01-05 A -1.509059
3 2000-01-03 B -1.135632
4 2000-01-04 B 1.212112
5 2000-01-05 B -0.173215
6 2000-01-03 C 0.119209
7 2000-01-04 C -1.044236
8 2000-01-05 C -0.861849
9 2000-01-03 D -2.104569
10 2000-01-04 D -0.494929
11 2000-01-05 D 1.071804

Running df.pivot(index='date', columns='variable', values='value')

Will give me this:

variable A B C D
date 
2000-01-03 0.469112 -1.135632 0.119209 -2.104569
2000-01-04 -0.282863 1.212112 -1.044236 -0.494929
2000-01-05 -1.509059 -0.173215 -0.861849 1.071804

I end up with a MultiIndex dataframe. An image might be better to describe what happens:

enter image description here

However, I would like to do this:

enter image description here

All the approaches I could find to flatten the multiindex end up giving me foo and bar on different rows. Could you give me a hand here?

python pandas dataframe

asked Nov 11 at 10:24

Rififi

9611029

asked Nov 11 at 10:24

Rififi

9611029

asked Nov 11 at 10:24

Rififi

9611029

asked Nov 11 at 10:24

Rififi

9611029

asked Nov 11 at 10:24

Rififi

9611029

add a comment |

2 Answers
2

active

oldest

votes

up vote
0
down vote

Ok after a few hours of intensive search, here is the simple solution I found:

df.columns = [col[0] + f"_rcol[1]" for col in df.columns]

answered Nov 11 at 12:59

Rififi

9611029

Now I understand what you need, simpliest it use df.columns = [f"a_rb" for a, b in df.columns] - check edited answer.
– jezrael
Nov 11 at 13:11

add a comment |

up vote
0
down vote

I believe you need add_prefix for change columns names, then remove column.name by rename_axis and for column from index add reset_index:

df1 = df.pivot(index='date', columns='variable', values='value')

df1 = df1.add_prefix(df1.columns.name + '_').rename_axis(None, axis=1).reset_index()
print (df1)
 date variable_A variable_B variable_C variable_D
0 2000-01-03 0.469112 -1.135632 0.119209 -2.104569
1 2000-01-04 -0.282863 1.212112 -1.044236 -0.494929
2 2000-01-05 -1.509059 -0.173215 -0.861849 1.071804

EDIT:

If need flatten MultiIndex in columns use list comprehension:

mux = pd.MultiIndex.from_product([["A", "B", "C", "D"], ["X", "Y"]])
df = pd.DataFrame([np.arange(8)], columns=mux)
print(df)
 A B C D 
 X Y X Y X Y X Y
0 0 1 2 3 4 5 6 7

df.columns = [f"a_rb" for a, b in df.columns]
print (df)
 A_rX A_rY B_rX B_rY C_rX C_rY D_rX D_rY
0 0 1 2 3 4 5 6 7

edited Nov 11 at 13:06

answered Nov 11 at 10:28

jezrael

311k21247323

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53247800%2fpandas-pivot-and-flatten-columns-by-combining-index-and-columns-names%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
0
down vote

Ok after a few hours of intensive search, here is the simple solution I found:

df.columns = [col[0] + f"_rcol[1]" for col in df.columns]

answered Nov 11 at 12:59

Rififi

9611029

Now I understand what you need, simpliest it use df.columns = [f"a_rb" for a, b in df.columns] - check edited answer.
– jezrael
Nov 11 at 13:11

add a comment |

up vote
0
down vote

Ok after a few hours of intensive search, here is the simple solution I found:

df.columns = [col[0] + f"_rcol[1]" for col in df.columns]

answered Nov 11 at 12:59

Rififi

9611029

Now I understand what you need, simpliest it use df.columns = [f"a_rb" for a, b in df.columns] - check edited answer.
– jezrael
Nov 11 at 13:11

add a comment |

up vote
0
down vote

Ok after a few hours of intensive search, here is the simple solution I found:

df.columns = [col[0] + f"_rcol[1]" for col in df.columns]

answered Nov 11 at 12:59

Rififi

9611029

Ok after a few hours of intensive search, here is the simple solution I found:

df.columns = [col[0] + f"_rcol[1]" for col in df.columns]

answered Nov 11 at 12:59

Rififi

9611029

answered Nov 11 at 12:59

Rififi

9611029

answered Nov 11 at 12:59

Rififi

9611029

answered Nov 11 at 12:59

Rififi

9611029

Now I understand what you need, simpliest it use df.columns = [f"a_rb" for a, b in df.columns] - check edited answer.
– jezrael
Nov 11 at 13:11

add a comment |

Now I understand what you need, simpliest it use df.columns = [f"a_rb" for a, b in df.columns] - check edited answer.
– jezrael
Nov 11 at 13:11

Now I understand what you need, simpliest it use df.columns = [f"a_rb" for a, b in df.columns] - check edited answer.
– jezrael
Nov 11 at 13:11

add a comment |

up vote
0
down vote

I believe you need add_prefix for change columns names, then remove column.name by rename_axis and for column from index add reset_index:

df1 = df.pivot(index='date', columns='variable', values='value')

df1 = df1.add_prefix(df1.columns.name + '_').rename_axis(None, axis=1).reset_index()
print (df1)
 date variable_A variable_B variable_C variable_D
0 2000-01-03 0.469112 -1.135632 0.119209 -2.104569
1 2000-01-04 -0.282863 1.212112 -1.044236 -0.494929
2 2000-01-05 -1.509059 -0.173215 -0.861849 1.071804

EDIT:

If need flatten MultiIndex in columns use list comprehension:

mux = pd.MultiIndex.from_product([["A", "B", "C", "D"], ["X", "Y"]])
df = pd.DataFrame([np.arange(8)], columns=mux)
print(df)
 A B C D 
 X Y X Y X Y X Y
0 0 1 2 3 4 5 6 7

df.columns = [f"a_rb" for a, b in df.columns]
print (df)
 A_rX A_rY B_rX B_rY C_rX C_rY D_rX D_rY
0 0 1 2 3 4 5 6 7

edited Nov 11 at 13:06

answered Nov 11 at 10:28

jezrael

311k21247323

add a comment |

up vote
0
down vote

I believe you need add_prefix for change columns names, then remove column.name by rename_axis and for column from index add reset_index:

df1 = df.pivot(index='date', columns='variable', values='value')

df1 = df1.add_prefix(df1.columns.name + '_').rename_axis(None, axis=1).reset_index()
print (df1)
 date variable_A variable_B variable_C variable_D
0 2000-01-03 0.469112 -1.135632 0.119209 -2.104569
1 2000-01-04 -0.282863 1.212112 -1.044236 -0.494929
2 2000-01-05 -1.509059 -0.173215 -0.861849 1.071804

EDIT:

If need flatten MultiIndex in columns use list comprehension:

mux = pd.MultiIndex.from_product([["A", "B", "C", "D"], ["X", "Y"]])
df = pd.DataFrame([np.arange(8)], columns=mux)
print(df)
 A B C D 
 X Y X Y X Y X Y
0 0 1 2 3 4 5 6 7

df.columns = [f"a_rb" for a, b in df.columns]
print (df)
 A_rX A_rY B_rX B_rY C_rX C_rY D_rX D_rY
0 0 1 2 3 4 5 6 7

edited Nov 11 at 13:06

answered Nov 11 at 10:28

jezrael

311k21247323

add a comment |

up vote
0
down vote

I believe you need add_prefix for change columns names, then remove column.name by rename_axis and for column from index add reset_index:

df1 = df.pivot(index='date', columns='variable', values='value')

df1 = df1.add_prefix(df1.columns.name + '_').rename_axis(None, axis=1).reset_index()
print (df1)
 date variable_A variable_B variable_C variable_D
0 2000-01-03 0.469112 -1.135632 0.119209 -2.104569
1 2000-01-04 -0.282863 1.212112 -1.044236 -0.494929
2 2000-01-05 -1.509059 -0.173215 -0.861849 1.071804

EDIT:

If need flatten MultiIndex in columns use list comprehension:

mux = pd.MultiIndex.from_product([["A", "B", "C", "D"], ["X", "Y"]])
df = pd.DataFrame([np.arange(8)], columns=mux)
print(df)
 A B C D 
 X Y X Y X Y X Y
0 0 1 2 3 4 5 6 7

df.columns = [f"a_rb" for a, b in df.columns]
print (df)
 A_rX A_rY B_rX B_rY C_rX C_rY D_rX D_rY
0 0 1 2 3 4 5 6 7

edited Nov 11 at 13:06

answered Nov 11 at 10:28

jezrael

311k21247323

I believe you need add_prefix for change columns names, then remove column.name by rename_axis and for column from index add reset_index:

df1 = df.pivot(index='date', columns='variable', values='value')

df1 = df1.add_prefix(df1.columns.name + '_').rename_axis(None, axis=1).reset_index()
print (df1)
 date variable_A variable_B variable_C variable_D
0 2000-01-03 0.469112 -1.135632 0.119209 -2.104569
1 2000-01-04 -0.282863 1.212112 -1.044236 -0.494929
2 2000-01-05 -1.509059 -0.173215 -0.861849 1.071804

EDIT:

If need flatten MultiIndex in columns use list comprehension:

mux = pd.MultiIndex.from_product([["A", "B", "C", "D"], ["X", "Y"]])
df = pd.DataFrame([np.arange(8)], columns=mux)
print(df)
 A B C D 
 X Y X Y X Y X Y
0 0 1 2 3 4 5 6 7

df.columns = [f"a_rb" for a, b in df.columns]
print (df)
 A_rX A_rY B_rX B_rY C_rX C_rY D_rX D_rY
0 0 1 2 3 4 5 6 7

edited Nov 11 at 13:06

answered Nov 11 at 10:28

jezrael

311k21247323

edited Nov 11 at 13:06

answered Nov 11 at 10:28

jezrael

311k21247323

answered Nov 11 at 10:28

jezrael

311k21247323

answered Nov 11 at 10:28

jezrael

311k21247323

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Odtnhj