Pandas: pivot and flatten columns by combining index and columns names









up vote
0
down vote

favorite
1












I would like to pivot a dataframe in Pandas. I'm following the doc here: https://pandas.pydata.org/pandas-docs/stable/reshaping.html



From this dataframe:



 date variable value
0 2000-01-03 A 0.469112
1 2000-01-04 A -0.282863
2 2000-01-05 A -1.509059
3 2000-01-03 B -1.135632
4 2000-01-04 B 1.212112
5 2000-01-05 B -0.173215
6 2000-01-03 C 0.119209
7 2000-01-04 C -1.044236
8 2000-01-05 C -0.861849
9 2000-01-03 D -2.104569
10 2000-01-04 D -0.494929
11 2000-01-05 D 1.071804


Running df.pivot(index='date', columns='variable', values='value')



Will give me this:



variable A B C D
date
2000-01-03 0.469112 -1.135632 0.119209 -2.104569
2000-01-04 -0.282863 1.212112 -1.044236 -0.494929
2000-01-05 -1.509059 -0.173215 -0.861849 1.071804


I end up with a MultiIndex dataframe. An image might be better to describe what happens:



enter image description here



However, I would like to do this:



enter image description here



All the approaches I could find to flatten the multiindex end up giving me foo and bar on different rows. Could you give me a hand here?










share|improve this question

























    up vote
    0
    down vote

    favorite
    1












    I would like to pivot a dataframe in Pandas. I'm following the doc here: https://pandas.pydata.org/pandas-docs/stable/reshaping.html



    From this dataframe:



     date variable value
    0 2000-01-03 A 0.469112
    1 2000-01-04 A -0.282863
    2 2000-01-05 A -1.509059
    3 2000-01-03 B -1.135632
    4 2000-01-04 B 1.212112
    5 2000-01-05 B -0.173215
    6 2000-01-03 C 0.119209
    7 2000-01-04 C -1.044236
    8 2000-01-05 C -0.861849
    9 2000-01-03 D -2.104569
    10 2000-01-04 D -0.494929
    11 2000-01-05 D 1.071804


    Running df.pivot(index='date', columns='variable', values='value')



    Will give me this:



    variable A B C D
    date
    2000-01-03 0.469112 -1.135632 0.119209 -2.104569
    2000-01-04 -0.282863 1.212112 -1.044236 -0.494929
    2000-01-05 -1.509059 -0.173215 -0.861849 1.071804


    I end up with a MultiIndex dataframe. An image might be better to describe what happens:



    enter image description here



    However, I would like to do this:



    enter image description here



    All the approaches I could find to flatten the multiindex end up giving me foo and bar on different rows. Could you give me a hand here?










    share|improve this question























      up vote
      0
      down vote

      favorite
      1









      up vote
      0
      down vote

      favorite
      1






      1





      I would like to pivot a dataframe in Pandas. I'm following the doc here: https://pandas.pydata.org/pandas-docs/stable/reshaping.html



      From this dataframe:



       date variable value
      0 2000-01-03 A 0.469112
      1 2000-01-04 A -0.282863
      2 2000-01-05 A -1.509059
      3 2000-01-03 B -1.135632
      4 2000-01-04 B 1.212112
      5 2000-01-05 B -0.173215
      6 2000-01-03 C 0.119209
      7 2000-01-04 C -1.044236
      8 2000-01-05 C -0.861849
      9 2000-01-03 D -2.104569
      10 2000-01-04 D -0.494929
      11 2000-01-05 D 1.071804


      Running df.pivot(index='date', columns='variable', values='value')



      Will give me this:



      variable A B C D
      date
      2000-01-03 0.469112 -1.135632 0.119209 -2.104569
      2000-01-04 -0.282863 1.212112 -1.044236 -0.494929
      2000-01-05 -1.509059 -0.173215 -0.861849 1.071804


      I end up with a MultiIndex dataframe. An image might be better to describe what happens:



      enter image description here



      However, I would like to do this:



      enter image description here



      All the approaches I could find to flatten the multiindex end up giving me foo and bar on different rows. Could you give me a hand here?










      share|improve this question













      I would like to pivot a dataframe in Pandas. I'm following the doc here: https://pandas.pydata.org/pandas-docs/stable/reshaping.html



      From this dataframe:



       date variable value
      0 2000-01-03 A 0.469112
      1 2000-01-04 A -0.282863
      2 2000-01-05 A -1.509059
      3 2000-01-03 B -1.135632
      4 2000-01-04 B 1.212112
      5 2000-01-05 B -0.173215
      6 2000-01-03 C 0.119209
      7 2000-01-04 C -1.044236
      8 2000-01-05 C -0.861849
      9 2000-01-03 D -2.104569
      10 2000-01-04 D -0.494929
      11 2000-01-05 D 1.071804


      Running df.pivot(index='date', columns='variable', values='value')



      Will give me this:



      variable A B C D
      date
      2000-01-03 0.469112 -1.135632 0.119209 -2.104569
      2000-01-04 -0.282863 1.212112 -1.044236 -0.494929
      2000-01-05 -1.509059 -0.173215 -0.861849 1.071804


      I end up with a MultiIndex dataframe. An image might be better to describe what happens:



      enter image description here



      However, I would like to do this:



      enter image description here



      All the approaches I could find to flatten the multiindex end up giving me foo and bar on different rows. Could you give me a hand here?







      python pandas dataframe






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 11 at 10:24









      Rififi

      9611029




      9611029






















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          0
          down vote













          Ok after a few hours of intensive search, here is the simple solution I found:



          df.columns = [col[0] + f"_rcol[1]" for col in df.columns]





          share|improve this answer




















          • Now I understand what you need, simpliest it use df.columns = [f"a_rb" for a, b in df.columns] - check edited answer.
            – jezrael
            Nov 11 at 13:11


















          up vote
          0
          down vote













          I believe you need add_prefix for change columns names, then remove column.name by rename_axis and for column from index add reset_index:



          df1 = df.pivot(index='date', columns='variable', values='value')

          df1 = df1.add_prefix(df1.columns.name + '_').rename_axis(None, axis=1).reset_index()
          print (df1)
          date variable_A variable_B variable_C variable_D
          0 2000-01-03 0.469112 -1.135632 0.119209 -2.104569
          1 2000-01-04 -0.282863 1.212112 -1.044236 -0.494929
          2 2000-01-05 -1.509059 -0.173215 -0.861849 1.071804


          EDIT:



          If need flatten MultiIndex in columns use list comprehension:



          mux = pd.MultiIndex.from_product([["A", "B", "C", "D"], ["X", "Y"]])
          df = pd.DataFrame([np.arange(8)], columns=mux)
          print(df)
          A B C D
          X Y X Y X Y X Y
          0 0 1 2 3 4 5 6 7

          df.columns = [f"a_rb" for a, b in df.columns]
          print (df)
          A_rX A_rY B_rX B_rY C_rX C_rY D_rX D_rY
          0 0 1 2 3 4 5 6 7





          share|improve this answer






















            Your Answer






            StackExchange.ifUsing("editor", function ()
            StackExchange.using("externalEditor", function ()
            StackExchange.using("snippets", function ()
            StackExchange.snippets.init();
            );
            );
            , "code-snippets");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "1"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53247800%2fpandas-pivot-and-flatten-columns-by-combining-index-and-columns-names%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            0
            down vote













            Ok after a few hours of intensive search, here is the simple solution I found:



            df.columns = [col[0] + f"_rcol[1]" for col in df.columns]





            share|improve this answer




















            • Now I understand what you need, simpliest it use df.columns = [f"a_rb" for a, b in df.columns] - check edited answer.
              – jezrael
              Nov 11 at 13:11















            up vote
            0
            down vote













            Ok after a few hours of intensive search, here is the simple solution I found:



            df.columns = [col[0] + f"_rcol[1]" for col in df.columns]





            share|improve this answer




















            • Now I understand what you need, simpliest it use df.columns = [f"a_rb" for a, b in df.columns] - check edited answer.
              – jezrael
              Nov 11 at 13:11













            up vote
            0
            down vote










            up vote
            0
            down vote









            Ok after a few hours of intensive search, here is the simple solution I found:



            df.columns = [col[0] + f"_rcol[1]" for col in df.columns]





            share|improve this answer












            Ok after a few hours of intensive search, here is the simple solution I found:



            df.columns = [col[0] + f"_rcol[1]" for col in df.columns]






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 11 at 12:59









            Rififi

            9611029




            9611029











            • Now I understand what you need, simpliest it use df.columns = [f"a_rb" for a, b in df.columns] - check edited answer.
              – jezrael
              Nov 11 at 13:11

















            • Now I understand what you need, simpliest it use df.columns = [f"a_rb" for a, b in df.columns] - check edited answer.
              – jezrael
              Nov 11 at 13:11
















            Now I understand what you need, simpliest it use df.columns = [f"a_rb" for a, b in df.columns] - check edited answer.
            – jezrael
            Nov 11 at 13:11





            Now I understand what you need, simpliest it use df.columns = [f"a_rb" for a, b in df.columns] - check edited answer.
            – jezrael
            Nov 11 at 13:11













            up vote
            0
            down vote













            I believe you need add_prefix for change columns names, then remove column.name by rename_axis and for column from index add reset_index:



            df1 = df.pivot(index='date', columns='variable', values='value')

            df1 = df1.add_prefix(df1.columns.name + '_').rename_axis(None, axis=1).reset_index()
            print (df1)
            date variable_A variable_B variable_C variable_D
            0 2000-01-03 0.469112 -1.135632 0.119209 -2.104569
            1 2000-01-04 -0.282863 1.212112 -1.044236 -0.494929
            2 2000-01-05 -1.509059 -0.173215 -0.861849 1.071804


            EDIT:



            If need flatten MultiIndex in columns use list comprehension:



            mux = pd.MultiIndex.from_product([["A", "B", "C", "D"], ["X", "Y"]])
            df = pd.DataFrame([np.arange(8)], columns=mux)
            print(df)
            A B C D
            X Y X Y X Y X Y
            0 0 1 2 3 4 5 6 7

            df.columns = [f"a_rb" for a, b in df.columns]
            print (df)
            A_rX A_rY B_rX B_rY C_rX C_rY D_rX D_rY
            0 0 1 2 3 4 5 6 7





            share|improve this answer


























              up vote
              0
              down vote













              I believe you need add_prefix for change columns names, then remove column.name by rename_axis and for column from index add reset_index:



              df1 = df.pivot(index='date', columns='variable', values='value')

              df1 = df1.add_prefix(df1.columns.name + '_').rename_axis(None, axis=1).reset_index()
              print (df1)
              date variable_A variable_B variable_C variable_D
              0 2000-01-03 0.469112 -1.135632 0.119209 -2.104569
              1 2000-01-04 -0.282863 1.212112 -1.044236 -0.494929
              2 2000-01-05 -1.509059 -0.173215 -0.861849 1.071804


              EDIT:



              If need flatten MultiIndex in columns use list comprehension:



              mux = pd.MultiIndex.from_product([["A", "B", "C", "D"], ["X", "Y"]])
              df = pd.DataFrame([np.arange(8)], columns=mux)
              print(df)
              A B C D
              X Y X Y X Y X Y
              0 0 1 2 3 4 5 6 7

              df.columns = [f"a_rb" for a, b in df.columns]
              print (df)
              A_rX A_rY B_rX B_rY C_rX C_rY D_rX D_rY
              0 0 1 2 3 4 5 6 7





              share|improve this answer
























                up vote
                0
                down vote










                up vote
                0
                down vote









                I believe you need add_prefix for change columns names, then remove column.name by rename_axis and for column from index add reset_index:



                df1 = df.pivot(index='date', columns='variable', values='value')

                df1 = df1.add_prefix(df1.columns.name + '_').rename_axis(None, axis=1).reset_index()
                print (df1)
                date variable_A variable_B variable_C variable_D
                0 2000-01-03 0.469112 -1.135632 0.119209 -2.104569
                1 2000-01-04 -0.282863 1.212112 -1.044236 -0.494929
                2 2000-01-05 -1.509059 -0.173215 -0.861849 1.071804


                EDIT:



                If need flatten MultiIndex in columns use list comprehension:



                mux = pd.MultiIndex.from_product([["A", "B", "C", "D"], ["X", "Y"]])
                df = pd.DataFrame([np.arange(8)], columns=mux)
                print(df)
                A B C D
                X Y X Y X Y X Y
                0 0 1 2 3 4 5 6 7

                df.columns = [f"a_rb" for a, b in df.columns]
                print (df)
                A_rX A_rY B_rX B_rY C_rX C_rY D_rX D_rY
                0 0 1 2 3 4 5 6 7





                share|improve this answer














                I believe you need add_prefix for change columns names, then remove column.name by rename_axis and for column from index add reset_index:



                df1 = df.pivot(index='date', columns='variable', values='value')

                df1 = df1.add_prefix(df1.columns.name + '_').rename_axis(None, axis=1).reset_index()
                print (df1)
                date variable_A variable_B variable_C variable_D
                0 2000-01-03 0.469112 -1.135632 0.119209 -2.104569
                1 2000-01-04 -0.282863 1.212112 -1.044236 -0.494929
                2 2000-01-05 -1.509059 -0.173215 -0.861849 1.071804


                EDIT:



                If need flatten MultiIndex in columns use list comprehension:



                mux = pd.MultiIndex.from_product([["A", "B", "C", "D"], ["X", "Y"]])
                df = pd.DataFrame([np.arange(8)], columns=mux)
                print(df)
                A B C D
                X Y X Y X Y X Y
                0 0 1 2 3 4 5 6 7

                df.columns = [f"a_rb" for a, b in df.columns]
                print (df)
                A_rX A_rY B_rX B_rY C_rX C_rY D_rX D_rY
                0 0 1 2 3 4 5 6 7






                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Nov 11 at 13:06

























                answered Nov 11 at 10:28









                jezrael

                311k21247323




                311k21247323



























                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.





                    Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                    Please pay close attention to the following guidance:


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53247800%2fpandas-pivot-and-flatten-columns-by-combining-index-and-columns-names%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    這個網誌中的熱門文章

                    Barbados

                    How to read a connectionString WITH PROVIDER in .NET Core?

                    Node.js Script on GitHub Pages or Amazon S3