pandas convert some columns into rows










29















So my dataset has some information by location for n dates. The problem is each date is actually a different column header. For example the CSV looks like



location name Jan-2010 Feb-2010 March-2010
A "test" 12 20 30
B "foo" 18 20 25


What I would like is for it to look like



location name Date Value
A "test" Jan-2010 12
A "test" Feb-2010 20
A "test" March-2010 30
B "foo" Jan-2010 18
B "foo" Feb-2010 20
B "foo" March-2010 25


problem is I don't know how many dates are in the column (though I know they will always start after name)










share|improve this question






















  • stackoverflow.com/questions/19842066/…

    – pyCthon
    Feb 22 '15 at 3:21















29















So my dataset has some information by location for n dates. The problem is each date is actually a different column header. For example the CSV looks like



location name Jan-2010 Feb-2010 March-2010
A "test" 12 20 30
B "foo" 18 20 25


What I would like is for it to look like



location name Date Value
A "test" Jan-2010 12
A "test" Feb-2010 20
A "test" March-2010 30
B "foo" Jan-2010 18
B "foo" Feb-2010 20
B "foo" March-2010 25


problem is I don't know how many dates are in the column (though I know they will always start after name)










share|improve this question






















  • stackoverflow.com/questions/19842066/…

    – pyCthon
    Feb 22 '15 at 3:21













29












29








29


20






So my dataset has some information by location for n dates. The problem is each date is actually a different column header. For example the CSV looks like



location name Jan-2010 Feb-2010 March-2010
A "test" 12 20 30
B "foo" 18 20 25


What I would like is for it to look like



location name Date Value
A "test" Jan-2010 12
A "test" Feb-2010 20
A "test" March-2010 30
B "foo" Jan-2010 18
B "foo" Feb-2010 20
B "foo" March-2010 25


problem is I don't know how many dates are in the column (though I know they will always start after name)










share|improve this question














So my dataset has some information by location for n dates. The problem is each date is actually a different column header. For example the CSV looks like



location name Jan-2010 Feb-2010 March-2010
A "test" 12 20 30
B "foo" 18 20 25


What I would like is for it to look like



location name Date Value
A "test" Jan-2010 12
A "test" Feb-2010 20
A "test" March-2010 30
B "foo" Jan-2010 18
B "foo" Feb-2010 20
B "foo" March-2010 25


problem is I don't know how many dates are in the column (though I know they will always start after name)







python pandas






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Feb 22 '15 at 3:08









WizurielWizuriel

6712922




6712922












  • stackoverflow.com/questions/19842066/…

    – pyCthon
    Feb 22 '15 at 3:21

















  • stackoverflow.com/questions/19842066/…

    – pyCthon
    Feb 22 '15 at 3:21
















stackoverflow.com/questions/19842066/…

– pyCthon
Feb 22 '15 at 3:21





stackoverflow.com/questions/19842066/…

– pyCthon
Feb 22 '15 at 3:21












3 Answers
3






active

oldest

votes


















84














You can use pd.melt to get most of the way there, and then sort:



>>> df
location name Jan-2010 Feb-2010 March-2010
0 A test 12 20 30
1 B foo 18 20 25
>>> df2 = pd.melt(df, id_vars=["location", "name"],
var_name="Date", value_name="Value")
>>> df2
location name Date Value
0 A test Jan-2010 12
1 B foo Jan-2010 18
2 A test Feb-2010 20
3 B foo Feb-2010 20
4 A test March-2010 30
5 B foo March-2010 25
>>> df2 = df2.sort(["location", "name"])
>>> df2
location name Date Value
0 A test Jan-2010 12
2 A test Feb-2010 20
4 A test March-2010 30
1 B foo Jan-2010 18
3 B foo Feb-2010 20
5 B foo March-2010 25


(Might want to throw in a .reset_index(drop=True), just to keep the output clean.)



Note: pd.DataFrame.sort has been deprecated in favour of pd.DataFrame.sort_values.






share|improve this answer




















  • 5





    A year later, this is still the most succinct answer I've seen to the columns to rows question. Thank you!

    – Moe Chughtai
    Jun 30 '16 at 12:25











  • @DSM what would be the inverse of this function. i.e. how would one convert df2 [back] to df

    – 3kstc
    Mar 8 '18 at 23:02







  • 1





    @3kstc Try here or here. You're wanting to look into pivots. Possibly pandas.pivot_table(df2,values='Value',index=['location','name'],columns='Date').reset_index().

    – Teepeemm
    Mar 9 '18 at 18:59











  • @DSM is there any way to go backwards? Meaning that I have a lot of rows with the same name and I would want all the dates to be on different columns

    – Adrian
    yesterday


















3














I guess I found a simpler solution



temp1 = pd.melt(df1, id_vars=["location"], var_name='Date', value_name='Value')
temp2 = pd.melt(df1, id_vars=["name"], var_name='Date', value_name='Value')


Concat whole temp1 with temp2's column name



temp1['new_column'] = temp2['name']


You now have what you asked for.






share|improve this answer






























    2














    pd.wide_to_long



    You can add a prefix to your year columns and then feed directly to pd.wide_to_long. I won't pretend this is efficient, but it may in certain situations be more convenient than pd.melt, e.g. when your columns already have an appropriate prefix.



    df.columns = np.hstack((df.columns[:2], df.columns[2:].map(lambda x: f'Valuex')))

    res = pd.wide_to_long(df, stubnames=['Value'], i='name', j='Date').reset_index()
    .sort_values(['location', 'name'])

    print(res)

    name Date location Value
    0 test Jan-2010 A 12
    2 test Feb-2010 A 20
    4 test March-2010 A 30
    1 foo Jan-2010 B 18
    3 foo Feb-2010 B 20
    5 foo March-2010 B 25





    share|improve this answer






















      Your Answer






      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f28654047%2fpandas-convert-some-columns-into-rows%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      84














      You can use pd.melt to get most of the way there, and then sort:



      >>> df
      location name Jan-2010 Feb-2010 March-2010
      0 A test 12 20 30
      1 B foo 18 20 25
      >>> df2 = pd.melt(df, id_vars=["location", "name"],
      var_name="Date", value_name="Value")
      >>> df2
      location name Date Value
      0 A test Jan-2010 12
      1 B foo Jan-2010 18
      2 A test Feb-2010 20
      3 B foo Feb-2010 20
      4 A test March-2010 30
      5 B foo March-2010 25
      >>> df2 = df2.sort(["location", "name"])
      >>> df2
      location name Date Value
      0 A test Jan-2010 12
      2 A test Feb-2010 20
      4 A test March-2010 30
      1 B foo Jan-2010 18
      3 B foo Feb-2010 20
      5 B foo March-2010 25


      (Might want to throw in a .reset_index(drop=True), just to keep the output clean.)



      Note: pd.DataFrame.sort has been deprecated in favour of pd.DataFrame.sort_values.






      share|improve this answer




















      • 5





        A year later, this is still the most succinct answer I've seen to the columns to rows question. Thank you!

        – Moe Chughtai
        Jun 30 '16 at 12:25











      • @DSM what would be the inverse of this function. i.e. how would one convert df2 [back] to df

        – 3kstc
        Mar 8 '18 at 23:02







      • 1





        @3kstc Try here or here. You're wanting to look into pivots. Possibly pandas.pivot_table(df2,values='Value',index=['location','name'],columns='Date').reset_index().

        – Teepeemm
        Mar 9 '18 at 18:59











      • @DSM is there any way to go backwards? Meaning that I have a lot of rows with the same name and I would want all the dates to be on different columns

        – Adrian
        yesterday















      84














      You can use pd.melt to get most of the way there, and then sort:



      >>> df
      location name Jan-2010 Feb-2010 March-2010
      0 A test 12 20 30
      1 B foo 18 20 25
      >>> df2 = pd.melt(df, id_vars=["location", "name"],
      var_name="Date", value_name="Value")
      >>> df2
      location name Date Value
      0 A test Jan-2010 12
      1 B foo Jan-2010 18
      2 A test Feb-2010 20
      3 B foo Feb-2010 20
      4 A test March-2010 30
      5 B foo March-2010 25
      >>> df2 = df2.sort(["location", "name"])
      >>> df2
      location name Date Value
      0 A test Jan-2010 12
      2 A test Feb-2010 20
      4 A test March-2010 30
      1 B foo Jan-2010 18
      3 B foo Feb-2010 20
      5 B foo March-2010 25


      (Might want to throw in a .reset_index(drop=True), just to keep the output clean.)



      Note: pd.DataFrame.sort has been deprecated in favour of pd.DataFrame.sort_values.






      share|improve this answer




















      • 5





        A year later, this is still the most succinct answer I've seen to the columns to rows question. Thank you!

        – Moe Chughtai
        Jun 30 '16 at 12:25











      • @DSM what would be the inverse of this function. i.e. how would one convert df2 [back] to df

        – 3kstc
        Mar 8 '18 at 23:02







      • 1





        @3kstc Try here or here. You're wanting to look into pivots. Possibly pandas.pivot_table(df2,values='Value',index=['location','name'],columns='Date').reset_index().

        – Teepeemm
        Mar 9 '18 at 18:59











      • @DSM is there any way to go backwards? Meaning that I have a lot of rows with the same name and I would want all the dates to be on different columns

        – Adrian
        yesterday













      84












      84








      84







      You can use pd.melt to get most of the way there, and then sort:



      >>> df
      location name Jan-2010 Feb-2010 March-2010
      0 A test 12 20 30
      1 B foo 18 20 25
      >>> df2 = pd.melt(df, id_vars=["location", "name"],
      var_name="Date", value_name="Value")
      >>> df2
      location name Date Value
      0 A test Jan-2010 12
      1 B foo Jan-2010 18
      2 A test Feb-2010 20
      3 B foo Feb-2010 20
      4 A test March-2010 30
      5 B foo March-2010 25
      >>> df2 = df2.sort(["location", "name"])
      >>> df2
      location name Date Value
      0 A test Jan-2010 12
      2 A test Feb-2010 20
      4 A test March-2010 30
      1 B foo Jan-2010 18
      3 B foo Feb-2010 20
      5 B foo March-2010 25


      (Might want to throw in a .reset_index(drop=True), just to keep the output clean.)



      Note: pd.DataFrame.sort has been deprecated in favour of pd.DataFrame.sort_values.






      share|improve this answer















      You can use pd.melt to get most of the way there, and then sort:



      >>> df
      location name Jan-2010 Feb-2010 March-2010
      0 A test 12 20 30
      1 B foo 18 20 25
      >>> df2 = pd.melt(df, id_vars=["location", "name"],
      var_name="Date", value_name="Value")
      >>> df2
      location name Date Value
      0 A test Jan-2010 12
      1 B foo Jan-2010 18
      2 A test Feb-2010 20
      3 B foo Feb-2010 20
      4 A test March-2010 30
      5 B foo March-2010 25
      >>> df2 = df2.sort(["location", "name"])
      >>> df2
      location name Date Value
      0 A test Jan-2010 12
      2 A test Feb-2010 20
      4 A test March-2010 30
      1 B foo Jan-2010 18
      3 B foo Feb-2010 20
      5 B foo March-2010 25


      (Might want to throw in a .reset_index(drop=True), just to keep the output clean.)



      Note: pd.DataFrame.sort has been deprecated in favour of pd.DataFrame.sort_values.







      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Apr 8 '18 at 12:32









      jpp

      97.4k2159109




      97.4k2159109










      answered Feb 22 '15 at 3:21









      DSMDSM

      207k35397372




      207k35397372







      • 5





        A year later, this is still the most succinct answer I've seen to the columns to rows question. Thank you!

        – Moe Chughtai
        Jun 30 '16 at 12:25











      • @DSM what would be the inverse of this function. i.e. how would one convert df2 [back] to df

        – 3kstc
        Mar 8 '18 at 23:02







      • 1





        @3kstc Try here or here. You're wanting to look into pivots. Possibly pandas.pivot_table(df2,values='Value',index=['location','name'],columns='Date').reset_index().

        – Teepeemm
        Mar 9 '18 at 18:59











      • @DSM is there any way to go backwards? Meaning that I have a lot of rows with the same name and I would want all the dates to be on different columns

        – Adrian
        yesterday












      • 5





        A year later, this is still the most succinct answer I've seen to the columns to rows question. Thank you!

        – Moe Chughtai
        Jun 30 '16 at 12:25











      • @DSM what would be the inverse of this function. i.e. how would one convert df2 [back] to df

        – 3kstc
        Mar 8 '18 at 23:02







      • 1





        @3kstc Try here or here. You're wanting to look into pivots. Possibly pandas.pivot_table(df2,values='Value',index=['location','name'],columns='Date').reset_index().

        – Teepeemm
        Mar 9 '18 at 18:59











      • @DSM is there any way to go backwards? Meaning that I have a lot of rows with the same name and I would want all the dates to be on different columns

        – Adrian
        yesterday







      5




      5





      A year later, this is still the most succinct answer I've seen to the columns to rows question. Thank you!

      – Moe Chughtai
      Jun 30 '16 at 12:25





      A year later, this is still the most succinct answer I've seen to the columns to rows question. Thank you!

      – Moe Chughtai
      Jun 30 '16 at 12:25













      @DSM what would be the inverse of this function. i.e. how would one convert df2 [back] to df

      – 3kstc
      Mar 8 '18 at 23:02






      @DSM what would be the inverse of this function. i.e. how would one convert df2 [back] to df

      – 3kstc
      Mar 8 '18 at 23:02





      1




      1





      @3kstc Try here or here. You're wanting to look into pivots. Possibly pandas.pivot_table(df2,values='Value',index=['location','name'],columns='Date').reset_index().

      – Teepeemm
      Mar 9 '18 at 18:59





      @3kstc Try here or here. You're wanting to look into pivots. Possibly pandas.pivot_table(df2,values='Value',index=['location','name'],columns='Date').reset_index().

      – Teepeemm
      Mar 9 '18 at 18:59













      @DSM is there any way to go backwards? Meaning that I have a lot of rows with the same name and I would want all the dates to be on different columns

      – Adrian
      yesterday





      @DSM is there any way to go backwards? Meaning that I have a lot of rows with the same name and I would want all the dates to be on different columns

      – Adrian
      yesterday













      3














      I guess I found a simpler solution



      temp1 = pd.melt(df1, id_vars=["location"], var_name='Date', value_name='Value')
      temp2 = pd.melt(df1, id_vars=["name"], var_name='Date', value_name='Value')


      Concat whole temp1 with temp2's column name



      temp1['new_column'] = temp2['name']


      You now have what you asked for.






      share|improve this answer



























        3














        I guess I found a simpler solution



        temp1 = pd.melt(df1, id_vars=["location"], var_name='Date', value_name='Value')
        temp2 = pd.melt(df1, id_vars=["name"], var_name='Date', value_name='Value')


        Concat whole temp1 with temp2's column name



        temp1['new_column'] = temp2['name']


        You now have what you asked for.






        share|improve this answer

























          3












          3








          3







          I guess I found a simpler solution



          temp1 = pd.melt(df1, id_vars=["location"], var_name='Date', value_name='Value')
          temp2 = pd.melt(df1, id_vars=["name"], var_name='Date', value_name='Value')


          Concat whole temp1 with temp2's column name



          temp1['new_column'] = temp2['name']


          You now have what you asked for.






          share|improve this answer













          I guess I found a simpler solution



          temp1 = pd.melt(df1, id_vars=["location"], var_name='Date', value_name='Value')
          temp2 = pd.melt(df1, id_vars=["name"], var_name='Date', value_name='Value')


          Concat whole temp1 with temp2's column name



          temp1['new_column'] = temp2['name']


          You now have what you asked for.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jun 25 '18 at 6:49









          PrometheusPrometheus

          16529




          16529





















              2














              pd.wide_to_long



              You can add a prefix to your year columns and then feed directly to pd.wide_to_long. I won't pretend this is efficient, but it may in certain situations be more convenient than pd.melt, e.g. when your columns already have an appropriate prefix.



              df.columns = np.hstack((df.columns[:2], df.columns[2:].map(lambda x: f'Valuex')))

              res = pd.wide_to_long(df, stubnames=['Value'], i='name', j='Date').reset_index()
              .sort_values(['location', 'name'])

              print(res)

              name Date location Value
              0 test Jan-2010 A 12
              2 test Feb-2010 A 20
              4 test March-2010 A 30
              1 foo Jan-2010 B 18
              3 foo Feb-2010 B 20
              5 foo March-2010 B 25





              share|improve this answer



























                2














                pd.wide_to_long



                You can add a prefix to your year columns and then feed directly to pd.wide_to_long. I won't pretend this is efficient, but it may in certain situations be more convenient than pd.melt, e.g. when your columns already have an appropriate prefix.



                df.columns = np.hstack((df.columns[:2], df.columns[2:].map(lambda x: f'Valuex')))

                res = pd.wide_to_long(df, stubnames=['Value'], i='name', j='Date').reset_index()
                .sort_values(['location', 'name'])

                print(res)

                name Date location Value
                0 test Jan-2010 A 12
                2 test Feb-2010 A 20
                4 test March-2010 A 30
                1 foo Jan-2010 B 18
                3 foo Feb-2010 B 20
                5 foo March-2010 B 25





                share|improve this answer

























                  2












                  2








                  2







                  pd.wide_to_long



                  You can add a prefix to your year columns and then feed directly to pd.wide_to_long. I won't pretend this is efficient, but it may in certain situations be more convenient than pd.melt, e.g. when your columns already have an appropriate prefix.



                  df.columns = np.hstack((df.columns[:2], df.columns[2:].map(lambda x: f'Valuex')))

                  res = pd.wide_to_long(df, stubnames=['Value'], i='name', j='Date').reset_index()
                  .sort_values(['location', 'name'])

                  print(res)

                  name Date location Value
                  0 test Jan-2010 A 12
                  2 test Feb-2010 A 20
                  4 test March-2010 A 30
                  1 foo Jan-2010 B 18
                  3 foo Feb-2010 B 20
                  5 foo March-2010 B 25





                  share|improve this answer













                  pd.wide_to_long



                  You can add a prefix to your year columns and then feed directly to pd.wide_to_long. I won't pretend this is efficient, but it may in certain situations be more convenient than pd.melt, e.g. when your columns already have an appropriate prefix.



                  df.columns = np.hstack((df.columns[:2], df.columns[2:].map(lambda x: f'Valuex')))

                  res = pd.wide_to_long(df, stubnames=['Value'], i='name', j='Date').reset_index()
                  .sort_values(['location', 'name'])

                  print(res)

                  name Date location Value
                  0 test Jan-2010 A 12
                  2 test Feb-2010 A 20
                  4 test March-2010 A 30
                  1 foo Jan-2010 B 18
                  3 foo Feb-2010 B 20
                  5 foo March-2010 B 25






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 13 '18 at 17:00









                  jppjpp

                  97.4k2159109




                  97.4k2159109



























                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f28654047%2fpandas-convert-some-columns-into-rows%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      這個網誌中的熱門文章

                      How to read a connectionString WITH PROVIDER in .NET Core?

                      Guadeloupe

                      Node.js Script on GitHub Pages or Amazon S3