Transformation of a given pandas dataframe to another dataframe










1















I have a pandas dataframe like this below. This gives me the distances in degrees from individual points to the following cities,viz, Fargo,Orange and Jersey City. But each column in the below dataframe like 'Fargo' has its row numbers 0 through 3 populated by the shortest 4 distances to any of the points and for the rest of the 8 rows it is getting populated because we are finding out the 4 shortest distances to another city 'Orange' and so on. To summarise from the below dataframe



Points = ['Point1','Point4','Point5','Point2','Point2','Point5','Point1','Point4','Point3','Point6','Point4','Point1']
Fargo = [2.90300755828,3.91961324034,21.9825588597,24.3141420303,24.3141420303,21.9825588597,2.90300755828,3.91961324034,25.3599772676,25.8509998739,3.91961324034,2.90300755828]
Orange = [25.5464458592,27.1527975618,6.17298387907,4.80214941294,4.80214941294,6.17298387907,25.5464458592,27.1527975618,46.4066249652,45.8853687976,27.1527975618,25.5464458592]
Jersey_City = [21.1030418227,19.6763385681,39.3194029761,41.8121131045,41.8121131045,39.3194029761,21.1030418227,19.6763385681,2.09632277264,2.67885042284,19.6763385681,21.1030418227]
toy_data=pd.DataFrame(index=Points,columns=['Fargo','Orange','Jersey_City'])
toy_data['Fargo']= Fargo
toy_data['Orange']=Orange
toy_data['Jersey_City']=Jersey_City


Lets say for the column Fargo the first 4 rows : rows 0 through 3 represent the points which have the shortest distances from Fargo. Similarly in the column Orange rows 4 through 7 represent the points which have the shortest distances to Orange and now in rows 4 through 7 the column Fargo gets populated with the distances from the nearest four points to Orange. But I want a frame where I get the 4 points having the shortest distances to each City in one single dataframe. So what you see here in the column Fargo rows 0-3 are its 4 nearest points,in column Orange, rows 4-7 are its nearest 4 points, in column Jersey City the rows 8-11 are its 4 nearest points. I want to keep those 4 nearest points for each city and remove the remaining as I have done below.
What I want is this:



Fargo = [2.9030075582789885,3.919613240342197,21.982558859743925,24.314142030334484,'NAN','NAN','NAN','NAN','NAN','NAN','NAN','NAN']
Orange = ['NAN','NAN','NAN','NAN',4.802149412942695,6.172983879065276,25.546445859236265,27.15279756182145,'NAN','NAN','NAN','NAN']
Jersey_City = ['NAN','NAN','NAN','NAN','NAN','NAN','NAN','NAN',2.096322772642856,2.67885042283533,19.676338568056806,21.10304182269932]
result_wanted_data =pd.DataFrame(index= Points,columns = ['Fargo','Orange','Jersey_City'])
result_wanted_data['Fargo']=Fargo
result_wanted_data['Orange']=Orange
result_wanted_data['Jersey_City']=Jersey_City









share|improve this question



















  • 2





    Please can you better explain the problem and what you are trying to obtain.

    – yatu
    Nov 15 '18 at 13:30






  • 2





    name 'data' is not defined! Please provide a mcve

    – user32185
    Nov 15 '18 at 13:34






  • 1





    @AlexandreNixon I hope you understand the problem now.

    – Sounak Banerjee
    Nov 15 '18 at 13:42











  • @user32185 I think the 'data' you were asking is given now. Apologies for the hassle.

    – Sounak Banerjee
    Nov 15 '18 at 13:43















1















I have a pandas dataframe like this below. This gives me the distances in degrees from individual points to the following cities,viz, Fargo,Orange and Jersey City. But each column in the below dataframe like 'Fargo' has its row numbers 0 through 3 populated by the shortest 4 distances to any of the points and for the rest of the 8 rows it is getting populated because we are finding out the 4 shortest distances to another city 'Orange' and so on. To summarise from the below dataframe



Points = ['Point1','Point4','Point5','Point2','Point2','Point5','Point1','Point4','Point3','Point6','Point4','Point1']
Fargo = [2.90300755828,3.91961324034,21.9825588597,24.3141420303,24.3141420303,21.9825588597,2.90300755828,3.91961324034,25.3599772676,25.8509998739,3.91961324034,2.90300755828]
Orange = [25.5464458592,27.1527975618,6.17298387907,4.80214941294,4.80214941294,6.17298387907,25.5464458592,27.1527975618,46.4066249652,45.8853687976,27.1527975618,25.5464458592]
Jersey_City = [21.1030418227,19.6763385681,39.3194029761,41.8121131045,41.8121131045,39.3194029761,21.1030418227,19.6763385681,2.09632277264,2.67885042284,19.6763385681,21.1030418227]
toy_data=pd.DataFrame(index=Points,columns=['Fargo','Orange','Jersey_City'])
toy_data['Fargo']= Fargo
toy_data['Orange']=Orange
toy_data['Jersey_City']=Jersey_City


Lets say for the column Fargo the first 4 rows : rows 0 through 3 represent the points which have the shortest distances from Fargo. Similarly in the column Orange rows 4 through 7 represent the points which have the shortest distances to Orange and now in rows 4 through 7 the column Fargo gets populated with the distances from the nearest four points to Orange. But I want a frame where I get the 4 points having the shortest distances to each City in one single dataframe. So what you see here in the column Fargo rows 0-3 are its 4 nearest points,in column Orange, rows 4-7 are its nearest 4 points, in column Jersey City the rows 8-11 are its 4 nearest points. I want to keep those 4 nearest points for each city and remove the remaining as I have done below.
What I want is this:



Fargo = [2.9030075582789885,3.919613240342197,21.982558859743925,24.314142030334484,'NAN','NAN','NAN','NAN','NAN','NAN','NAN','NAN']
Orange = ['NAN','NAN','NAN','NAN',4.802149412942695,6.172983879065276,25.546445859236265,27.15279756182145,'NAN','NAN','NAN','NAN']
Jersey_City = ['NAN','NAN','NAN','NAN','NAN','NAN','NAN','NAN',2.096322772642856,2.67885042283533,19.676338568056806,21.10304182269932]
result_wanted_data =pd.DataFrame(index= Points,columns = ['Fargo','Orange','Jersey_City'])
result_wanted_data['Fargo']=Fargo
result_wanted_data['Orange']=Orange
result_wanted_data['Jersey_City']=Jersey_City









share|improve this question



















  • 2





    Please can you better explain the problem and what you are trying to obtain.

    – yatu
    Nov 15 '18 at 13:30






  • 2





    name 'data' is not defined! Please provide a mcve

    – user32185
    Nov 15 '18 at 13:34






  • 1





    @AlexandreNixon I hope you understand the problem now.

    – Sounak Banerjee
    Nov 15 '18 at 13:42











  • @user32185 I think the 'data' you were asking is given now. Apologies for the hassle.

    – Sounak Banerjee
    Nov 15 '18 at 13:43













1












1








1


1






I have a pandas dataframe like this below. This gives me the distances in degrees from individual points to the following cities,viz, Fargo,Orange and Jersey City. But each column in the below dataframe like 'Fargo' has its row numbers 0 through 3 populated by the shortest 4 distances to any of the points and for the rest of the 8 rows it is getting populated because we are finding out the 4 shortest distances to another city 'Orange' and so on. To summarise from the below dataframe



Points = ['Point1','Point4','Point5','Point2','Point2','Point5','Point1','Point4','Point3','Point6','Point4','Point1']
Fargo = [2.90300755828,3.91961324034,21.9825588597,24.3141420303,24.3141420303,21.9825588597,2.90300755828,3.91961324034,25.3599772676,25.8509998739,3.91961324034,2.90300755828]
Orange = [25.5464458592,27.1527975618,6.17298387907,4.80214941294,4.80214941294,6.17298387907,25.5464458592,27.1527975618,46.4066249652,45.8853687976,27.1527975618,25.5464458592]
Jersey_City = [21.1030418227,19.6763385681,39.3194029761,41.8121131045,41.8121131045,39.3194029761,21.1030418227,19.6763385681,2.09632277264,2.67885042284,19.6763385681,21.1030418227]
toy_data=pd.DataFrame(index=Points,columns=['Fargo','Orange','Jersey_City'])
toy_data['Fargo']= Fargo
toy_data['Orange']=Orange
toy_data['Jersey_City']=Jersey_City


Lets say for the column Fargo the first 4 rows : rows 0 through 3 represent the points which have the shortest distances from Fargo. Similarly in the column Orange rows 4 through 7 represent the points which have the shortest distances to Orange and now in rows 4 through 7 the column Fargo gets populated with the distances from the nearest four points to Orange. But I want a frame where I get the 4 points having the shortest distances to each City in one single dataframe. So what you see here in the column Fargo rows 0-3 are its 4 nearest points,in column Orange, rows 4-7 are its nearest 4 points, in column Jersey City the rows 8-11 are its 4 nearest points. I want to keep those 4 nearest points for each city and remove the remaining as I have done below.
What I want is this:



Fargo = [2.9030075582789885,3.919613240342197,21.982558859743925,24.314142030334484,'NAN','NAN','NAN','NAN','NAN','NAN','NAN','NAN']
Orange = ['NAN','NAN','NAN','NAN',4.802149412942695,6.172983879065276,25.546445859236265,27.15279756182145,'NAN','NAN','NAN','NAN']
Jersey_City = ['NAN','NAN','NAN','NAN','NAN','NAN','NAN','NAN',2.096322772642856,2.67885042283533,19.676338568056806,21.10304182269932]
result_wanted_data =pd.DataFrame(index= Points,columns = ['Fargo','Orange','Jersey_City'])
result_wanted_data['Fargo']=Fargo
result_wanted_data['Orange']=Orange
result_wanted_data['Jersey_City']=Jersey_City









share|improve this question
















I have a pandas dataframe like this below. This gives me the distances in degrees from individual points to the following cities,viz, Fargo,Orange and Jersey City. But each column in the below dataframe like 'Fargo' has its row numbers 0 through 3 populated by the shortest 4 distances to any of the points and for the rest of the 8 rows it is getting populated because we are finding out the 4 shortest distances to another city 'Orange' and so on. To summarise from the below dataframe



Points = ['Point1','Point4','Point5','Point2','Point2','Point5','Point1','Point4','Point3','Point6','Point4','Point1']
Fargo = [2.90300755828,3.91961324034,21.9825588597,24.3141420303,24.3141420303,21.9825588597,2.90300755828,3.91961324034,25.3599772676,25.8509998739,3.91961324034,2.90300755828]
Orange = [25.5464458592,27.1527975618,6.17298387907,4.80214941294,4.80214941294,6.17298387907,25.5464458592,27.1527975618,46.4066249652,45.8853687976,27.1527975618,25.5464458592]
Jersey_City = [21.1030418227,19.6763385681,39.3194029761,41.8121131045,41.8121131045,39.3194029761,21.1030418227,19.6763385681,2.09632277264,2.67885042284,19.6763385681,21.1030418227]
toy_data=pd.DataFrame(index=Points,columns=['Fargo','Orange','Jersey_City'])
toy_data['Fargo']= Fargo
toy_data['Orange']=Orange
toy_data['Jersey_City']=Jersey_City


Lets say for the column Fargo the first 4 rows : rows 0 through 3 represent the points which have the shortest distances from Fargo. Similarly in the column Orange rows 4 through 7 represent the points which have the shortest distances to Orange and now in rows 4 through 7 the column Fargo gets populated with the distances from the nearest four points to Orange. But I want a frame where I get the 4 points having the shortest distances to each City in one single dataframe. So what you see here in the column Fargo rows 0-3 are its 4 nearest points,in column Orange, rows 4-7 are its nearest 4 points, in column Jersey City the rows 8-11 are its 4 nearest points. I want to keep those 4 nearest points for each city and remove the remaining as I have done below.
What I want is this:



Fargo = [2.9030075582789885,3.919613240342197,21.982558859743925,24.314142030334484,'NAN','NAN','NAN','NAN','NAN','NAN','NAN','NAN']
Orange = ['NAN','NAN','NAN','NAN',4.802149412942695,6.172983879065276,25.546445859236265,27.15279756182145,'NAN','NAN','NAN','NAN']
Jersey_City = ['NAN','NAN','NAN','NAN','NAN','NAN','NAN','NAN',2.096322772642856,2.67885042283533,19.676338568056806,21.10304182269932]
result_wanted_data =pd.DataFrame(index= Points,columns = ['Fargo','Orange','Jersey_City'])
result_wanted_data['Fargo']=Fargo
result_wanted_data['Orange']=Orange
result_wanted_data['Jersey_City']=Jersey_City






python pandas dataframe






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 15 '18 at 13:49









Malik Asad

318212




318212










asked Nov 15 '18 at 13:26









Sounak BanerjeeSounak Banerjee

465




465







  • 2





    Please can you better explain the problem and what you are trying to obtain.

    – yatu
    Nov 15 '18 at 13:30






  • 2





    name 'data' is not defined! Please provide a mcve

    – user32185
    Nov 15 '18 at 13:34






  • 1





    @AlexandreNixon I hope you understand the problem now.

    – Sounak Banerjee
    Nov 15 '18 at 13:42











  • @user32185 I think the 'data' you were asking is given now. Apologies for the hassle.

    – Sounak Banerjee
    Nov 15 '18 at 13:43












  • 2





    Please can you better explain the problem and what you are trying to obtain.

    – yatu
    Nov 15 '18 at 13:30






  • 2





    name 'data' is not defined! Please provide a mcve

    – user32185
    Nov 15 '18 at 13:34






  • 1





    @AlexandreNixon I hope you understand the problem now.

    – Sounak Banerjee
    Nov 15 '18 at 13:42











  • @user32185 I think the 'data' you were asking is given now. Apologies for the hassle.

    – Sounak Banerjee
    Nov 15 '18 at 13:43







2




2





Please can you better explain the problem and what you are trying to obtain.

– yatu
Nov 15 '18 at 13:30





Please can you better explain the problem and what you are trying to obtain.

– yatu
Nov 15 '18 at 13:30




2




2





name 'data' is not defined! Please provide a mcve

– user32185
Nov 15 '18 at 13:34





name 'data' is not defined! Please provide a mcve

– user32185
Nov 15 '18 at 13:34




1




1





@AlexandreNixon I hope you understand the problem now.

– Sounak Banerjee
Nov 15 '18 at 13:42





@AlexandreNixon I hope you understand the problem now.

– Sounak Banerjee
Nov 15 '18 at 13:42













@user32185 I think the 'data' you were asking is given now. Apologies for the hassle.

– Sounak Banerjee
Nov 15 '18 at 13:43





@user32185 I think the 'data' you were asking is given now. Apologies for the hassle.

– Sounak Banerjee
Nov 15 '18 at 13:43












3 Answers
3






active

oldest

votes


















1














What you can do is not exactly that what I guess you wanted but I think this will solve the purpose:



newdf=np.empty([12])

for i in range(12):
newdf[i]=data.iloc[i,[(math.ceil((i+1)/4))]]
newdf1=
cities=list(data.columns.values[1:])
for i in range(12):
newdf1.append(cities[(math.ceil((i+1)/4)-1)])
strs = ["" for x in range(12)]
for i in range(12):

strs[i]=data.iloc[i,0]

final_data=pd.DataFrame(columns=['city','point','distance' ])
final_data['city']=newdf1
final_data['distance']=newdf
final_data['point']=strs





share|improve this answer
































    1














    You can use the following:



    intervals = np.array_split(np.arange(toy_data.shape[0]), 3)
    df = pd.DataFrame(columns=['Distances'], index=toy_data.reset_index().index)
    for i, j in zip(range(toy_data.shape[1]), intervals):
    df.loc[j,'Distances'] = toy_data.reset_index(drop=True).iloc[j,i]

    print(df)

    Distances
    0 2.90301
    1 3.91961
    2 21.9826
    3 24.3141
    4 4.80215
    5 6.17298
    6 25.5464
    7 27.1528
    8 2.09632
    9 2.67885
    10 19.6763
    11 21.103





    share|improve this answer






























      1














      You can use np.split() and a for loop:



      x = 0
      split =
      for num in range(len(toy_data.columns)-1):
      split.append(x+4)
      x+=4

      dfs = np.split(toy_data, split)

      data =
      for i in range(len(dfs)):
      data.append(pd.DataFrame(dfs[i][dfs[i].columns[i]]))
      pd.concat(data, sort=False)

      Fargo Orange Jersey_City
      Point1 2.903008 NaN NaN
      Point4 3.919613 NaN NaN
      Point5 21.982559 NaN NaN
      Point2 24.314142 NaN NaN
      Point2 NaN 4.802149 NaN
      Point5 NaN 6.172984 NaN
      Point1 NaN 25.546446 NaN
      Point4 NaN 27.152798 NaN
      Point3 NaN NaN 2.096323
      Point6 NaN NaN 2.678850
      Point4 NaN NaN 19.676339
      Point1 NaN NaN 21.103042





      share|improve this answer

























      • TypeError: concat() got an unexpected keyword argument 'sort'. It shows this error..

        – Saradamani
        Nov 15 '18 at 14:25











      • @Saradamani what version of pandas are you using?

        – Chris
        Nov 15 '18 at 14:27











      • pd.__version__ Out[924]: '0.21.1'

        – Saradamani
        Nov 15 '18 at 14:28






      • 1





        @Saradamani Sort was added in 0.23 I believe. you should just be able to remove that param

        – Chris
        Nov 15 '18 at 14:29











      • Beautiful answer Yes I saw this works. My solution was ofcourse different..

        – Saradamani
        Nov 15 '18 at 14:33










      Your Answer






      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53320534%2ftransformation-of-a-given-pandas-dataframe-to-another-dataframe%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      1














      What you can do is not exactly that what I guess you wanted but I think this will solve the purpose:



      newdf=np.empty([12])

      for i in range(12):
      newdf[i]=data.iloc[i,[(math.ceil((i+1)/4))]]
      newdf1=
      cities=list(data.columns.values[1:])
      for i in range(12):
      newdf1.append(cities[(math.ceil((i+1)/4)-1)])
      strs = ["" for x in range(12)]
      for i in range(12):

      strs[i]=data.iloc[i,0]

      final_data=pd.DataFrame(columns=['city','point','distance' ])
      final_data['city']=newdf1
      final_data['distance']=newdf
      final_data['point']=strs





      share|improve this answer





























        1














        What you can do is not exactly that what I guess you wanted but I think this will solve the purpose:



        newdf=np.empty([12])

        for i in range(12):
        newdf[i]=data.iloc[i,[(math.ceil((i+1)/4))]]
        newdf1=
        cities=list(data.columns.values[1:])
        for i in range(12):
        newdf1.append(cities[(math.ceil((i+1)/4)-1)])
        strs = ["" for x in range(12)]
        for i in range(12):

        strs[i]=data.iloc[i,0]

        final_data=pd.DataFrame(columns=['city','point','distance' ])
        final_data['city']=newdf1
        final_data['distance']=newdf
        final_data['point']=strs





        share|improve this answer



























          1












          1








          1







          What you can do is not exactly that what I guess you wanted but I think this will solve the purpose:



          newdf=np.empty([12])

          for i in range(12):
          newdf[i]=data.iloc[i,[(math.ceil((i+1)/4))]]
          newdf1=
          cities=list(data.columns.values[1:])
          for i in range(12):
          newdf1.append(cities[(math.ceil((i+1)/4)-1)])
          strs = ["" for x in range(12)]
          for i in range(12):

          strs[i]=data.iloc[i,0]

          final_data=pd.DataFrame(columns=['city','point','distance' ])
          final_data['city']=newdf1
          final_data['distance']=newdf
          final_data['point']=strs





          share|improve this answer















          What you can do is not exactly that what I guess you wanted but I think this will solve the purpose:



          newdf=np.empty([12])

          for i in range(12):
          newdf[i]=data.iloc[i,[(math.ceil((i+1)/4))]]
          newdf1=
          cities=list(data.columns.values[1:])
          for i in range(12):
          newdf1.append(cities[(math.ceil((i+1)/4)-1)])
          strs = ["" for x in range(12)]
          for i in range(12):

          strs[i]=data.iloc[i,0]

          final_data=pd.DataFrame(columns=['city','point','distance' ])
          final_data['city']=newdf1
          final_data['distance']=newdf
          final_data['point']=strs






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 15 '18 at 14:22

























          answered Nov 15 '18 at 13:56









          SaradamaniSaradamani

          155212




          155212























              1














              You can use the following:



              intervals = np.array_split(np.arange(toy_data.shape[0]), 3)
              df = pd.DataFrame(columns=['Distances'], index=toy_data.reset_index().index)
              for i, j in zip(range(toy_data.shape[1]), intervals):
              df.loc[j,'Distances'] = toy_data.reset_index(drop=True).iloc[j,i]

              print(df)

              Distances
              0 2.90301
              1 3.91961
              2 21.9826
              3 24.3141
              4 4.80215
              5 6.17298
              6 25.5464
              7 27.1528
              8 2.09632
              9 2.67885
              10 19.6763
              11 21.103





              share|improve this answer



























                1














                You can use the following:



                intervals = np.array_split(np.arange(toy_data.shape[0]), 3)
                df = pd.DataFrame(columns=['Distances'], index=toy_data.reset_index().index)
                for i, j in zip(range(toy_data.shape[1]), intervals):
                df.loc[j,'Distances'] = toy_data.reset_index(drop=True).iloc[j,i]

                print(df)

                Distances
                0 2.90301
                1 3.91961
                2 21.9826
                3 24.3141
                4 4.80215
                5 6.17298
                6 25.5464
                7 27.1528
                8 2.09632
                9 2.67885
                10 19.6763
                11 21.103





                share|improve this answer

























                  1












                  1








                  1







                  You can use the following:



                  intervals = np.array_split(np.arange(toy_data.shape[0]), 3)
                  df = pd.DataFrame(columns=['Distances'], index=toy_data.reset_index().index)
                  for i, j in zip(range(toy_data.shape[1]), intervals):
                  df.loc[j,'Distances'] = toy_data.reset_index(drop=True).iloc[j,i]

                  print(df)

                  Distances
                  0 2.90301
                  1 3.91961
                  2 21.9826
                  3 24.3141
                  4 4.80215
                  5 6.17298
                  6 25.5464
                  7 27.1528
                  8 2.09632
                  9 2.67885
                  10 19.6763
                  11 21.103





                  share|improve this answer













                  You can use the following:



                  intervals = np.array_split(np.arange(toy_data.shape[0]), 3)
                  df = pd.DataFrame(columns=['Distances'], index=toy_data.reset_index().index)
                  for i, j in zip(range(toy_data.shape[1]), intervals):
                  df.loc[j,'Distances'] = toy_data.reset_index(drop=True).iloc[j,i]

                  print(df)

                  Distances
                  0 2.90301
                  1 3.91961
                  2 21.9826
                  3 24.3141
                  4 4.80215
                  5 6.17298
                  6 25.5464
                  7 27.1528
                  8 2.09632
                  9 2.67885
                  10 19.6763
                  11 21.103






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 15 '18 at 14:19









                  yatuyatu

                  13.7k31441




                  13.7k31441





















                      1














                      You can use np.split() and a for loop:



                      x = 0
                      split =
                      for num in range(len(toy_data.columns)-1):
                      split.append(x+4)
                      x+=4

                      dfs = np.split(toy_data, split)

                      data =
                      for i in range(len(dfs)):
                      data.append(pd.DataFrame(dfs[i][dfs[i].columns[i]]))
                      pd.concat(data, sort=False)

                      Fargo Orange Jersey_City
                      Point1 2.903008 NaN NaN
                      Point4 3.919613 NaN NaN
                      Point5 21.982559 NaN NaN
                      Point2 24.314142 NaN NaN
                      Point2 NaN 4.802149 NaN
                      Point5 NaN 6.172984 NaN
                      Point1 NaN 25.546446 NaN
                      Point4 NaN 27.152798 NaN
                      Point3 NaN NaN 2.096323
                      Point6 NaN NaN 2.678850
                      Point4 NaN NaN 19.676339
                      Point1 NaN NaN 21.103042





                      share|improve this answer

























                      • TypeError: concat() got an unexpected keyword argument 'sort'. It shows this error..

                        – Saradamani
                        Nov 15 '18 at 14:25











                      • @Saradamani what version of pandas are you using?

                        – Chris
                        Nov 15 '18 at 14:27











                      • pd.__version__ Out[924]: '0.21.1'

                        – Saradamani
                        Nov 15 '18 at 14:28






                      • 1





                        @Saradamani Sort was added in 0.23 I believe. you should just be able to remove that param

                        – Chris
                        Nov 15 '18 at 14:29











                      • Beautiful answer Yes I saw this works. My solution was ofcourse different..

                        – Saradamani
                        Nov 15 '18 at 14:33















                      1














                      You can use np.split() and a for loop:



                      x = 0
                      split =
                      for num in range(len(toy_data.columns)-1):
                      split.append(x+4)
                      x+=4

                      dfs = np.split(toy_data, split)

                      data =
                      for i in range(len(dfs)):
                      data.append(pd.DataFrame(dfs[i][dfs[i].columns[i]]))
                      pd.concat(data, sort=False)

                      Fargo Orange Jersey_City
                      Point1 2.903008 NaN NaN
                      Point4 3.919613 NaN NaN
                      Point5 21.982559 NaN NaN
                      Point2 24.314142 NaN NaN
                      Point2 NaN 4.802149 NaN
                      Point5 NaN 6.172984 NaN
                      Point1 NaN 25.546446 NaN
                      Point4 NaN 27.152798 NaN
                      Point3 NaN NaN 2.096323
                      Point6 NaN NaN 2.678850
                      Point4 NaN NaN 19.676339
                      Point1 NaN NaN 21.103042





                      share|improve this answer

























                      • TypeError: concat() got an unexpected keyword argument 'sort'. It shows this error..

                        – Saradamani
                        Nov 15 '18 at 14:25











                      • @Saradamani what version of pandas are you using?

                        – Chris
                        Nov 15 '18 at 14:27











                      • pd.__version__ Out[924]: '0.21.1'

                        – Saradamani
                        Nov 15 '18 at 14:28






                      • 1





                        @Saradamani Sort was added in 0.23 I believe. you should just be able to remove that param

                        – Chris
                        Nov 15 '18 at 14:29











                      • Beautiful answer Yes I saw this works. My solution was ofcourse different..

                        – Saradamani
                        Nov 15 '18 at 14:33













                      1












                      1








                      1







                      You can use np.split() and a for loop:



                      x = 0
                      split =
                      for num in range(len(toy_data.columns)-1):
                      split.append(x+4)
                      x+=4

                      dfs = np.split(toy_data, split)

                      data =
                      for i in range(len(dfs)):
                      data.append(pd.DataFrame(dfs[i][dfs[i].columns[i]]))
                      pd.concat(data, sort=False)

                      Fargo Orange Jersey_City
                      Point1 2.903008 NaN NaN
                      Point4 3.919613 NaN NaN
                      Point5 21.982559 NaN NaN
                      Point2 24.314142 NaN NaN
                      Point2 NaN 4.802149 NaN
                      Point5 NaN 6.172984 NaN
                      Point1 NaN 25.546446 NaN
                      Point4 NaN 27.152798 NaN
                      Point3 NaN NaN 2.096323
                      Point6 NaN NaN 2.678850
                      Point4 NaN NaN 19.676339
                      Point1 NaN NaN 21.103042





                      share|improve this answer















                      You can use np.split() and a for loop:



                      x = 0
                      split =
                      for num in range(len(toy_data.columns)-1):
                      split.append(x+4)
                      x+=4

                      dfs = np.split(toy_data, split)

                      data =
                      for i in range(len(dfs)):
                      data.append(pd.DataFrame(dfs[i][dfs[i].columns[i]]))
                      pd.concat(data, sort=False)

                      Fargo Orange Jersey_City
                      Point1 2.903008 NaN NaN
                      Point4 3.919613 NaN NaN
                      Point5 21.982559 NaN NaN
                      Point2 24.314142 NaN NaN
                      Point2 NaN 4.802149 NaN
                      Point5 NaN 6.172984 NaN
                      Point1 NaN 25.546446 NaN
                      Point4 NaN 27.152798 NaN
                      Point3 NaN NaN 2.096323
                      Point6 NaN NaN 2.678850
                      Point4 NaN NaN 19.676339
                      Point1 NaN NaN 21.103042






                      share|improve this answer














                      share|improve this answer



                      share|improve this answer








                      edited Nov 15 '18 at 14:46

























                      answered Nov 15 '18 at 14:16









                      ChrisChris

                      3,0482523




                      3,0482523












                      • TypeError: concat() got an unexpected keyword argument 'sort'. It shows this error..

                        – Saradamani
                        Nov 15 '18 at 14:25











                      • @Saradamani what version of pandas are you using?

                        – Chris
                        Nov 15 '18 at 14:27











                      • pd.__version__ Out[924]: '0.21.1'

                        – Saradamani
                        Nov 15 '18 at 14:28






                      • 1





                        @Saradamani Sort was added in 0.23 I believe. you should just be able to remove that param

                        – Chris
                        Nov 15 '18 at 14:29











                      • Beautiful answer Yes I saw this works. My solution was ofcourse different..

                        – Saradamani
                        Nov 15 '18 at 14:33

















                      • TypeError: concat() got an unexpected keyword argument 'sort'. It shows this error..

                        – Saradamani
                        Nov 15 '18 at 14:25











                      • @Saradamani what version of pandas are you using?

                        – Chris
                        Nov 15 '18 at 14:27











                      • pd.__version__ Out[924]: '0.21.1'

                        – Saradamani
                        Nov 15 '18 at 14:28






                      • 1





                        @Saradamani Sort was added in 0.23 I believe. you should just be able to remove that param

                        – Chris
                        Nov 15 '18 at 14:29











                      • Beautiful answer Yes I saw this works. My solution was ofcourse different..

                        – Saradamani
                        Nov 15 '18 at 14:33
















                      TypeError: concat() got an unexpected keyword argument 'sort'. It shows this error..

                      – Saradamani
                      Nov 15 '18 at 14:25





                      TypeError: concat() got an unexpected keyword argument 'sort'. It shows this error..

                      – Saradamani
                      Nov 15 '18 at 14:25













                      @Saradamani what version of pandas are you using?

                      – Chris
                      Nov 15 '18 at 14:27





                      @Saradamani what version of pandas are you using?

                      – Chris
                      Nov 15 '18 at 14:27













                      pd.__version__ Out[924]: '0.21.1'

                      – Saradamani
                      Nov 15 '18 at 14:28





                      pd.__version__ Out[924]: '0.21.1'

                      – Saradamani
                      Nov 15 '18 at 14:28




                      1




                      1





                      @Saradamani Sort was added in 0.23 I believe. you should just be able to remove that param

                      – Chris
                      Nov 15 '18 at 14:29





                      @Saradamani Sort was added in 0.23 I believe. you should just be able to remove that param

                      – Chris
                      Nov 15 '18 at 14:29













                      Beautiful answer Yes I saw this works. My solution was ofcourse different..

                      – Saradamani
                      Nov 15 '18 at 14:33





                      Beautiful answer Yes I saw this works. My solution was ofcourse different..

                      – Saradamani
                      Nov 15 '18 at 14:33

















                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53320534%2ftransformation-of-a-given-pandas-dataframe-to-another-dataframe%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      這個網誌中的熱門文章

                      How to read a connectionString WITH PROVIDER in .NET Core?

                      Node.js Script on GitHub Pages or Amazon S3

                      Museum of Modern and Contemporary Art of Trento and Rovereto