Pandas pivot table using custom conditions on the dataframe









up vote
4
down vote

favorite












I want to make a pivot table based on custom conditions in the dataframe:



The dataframe looks like this:



>>> df = pd.DataFrame("Area": ["A", "A", "B", "A", "C", "A", "D", "A"],
"City" : ["X", "Y", "Z", "P", "Q", "R", "S", "X"],
"Condition" : ["Good", "Bad", "Good", "Good", "Good", "Bad", "Good", "Good"],
"Population" : [100,150,50,200,170,390,80,100]
"Pincode" : ["X1", "Y1", "Z1", "P1", "Q1", "R1", "S1", "X2"] )
>>> df
Area City Condition Population Pincode
0 A X Good 100 X1
1 A Y Bad 150 Y1
2 B Z Good 50 Z1
3 A P Good 200 P1
4 C Q Good 170 Q1
5 A R Bad 390 R1
6 D S Good 80 S1
7 A X Good 100 X2


Now I want to pivot the dataframe df in a manner such that I can see the unique count of cities against each area along with the corresponding count of "Good" cities and also the population of the area.



I expect an output like this:



Area city_count good_city_count Population
A 4 2 940
B 1 1 50
C 1 1 170
D 1 1 80
All 7 5 1240


I can give a dictionary to the aggfunc parameter but this doesn't give me the city count split between the good cities.



>>> city_count = df.pivot_table(index=["Area"],
values=["City", "Population"],
aggfunc="City": lambda x: len(x.unique()),
"Population": "sum",
margins=True)

Area City Population
0 A 4 940
1 B 1 50
2 C 1 170
3 D 1 80
4 All 7 1240


I can merge two different pivot tables - one with the count of cities and the other with the population but this is not scalable for a large dataset with a big aggfunc dictionary.










share|improve this question









New contributor




Pratiek Malhotra is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.























    up vote
    4
    down vote

    favorite












    I want to make a pivot table based on custom conditions in the dataframe:



    The dataframe looks like this:



    >>> df = pd.DataFrame("Area": ["A", "A", "B", "A", "C", "A", "D", "A"],
    "City" : ["X", "Y", "Z", "P", "Q", "R", "S", "X"],
    "Condition" : ["Good", "Bad", "Good", "Good", "Good", "Bad", "Good", "Good"],
    "Population" : [100,150,50,200,170,390,80,100]
    "Pincode" : ["X1", "Y1", "Z1", "P1", "Q1", "R1", "S1", "X2"] )
    >>> df
    Area City Condition Population Pincode
    0 A X Good 100 X1
    1 A Y Bad 150 Y1
    2 B Z Good 50 Z1
    3 A P Good 200 P1
    4 C Q Good 170 Q1
    5 A R Bad 390 R1
    6 D S Good 80 S1
    7 A X Good 100 X2


    Now I want to pivot the dataframe df in a manner such that I can see the unique count of cities against each area along with the corresponding count of "Good" cities and also the population of the area.



    I expect an output like this:



    Area city_count good_city_count Population
    A 4 2 940
    B 1 1 50
    C 1 1 170
    D 1 1 80
    All 7 5 1240


    I can give a dictionary to the aggfunc parameter but this doesn't give me the city count split between the good cities.



    >>> city_count = df.pivot_table(index=["Area"],
    values=["City", "Population"],
    aggfunc="City": lambda x: len(x.unique()),
    "Population": "sum",
    margins=True)

    Area City Population
    0 A 4 940
    1 B 1 50
    2 C 1 170
    3 D 1 80
    4 All 7 1240


    I can merge two different pivot tables - one with the count of cities and the other with the population but this is not scalable for a large dataset with a big aggfunc dictionary.










    share|improve this question









    New contributor




    Pratiek Malhotra is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.





















      up vote
      4
      down vote

      favorite









      up vote
      4
      down vote

      favorite











      I want to make a pivot table based on custom conditions in the dataframe:



      The dataframe looks like this:



      >>> df = pd.DataFrame("Area": ["A", "A", "B", "A", "C", "A", "D", "A"],
      "City" : ["X", "Y", "Z", "P", "Q", "R", "S", "X"],
      "Condition" : ["Good", "Bad", "Good", "Good", "Good", "Bad", "Good", "Good"],
      "Population" : [100,150,50,200,170,390,80,100]
      "Pincode" : ["X1", "Y1", "Z1", "P1", "Q1", "R1", "S1", "X2"] )
      >>> df
      Area City Condition Population Pincode
      0 A X Good 100 X1
      1 A Y Bad 150 Y1
      2 B Z Good 50 Z1
      3 A P Good 200 P1
      4 C Q Good 170 Q1
      5 A R Bad 390 R1
      6 D S Good 80 S1
      7 A X Good 100 X2


      Now I want to pivot the dataframe df in a manner such that I can see the unique count of cities against each area along with the corresponding count of "Good" cities and also the population of the area.



      I expect an output like this:



      Area city_count good_city_count Population
      A 4 2 940
      B 1 1 50
      C 1 1 170
      D 1 1 80
      All 7 5 1240


      I can give a dictionary to the aggfunc parameter but this doesn't give me the city count split between the good cities.



      >>> city_count = df.pivot_table(index=["Area"],
      values=["City", "Population"],
      aggfunc="City": lambda x: len(x.unique()),
      "Population": "sum",
      margins=True)

      Area City Population
      0 A 4 940
      1 B 1 50
      2 C 1 170
      3 D 1 80
      4 All 7 1240


      I can merge two different pivot tables - one with the count of cities and the other with the population but this is not scalable for a large dataset with a big aggfunc dictionary.










      share|improve this question









      New contributor




      Pratiek Malhotra is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      I want to make a pivot table based on custom conditions in the dataframe:



      The dataframe looks like this:



      >>> df = pd.DataFrame("Area": ["A", "A", "B", "A", "C", "A", "D", "A"],
      "City" : ["X", "Y", "Z", "P", "Q", "R", "S", "X"],
      "Condition" : ["Good", "Bad", "Good", "Good", "Good", "Bad", "Good", "Good"],
      "Population" : [100,150,50,200,170,390,80,100]
      "Pincode" : ["X1", "Y1", "Z1", "P1", "Q1", "R1", "S1", "X2"] )
      >>> df
      Area City Condition Population Pincode
      0 A X Good 100 X1
      1 A Y Bad 150 Y1
      2 B Z Good 50 Z1
      3 A P Good 200 P1
      4 C Q Good 170 Q1
      5 A R Bad 390 R1
      6 D S Good 80 S1
      7 A X Good 100 X2


      Now I want to pivot the dataframe df in a manner such that I can see the unique count of cities against each area along with the corresponding count of "Good" cities and also the population of the area.



      I expect an output like this:



      Area city_count good_city_count Population
      A 4 2 940
      B 1 1 50
      C 1 1 170
      D 1 1 80
      All 7 5 1240


      I can give a dictionary to the aggfunc parameter but this doesn't give me the city count split between the good cities.



      >>> city_count = df.pivot_table(index=["Area"],
      values=["City", "Population"],
      aggfunc="City": lambda x: len(x.unique()),
      "Population": "sum",
      margins=True)

      Area City Population
      0 A 4 940
      1 B 1 50
      2 C 1 170
      3 D 1 80
      4 All 7 1240


      I can merge two different pivot tables - one with the count of cities and the other with the population but this is not scalable for a large dataset with a big aggfunc dictionary.







      python pandas pivot-table






      share|improve this question









      New contributor




      Pratiek Malhotra is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question









      New contributor




      Pratiek Malhotra is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question








      edited Nov 10 at 21:19





















      New contributor




      Pratiek Malhotra is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked Nov 10 at 13:36









      Pratiek Malhotra

      212




      212




      New contributor




      Pratiek Malhotra is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      Pratiek Malhotra is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      Pratiek Malhotra is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          1
          down vote













          Add new parameters columns with fill_value and also is possible use nunique for aggregate function:



          city_count = df.pivot_table(index = "Area", 
          values = "City",
          columns='Condition',
          aggfunc = lambda x : x.nunique(),
          margins = True,
          fill_value=0)
          print (city_count)
          Condition Bad Good All
          Area
          A 2 2 4
          B 0 1 1
          C 0 1 1
          D 0 1 1
          All 2 5 7


          Last if need convert index to column and change columns names:



          city_count = city_count.add_suffix('_count').reset_index().rename_axis(None, 1)
          print (city_count)
          Area Bad_count Good_count All_count
          0 A 2 2 4
          1 B 0 1 1
          2 C 0 1 1
          3 D 0 1 1
          4 All 2 5 7


          EDIT:



          d = 'City':'nunique','Population':'sum', 'good_city_count':'nunique'
          d1 = 'City':'city_count','Condition':'good_city_count'

          mask = df["Condition"] == 'Good'
          df = (df.assign(good_city_count = lambda x: np.where(mask, x['City'], np.nan))
          .groupby('Area')
          .agg(d)
          .rename(columns=d1))

          df = df.append(df.sum().rename('All')).reset_index()

          print (df)
          Area city_count Population good_city_count
          0 A 4 940 2
          1 B 1 50 1
          2 C 1 170 1
          3 D 1 80 1
          4 All 7 1240 5





          share|improve this answer


















          • 1




            This wouldn't give me the total_count of cities against the good_count
            – Pratiek Malhotra
            Nov 10 at 14:18










          • @PratiekMalhotra - sorry, you are right. rollback to previous answer.
            – jezrael
            Nov 10 at 14:20






          • 1




            @PratiekMalhotra - If your question was answered, please accept the most helpful answer by clicking the grey check to the left of that answer to toggle it green. It helps the community. Thanks.
            – jezrael
            Nov 10 at 14:24










          • @jeszrael, In dataframes where I need to aggregate values from multiple columns. I cannot use the "columns" parameter as it would aggregate all the values based on that.
            – Pratiek Malhotra
            Nov 10 at 14:31











          • @PratiekMalhotra - Can you create minimal, complete, and verifiable example? Because not sure if understand... Thank you
            – jezrael
            Nov 10 at 14:36

















          up vote
          1
          down vote













          Another method without using pivot_table. Use np.where with groupby+agg:



          df['Condition'] = np.where(df['Condition']=='Good', df['City'], np.nan)
          df = df.groupby('Area').agg('City':'nunique', 'Condition':'nunique', 'Population':'sum')
          .rename(columns='City':'city_count', 'Condition':'good_city_count')
          df.loc['All',:] = df.sum()
          df = df.astype(int).reset_index()

          print(df)
          Area city_count good_city_count Population
          0 A 4 2 940
          1 B 1 1 50
          2 C 1 1 170
          3 D 1 1 80
          4 All 7 5 1240





          share|improve this answer






















          • When using this, the good city count will not be unique as it will be summing all the duplicate city appearances for Area as well
            – Pratiek Malhotra
            Nov 10 at 20:48










          • @PratiekMalhotra Check the update.
            – Sandeep Kadapa
            Nov 11 at 2:46










          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );






          Pratiek Malhotra is a new contributor. Be nice, and check out our Code of Conduct.









           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53239514%2fpandas-pivot-table-using-custom-conditions-on-the-dataframe%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          1
          down vote













          Add new parameters columns with fill_value and also is possible use nunique for aggregate function:



          city_count = df.pivot_table(index = "Area", 
          values = "City",
          columns='Condition',
          aggfunc = lambda x : x.nunique(),
          margins = True,
          fill_value=0)
          print (city_count)
          Condition Bad Good All
          Area
          A 2 2 4
          B 0 1 1
          C 0 1 1
          D 0 1 1
          All 2 5 7


          Last if need convert index to column and change columns names:



          city_count = city_count.add_suffix('_count').reset_index().rename_axis(None, 1)
          print (city_count)
          Area Bad_count Good_count All_count
          0 A 2 2 4
          1 B 0 1 1
          2 C 0 1 1
          3 D 0 1 1
          4 All 2 5 7


          EDIT:



          d = 'City':'nunique','Population':'sum', 'good_city_count':'nunique'
          d1 = 'City':'city_count','Condition':'good_city_count'

          mask = df["Condition"] == 'Good'
          df = (df.assign(good_city_count = lambda x: np.where(mask, x['City'], np.nan))
          .groupby('Area')
          .agg(d)
          .rename(columns=d1))

          df = df.append(df.sum().rename('All')).reset_index()

          print (df)
          Area city_count Population good_city_count
          0 A 4 940 2
          1 B 1 50 1
          2 C 1 170 1
          3 D 1 80 1
          4 All 7 1240 5





          share|improve this answer


















          • 1




            This wouldn't give me the total_count of cities against the good_count
            – Pratiek Malhotra
            Nov 10 at 14:18










          • @PratiekMalhotra - sorry, you are right. rollback to previous answer.
            – jezrael
            Nov 10 at 14:20






          • 1




            @PratiekMalhotra - If your question was answered, please accept the most helpful answer by clicking the grey check to the left of that answer to toggle it green. It helps the community. Thanks.
            – jezrael
            Nov 10 at 14:24










          • @jeszrael, In dataframes where I need to aggregate values from multiple columns. I cannot use the "columns" parameter as it would aggregate all the values based on that.
            – Pratiek Malhotra
            Nov 10 at 14:31











          • @PratiekMalhotra - Can you create minimal, complete, and verifiable example? Because not sure if understand... Thank you
            – jezrael
            Nov 10 at 14:36














          up vote
          1
          down vote













          Add new parameters columns with fill_value and also is possible use nunique for aggregate function:



          city_count = df.pivot_table(index = "Area", 
          values = "City",
          columns='Condition',
          aggfunc = lambda x : x.nunique(),
          margins = True,
          fill_value=0)
          print (city_count)
          Condition Bad Good All
          Area
          A 2 2 4
          B 0 1 1
          C 0 1 1
          D 0 1 1
          All 2 5 7


          Last if need convert index to column and change columns names:



          city_count = city_count.add_suffix('_count').reset_index().rename_axis(None, 1)
          print (city_count)
          Area Bad_count Good_count All_count
          0 A 2 2 4
          1 B 0 1 1
          2 C 0 1 1
          3 D 0 1 1
          4 All 2 5 7


          EDIT:



          d = 'City':'nunique','Population':'sum', 'good_city_count':'nunique'
          d1 = 'City':'city_count','Condition':'good_city_count'

          mask = df["Condition"] == 'Good'
          df = (df.assign(good_city_count = lambda x: np.where(mask, x['City'], np.nan))
          .groupby('Area')
          .agg(d)
          .rename(columns=d1))

          df = df.append(df.sum().rename('All')).reset_index()

          print (df)
          Area city_count Population good_city_count
          0 A 4 940 2
          1 B 1 50 1
          2 C 1 170 1
          3 D 1 80 1
          4 All 7 1240 5





          share|improve this answer


















          • 1




            This wouldn't give me the total_count of cities against the good_count
            – Pratiek Malhotra
            Nov 10 at 14:18










          • @PratiekMalhotra - sorry, you are right. rollback to previous answer.
            – jezrael
            Nov 10 at 14:20






          • 1




            @PratiekMalhotra - If your question was answered, please accept the most helpful answer by clicking the grey check to the left of that answer to toggle it green. It helps the community. Thanks.
            – jezrael
            Nov 10 at 14:24










          • @jeszrael, In dataframes where I need to aggregate values from multiple columns. I cannot use the "columns" parameter as it would aggregate all the values based on that.
            – Pratiek Malhotra
            Nov 10 at 14:31











          • @PratiekMalhotra - Can you create minimal, complete, and verifiable example? Because not sure if understand... Thank you
            – jezrael
            Nov 10 at 14:36












          up vote
          1
          down vote










          up vote
          1
          down vote









          Add new parameters columns with fill_value and also is possible use nunique for aggregate function:



          city_count = df.pivot_table(index = "Area", 
          values = "City",
          columns='Condition',
          aggfunc = lambda x : x.nunique(),
          margins = True,
          fill_value=0)
          print (city_count)
          Condition Bad Good All
          Area
          A 2 2 4
          B 0 1 1
          C 0 1 1
          D 0 1 1
          All 2 5 7


          Last if need convert index to column and change columns names:



          city_count = city_count.add_suffix('_count').reset_index().rename_axis(None, 1)
          print (city_count)
          Area Bad_count Good_count All_count
          0 A 2 2 4
          1 B 0 1 1
          2 C 0 1 1
          3 D 0 1 1
          4 All 2 5 7


          EDIT:



          d = 'City':'nunique','Population':'sum', 'good_city_count':'nunique'
          d1 = 'City':'city_count','Condition':'good_city_count'

          mask = df["Condition"] == 'Good'
          df = (df.assign(good_city_count = lambda x: np.where(mask, x['City'], np.nan))
          .groupby('Area')
          .agg(d)
          .rename(columns=d1))

          df = df.append(df.sum().rename('All')).reset_index()

          print (df)
          Area city_count Population good_city_count
          0 A 4 940 2
          1 B 1 50 1
          2 C 1 170 1
          3 D 1 80 1
          4 All 7 1240 5





          share|improve this answer














          Add new parameters columns with fill_value and also is possible use nunique for aggregate function:



          city_count = df.pivot_table(index = "Area", 
          values = "City",
          columns='Condition',
          aggfunc = lambda x : x.nunique(),
          margins = True,
          fill_value=0)
          print (city_count)
          Condition Bad Good All
          Area
          A 2 2 4
          B 0 1 1
          C 0 1 1
          D 0 1 1
          All 2 5 7


          Last if need convert index to column and change columns names:



          city_count = city_count.add_suffix('_count').reset_index().rename_axis(None, 1)
          print (city_count)
          Area Bad_count Good_count All_count
          0 A 2 2 4
          1 B 0 1 1
          2 C 0 1 1
          3 D 0 1 1
          4 All 2 5 7


          EDIT:



          d = 'City':'nunique','Population':'sum', 'good_city_count':'nunique'
          d1 = 'City':'city_count','Condition':'good_city_count'

          mask = df["Condition"] == 'Good'
          df = (df.assign(good_city_count = lambda x: np.where(mask, x['City'], np.nan))
          .groupby('Area')
          .agg(d)
          .rename(columns=d1))

          df = df.append(df.sum().rename('All')).reset_index()

          print (df)
          Area city_count Population good_city_count
          0 A 4 940 2
          1 B 1 50 1
          2 C 1 170 1
          3 D 1 80 1
          4 All 7 1240 5






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 10 at 22:25

























          answered Nov 10 at 13:40









          jezrael

          306k20239314




          306k20239314







          • 1




            This wouldn't give me the total_count of cities against the good_count
            – Pratiek Malhotra
            Nov 10 at 14:18










          • @PratiekMalhotra - sorry, you are right. rollback to previous answer.
            – jezrael
            Nov 10 at 14:20






          • 1




            @PratiekMalhotra - If your question was answered, please accept the most helpful answer by clicking the grey check to the left of that answer to toggle it green. It helps the community. Thanks.
            – jezrael
            Nov 10 at 14:24










          • @jeszrael, In dataframes where I need to aggregate values from multiple columns. I cannot use the "columns" parameter as it would aggregate all the values based on that.
            – Pratiek Malhotra
            Nov 10 at 14:31











          • @PratiekMalhotra - Can you create minimal, complete, and verifiable example? Because not sure if understand... Thank you
            – jezrael
            Nov 10 at 14:36












          • 1




            This wouldn't give me the total_count of cities against the good_count
            – Pratiek Malhotra
            Nov 10 at 14:18










          • @PratiekMalhotra - sorry, you are right. rollback to previous answer.
            – jezrael
            Nov 10 at 14:20






          • 1




            @PratiekMalhotra - If your question was answered, please accept the most helpful answer by clicking the grey check to the left of that answer to toggle it green. It helps the community. Thanks.
            – jezrael
            Nov 10 at 14:24










          • @jeszrael, In dataframes where I need to aggregate values from multiple columns. I cannot use the "columns" parameter as it would aggregate all the values based on that.
            – Pratiek Malhotra
            Nov 10 at 14:31











          • @PratiekMalhotra - Can you create minimal, complete, and verifiable example? Because not sure if understand... Thank you
            – jezrael
            Nov 10 at 14:36







          1




          1




          This wouldn't give me the total_count of cities against the good_count
          – Pratiek Malhotra
          Nov 10 at 14:18




          This wouldn't give me the total_count of cities against the good_count
          – Pratiek Malhotra
          Nov 10 at 14:18












          @PratiekMalhotra - sorry, you are right. rollback to previous answer.
          – jezrael
          Nov 10 at 14:20




          @PratiekMalhotra - sorry, you are right. rollback to previous answer.
          – jezrael
          Nov 10 at 14:20




          1




          1




          @PratiekMalhotra - If your question was answered, please accept the most helpful answer by clicking the grey check to the left of that answer to toggle it green. It helps the community. Thanks.
          – jezrael
          Nov 10 at 14:24




          @PratiekMalhotra - If your question was answered, please accept the most helpful answer by clicking the grey check to the left of that answer to toggle it green. It helps the community. Thanks.
          – jezrael
          Nov 10 at 14:24












          @jeszrael, In dataframes where I need to aggregate values from multiple columns. I cannot use the "columns" parameter as it would aggregate all the values based on that.
          – Pratiek Malhotra
          Nov 10 at 14:31





          @jeszrael, In dataframes where I need to aggregate values from multiple columns. I cannot use the "columns" parameter as it would aggregate all the values based on that.
          – Pratiek Malhotra
          Nov 10 at 14:31













          @PratiekMalhotra - Can you create minimal, complete, and verifiable example? Because not sure if understand... Thank you
          – jezrael
          Nov 10 at 14:36




          @PratiekMalhotra - Can you create minimal, complete, and verifiable example? Because not sure if understand... Thank you
          – jezrael
          Nov 10 at 14:36












          up vote
          1
          down vote













          Another method without using pivot_table. Use np.where with groupby+agg:



          df['Condition'] = np.where(df['Condition']=='Good', df['City'], np.nan)
          df = df.groupby('Area').agg('City':'nunique', 'Condition':'nunique', 'Population':'sum')
          .rename(columns='City':'city_count', 'Condition':'good_city_count')
          df.loc['All',:] = df.sum()
          df = df.astype(int).reset_index()

          print(df)
          Area city_count good_city_count Population
          0 A 4 2 940
          1 B 1 1 50
          2 C 1 1 170
          3 D 1 1 80
          4 All 7 5 1240





          share|improve this answer






















          • When using this, the good city count will not be unique as it will be summing all the duplicate city appearances for Area as well
            – Pratiek Malhotra
            Nov 10 at 20:48










          • @PratiekMalhotra Check the update.
            – Sandeep Kadapa
            Nov 11 at 2:46














          up vote
          1
          down vote













          Another method without using pivot_table. Use np.where with groupby+agg:



          df['Condition'] = np.where(df['Condition']=='Good', df['City'], np.nan)
          df = df.groupby('Area').agg('City':'nunique', 'Condition':'nunique', 'Population':'sum')
          .rename(columns='City':'city_count', 'Condition':'good_city_count')
          df.loc['All',:] = df.sum()
          df = df.astype(int).reset_index()

          print(df)
          Area city_count good_city_count Population
          0 A 4 2 940
          1 B 1 1 50
          2 C 1 1 170
          3 D 1 1 80
          4 All 7 5 1240





          share|improve this answer






















          • When using this, the good city count will not be unique as it will be summing all the duplicate city appearances for Area as well
            – Pratiek Malhotra
            Nov 10 at 20:48










          • @PratiekMalhotra Check the update.
            – Sandeep Kadapa
            Nov 11 at 2:46












          up vote
          1
          down vote










          up vote
          1
          down vote









          Another method without using pivot_table. Use np.where with groupby+agg:



          df['Condition'] = np.where(df['Condition']=='Good', df['City'], np.nan)
          df = df.groupby('Area').agg('City':'nunique', 'Condition':'nunique', 'Population':'sum')
          .rename(columns='City':'city_count', 'Condition':'good_city_count')
          df.loc['All',:] = df.sum()
          df = df.astype(int).reset_index()

          print(df)
          Area city_count good_city_count Population
          0 A 4 2 940
          1 B 1 1 50
          2 C 1 1 170
          3 D 1 1 80
          4 All 7 5 1240





          share|improve this answer














          Another method without using pivot_table. Use np.where with groupby+agg:



          df['Condition'] = np.where(df['Condition']=='Good', df['City'], np.nan)
          df = df.groupby('Area').agg('City':'nunique', 'Condition':'nunique', 'Population':'sum')
          .rename(columns='City':'city_count', 'Condition':'good_city_count')
          df.loc['All',:] = df.sum()
          df = df.astype(int).reset_index()

          print(df)
          Area city_count good_city_count Population
          0 A 4 2 940
          1 B 1 1 50
          2 C 1 1 170
          3 D 1 1 80
          4 All 7 5 1240






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 11 at 2:44

























          answered Nov 10 at 13:48









          Sandeep Kadapa

          5,014426




          5,014426











          • When using this, the good city count will not be unique as it will be summing all the duplicate city appearances for Area as well
            – Pratiek Malhotra
            Nov 10 at 20:48










          • @PratiekMalhotra Check the update.
            – Sandeep Kadapa
            Nov 11 at 2:46
















          • When using this, the good city count will not be unique as it will be summing all the duplicate city appearances for Area as well
            – Pratiek Malhotra
            Nov 10 at 20:48










          • @PratiekMalhotra Check the update.
            – Sandeep Kadapa
            Nov 11 at 2:46















          When using this, the good city count will not be unique as it will be summing all the duplicate city appearances for Area as well
          – Pratiek Malhotra
          Nov 10 at 20:48




          When using this, the good city count will not be unique as it will be summing all the duplicate city appearances for Area as well
          – Pratiek Malhotra
          Nov 10 at 20:48












          @PratiekMalhotra Check the update.
          – Sandeep Kadapa
          Nov 11 at 2:46




          @PratiekMalhotra Check the update.
          – Sandeep Kadapa
          Nov 11 at 2:46










          Pratiek Malhotra is a new contributor. Be nice, and check out our Code of Conduct.









           

          draft saved


          draft discarded


















          Pratiek Malhotra is a new contributor. Be nice, and check out our Code of Conduct.












          Pratiek Malhotra is a new contributor. Be nice, and check out our Code of Conduct.











          Pratiek Malhotra is a new contributor. Be nice, and check out our Code of Conduct.













           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53239514%2fpandas-pivot-table-using-custom-conditions-on-the-dataframe%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          這個網誌中的熱門文章

          How to read a connectionString WITH PROVIDER in .NET Core?

          Node.js Script on GitHub Pages or Amazon S3

          Museum of Modern and Contemporary Art of Trento and Rovereto