Return values based on column separated by comma and two other columns










2














I have a table below:



My Table



I want to add a column (Evaluation) that returns one of the elements in the cars columns (separated by comma). The element to return will be based on the Ferrari and Toyota columns. The Evaluation column returns the element that the individual does not have. So take the first row, for example, John has one Ferrari and no Toyota. Since John has no Toyota, the evaluation column returns Toyota.



Result



I would prefer to have the decision made using the cars column, separating the text by comma and looking up the text against the values under Ferrari and Toyota










share|improve this question























  • Anybody with answers?
    – UJAY
    Nov 13 '18 at 17:16















2














I have a table below:



My Table



I want to add a column (Evaluation) that returns one of the elements in the cars columns (separated by comma). The element to return will be based on the Ferrari and Toyota columns. The Evaluation column returns the element that the individual does not have. So take the first row, for example, John has one Ferrari and no Toyota. Since John has no Toyota, the evaluation column returns Toyota.



Result



I would prefer to have the decision made using the cars column, separating the text by comma and looking up the text against the values under Ferrari and Toyota










share|improve this question























  • Anybody with answers?
    – UJAY
    Nov 13 '18 at 17:16













2












2








2







I have a table below:



My Table



I want to add a column (Evaluation) that returns one of the elements in the cars columns (separated by comma). The element to return will be based on the Ferrari and Toyota columns. The Evaluation column returns the element that the individual does not have. So take the first row, for example, John has one Ferrari and no Toyota. Since John has no Toyota, the evaluation column returns Toyota.



Result



I would prefer to have the decision made using the cars column, separating the text by comma and looking up the text against the values under Ferrari and Toyota










share|improve this question















I have a table below:



My Table



I want to add a column (Evaluation) that returns one of the elements in the cars columns (separated by comma). The element to return will be based on the Ferrari and Toyota columns. The Evaluation column returns the element that the individual does not have. So take the first row, for example, John has one Ferrari and no Toyota. Since John has no Toyota, the evaluation column returns Toyota.



Result



I would prefer to have the decision made using the cars column, separating the text by comma and looking up the text against the values under Ferrari and Toyota







python pandas






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 13 '18 at 0:48







UJAY

















asked Nov 12 '18 at 23:22









UJAYUJAY

374




374











  • Anybody with answers?
    – UJAY
    Nov 13 '18 at 17:16
















  • Anybody with answers?
    – UJAY
    Nov 13 '18 at 17:16















Anybody with answers?
– UJAY
Nov 13 '18 at 17:16




Anybody with answers?
– UJAY
Nov 13 '18 at 17:16












2 Answers
2






active

oldest

votes


















1














You can use:



df = pd.DataFrame('Names':['John'] * 2 + ['Peter'] * 2 + ['Sam'] * 2,
'Cars':['Ferrari, Toyota','Ferrari','Ferrari, Toyota','Ferrari',
'Ferrari, Toyota','Ferrari, Toyota'],
'Ferrari': [1,1,0,0,1,1],
'Toyota': [0,1,1,0,1,0])

df['Evaluation1'] = df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', ').str.rstrip(', ')
print (df)
Names Cars Ferrari Toyota Evaluation1
0 John Ferrari, Toyota 1 0 Toyota
1 John Ferrari 1 1
2 Peter Ferrari, Toyota 0 1 Ferrari
3 Peter Ferrari 0 0 Ferrari, Toyota
4 Sam Ferrari, Toyota 1 1
5 Sam Ferrari, Toyota 1 0 Toyota


Details:



First seelct all columns without first 2 by iloc and create boolean mask - compare by ne (!=):



print (df.iloc[:, 2:].ne(1))
Ferrari Toyota
0 False True
1 False False
2 True False
3 True True
4 False False
5 False True


Then use matrix multiplication by dot with columns names with separator:



print (df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', '))
0 Toyota,
1
2 Ferrari,
3 Ferrari, Toyota,
4
5 Toyota,
dtype: object


And remove last separator by rstrip:



print (df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', ').str.rstrip(', '))
0 Toyota
1
2 Ferrari
3 Ferrari, Toyota
4
5 Toyota
dtype: object


If not possible select by position because positions should be changed of columns without 0,1 is possible use drop for remove unecessary columns:



df1 = df.drop(['Names','Ferrari'], axis=1).ne(1)
df['Evaluation2'] = df1.dot(df1.columns + ', ').str.rstrip(', ')





share|improve this answer




























    0














    df = pd.DataFrame('a': [0,0,1,1], 'b': [0,1,0,1])


    Creates the following DataFrame:



     a b
    0 0 0
    1 0 1
    2 1 0
    3 1 1


    You can add a new column with a list of column names equal to zero with:



    df['evaluated'] = df.apply(lambda x: ','.join(df.columns[x == 0]), axis=1)


    Output:



     a b evaluated
    0 0 0 a,b
    1 0 1 a
    2 1 0 b
    3 1 1





    share|improve this answer




















      Your Answer






      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53271542%2freturn-values-based-on-column-separated-by-comma-and-two-other-columns%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      1














      You can use:



      df = pd.DataFrame('Names':['John'] * 2 + ['Peter'] * 2 + ['Sam'] * 2,
      'Cars':['Ferrari, Toyota','Ferrari','Ferrari, Toyota','Ferrari',
      'Ferrari, Toyota','Ferrari, Toyota'],
      'Ferrari': [1,1,0,0,1,1],
      'Toyota': [0,1,1,0,1,0])

      df['Evaluation1'] = df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', ').str.rstrip(', ')
      print (df)
      Names Cars Ferrari Toyota Evaluation1
      0 John Ferrari, Toyota 1 0 Toyota
      1 John Ferrari 1 1
      2 Peter Ferrari, Toyota 0 1 Ferrari
      3 Peter Ferrari 0 0 Ferrari, Toyota
      4 Sam Ferrari, Toyota 1 1
      5 Sam Ferrari, Toyota 1 0 Toyota


      Details:



      First seelct all columns without first 2 by iloc and create boolean mask - compare by ne (!=):



      print (df.iloc[:, 2:].ne(1))
      Ferrari Toyota
      0 False True
      1 False False
      2 True False
      3 True True
      4 False False
      5 False True


      Then use matrix multiplication by dot with columns names with separator:



      print (df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', '))
      0 Toyota,
      1
      2 Ferrari,
      3 Ferrari, Toyota,
      4
      5 Toyota,
      dtype: object


      And remove last separator by rstrip:



      print (df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', ').str.rstrip(', '))
      0 Toyota
      1
      2 Ferrari
      3 Ferrari, Toyota
      4
      5 Toyota
      dtype: object


      If not possible select by position because positions should be changed of columns without 0,1 is possible use drop for remove unecessary columns:



      df1 = df.drop(['Names','Ferrari'], axis=1).ne(1)
      df['Evaluation2'] = df1.dot(df1.columns + ', ').str.rstrip(', ')





      share|improve this answer

























        1














        You can use:



        df = pd.DataFrame('Names':['John'] * 2 + ['Peter'] * 2 + ['Sam'] * 2,
        'Cars':['Ferrari, Toyota','Ferrari','Ferrari, Toyota','Ferrari',
        'Ferrari, Toyota','Ferrari, Toyota'],
        'Ferrari': [1,1,0,0,1,1],
        'Toyota': [0,1,1,0,1,0])

        df['Evaluation1'] = df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', ').str.rstrip(', ')
        print (df)
        Names Cars Ferrari Toyota Evaluation1
        0 John Ferrari, Toyota 1 0 Toyota
        1 John Ferrari 1 1
        2 Peter Ferrari, Toyota 0 1 Ferrari
        3 Peter Ferrari 0 0 Ferrari, Toyota
        4 Sam Ferrari, Toyota 1 1
        5 Sam Ferrari, Toyota 1 0 Toyota


        Details:



        First seelct all columns without first 2 by iloc and create boolean mask - compare by ne (!=):



        print (df.iloc[:, 2:].ne(1))
        Ferrari Toyota
        0 False True
        1 False False
        2 True False
        3 True True
        4 False False
        5 False True


        Then use matrix multiplication by dot with columns names with separator:



        print (df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', '))
        0 Toyota,
        1
        2 Ferrari,
        3 Ferrari, Toyota,
        4
        5 Toyota,
        dtype: object


        And remove last separator by rstrip:



        print (df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', ').str.rstrip(', '))
        0 Toyota
        1
        2 Ferrari
        3 Ferrari, Toyota
        4
        5 Toyota
        dtype: object


        If not possible select by position because positions should be changed of columns without 0,1 is possible use drop for remove unecessary columns:



        df1 = df.drop(['Names','Ferrari'], axis=1).ne(1)
        df['Evaluation2'] = df1.dot(df1.columns + ', ').str.rstrip(', ')





        share|improve this answer























          1












          1








          1






          You can use:



          df = pd.DataFrame('Names':['John'] * 2 + ['Peter'] * 2 + ['Sam'] * 2,
          'Cars':['Ferrari, Toyota','Ferrari','Ferrari, Toyota','Ferrari',
          'Ferrari, Toyota','Ferrari, Toyota'],
          'Ferrari': [1,1,0,0,1,1],
          'Toyota': [0,1,1,0,1,0])

          df['Evaluation1'] = df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', ').str.rstrip(', ')
          print (df)
          Names Cars Ferrari Toyota Evaluation1
          0 John Ferrari, Toyota 1 0 Toyota
          1 John Ferrari 1 1
          2 Peter Ferrari, Toyota 0 1 Ferrari
          3 Peter Ferrari 0 0 Ferrari, Toyota
          4 Sam Ferrari, Toyota 1 1
          5 Sam Ferrari, Toyota 1 0 Toyota


          Details:



          First seelct all columns without first 2 by iloc and create boolean mask - compare by ne (!=):



          print (df.iloc[:, 2:].ne(1))
          Ferrari Toyota
          0 False True
          1 False False
          2 True False
          3 True True
          4 False False
          5 False True


          Then use matrix multiplication by dot with columns names with separator:



          print (df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', '))
          0 Toyota,
          1
          2 Ferrari,
          3 Ferrari, Toyota,
          4
          5 Toyota,
          dtype: object


          And remove last separator by rstrip:



          print (df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', ').str.rstrip(', '))
          0 Toyota
          1
          2 Ferrari
          3 Ferrari, Toyota
          4
          5 Toyota
          dtype: object


          If not possible select by position because positions should be changed of columns without 0,1 is possible use drop for remove unecessary columns:



          df1 = df.drop(['Names','Ferrari'], axis=1).ne(1)
          df['Evaluation2'] = df1.dot(df1.columns + ', ').str.rstrip(', ')





          share|improve this answer












          You can use:



          df = pd.DataFrame('Names':['John'] * 2 + ['Peter'] * 2 + ['Sam'] * 2,
          'Cars':['Ferrari, Toyota','Ferrari','Ferrari, Toyota','Ferrari',
          'Ferrari, Toyota','Ferrari, Toyota'],
          'Ferrari': [1,1,0,0,1,1],
          'Toyota': [0,1,1,0,1,0])

          df['Evaluation1'] = df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', ').str.rstrip(', ')
          print (df)
          Names Cars Ferrari Toyota Evaluation1
          0 John Ferrari, Toyota 1 0 Toyota
          1 John Ferrari 1 1
          2 Peter Ferrari, Toyota 0 1 Ferrari
          3 Peter Ferrari 0 0 Ferrari, Toyota
          4 Sam Ferrari, Toyota 1 1
          5 Sam Ferrari, Toyota 1 0 Toyota


          Details:



          First seelct all columns without first 2 by iloc and create boolean mask - compare by ne (!=):



          print (df.iloc[:, 2:].ne(1))
          Ferrari Toyota
          0 False True
          1 False False
          2 True False
          3 True True
          4 False False
          5 False True


          Then use matrix multiplication by dot with columns names with separator:



          print (df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', '))
          0 Toyota,
          1
          2 Ferrari,
          3 Ferrari, Toyota,
          4
          5 Toyota,
          dtype: object


          And remove last separator by rstrip:



          print (df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', ').str.rstrip(', '))
          0 Toyota
          1
          2 Ferrari
          3 Ferrari, Toyota
          4
          5 Toyota
          dtype: object


          If not possible select by position because positions should be changed of columns without 0,1 is possible use drop for remove unecessary columns:



          df1 = df.drop(['Names','Ferrari'], axis=1).ne(1)
          df['Evaluation2'] = df1.dot(df1.columns + ', ').str.rstrip(', ')






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 15 '18 at 6:21









          jezraeljezrael

          322k23265342




          322k23265342























              0














              df = pd.DataFrame('a': [0,0,1,1], 'b': [0,1,0,1])


              Creates the following DataFrame:



               a b
              0 0 0
              1 0 1
              2 1 0
              3 1 1


              You can add a new column with a list of column names equal to zero with:



              df['evaluated'] = df.apply(lambda x: ','.join(df.columns[x == 0]), axis=1)


              Output:



               a b evaluated
              0 0 0 a,b
              1 0 1 a
              2 1 0 b
              3 1 1





              share|improve this answer

























                0














                df = pd.DataFrame('a': [0,0,1,1], 'b': [0,1,0,1])


                Creates the following DataFrame:



                 a b
                0 0 0
                1 0 1
                2 1 0
                3 1 1


                You can add a new column with a list of column names equal to zero with:



                df['evaluated'] = df.apply(lambda x: ','.join(df.columns[x == 0]), axis=1)


                Output:



                 a b evaluated
                0 0 0 a,b
                1 0 1 a
                2 1 0 b
                3 1 1





                share|improve this answer























                  0












                  0








                  0






                  df = pd.DataFrame('a': [0,0,1,1], 'b': [0,1,0,1])


                  Creates the following DataFrame:



                   a b
                  0 0 0
                  1 0 1
                  2 1 0
                  3 1 1


                  You can add a new column with a list of column names equal to zero with:



                  df['evaluated'] = df.apply(lambda x: ','.join(df.columns[x == 0]), axis=1)


                  Output:



                   a b evaluated
                  0 0 0 a,b
                  1 0 1 a
                  2 1 0 b
                  3 1 1





                  share|improve this answer












                  df = pd.DataFrame('a': [0,0,1,1], 'b': [0,1,0,1])


                  Creates the following DataFrame:



                   a b
                  0 0 0
                  1 0 1
                  2 1 0
                  3 1 1


                  You can add a new column with a list of column names equal to zero with:



                  df['evaluated'] = df.apply(lambda x: ','.join(df.columns[x == 0]), axis=1)


                  Output:



                   a b evaluated
                  0 0 0 a,b
                  1 0 1 a
                  2 1 0 b
                  3 1 1






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 12 '18 at 23:40









                  miles82miles82

                  4,8303023




                  4,8303023



























                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53271542%2freturn-values-based-on-column-separated-by-comma-and-two-other-columns%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      這個網誌中的熱門文章

                      How to read a connectionString WITH PROVIDER in .NET Core?

                      In R, how to develop a multiplot heatmap.2 figure showing key labels successfully

                      Museum of Modern and Contemporary Art of Trento and Rovereto