Return values based on column separated by comma and two other columns
I have a table below:
I want to add a column (Evaluation) that returns one of the elements in the cars columns (separated by comma). The element to return will be based on the Ferrari and Toyota columns. The Evaluation column returns the element that the individual does not have. So take the first row, for example, John has one Ferrari and no Toyota. Since John has no Toyota, the evaluation column returns Toyota.
I would prefer to have the decision made using the cars column, separating the text by comma and looking up the text against the values under Ferrari and Toyota
python pandas
add a comment |
I have a table below:
I want to add a column (Evaluation) that returns one of the elements in the cars columns (separated by comma). The element to return will be based on the Ferrari and Toyota columns. The Evaluation column returns the element that the individual does not have. So take the first row, for example, John has one Ferrari and no Toyota. Since John has no Toyota, the evaluation column returns Toyota.
I would prefer to have the decision made using the cars column, separating the text by comma and looking up the text against the values under Ferrari and Toyota
python pandas
Anybody with answers?
– UJAY
Nov 13 '18 at 17:16
add a comment |
I have a table below:
I want to add a column (Evaluation) that returns one of the elements in the cars columns (separated by comma). The element to return will be based on the Ferrari and Toyota columns. The Evaluation column returns the element that the individual does not have. So take the first row, for example, John has one Ferrari and no Toyota. Since John has no Toyota, the evaluation column returns Toyota.
I would prefer to have the decision made using the cars column, separating the text by comma and looking up the text against the values under Ferrari and Toyota
python pandas
I have a table below:
I want to add a column (Evaluation) that returns one of the elements in the cars columns (separated by comma). The element to return will be based on the Ferrari and Toyota columns. The Evaluation column returns the element that the individual does not have. So take the first row, for example, John has one Ferrari and no Toyota. Since John has no Toyota, the evaluation column returns Toyota.
I would prefer to have the decision made using the cars column, separating the text by comma and looking up the text against the values under Ferrari and Toyota
python pandas
python pandas
edited Nov 13 '18 at 0:48
UJAY
asked Nov 12 '18 at 23:22
UJAYUJAY
374
374
Anybody with answers?
– UJAY
Nov 13 '18 at 17:16
add a comment |
Anybody with answers?
– UJAY
Nov 13 '18 at 17:16
Anybody with answers?
– UJAY
Nov 13 '18 at 17:16
Anybody with answers?
– UJAY
Nov 13 '18 at 17:16
add a comment |
2 Answers
2
active
oldest
votes
You can use:
df = pd.DataFrame('Names':['John'] * 2 + ['Peter'] * 2 + ['Sam'] * 2,
'Cars':['Ferrari, Toyota','Ferrari','Ferrari, Toyota','Ferrari',
'Ferrari, Toyota','Ferrari, Toyota'],
'Ferrari': [1,1,0,0,1,1],
'Toyota': [0,1,1,0,1,0])
df['Evaluation1'] = df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', ').str.rstrip(', ')
print (df)
Names Cars Ferrari Toyota Evaluation1
0 John Ferrari, Toyota 1 0 Toyota
1 John Ferrari 1 1
2 Peter Ferrari, Toyota 0 1 Ferrari
3 Peter Ferrari 0 0 Ferrari, Toyota
4 Sam Ferrari, Toyota 1 1
5 Sam Ferrari, Toyota 1 0 Toyota
Details:
First seelct all columns without first 2 by iloc
and create boolean mask - compare by ne
(!=
):
print (df.iloc[:, 2:].ne(1))
Ferrari Toyota
0 False True
1 False False
2 True False
3 True True
4 False False
5 False True
Then use matrix multiplication by dot
with columns names with separator:
print (df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', '))
0 Toyota,
1
2 Ferrari,
3 Ferrari, Toyota,
4
5 Toyota,
dtype: object
And remove last separator by rstrip
:
print (df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', ').str.rstrip(', '))
0 Toyota
1
2 Ferrari
3 Ferrari, Toyota
4
5 Toyota
dtype: object
If not possible select by position because positions should be changed of columns without 0,1
is possible use drop
for remove unecessary columns:
df1 = df.drop(['Names','Ferrari'], axis=1).ne(1)
df['Evaluation2'] = df1.dot(df1.columns + ', ').str.rstrip(', ')
add a comment |
df = pd.DataFrame('a': [0,0,1,1], 'b': [0,1,0,1])
Creates the following DataFrame:
a b
0 0 0
1 0 1
2 1 0
3 1 1
You can add a new column with a list of column names equal to zero with:
df['evaluated'] = df.apply(lambda x: ','.join(df.columns[x == 0]), axis=1)
Output:
a b evaluated
0 0 0 a,b
1 0 1 a
2 1 0 b
3 1 1
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53271542%2freturn-values-based-on-column-separated-by-comma-and-two-other-columns%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can use:
df = pd.DataFrame('Names':['John'] * 2 + ['Peter'] * 2 + ['Sam'] * 2,
'Cars':['Ferrari, Toyota','Ferrari','Ferrari, Toyota','Ferrari',
'Ferrari, Toyota','Ferrari, Toyota'],
'Ferrari': [1,1,0,0,1,1],
'Toyota': [0,1,1,0,1,0])
df['Evaluation1'] = df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', ').str.rstrip(', ')
print (df)
Names Cars Ferrari Toyota Evaluation1
0 John Ferrari, Toyota 1 0 Toyota
1 John Ferrari 1 1
2 Peter Ferrari, Toyota 0 1 Ferrari
3 Peter Ferrari 0 0 Ferrari, Toyota
4 Sam Ferrari, Toyota 1 1
5 Sam Ferrari, Toyota 1 0 Toyota
Details:
First seelct all columns without first 2 by iloc
and create boolean mask - compare by ne
(!=
):
print (df.iloc[:, 2:].ne(1))
Ferrari Toyota
0 False True
1 False False
2 True False
3 True True
4 False False
5 False True
Then use matrix multiplication by dot
with columns names with separator:
print (df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', '))
0 Toyota,
1
2 Ferrari,
3 Ferrari, Toyota,
4
5 Toyota,
dtype: object
And remove last separator by rstrip
:
print (df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', ').str.rstrip(', '))
0 Toyota
1
2 Ferrari
3 Ferrari, Toyota
4
5 Toyota
dtype: object
If not possible select by position because positions should be changed of columns without 0,1
is possible use drop
for remove unecessary columns:
df1 = df.drop(['Names','Ferrari'], axis=1).ne(1)
df['Evaluation2'] = df1.dot(df1.columns + ', ').str.rstrip(', ')
add a comment |
You can use:
df = pd.DataFrame('Names':['John'] * 2 + ['Peter'] * 2 + ['Sam'] * 2,
'Cars':['Ferrari, Toyota','Ferrari','Ferrari, Toyota','Ferrari',
'Ferrari, Toyota','Ferrari, Toyota'],
'Ferrari': [1,1,0,0,1,1],
'Toyota': [0,1,1,0,1,0])
df['Evaluation1'] = df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', ').str.rstrip(', ')
print (df)
Names Cars Ferrari Toyota Evaluation1
0 John Ferrari, Toyota 1 0 Toyota
1 John Ferrari 1 1
2 Peter Ferrari, Toyota 0 1 Ferrari
3 Peter Ferrari 0 0 Ferrari, Toyota
4 Sam Ferrari, Toyota 1 1
5 Sam Ferrari, Toyota 1 0 Toyota
Details:
First seelct all columns without first 2 by iloc
and create boolean mask - compare by ne
(!=
):
print (df.iloc[:, 2:].ne(1))
Ferrari Toyota
0 False True
1 False False
2 True False
3 True True
4 False False
5 False True
Then use matrix multiplication by dot
with columns names with separator:
print (df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', '))
0 Toyota,
1
2 Ferrari,
3 Ferrari, Toyota,
4
5 Toyota,
dtype: object
And remove last separator by rstrip
:
print (df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', ').str.rstrip(', '))
0 Toyota
1
2 Ferrari
3 Ferrari, Toyota
4
5 Toyota
dtype: object
If not possible select by position because positions should be changed of columns without 0,1
is possible use drop
for remove unecessary columns:
df1 = df.drop(['Names','Ferrari'], axis=1).ne(1)
df['Evaluation2'] = df1.dot(df1.columns + ', ').str.rstrip(', ')
add a comment |
You can use:
df = pd.DataFrame('Names':['John'] * 2 + ['Peter'] * 2 + ['Sam'] * 2,
'Cars':['Ferrari, Toyota','Ferrari','Ferrari, Toyota','Ferrari',
'Ferrari, Toyota','Ferrari, Toyota'],
'Ferrari': [1,1,0,0,1,1],
'Toyota': [0,1,1,0,1,0])
df['Evaluation1'] = df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', ').str.rstrip(', ')
print (df)
Names Cars Ferrari Toyota Evaluation1
0 John Ferrari, Toyota 1 0 Toyota
1 John Ferrari 1 1
2 Peter Ferrari, Toyota 0 1 Ferrari
3 Peter Ferrari 0 0 Ferrari, Toyota
4 Sam Ferrari, Toyota 1 1
5 Sam Ferrari, Toyota 1 0 Toyota
Details:
First seelct all columns without first 2 by iloc
and create boolean mask - compare by ne
(!=
):
print (df.iloc[:, 2:].ne(1))
Ferrari Toyota
0 False True
1 False False
2 True False
3 True True
4 False False
5 False True
Then use matrix multiplication by dot
with columns names with separator:
print (df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', '))
0 Toyota,
1
2 Ferrari,
3 Ferrari, Toyota,
4
5 Toyota,
dtype: object
And remove last separator by rstrip
:
print (df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', ').str.rstrip(', '))
0 Toyota
1
2 Ferrari
3 Ferrari, Toyota
4
5 Toyota
dtype: object
If not possible select by position because positions should be changed of columns without 0,1
is possible use drop
for remove unecessary columns:
df1 = df.drop(['Names','Ferrari'], axis=1).ne(1)
df['Evaluation2'] = df1.dot(df1.columns + ', ').str.rstrip(', ')
You can use:
df = pd.DataFrame('Names':['John'] * 2 + ['Peter'] * 2 + ['Sam'] * 2,
'Cars':['Ferrari, Toyota','Ferrari','Ferrari, Toyota','Ferrari',
'Ferrari, Toyota','Ferrari, Toyota'],
'Ferrari': [1,1,0,0,1,1],
'Toyota': [0,1,1,0,1,0])
df['Evaluation1'] = df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', ').str.rstrip(', ')
print (df)
Names Cars Ferrari Toyota Evaluation1
0 John Ferrari, Toyota 1 0 Toyota
1 John Ferrari 1 1
2 Peter Ferrari, Toyota 0 1 Ferrari
3 Peter Ferrari 0 0 Ferrari, Toyota
4 Sam Ferrari, Toyota 1 1
5 Sam Ferrari, Toyota 1 0 Toyota
Details:
First seelct all columns without first 2 by iloc
and create boolean mask - compare by ne
(!=
):
print (df.iloc[:, 2:].ne(1))
Ferrari Toyota
0 False True
1 False False
2 True False
3 True True
4 False False
5 False True
Then use matrix multiplication by dot
with columns names with separator:
print (df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', '))
0 Toyota,
1
2 Ferrari,
3 Ferrari, Toyota,
4
5 Toyota,
dtype: object
And remove last separator by rstrip
:
print (df.iloc[:, 2:].ne(1).dot(df.columns[2:] + ', ').str.rstrip(', '))
0 Toyota
1
2 Ferrari
3 Ferrari, Toyota
4
5 Toyota
dtype: object
If not possible select by position because positions should be changed of columns without 0,1
is possible use drop
for remove unecessary columns:
df1 = df.drop(['Names','Ferrari'], axis=1).ne(1)
df['Evaluation2'] = df1.dot(df1.columns + ', ').str.rstrip(', ')
answered Nov 15 '18 at 6:21
jezraeljezrael
322k23265342
322k23265342
add a comment |
add a comment |
df = pd.DataFrame('a': [0,0,1,1], 'b': [0,1,0,1])
Creates the following DataFrame:
a b
0 0 0
1 0 1
2 1 0
3 1 1
You can add a new column with a list of column names equal to zero with:
df['evaluated'] = df.apply(lambda x: ','.join(df.columns[x == 0]), axis=1)
Output:
a b evaluated
0 0 0 a,b
1 0 1 a
2 1 0 b
3 1 1
add a comment |
df = pd.DataFrame('a': [0,0,1,1], 'b': [0,1,0,1])
Creates the following DataFrame:
a b
0 0 0
1 0 1
2 1 0
3 1 1
You can add a new column with a list of column names equal to zero with:
df['evaluated'] = df.apply(lambda x: ','.join(df.columns[x == 0]), axis=1)
Output:
a b evaluated
0 0 0 a,b
1 0 1 a
2 1 0 b
3 1 1
add a comment |
df = pd.DataFrame('a': [0,0,1,1], 'b': [0,1,0,1])
Creates the following DataFrame:
a b
0 0 0
1 0 1
2 1 0
3 1 1
You can add a new column with a list of column names equal to zero with:
df['evaluated'] = df.apply(lambda x: ','.join(df.columns[x == 0]), axis=1)
Output:
a b evaluated
0 0 0 a,b
1 0 1 a
2 1 0 b
3 1 1
df = pd.DataFrame('a': [0,0,1,1], 'b': [0,1,0,1])
Creates the following DataFrame:
a b
0 0 0
1 0 1
2 1 0
3 1 1
You can add a new column with a list of column names equal to zero with:
df['evaluated'] = df.apply(lambda x: ','.join(df.columns[x == 0]), axis=1)
Output:
a b evaluated
0 0 0 a,b
1 0 1 a
2 1 0 b
3 1 1
answered Nov 12 '18 at 23:40
miles82miles82
4,8303023
4,8303023
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53271542%2freturn-values-based-on-column-separated-by-comma-and-two-other-columns%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Anybody with answers?
– UJAY
Nov 13 '18 at 17:16