Iterating over rules using if else

I have a main table with the rates of each product and their respective categories for package and risk.

df = pd.DataFrame('package': 0: 'basic', 1: 'medium', 2: 'premium', 3:'basic', 4:'medium', 5:'premium',
 'risk_bin': 0: 'good/mid', 1: 'good/mid', 2: 'good/mid', 3:'bad', 4:'bad',5:'bad',
 'A': 0:0.012,1:0.022,2:0.032,3:0.05,4:0.06,5:0.07,
 'B': 0:0.013,1:0.023,2:0.033,3:0.051,4:0.061,5:0.071,
 'C': 0:0.014,1:0.024,2:0.034,3:0.052,4:0.062,5:0.072,
 'D': 0:0.015,1:0.025,2:0.035,3:0.053,4:0.063,5:0.073)
df = df[df.columns[[4,5,0,1,2,3]]]

On the second table, I get the user options, and the user will be able to create as many rules as he wants for these products based on the other product rates. And these rules can apply for only a certain package or risk bin.

So, for the example below, the product B will have the rate of product A plus 5% only for basic package and good/mid risk. The product C will have the rates of D plus 10% for all packages and only for bad risk.

rules = pd.DataFrame('rule': 0: '1', 1: '2',
 'product1': 0: 'B', 1: 'C',
 'relantioship': 0:'=',1:'=',
 'product2': 0:'A',1:'D',
 'symbol': 0:'+',1:'-',
 'value': 0:0.05,1:0.10,
 'package':0:'basic',1:'all',
 'risk': 0:'good/mid', 1:'bad')
 rules = rules[rules.columns[[5,1,3,2,6,7,0,4]]]

Because I can have as many rules as the user wants I need to create a loop, and then pass the values accordingly to the relationships defined.

df2 = df.reset_index()

rules_nc = rules['rule'].get_values()
nc_cnt = rules_nc.size 

for i in range(nc_cnt):
 if pd.isnull(rules['rule'][i]):
 break
 product_1 = rules['product1'][i]
 product_2 = rules['product2'][i]
 sym = str(rules['symbol'][i])
 val = rules['value'][i]
 pack= rules['package'][i]
 risk = rules['risk'][i] 

if (df2['risk_bin']==risk) & (df2['package']==pack):
 if sym=='+':
 df2[product_1] = df2[product_2] + val
 if sym=='-':
 df2[product_1] = df2[product_2] - val 
else:
 df2[product_1] = df2[product_1]

When I do this, I get the error below:

 The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

This is the output that I'm expecting for this set of rules.

 results = pd.DataFrame('package': 0: 'basic', 1: 'medium', 2: 'premium', 3:'basic', 4:'medium', 5:'premium',
 'risk_bin': 0: 'good/mid', 1: 'good/mid', 2: 'good/mid', 3:'bad', 4:'bad',5:'bad',
 'A': 0:0.012,1:0.022,2:0.032,3:0.05,4:0.06,5:0.07,
 'B': 0:0.062,1:0.023,2:0.033,3:0.1,4:0.061,5:0.071,
 'C': 0:0.014,1:0.024,2:0.034,3:0.153,4:0.163,5:0.173,
 'D': 0:0.015,1:0.025,2:0.035,3:0.053,4:0.063,5:0.073)

results = results[results.columns[[4,5,0,1,2,3]]]

Can you please help me?
Thanks a lot!

edited Nov 14 '18 at 14:42

asked Nov 13 '18 at 23:21

Ingrid

226

1

Can you provide the output you'd expect given the input you provided? This is likely a job for pandas.merge() rather than a iteration: pandas.pydata.org/pandas-docs/stable/generated/…

– smj
Nov 14 '18 at 0:10

@smj the rules and number of rules can change accordingly to the user, how I suppose to use pandas.merge() to solve this? Can you give an example? I just added the output expected. Thanks

– Ingrid
Nov 14 '18 at 14:17

Could you try to explain in words what is your loop trying to do? It's not quite clear to me.

– user32185
Nov 14 '18 at 14:36

@user32185 just added in words. thanks

– Ingrid
Nov 14 '18 at 14:43

If you want I can share a solution that works per batch of rules. But it could lead to weird results if the rules are not univocal.

– user32185
Nov 14 '18 at 15:52

|
show 1 more comment

I have a main table with the rates of each product and their respective categories for package and risk.

df = pd.DataFrame('package': 0: 'basic', 1: 'medium', 2: 'premium', 3:'basic', 4:'medium', 5:'premium',
 'risk_bin': 0: 'good/mid', 1: 'good/mid', 2: 'good/mid', 3:'bad', 4:'bad',5:'bad',
 'A': 0:0.012,1:0.022,2:0.032,3:0.05,4:0.06,5:0.07,
 'B': 0:0.013,1:0.023,2:0.033,3:0.051,4:0.061,5:0.071,
 'C': 0:0.014,1:0.024,2:0.034,3:0.052,4:0.062,5:0.072,
 'D': 0:0.015,1:0.025,2:0.035,3:0.053,4:0.063,5:0.073)
df = df[df.columns[[4,5,0,1,2,3]]]

rules = pd.DataFrame('rule': 0: '1', 1: '2',
 'product1': 0: 'B', 1: 'C',
 'relantioship': 0:'=',1:'=',
 'product2': 0:'A',1:'D',
 'symbol': 0:'+',1:'-',
 'value': 0:0.05,1:0.10,
 'package':0:'basic',1:'all',
 'risk': 0:'good/mid', 1:'bad')
 rules = rules[rules.columns[[5,1,3,2,6,7,0,4]]]

Because I can have as many rules as the user wants I need to create a loop, and then pass the values accordingly to the relationships defined.

df2 = df.reset_index()

rules_nc = rules['rule'].get_values()
nc_cnt = rules_nc.size 

for i in range(nc_cnt):
 if pd.isnull(rules['rule'][i]):
 break
 product_1 = rules['product1'][i]
 product_2 = rules['product2'][i]
 sym = str(rules['symbol'][i])
 val = rules['value'][i]
 pack= rules['package'][i]
 risk = rules['risk'][i] 

if (df2['risk_bin']==risk) & (df2['package']==pack):
 if sym=='+':
 df2[product_1] = df2[product_2] + val
 if sym=='-':
 df2[product_1] = df2[product_2] - val 
else:
 df2[product_1] = df2[product_1]

When I do this, I get the error below:

 The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

This is the output that I'm expecting for this set of rules.

 results = pd.DataFrame('package': 0: 'basic', 1: 'medium', 2: 'premium', 3:'basic', 4:'medium', 5:'premium',
 'risk_bin': 0: 'good/mid', 1: 'good/mid', 2: 'good/mid', 3:'bad', 4:'bad',5:'bad',
 'A': 0:0.012,1:0.022,2:0.032,3:0.05,4:0.06,5:0.07,
 'B': 0:0.062,1:0.023,2:0.033,3:0.1,4:0.061,5:0.071,
 'C': 0:0.014,1:0.024,2:0.034,3:0.153,4:0.163,5:0.173,
 'D': 0:0.015,1:0.025,2:0.035,3:0.053,4:0.063,5:0.073)

results = results[results.columns[[4,5,0,1,2,3]]]

Can you please help me?
Thanks a lot!

edited Nov 14 '18 at 14:42

asked Nov 13 '18 at 23:21

Ingrid

226

1

Can you provide the output you'd expect given the input you provided? This is likely a job for pandas.merge() rather than a iteration: pandas.pydata.org/pandas-docs/stable/generated/…

– smj
Nov 14 '18 at 0:10

@smj the rules and number of rules can change accordingly to the user, how I suppose to use pandas.merge() to solve this? Can you give an example? I just added the output expected. Thanks

– Ingrid
Nov 14 '18 at 14:17

Could you try to explain in words what is your loop trying to do? It's not quite clear to me.

– user32185
Nov 14 '18 at 14:36

@user32185 just added in words. thanks

– Ingrid
Nov 14 '18 at 14:43

If you want I can share a solution that works per batch of rules. But it could lead to weird results if the rules are not univocal.

– user32185
Nov 14 '18 at 15:52

|
show 1 more comment

I have a main table with the rates of each product and their respective categories for package and risk.

df = pd.DataFrame('package': 0: 'basic', 1: 'medium', 2: 'premium', 3:'basic', 4:'medium', 5:'premium',
 'risk_bin': 0: 'good/mid', 1: 'good/mid', 2: 'good/mid', 3:'bad', 4:'bad',5:'bad',
 'A': 0:0.012,1:0.022,2:0.032,3:0.05,4:0.06,5:0.07,
 'B': 0:0.013,1:0.023,2:0.033,3:0.051,4:0.061,5:0.071,
 'C': 0:0.014,1:0.024,2:0.034,3:0.052,4:0.062,5:0.072,
 'D': 0:0.015,1:0.025,2:0.035,3:0.053,4:0.063,5:0.073)
df = df[df.columns[[4,5,0,1,2,3]]]

rules = pd.DataFrame('rule': 0: '1', 1: '2',
 'product1': 0: 'B', 1: 'C',
 'relantioship': 0:'=',1:'=',
 'product2': 0:'A',1:'D',
 'symbol': 0:'+',1:'-',
 'value': 0:0.05,1:0.10,
 'package':0:'basic',1:'all',
 'risk': 0:'good/mid', 1:'bad')
 rules = rules[rules.columns[[5,1,3,2,6,7,0,4]]]

Because I can have as many rules as the user wants I need to create a loop, and then pass the values accordingly to the relationships defined.

df2 = df.reset_index()

rules_nc = rules['rule'].get_values()
nc_cnt = rules_nc.size 

for i in range(nc_cnt):
 if pd.isnull(rules['rule'][i]):
 break
 product_1 = rules['product1'][i]
 product_2 = rules['product2'][i]
 sym = str(rules['symbol'][i])
 val = rules['value'][i]
 pack= rules['package'][i]
 risk = rules['risk'][i] 

if (df2['risk_bin']==risk) & (df2['package']==pack):
 if sym=='+':
 df2[product_1] = df2[product_2] + val
 if sym=='-':
 df2[product_1] = df2[product_2] - val 
else:
 df2[product_1] = df2[product_1]

When I do this, I get the error below:

 The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

This is the output that I'm expecting for this set of rules.

 results = pd.DataFrame('package': 0: 'basic', 1: 'medium', 2: 'premium', 3:'basic', 4:'medium', 5:'premium',
 'risk_bin': 0: 'good/mid', 1: 'good/mid', 2: 'good/mid', 3:'bad', 4:'bad',5:'bad',
 'A': 0:0.012,1:0.022,2:0.032,3:0.05,4:0.06,5:0.07,
 'B': 0:0.062,1:0.023,2:0.033,3:0.1,4:0.061,5:0.071,
 'C': 0:0.014,1:0.024,2:0.034,3:0.153,4:0.163,5:0.173,
 'D': 0:0.015,1:0.025,2:0.035,3:0.053,4:0.063,5:0.073)

results = results[results.columns[[4,5,0,1,2,3]]]

Can you please help me?
Thanks a lot!

edited Nov 14 '18 at 14:42

asked Nov 13 '18 at 23:21

Ingrid

226

I have a main table with the rates of each product and their respective categories for package and risk.

df = pd.DataFrame('package': 0: 'basic', 1: 'medium', 2: 'premium', 3:'basic', 4:'medium', 5:'premium',
 'risk_bin': 0: 'good/mid', 1: 'good/mid', 2: 'good/mid', 3:'bad', 4:'bad',5:'bad',
 'A': 0:0.012,1:0.022,2:0.032,3:0.05,4:0.06,5:0.07,
 'B': 0:0.013,1:0.023,2:0.033,3:0.051,4:0.061,5:0.071,
 'C': 0:0.014,1:0.024,2:0.034,3:0.052,4:0.062,5:0.072,
 'D': 0:0.015,1:0.025,2:0.035,3:0.053,4:0.063,5:0.073)
df = df[df.columns[[4,5,0,1,2,3]]]

rules = pd.DataFrame('rule': 0: '1', 1: '2',
 'product1': 0: 'B', 1: 'C',
 'relantioship': 0:'=',1:'=',
 'product2': 0:'A',1:'D',
 'symbol': 0:'+',1:'-',
 'value': 0:0.05,1:0.10,
 'package':0:'basic',1:'all',
 'risk': 0:'good/mid', 1:'bad')
 rules = rules[rules.columns[[5,1,3,2,6,7,0,4]]]

Because I can have as many rules as the user wants I need to create a loop, and then pass the values accordingly to the relationships defined.

df2 = df.reset_index()

rules_nc = rules['rule'].get_values()
nc_cnt = rules_nc.size 

for i in range(nc_cnt):
 if pd.isnull(rules['rule'][i]):
 break
 product_1 = rules['product1'][i]
 product_2 = rules['product2'][i]
 sym = str(rules['symbol'][i])
 val = rules['value'][i]
 pack= rules['package'][i]
 risk = rules['risk'][i] 

if (df2['risk_bin']==risk) & (df2['package']==pack):
 if sym=='+':
 df2[product_1] = df2[product_2] + val
 if sym=='-':
 df2[product_1] = df2[product_2] - val 
else:
 df2[product_1] = df2[product_1]

When I do this, I get the error below:

 The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

This is the output that I'm expecting for this set of rules.

 results = pd.DataFrame('package': 0: 'basic', 1: 'medium', 2: 'premium', 3:'basic', 4:'medium', 5:'premium',
 'risk_bin': 0: 'good/mid', 1: 'good/mid', 2: 'good/mid', 3:'bad', 4:'bad',5:'bad',
 'A': 0:0.012,1:0.022,2:0.032,3:0.05,4:0.06,5:0.07,
 'B': 0:0.062,1:0.023,2:0.033,3:0.1,4:0.061,5:0.071,
 'C': 0:0.014,1:0.024,2:0.034,3:0.153,4:0.163,5:0.173,
 'D': 0:0.015,1:0.025,2:0.035,3:0.053,4:0.063,5:0.073)

results = results[results.columns[[4,5,0,1,2,3]]]

Can you please help me?
Thanks a lot!

python pandas loops if-statement

edited Nov 14 '18 at 14:42

asked Nov 13 '18 at 23:21

Ingrid

226

edited Nov 14 '18 at 14:42

asked Nov 13 '18 at 23:21

Ingrid

226

edited Nov 14 '18 at 14:42

asked Nov 13 '18 at 23:21

Ingrid

226

asked Nov 13 '18 at 23:21

Ingrid

226

asked Nov 13 '18 at 23:21

Ingrid

226

1

Can you provide the output you'd expect given the input you provided? This is likely a job for pandas.merge() rather than a iteration: pandas.pydata.org/pandas-docs/stable/generated/…

– smj
Nov 14 '18 at 0:10

@smj the rules and number of rules can change accordingly to the user, how I suppose to use pandas.merge() to solve this? Can you give an example? I just added the output expected. Thanks

– Ingrid
Nov 14 '18 at 14:17

Could you try to explain in words what is your loop trying to do? It's not quite clear to me.

– user32185
Nov 14 '18 at 14:36

@user32185 just added in words. thanks

– Ingrid
Nov 14 '18 at 14:43

If you want I can share a solution that works per batch of rules. But it could lead to weird results if the rules are not univocal.

– user32185
Nov 14 '18 at 15:52

|
show 1 more comment

1

Can you provide the output you'd expect given the input you provided? This is likely a job for pandas.merge() rather than a iteration: pandas.pydata.org/pandas-docs/stable/generated/…

– smj
Nov 14 '18 at 0:10

@smj the rules and number of rules can change accordingly to the user, how I suppose to use pandas.merge() to solve this? Can you give an example? I just added the output expected. Thanks

– Ingrid
Nov 14 '18 at 14:17

Could you try to explain in words what is your loop trying to do? It's not quite clear to me.

– user32185
Nov 14 '18 at 14:36

@user32185 just added in words. thanks

– Ingrid
Nov 14 '18 at 14:43

If you want I can share a solution that works per batch of rules. But it could lead to weird results if the rules are not univocal.

– user32185
Nov 14 '18 at 15:52

Can you provide the output you'd expect given the input you provided? This is likely a job for pandas.merge() rather than a iteration: pandas.pydata.org/pandas-docs/stable/generated/…

– smj
Nov 14 '18 at 0:10

@smj the rules and number of rules can change accordingly to the user, how I suppose to use pandas.merge() to solve this? Can you give an example? I just added the output expected. Thanks

– Ingrid
Nov 14 '18 at 14:17

Could you try to explain in words what is your loop trying to do? It's not quite clear to me.

– user32185
Nov 14 '18 at 14:36

@user32185 just added in words. thanks

– Ingrid
Nov 14 '18 at 14:43

If you want I can share a solution that works per batch of rules. But it could lead to weird results if the rules are not univocal.

– user32185
Nov 14 '18 at 15:52

|
show 1 more comment

1 Answer
1

active

oldest

votes

This is one possible solution. Is not ideal as it using apply, which is faster than a loop but not as fast as a vectorial solution. I renamed risk as risk_bin in rules.

import pandas as pd

df = pd.DataFrame('package': 0: 'basic', 1: 'medium', 2: 'premium', 3:'basic', 4:'medium', 5:'premium',
 'risk_bin': 0: 'good/mid', 1: 'good/mid', 2: 'good/mid', 3:'bad', 4:'bad',5:'bad',
 'A': 0:0.012,1:0.022,2:0.032,3:0.05,4:0.06,5:0.07,
 'B': 0:0.013,1:0.023,2:0.033,3:0.051,4:0.061,5:0.071,
 'C': 0:0.014,1:0.024,2:0.034,3:0.052,4:0.062,5:0.072,
 'D': 0:0.015,1:0.025,2:0.035,3:0.053,4:0.063,5:0.073)
df = df[df.columns[[4,5,0,1,2,3]]]

rules = pd.DataFrame('rule': 0: '1', 1: '2',
 'product1': 0: 'B', 1: 'C',
 'relantioship': 0:'=',1:'=',
 'product2': 0:'A',1:'D',
 'symbol': 0:'+',1:'-',
 'value': 0:0.05,1:0.10,
 'package':0:'basic',1:'all',
 'risk_bin': 0:'good/mid', 1:'bad')
rules = rules[rules.columns[[5,1,3,2,6,7,0,4]]]

def fun(row):
 if row["symbol"] == "+":
 row[row["product1"]] = row[row["product2"]] + row["value"]
 else:
 row[row["product1"]] = row[row["product2"]] - row["value"]
 return row

# here you look for all the rows where rules match with the given columns
df1 = pd.merge(df.reset_index(), rules, on=["package", "risk_bin"])
# here you what a rule for `all` package
df2 = pd.merge(df.reset_index(),
 rules[rules["package"]=='all'].loc[:, rules.columns != "package"],
 on=["risk_bin"])
# now you apply the function to both df
df1 = df1.apply(lambda x: fun(x), axis=1)
df2 = df2.apply(lambda x: fun(x), axis=1)

#select the indices in df1 and df2
bad_idx = df.index.isin(df1["index"].tolist()+df2["index"].tolist())

#concat all together
res = pd.concat([df1[df.columns], df2[df.columns], df[~bad_idx]],ignore_index=True)

edited Nov 14 '18 at 17:29

answered Nov 14 '18 at 17:26

user32185

1,3611824

Thanks a lot @user32185

– Ingrid
Nov 14 '18 at 17:55

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53290946%2fiterating-over-rules-using-if-else%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

This is one possible solution. Is not ideal as it using apply, which is faster than a loop but not as fast as a vectorial solution. I renamed risk as risk_bin in rules.

import pandas as pd

df = pd.DataFrame('package': 0: 'basic', 1: 'medium', 2: 'premium', 3:'basic', 4:'medium', 5:'premium',
 'risk_bin': 0: 'good/mid', 1: 'good/mid', 2: 'good/mid', 3:'bad', 4:'bad',5:'bad',
 'A': 0:0.012,1:0.022,2:0.032,3:0.05,4:0.06,5:0.07,
 'B': 0:0.013,1:0.023,2:0.033,3:0.051,4:0.061,5:0.071,
 'C': 0:0.014,1:0.024,2:0.034,3:0.052,4:0.062,5:0.072,
 'D': 0:0.015,1:0.025,2:0.035,3:0.053,4:0.063,5:0.073)
df = df[df.columns[[4,5,0,1,2,3]]]

rules = pd.DataFrame('rule': 0: '1', 1: '2',
 'product1': 0: 'B', 1: 'C',
 'relantioship': 0:'=',1:'=',
 'product2': 0:'A',1:'D',
 'symbol': 0:'+',1:'-',
 'value': 0:0.05,1:0.10,
 'package':0:'basic',1:'all',
 'risk_bin': 0:'good/mid', 1:'bad')
rules = rules[rules.columns[[5,1,3,2,6,7,0,4]]]

def fun(row):
 if row["symbol"] == "+":
 row[row["product1"]] = row[row["product2"]] + row["value"]
 else:
 row[row["product1"]] = row[row["product2"]] - row["value"]
 return row

# here you look for all the rows where rules match with the given columns
df1 = pd.merge(df.reset_index(), rules, on=["package", "risk_bin"])
# here you what a rule for `all` package
df2 = pd.merge(df.reset_index(),
 rules[rules["package"]=='all'].loc[:, rules.columns != "package"],
 on=["risk_bin"])
# now you apply the function to both df
df1 = df1.apply(lambda x: fun(x), axis=1)
df2 = df2.apply(lambda x: fun(x), axis=1)

#select the indices in df1 and df2
bad_idx = df.index.isin(df1["index"].tolist()+df2["index"].tolist())

#concat all together
res = pd.concat([df1[df.columns], df2[df.columns], df[~bad_idx]],ignore_index=True)

edited Nov 14 '18 at 17:29

answered Nov 14 '18 at 17:26

user32185

1,3611824

Thanks a lot @user32185

– Ingrid
Nov 14 '18 at 17:55

add a comment |

This is one possible solution. Is not ideal as it using apply, which is faster than a loop but not as fast as a vectorial solution. I renamed risk as risk_bin in rules.

import pandas as pd

df = pd.DataFrame('package': 0: 'basic', 1: 'medium', 2: 'premium', 3:'basic', 4:'medium', 5:'premium',
 'risk_bin': 0: 'good/mid', 1: 'good/mid', 2: 'good/mid', 3:'bad', 4:'bad',5:'bad',
 'A': 0:0.012,1:0.022,2:0.032,3:0.05,4:0.06,5:0.07,
 'B': 0:0.013,1:0.023,2:0.033,3:0.051,4:0.061,5:0.071,
 'C': 0:0.014,1:0.024,2:0.034,3:0.052,4:0.062,5:0.072,
 'D': 0:0.015,1:0.025,2:0.035,3:0.053,4:0.063,5:0.073)
df = df[df.columns[[4,5,0,1,2,3]]]

rules = pd.DataFrame('rule': 0: '1', 1: '2',
 'product1': 0: 'B', 1: 'C',
 'relantioship': 0:'=',1:'=',
 'product2': 0:'A',1:'D',
 'symbol': 0:'+',1:'-',
 'value': 0:0.05,1:0.10,
 'package':0:'basic',1:'all',
 'risk_bin': 0:'good/mid', 1:'bad')
rules = rules[rules.columns[[5,1,3,2,6,7,0,4]]]

def fun(row):
 if row["symbol"] == "+":
 row[row["product1"]] = row[row["product2"]] + row["value"]
 else:
 row[row["product1"]] = row[row["product2"]] - row["value"]
 return row

# here you look for all the rows where rules match with the given columns
df1 = pd.merge(df.reset_index(), rules, on=["package", "risk_bin"])
# here you what a rule for `all` package
df2 = pd.merge(df.reset_index(),
 rules[rules["package"]=='all'].loc[:, rules.columns != "package"],
 on=["risk_bin"])
# now you apply the function to both df
df1 = df1.apply(lambda x: fun(x), axis=1)
df2 = df2.apply(lambda x: fun(x), axis=1)

#select the indices in df1 and df2
bad_idx = df.index.isin(df1["index"].tolist()+df2["index"].tolist())

#concat all together
res = pd.concat([df1[df.columns], df2[df.columns], df[~bad_idx]],ignore_index=True)

edited Nov 14 '18 at 17:29

answered Nov 14 '18 at 17:26

user32185

1,3611824

Thanks a lot @user32185

– Ingrid
Nov 14 '18 at 17:55

add a comment |

This is one possible solution. Is not ideal as it using apply, which is faster than a loop but not as fast as a vectorial solution. I renamed risk as risk_bin in rules.

import pandas as pd

df = pd.DataFrame('package': 0: 'basic', 1: 'medium', 2: 'premium', 3:'basic', 4:'medium', 5:'premium',
 'risk_bin': 0: 'good/mid', 1: 'good/mid', 2: 'good/mid', 3:'bad', 4:'bad',5:'bad',
 'A': 0:0.012,1:0.022,2:0.032,3:0.05,4:0.06,5:0.07,
 'B': 0:0.013,1:0.023,2:0.033,3:0.051,4:0.061,5:0.071,
 'C': 0:0.014,1:0.024,2:0.034,3:0.052,4:0.062,5:0.072,
 'D': 0:0.015,1:0.025,2:0.035,3:0.053,4:0.063,5:0.073)
df = df[df.columns[[4,5,0,1,2,3]]]

rules = pd.DataFrame('rule': 0: '1', 1: '2',
 'product1': 0: 'B', 1: 'C',
 'relantioship': 0:'=',1:'=',
 'product2': 0:'A',1:'D',
 'symbol': 0:'+',1:'-',
 'value': 0:0.05,1:0.10,
 'package':0:'basic',1:'all',
 'risk_bin': 0:'good/mid', 1:'bad')
rules = rules[rules.columns[[5,1,3,2,6,7,0,4]]]

def fun(row):
 if row["symbol"] == "+":
 row[row["product1"]] = row[row["product2"]] + row["value"]
 else:
 row[row["product1"]] = row[row["product2"]] - row["value"]
 return row

# here you look for all the rows where rules match with the given columns
df1 = pd.merge(df.reset_index(), rules, on=["package", "risk_bin"])
# here you what a rule for `all` package
df2 = pd.merge(df.reset_index(),
 rules[rules["package"]=='all'].loc[:, rules.columns != "package"],
 on=["risk_bin"])
# now you apply the function to both df
df1 = df1.apply(lambda x: fun(x), axis=1)
df2 = df2.apply(lambda x: fun(x), axis=1)

#select the indices in df1 and df2
bad_idx = df.index.isin(df1["index"].tolist()+df2["index"].tolist())

#concat all together
res = pd.concat([df1[df.columns], df2[df.columns], df[~bad_idx]],ignore_index=True)

edited Nov 14 '18 at 17:29

answered Nov 14 '18 at 17:26

user32185

1,3611824

This is one possible solution. Is not ideal as it using apply, which is faster than a loop but not as fast as a vectorial solution. I renamed risk as risk_bin in rules.

import pandas as pd

df = pd.DataFrame('package': 0: 'basic', 1: 'medium', 2: 'premium', 3:'basic', 4:'medium', 5:'premium',
 'risk_bin': 0: 'good/mid', 1: 'good/mid', 2: 'good/mid', 3:'bad', 4:'bad',5:'bad',
 'A': 0:0.012,1:0.022,2:0.032,3:0.05,4:0.06,5:0.07,
 'B': 0:0.013,1:0.023,2:0.033,3:0.051,4:0.061,5:0.071,
 'C': 0:0.014,1:0.024,2:0.034,3:0.052,4:0.062,5:0.072,
 'D': 0:0.015,1:0.025,2:0.035,3:0.053,4:0.063,5:0.073)
df = df[df.columns[[4,5,0,1,2,3]]]

rules = pd.DataFrame('rule': 0: '1', 1: '2',
 'product1': 0: 'B', 1: 'C',
 'relantioship': 0:'=',1:'=',
 'product2': 0:'A',1:'D',
 'symbol': 0:'+',1:'-',
 'value': 0:0.05,1:0.10,
 'package':0:'basic',1:'all',
 'risk_bin': 0:'good/mid', 1:'bad')
rules = rules[rules.columns[[5,1,3,2,6,7,0,4]]]

def fun(row):
 if row["symbol"] == "+":
 row[row["product1"]] = row[row["product2"]] + row["value"]
 else:
 row[row["product1"]] = row[row["product2"]] - row["value"]
 return row

# here you look for all the rows where rules match with the given columns
df1 = pd.merge(df.reset_index(), rules, on=["package", "risk_bin"])
# here you what a rule for `all` package
df2 = pd.merge(df.reset_index(),
 rules[rules["package"]=='all'].loc[:, rules.columns != "package"],
 on=["risk_bin"])
# now you apply the function to both df
df1 = df1.apply(lambda x: fun(x), axis=1)
df2 = df2.apply(lambda x: fun(x), axis=1)

#select the indices in df1 and df2
bad_idx = df.index.isin(df1["index"].tolist()+df2["index"].tolist())

#concat all together
res = pd.concat([df1[df.columns], df2[df.columns], df[~bad_idx]],ignore_index=True)

edited Nov 14 '18 at 17:29

answered Nov 14 '18 at 17:26

user32185

1,3611824

edited Nov 14 '18 at 17:29

answered Nov 14 '18 at 17:26

user32185

1,3611824

answered Nov 14 '18 at 17:26

user32185

1,3611824

answered Nov 14 '18 at 17:26

user32185

1,3611824

Thanks a lot @user32185

– Ingrid
Nov 14 '18 at 17:55

add a comment |

Thanks a lot @user32185

– Ingrid
Nov 14 '18 at 17:55

Thanks a lot @user32185

– Ingrid
Nov 14 '18 at 17:55

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

uhdMvhYHEPqN6EIOtLHvO sdQ,N1Ht EZOSxpuEhH7KyiGt

搜尋此網誌

Odtnhj