How to extract the characters from a string that are inside parentheses?
up vote
2
down vote
favorite
Picture of the DataFrame:
I have one column named contracting and another named contractor inside a DataFrame.
I need to divide, for example, the column contractor, into 2 new columns: one column containing the Fiscal number that is inside the parenthesis and another column containing all the rest (the description).
Example:
Contractor: Meo(504615947)
I need that it becomes:
Contractor_Name: Meo and Contractor_Number:504615947
I tried to do this:
proc_2013[['contractor_description', 'contractor_NIF']]= pd.DataFrame(proc_2013['contractor'].str.split(('('),1).tolist())
proc2013['contractor_NIF'] = proc2013.contractor_NIF.str.extract('(d+)')
Problem 1:
I can have a name description inside a parenthesis as well, followed by the number that I am trying to extract.
Problem 2:
Sometimes, if the contractor is from a foreign country, it has a letter in the beginning of the Fiscal Number (not only numbers as I assumed at first, using my second line of code).
All Fiscal Numbers have 9 digits.
python pandas
add a comment |
up vote
2
down vote
favorite
Picture of the DataFrame:
I have one column named contracting and another named contractor inside a DataFrame.
I need to divide, for example, the column contractor, into 2 new columns: one column containing the Fiscal number that is inside the parenthesis and another column containing all the rest (the description).
Example:
Contractor: Meo(504615947)
I need that it becomes:
Contractor_Name: Meo and Contractor_Number:504615947
I tried to do this:
proc_2013[['contractor_description', 'contractor_NIF']]= pd.DataFrame(proc_2013['contractor'].str.split(('('),1).tolist())
proc2013['contractor_NIF'] = proc2013.contractor_NIF.str.extract('(d+)')
Problem 1:
I can have a name description inside a parenthesis as well, followed by the number that I am trying to extract.
Problem 2:
Sometimes, if the contractor is from a foreign country, it has a letter in the beginning of the Fiscal Number (not only numbers as I assumed at first, using my second line of code).
All Fiscal Numbers have 9 digits.
python pandas
1
Please give a proper Minimal, Complete, and Verifiable example, in text form.
– jonrsharpe
Nov 11 at 15:16
Please don't post images of your code, see meta.stackoverflow.com/questions/374700/…
– quant
Nov 11 at 15:29
Sorry, it's my first post. I will keep your suggestions in mind next time.
– jess
Nov 11 at 16:16
add a comment |
up vote
2
down vote
favorite
up vote
2
down vote
favorite
Picture of the DataFrame:
I have one column named contracting and another named contractor inside a DataFrame.
I need to divide, for example, the column contractor, into 2 new columns: one column containing the Fiscal number that is inside the parenthesis and another column containing all the rest (the description).
Example:
Contractor: Meo(504615947)
I need that it becomes:
Contractor_Name: Meo and Contractor_Number:504615947
I tried to do this:
proc_2013[['contractor_description', 'contractor_NIF']]= pd.DataFrame(proc_2013['contractor'].str.split(('('),1).tolist())
proc2013['contractor_NIF'] = proc2013.contractor_NIF.str.extract('(d+)')
Problem 1:
I can have a name description inside a parenthesis as well, followed by the number that I am trying to extract.
Problem 2:
Sometimes, if the contractor is from a foreign country, it has a letter in the beginning of the Fiscal Number (not only numbers as I assumed at first, using my second line of code).
All Fiscal Numbers have 9 digits.
python pandas
Picture of the DataFrame:
I have one column named contracting and another named contractor inside a DataFrame.
I need to divide, for example, the column contractor, into 2 new columns: one column containing the Fiscal number that is inside the parenthesis and another column containing all the rest (the description).
Example:
Contractor: Meo(504615947)
I need that it becomes:
Contractor_Name: Meo and Contractor_Number:504615947
I tried to do this:
proc_2013[['contractor_description', 'contractor_NIF']]= pd.DataFrame(proc_2013['contractor'].str.split(('('),1).tolist())
proc2013['contractor_NIF'] = proc2013.contractor_NIF.str.extract('(d+)')
Problem 1:
I can have a name description inside a parenthesis as well, followed by the number that I am trying to extract.
Problem 2:
Sometimes, if the contractor is from a foreign country, it has a letter in the beginning of the Fiscal Number (not only numbers as I assumed at first, using my second line of code).
All Fiscal Numbers have 9 digits.
python pandas
python pandas
edited Nov 11 at 16:59
Akash Ranjan
10811
10811
asked Nov 11 at 15:12
jess
164
164
1
Please give a proper Minimal, Complete, and Verifiable example, in text form.
– jonrsharpe
Nov 11 at 15:16
Please don't post images of your code, see meta.stackoverflow.com/questions/374700/…
– quant
Nov 11 at 15:29
Sorry, it's my first post. I will keep your suggestions in mind next time.
– jess
Nov 11 at 16:16
add a comment |
1
Please give a proper Minimal, Complete, and Verifiable example, in text form.
– jonrsharpe
Nov 11 at 15:16
Please don't post images of your code, see meta.stackoverflow.com/questions/374700/…
– quant
Nov 11 at 15:29
Sorry, it's my first post. I will keep your suggestions in mind next time.
– jess
Nov 11 at 16:16
1
1
Please give a proper Minimal, Complete, and Verifiable example, in text form.
– jonrsharpe
Nov 11 at 15:16
Please give a proper Minimal, Complete, and Verifiable example, in text form.
– jonrsharpe
Nov 11 at 15:16
Please don't post images of your code, see meta.stackoverflow.com/questions/374700/…
– quant
Nov 11 at 15:29
Please don't post images of your code, see meta.stackoverflow.com/questions/374700/…
– quant
Nov 11 at 15:29
Sorry, it's my first post. I will keep your suggestions in mind next time.
– jess
Nov 11 at 16:16
Sorry, it's my first post. I will keep your suggestions in mind next time.
– jess
Nov 11 at 16:16
add a comment |
2 Answers
2
active
oldest
votes
up vote
2
down vote
accepted
As far as i could understand your question, this can be a possible solution,
df['contractor_name']=list(map(lambda x : x.split('(')[0],df['con']))
df['contractor_number']=list(map(lambda x : x.split('(')[-1][-10:-1],df['contractor']))
Hope this helps.
Thanks for the help Akash! But I still have a problem. I have the following string: "Trabalhadores em Funções Públicas (ADSE) (600000303)"" My goal is to split this into two new columns: Contractor_Name: "Trabalhadores em Funções Públicas (ADSE) " and Contractor_Number: 600000303. Using the solution you provided, I obtain Contractor_Number: "ADSE)" . Can you still help me, please?
– jess
Nov 11 at 16:26
i have updated the code to give you required contractor_number, please confirm if you need (ADSE) as well in your contractor_name?
– Akash Ranjan
Nov 11 at 16:36
It's working now exactly as I wanted. Thank you so so much Akash for your great help!!!!
– jess
Nov 11 at 16:44
No, Thanks. Didn't knew that. Keep Coding :)
– Akash Ranjan
Nov 11 at 17:09
It seems to have worked now. I was able to click in the arrow to vote +1. Thanks again! :)
– jess
Nov 11 at 17:12
add a comment |
up vote
2
down vote
You could change d
to w
for any alphanumeric like:
proc2013['contractor_NIF'] = proc2013.contractor_NIF.str.extract('((w+))')
Thanks a lot Franco! Your comment helped me to solve my second problem. I still need to understand what to do when I have something like: Trabalhadores em Funções Públicas (ADSE) (600000303). I am getting ADSE as the fiscal number instead of 600000303.
– jess
Nov 11 at 15:41
Welcome! It would be good if you clarify which are the inputs that you are having trouble with with examples.
– Franco Piccolo
Nov 11 at 15:46
Akash already helped me to solve the problem. Thanks again Franco for the help and sorry if I wasn't very clear!
– jess
Nov 11 at 16:45
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
accepted
As far as i could understand your question, this can be a possible solution,
df['contractor_name']=list(map(lambda x : x.split('(')[0],df['con']))
df['contractor_number']=list(map(lambda x : x.split('(')[-1][-10:-1],df['contractor']))
Hope this helps.
Thanks for the help Akash! But I still have a problem. I have the following string: "Trabalhadores em Funções Públicas (ADSE) (600000303)"" My goal is to split this into two new columns: Contractor_Name: "Trabalhadores em Funções Públicas (ADSE) " and Contractor_Number: 600000303. Using the solution you provided, I obtain Contractor_Number: "ADSE)" . Can you still help me, please?
– jess
Nov 11 at 16:26
i have updated the code to give you required contractor_number, please confirm if you need (ADSE) as well in your contractor_name?
– Akash Ranjan
Nov 11 at 16:36
It's working now exactly as I wanted. Thank you so so much Akash for your great help!!!!
– jess
Nov 11 at 16:44
No, Thanks. Didn't knew that. Keep Coding :)
– Akash Ranjan
Nov 11 at 17:09
It seems to have worked now. I was able to click in the arrow to vote +1. Thanks again! :)
– jess
Nov 11 at 17:12
add a comment |
up vote
2
down vote
accepted
As far as i could understand your question, this can be a possible solution,
df['contractor_name']=list(map(lambda x : x.split('(')[0],df['con']))
df['contractor_number']=list(map(lambda x : x.split('(')[-1][-10:-1],df['contractor']))
Hope this helps.
Thanks for the help Akash! But I still have a problem. I have the following string: "Trabalhadores em Funções Públicas (ADSE) (600000303)"" My goal is to split this into two new columns: Contractor_Name: "Trabalhadores em Funções Públicas (ADSE) " and Contractor_Number: 600000303. Using the solution you provided, I obtain Contractor_Number: "ADSE)" . Can you still help me, please?
– jess
Nov 11 at 16:26
i have updated the code to give you required contractor_number, please confirm if you need (ADSE) as well in your contractor_name?
– Akash Ranjan
Nov 11 at 16:36
It's working now exactly as I wanted. Thank you so so much Akash for your great help!!!!
– jess
Nov 11 at 16:44
No, Thanks. Didn't knew that. Keep Coding :)
– Akash Ranjan
Nov 11 at 17:09
It seems to have worked now. I was able to click in the arrow to vote +1. Thanks again! :)
– jess
Nov 11 at 17:12
add a comment |
up vote
2
down vote
accepted
up vote
2
down vote
accepted
As far as i could understand your question, this can be a possible solution,
df['contractor_name']=list(map(lambda x : x.split('(')[0],df['con']))
df['contractor_number']=list(map(lambda x : x.split('(')[-1][-10:-1],df['contractor']))
Hope this helps.
As far as i could understand your question, this can be a possible solution,
df['contractor_name']=list(map(lambda x : x.split('(')[0],df['con']))
df['contractor_number']=list(map(lambda x : x.split('(')[-1][-10:-1],df['contractor']))
Hope this helps.
edited Nov 11 at 16:51
answered Nov 11 at 15:45
Akash Ranjan
10811
10811
Thanks for the help Akash! But I still have a problem. I have the following string: "Trabalhadores em Funções Públicas (ADSE) (600000303)"" My goal is to split this into two new columns: Contractor_Name: "Trabalhadores em Funções Públicas (ADSE) " and Contractor_Number: 600000303. Using the solution you provided, I obtain Contractor_Number: "ADSE)" . Can you still help me, please?
– jess
Nov 11 at 16:26
i have updated the code to give you required contractor_number, please confirm if you need (ADSE) as well in your contractor_name?
– Akash Ranjan
Nov 11 at 16:36
It's working now exactly as I wanted. Thank you so so much Akash for your great help!!!!
– jess
Nov 11 at 16:44
No, Thanks. Didn't knew that. Keep Coding :)
– Akash Ranjan
Nov 11 at 17:09
It seems to have worked now. I was able to click in the arrow to vote +1. Thanks again! :)
– jess
Nov 11 at 17:12
add a comment |
Thanks for the help Akash! But I still have a problem. I have the following string: "Trabalhadores em Funções Públicas (ADSE) (600000303)"" My goal is to split this into two new columns: Contractor_Name: "Trabalhadores em Funções Públicas (ADSE) " and Contractor_Number: 600000303. Using the solution you provided, I obtain Contractor_Number: "ADSE)" . Can you still help me, please?
– jess
Nov 11 at 16:26
i have updated the code to give you required contractor_number, please confirm if you need (ADSE) as well in your contractor_name?
– Akash Ranjan
Nov 11 at 16:36
It's working now exactly as I wanted. Thank you so so much Akash for your great help!!!!
– jess
Nov 11 at 16:44
No, Thanks. Didn't knew that. Keep Coding :)
– Akash Ranjan
Nov 11 at 17:09
It seems to have worked now. I was able to click in the arrow to vote +1. Thanks again! :)
– jess
Nov 11 at 17:12
Thanks for the help Akash! But I still have a problem. I have the following string: "Trabalhadores em Funções Públicas (ADSE) (600000303)"" My goal is to split this into two new columns: Contractor_Name: "Trabalhadores em Funções Públicas (ADSE) " and Contractor_Number: 600000303. Using the solution you provided, I obtain Contractor_Number: "ADSE)" . Can you still help me, please?
– jess
Nov 11 at 16:26
Thanks for the help Akash! But I still have a problem. I have the following string: "Trabalhadores em Funções Públicas (ADSE) (600000303)"" My goal is to split this into two new columns: Contractor_Name: "Trabalhadores em Funções Públicas (ADSE) " and Contractor_Number: 600000303. Using the solution you provided, I obtain Contractor_Number: "ADSE)" . Can you still help me, please?
– jess
Nov 11 at 16:26
i have updated the code to give you required contractor_number, please confirm if you need (ADSE) as well in your contractor_name?
– Akash Ranjan
Nov 11 at 16:36
i have updated the code to give you required contractor_number, please confirm if you need (ADSE) as well in your contractor_name?
– Akash Ranjan
Nov 11 at 16:36
It's working now exactly as I wanted. Thank you so so much Akash for your great help!!!!
– jess
Nov 11 at 16:44
It's working now exactly as I wanted. Thank you so so much Akash for your great help!!!!
– jess
Nov 11 at 16:44
No, Thanks. Didn't knew that. Keep Coding :)
– Akash Ranjan
Nov 11 at 17:09
No, Thanks. Didn't knew that. Keep Coding :)
– Akash Ranjan
Nov 11 at 17:09
It seems to have worked now. I was able to click in the arrow to vote +1. Thanks again! :)
– jess
Nov 11 at 17:12
It seems to have worked now. I was able to click in the arrow to vote +1. Thanks again! :)
– jess
Nov 11 at 17:12
add a comment |
up vote
2
down vote
You could change d
to w
for any alphanumeric like:
proc2013['contractor_NIF'] = proc2013.contractor_NIF.str.extract('((w+))')
Thanks a lot Franco! Your comment helped me to solve my second problem. I still need to understand what to do when I have something like: Trabalhadores em Funções Públicas (ADSE) (600000303). I am getting ADSE as the fiscal number instead of 600000303.
– jess
Nov 11 at 15:41
Welcome! It would be good if you clarify which are the inputs that you are having trouble with with examples.
– Franco Piccolo
Nov 11 at 15:46
Akash already helped me to solve the problem. Thanks again Franco for the help and sorry if I wasn't very clear!
– jess
Nov 11 at 16:45
add a comment |
up vote
2
down vote
You could change d
to w
for any alphanumeric like:
proc2013['contractor_NIF'] = proc2013.contractor_NIF.str.extract('((w+))')
Thanks a lot Franco! Your comment helped me to solve my second problem. I still need to understand what to do when I have something like: Trabalhadores em Funções Públicas (ADSE) (600000303). I am getting ADSE as the fiscal number instead of 600000303.
– jess
Nov 11 at 15:41
Welcome! It would be good if you clarify which are the inputs that you are having trouble with with examples.
– Franco Piccolo
Nov 11 at 15:46
Akash already helped me to solve the problem. Thanks again Franco for the help and sorry if I wasn't very clear!
– jess
Nov 11 at 16:45
add a comment |
up vote
2
down vote
up vote
2
down vote
You could change d
to w
for any alphanumeric like:
proc2013['contractor_NIF'] = proc2013.contractor_NIF.str.extract('((w+))')
You could change d
to w
for any alphanumeric like:
proc2013['contractor_NIF'] = proc2013.contractor_NIF.str.extract('((w+))')
edited Nov 11 at 15:36
answered Nov 11 at 15:31
Franco Piccolo
1,345611
1,345611
Thanks a lot Franco! Your comment helped me to solve my second problem. I still need to understand what to do when I have something like: Trabalhadores em Funções Públicas (ADSE) (600000303). I am getting ADSE as the fiscal number instead of 600000303.
– jess
Nov 11 at 15:41
Welcome! It would be good if you clarify which are the inputs that you are having trouble with with examples.
– Franco Piccolo
Nov 11 at 15:46
Akash already helped me to solve the problem. Thanks again Franco for the help and sorry if I wasn't very clear!
– jess
Nov 11 at 16:45
add a comment |
Thanks a lot Franco! Your comment helped me to solve my second problem. I still need to understand what to do when I have something like: Trabalhadores em Funções Públicas (ADSE) (600000303). I am getting ADSE as the fiscal number instead of 600000303.
– jess
Nov 11 at 15:41
Welcome! It would be good if you clarify which are the inputs that you are having trouble with with examples.
– Franco Piccolo
Nov 11 at 15:46
Akash already helped me to solve the problem. Thanks again Franco for the help and sorry if I wasn't very clear!
– jess
Nov 11 at 16:45
Thanks a lot Franco! Your comment helped me to solve my second problem. I still need to understand what to do when I have something like: Trabalhadores em Funções Públicas (ADSE) (600000303). I am getting ADSE as the fiscal number instead of 600000303.
– jess
Nov 11 at 15:41
Thanks a lot Franco! Your comment helped me to solve my second problem. I still need to understand what to do when I have something like: Trabalhadores em Funções Públicas (ADSE) (600000303). I am getting ADSE as the fiscal number instead of 600000303.
– jess
Nov 11 at 15:41
Welcome! It would be good if you clarify which are the inputs that you are having trouble with with examples.
– Franco Piccolo
Nov 11 at 15:46
Welcome! It would be good if you clarify which are the inputs that you are having trouble with with examples.
– Franco Piccolo
Nov 11 at 15:46
Akash already helped me to solve the problem. Thanks again Franco for the help and sorry if I wasn't very clear!
– jess
Nov 11 at 16:45
Akash already helped me to solve the problem. Thanks again Franco for the help and sorry if I wasn't very clear!
– jess
Nov 11 at 16:45
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53250089%2fhow-to-extract-the-characters-from-a-string-that-are-inside-parentheses%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Please give a proper Minimal, Complete, and Verifiable example, in text form.
– jonrsharpe
Nov 11 at 15:16
Please don't post images of your code, see meta.stackoverflow.com/questions/374700/…
– quant
Nov 11 at 15:29
Sorry, it's my first post. I will keep your suggestions in mind next time.
– jess
Nov 11 at 16:16