calculate percentages after joining 2 tables
up vote
0
down vote
favorite
I have two tables, and I'd like to see the total number of customers and the total number of gender either male or female per each Book. After finding that, I'd like to get the percentage of gender customer.
Here is the Customer table:
Cus_Id Fname Gender
12 Sam male
13 Waqas female
14 Sim male
15 Rwan female
Here is the Books_cust table:
Book_Id Cus_Id Rating
348 12 5
342 13 8
323 13 4
434 15 9
I have so far is the following code.
LOAD1 = load '/user/Customer.txt' using PigStorage() AS (Cus_Id:int, Fname:chararray, Gender:chararray);
LOAD2 = load '/user/Books_cust.txt' using PigStorage() AS (Book_Id:int,Cus_Id:int, Rating:int);
JOIN1 = JOIN LOAD1 BY Cus_Id, LOAD2 by Cus_Id;
GROUP1 = GROUP JOIN1 BY (Book_Id, Gender);
GENERATE1 = FOREACH GROUP1 GENERATE FLATTEN(group), COUNT(JOIN1);
apache-pig
add a comment |
up vote
0
down vote
favorite
I have two tables, and I'd like to see the total number of customers and the total number of gender either male or female per each Book. After finding that, I'd like to get the percentage of gender customer.
Here is the Customer table:
Cus_Id Fname Gender
12 Sam male
13 Waqas female
14 Sim male
15 Rwan female
Here is the Books_cust table:
Book_Id Cus_Id Rating
348 12 5
342 13 8
323 13 4
434 15 9
I have so far is the following code.
LOAD1 = load '/user/Customer.txt' using PigStorage() AS (Cus_Id:int, Fname:chararray, Gender:chararray);
LOAD2 = load '/user/Books_cust.txt' using PigStorage() AS (Book_Id:int,Cus_Id:int, Rating:int);
JOIN1 = JOIN LOAD1 BY Cus_Id, LOAD2 by Cus_Id;
GROUP1 = GROUP JOIN1 BY (Book_Id, Gender);
GENERATE1 = FOREACH GROUP1 GENERATE FLATTEN(group), COUNT(JOIN1);
apache-pig
Please specify expected outputs as per you tables so that it is clear as to what is required.
– Rajeev Atmakuri
Nov 11 at 10:01
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have two tables, and I'd like to see the total number of customers and the total number of gender either male or female per each Book. After finding that, I'd like to get the percentage of gender customer.
Here is the Customer table:
Cus_Id Fname Gender
12 Sam male
13 Waqas female
14 Sim male
15 Rwan female
Here is the Books_cust table:
Book_Id Cus_Id Rating
348 12 5
342 13 8
323 13 4
434 15 9
I have so far is the following code.
LOAD1 = load '/user/Customer.txt' using PigStorage() AS (Cus_Id:int, Fname:chararray, Gender:chararray);
LOAD2 = load '/user/Books_cust.txt' using PigStorage() AS (Book_Id:int,Cus_Id:int, Rating:int);
JOIN1 = JOIN LOAD1 BY Cus_Id, LOAD2 by Cus_Id;
GROUP1 = GROUP JOIN1 BY (Book_Id, Gender);
GENERATE1 = FOREACH GROUP1 GENERATE FLATTEN(group), COUNT(JOIN1);
apache-pig
I have two tables, and I'd like to see the total number of customers and the total number of gender either male or female per each Book. After finding that, I'd like to get the percentage of gender customer.
Here is the Customer table:
Cus_Id Fname Gender
12 Sam male
13 Waqas female
14 Sim male
15 Rwan female
Here is the Books_cust table:
Book_Id Cus_Id Rating
348 12 5
342 13 8
323 13 4
434 15 9
I have so far is the following code.
LOAD1 = load '/user/Customer.txt' using PigStorage() AS (Cus_Id:int, Fname:chararray, Gender:chararray);
LOAD2 = load '/user/Books_cust.txt' using PigStorage() AS (Book_Id:int,Cus_Id:int, Rating:int);
JOIN1 = JOIN LOAD1 BY Cus_Id, LOAD2 by Cus_Id;
GROUP1 = GROUP JOIN1 BY (Book_Id, Gender);
GENERATE1 = FOREACH GROUP1 GENERATE FLATTEN(group), COUNT(JOIN1);
apache-pig
apache-pig
asked Nov 10 at 20:02
Fahed
61
61
Please specify expected outputs as per you tables so that it is clear as to what is required.
– Rajeev Atmakuri
Nov 11 at 10:01
add a comment |
Please specify expected outputs as per you tables so that it is clear as to what is required.
– Rajeev Atmakuri
Nov 11 at 10:01
Please specify expected outputs as per you tables so that it is clear as to what is required.
– Rajeev Atmakuri
Nov 11 at 10:01
Please specify expected outputs as per you tables so that it is clear as to what is required.
– Rajeev Atmakuri
Nov 11 at 10:01
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53242921%2fcalculate-percentages-after-joining-2-tables%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Please specify expected outputs as per you tables so that it is clear as to what is required.
– Rajeev Atmakuri
Nov 11 at 10:01