MySQL: Efficient way computing set powers of Venn-Diagram

up vote
3
down vote

favorite

Given the 4 tables, each containing items and representing one set, how to get the count of the items in each compartment required to draw a Venn diagram as shown below. The calculation should take place in the MySQL server avoiding transmission of items to the application server.

Example tables:

s1: s2: s3: s4:
+------+ +------+ +------+ +------+
| item | | item | | item | | item |
+------+ +------+ +------+ +------+
| a | | a | | a | | a |
+------+ +------+ +------+ +------+
| b | | b | | b | | c |
+------+ +------+ +------+ +------+
| c | | c | | d | | d |
+------+ +------+ +------+ +------+
| d | | e | | e | | e |
+------+ +------+ +------+ +------+
| ... | | ... | | ... | | ... |

Now, I think I would calculate some set powers. Some examples with I corresponding to s1, II to s2, III to s3 and IV to s4:

quadruple Venn Diagram - Venn diagram made from 4 sets using ellipses

If I reinterpret sx as being a set, I would write:

|s1 ∩ s2 ∩ s3 ∩ s4| - the white 25 in the center

|(s1 ∩ s2 ∩ s4) s3| - the white 15 below right in relation to the center

|(s1 ∩ s4) (s2 ∪ s3)| - the white 5 on the bottom

|s1 (s2 ∪ s3 ∪ s4)| - the dark blue 60 on the blue ground

... till 15.

How to calculate those powers efficiently on the MySQL server? Does MySQL provide a function aiding in the calculation?

A naive approach would be running a query for 1.

SELECT count(*) FROM(
SELECT item FROM s1
INTERSECT
SELECT item FROM s2
INTERSECT
SELECT item FROM s3
INTERSECT
SELECT item FROM s4);

and another query for 2.

SELECT count(*) FROM(
SELECT item FROM s1
INTERSECT
SELECT item FROM s2
INTERSECT
SELECT item FROM s4
EXCEPT
SELECT item FROM s3);

and so on, resulting in 15 queries.

edited Nov 11 at 12:53

asked Nov 10 at 0:27

Rainer Rillke

957819

If someone tells me convincingly it would be a lot easier to do it with Postgres, I would change the question accordingly. It should probably read "Open Source DBMS: ..." but that's too broad for SO.
– Rainer Rillke
Nov 10 at 0:41

1

There is no INTERSECT and EXCEPT in MySQL. So, you could use other RDBMS, which provides these features.
– Madhur Bhaiya
Nov 10 at 9:26

@MadhurBhaiya Wasn't aware of that. MariaDB introduced set operations with 10.3.
– Rainer Rillke
Nov 10 at 11:04

Current solution: gist.github.com/Rillke/c2da0921f8f2a047615f41fab8781c11
– Rainer Rillke
Nov 11 at 14:40

add a comment |

up vote
3
down vote

favorite

Example tables:

s1: s2: s3: s4:
+------+ +------+ +------+ +------+
| item | | item | | item | | item |
+------+ +------+ +------+ +------+
| a | | a | | a | | a |
+------+ +------+ +------+ +------+
| b | | b | | b | | c |
+------+ +------+ +------+ +------+
| c | | c | | d | | d |
+------+ +------+ +------+ +------+
| d | | e | | e | | e |
+------+ +------+ +------+ +------+
| ... | | ... | | ... | | ... |

Now, I think I would calculate some set powers. Some examples with I corresponding to s1, II to s2, III to s3 and IV to s4:

quadruple Venn Diagram - Venn diagram made from 4 sets using ellipses

If I reinterpret sx as being a set, I would write:

|s1 ∩ s2 ∩ s3 ∩ s4| - the white 25 in the center

|(s1 ∩ s2 ∩ s4) s3| - the white 15 below right in relation to the center

|(s1 ∩ s4) (s2 ∪ s3)| - the white 5 on the bottom

|s1 (s2 ∪ s3 ∪ s4)| - the dark blue 60 on the blue ground

... till 15.

How to calculate those powers efficiently on the MySQL server? Does MySQL provide a function aiding in the calculation?

A naive approach would be running a query for 1.

SELECT count(*) FROM(
SELECT item FROM s1
INTERSECT
SELECT item FROM s2
INTERSECT
SELECT item FROM s3
INTERSECT
SELECT item FROM s4);

and another query for 2.

SELECT count(*) FROM(
SELECT item FROM s1
INTERSECT
SELECT item FROM s2
INTERSECT
SELECT item FROM s4
EXCEPT
SELECT item FROM s3);

and so on, resulting in 15 queries.

edited Nov 11 at 12:53

asked Nov 10 at 0:27

Rainer Rillke

957819

If someone tells me convincingly it would be a lot easier to do it with Postgres, I would change the question accordingly. It should probably read "Open Source DBMS: ..." but that's too broad for SO.
– Rainer Rillke
Nov 10 at 0:41

1

There is no INTERSECT and EXCEPT in MySQL. So, you could use other RDBMS, which provides these features.
– Madhur Bhaiya
Nov 10 at 9:26

@MadhurBhaiya Wasn't aware of that. MariaDB introduced set operations with 10.3.
– Rainer Rillke
Nov 10 at 11:04

Current solution: gist.github.com/Rillke/c2da0921f8f2a047615f41fab8781c11
– Rainer Rillke
Nov 11 at 14:40

add a comment |

up vote
3
down vote

favorite

Example tables:

s1: s2: s3: s4:
+------+ +------+ +------+ +------+
| item | | item | | item | | item |
+------+ +------+ +------+ +------+
| a | | a | | a | | a |
+------+ +------+ +------+ +------+
| b | | b | | b | | c |
+------+ +------+ +------+ +------+
| c | | c | | d | | d |
+------+ +------+ +------+ +------+
| d | | e | | e | | e |
+------+ +------+ +------+ +------+
| ... | | ... | | ... | | ... |

Now, I think I would calculate some set powers. Some examples with I corresponding to s1, II to s2, III to s3 and IV to s4:

quadruple Venn Diagram - Venn diagram made from 4 sets using ellipses

If I reinterpret sx as being a set, I would write:

|s1 ∩ s2 ∩ s3 ∩ s4| - the white 25 in the center

|(s1 ∩ s2 ∩ s4) s3| - the white 15 below right in relation to the center

|(s1 ∩ s4) (s2 ∪ s3)| - the white 5 on the bottom

|s1 (s2 ∪ s3 ∪ s4)| - the dark blue 60 on the blue ground

... till 15.

How to calculate those powers efficiently on the MySQL server? Does MySQL provide a function aiding in the calculation?

A naive approach would be running a query for 1.

SELECT count(*) FROM(
SELECT item FROM s1
INTERSECT
SELECT item FROM s2
INTERSECT
SELECT item FROM s3
INTERSECT
SELECT item FROM s4);

and another query for 2.

SELECT count(*) FROM(
SELECT item FROM s1
INTERSECT
SELECT item FROM s2
INTERSECT
SELECT item FROM s4
EXCEPT
SELECT item FROM s3);

and so on, resulting in 15 queries.

edited Nov 11 at 12:53

asked Nov 10 at 0:27

Rainer Rillke

957819

Example tables:

s1: s2: s3: s4:
+------+ +------+ +------+ +------+
| item | | item | | item | | item |
+------+ +------+ +------+ +------+
| a | | a | | a | | a |
+------+ +------+ +------+ +------+
| b | | b | | b | | c |
+------+ +------+ +------+ +------+
| c | | c | | d | | d |
+------+ +------+ +------+ +------+
| d | | e | | e | | e |
+------+ +------+ +------+ +------+
| ... | | ... | | ... | | ... |

Now, I think I would calculate some set powers. Some examples with I corresponding to s1, II to s2, III to s3 and IV to s4:

quadruple Venn Diagram - Venn diagram made from 4 sets using ellipses

If I reinterpret sx as being a set, I would write:

|s1 ∩ s2 ∩ s3 ∩ s4| - the white 25 in the center

|(s1 ∩ s2 ∩ s4) s3| - the white 15 below right in relation to the center

|(s1 ∩ s4) (s2 ∪ s3)| - the white 5 on the bottom

|s1 (s2 ∪ s3 ∪ s4)| - the dark blue 60 on the blue ground

... till 15.

How to calculate those powers efficiently on the MySQL server? Does MySQL provide a function aiding in the calculation?

A naive approach would be running a query for 1.

SELECT count(*) FROM(
SELECT item FROM s1
INTERSECT
SELECT item FROM s2
INTERSECT
SELECT item FROM s3
INTERSECT
SELECT item FROM s4);

and another query for 2.

SELECT count(*) FROM(
SELECT item FROM s1
INTERSECT
SELECT item FROM s2
INTERSECT
SELECT item FROM s4
EXCEPT
SELECT item FROM s3);

and so on, resulting in 15 queries.

mysql venn-diagram set-intersection

edited Nov 11 at 12:53

asked Nov 10 at 0:27

Rainer Rillke

957819

edited Nov 11 at 12:53

asked Nov 10 at 0:27

Rainer Rillke

957819

edited Nov 11 at 12:53

asked Nov 10 at 0:27

Rainer Rillke

957819

asked Nov 10 at 0:27

Rainer Rillke

957819

asked Nov 10 at 0:27

Rainer Rillke

957819

If someone tells me convincingly it would be a lot easier to do it with Postgres, I would change the question accordingly. It should probably read "Open Source DBMS: ..." but that's too broad for SO.
– Rainer Rillke
Nov 10 at 0:41

1

There is no INTERSECT and EXCEPT in MySQL. So, you could use other RDBMS, which provides these features.
– Madhur Bhaiya
Nov 10 at 9:26

@MadhurBhaiya Wasn't aware of that. MariaDB introduced set operations with 10.3.
– Rainer Rillke
Nov 10 at 11:04

Current solution: gist.github.com/Rillke/c2da0921f8f2a047615f41fab8781c11
– Rainer Rillke
Nov 11 at 14:40

add a comment |

If someone tells me convincingly it would be a lot easier to do it with Postgres, I would change the question accordingly. It should probably read "Open Source DBMS: ..." but that's too broad for SO.
– Rainer Rillke
Nov 10 at 0:41

1

There is no INTERSECT and EXCEPT in MySQL. So, you could use other RDBMS, which provides these features.
– Madhur Bhaiya
Nov 10 at 9:26

@MadhurBhaiya Wasn't aware of that. MariaDB introduced set operations with 10.3.
– Rainer Rillke
Nov 10 at 11:04

Current solution: gist.github.com/Rillke/c2da0921f8f2a047615f41fab8781c11
– Rainer Rillke
Nov 11 at 14:40

If someone tells me convincingly it would be a lot easier to do it with Postgres, I would change the question accordingly. It should probably read "Open Source DBMS: ..." but that's too broad for SO.
– Rainer Rillke
Nov 10 at 0:41

There is no INTERSECT and EXCEPT in MySQL. So, you could use other RDBMS, which provides these features.
– Madhur Bhaiya
Nov 10 at 9:26

@MadhurBhaiya Wasn't aware of that. MariaDB introduced set operations with 10.3.
– Rainer Rillke
Nov 10 at 11:04

Current solution: gist.github.com/Rillke/c2da0921f8f2a047615f41fab8781c11
– Rainer Rillke
Nov 11 at 14:40

add a comment |

3 Answers
3

active

oldest

votes

up vote
1
down vote

accepted

Try something like this:

with universe as (
 select * from s1 
 union
 select * from s2
 union
 select * from s3
 union
 select * from s4
),
regions as (
 select
 case when s1.item is null then '0' else '1' end
 ||
 case when s2.item is null then '0' else '1' end
 ||
 case when s3.item is null then '0' else '1' end
 ||
 case when s4.item is null then '0' else '1' end as Region
 from universe u
 left join s1 on u.item = s1.item
 left join s2 on u.item = s2.item
 left join s3 on u.item = s3.item
 left join s4 on u.item = s4.item
)
select Region, count(*) from regions group by Region

Disclaimer: I only tested this in SQLite. You might need to SET sql_mode='PIPES_AS_CONCAT' for the ANSI string concatenation to work in MySQL, or use the concat function instead. The WITH syntax is only supported starting from version 8.0 of MySQL, but you can use temporary tables or nested queries appropriately instead.

If the sets are very large you might want to index the item column before querying in case the SQL optimizer won't figure it out by itself.

edited Nov 17 at 14:24

answered Nov 16 at 23:07

KT.

5,18222455

add a comment |

up vote
0
down vote

Following procedure:

Created a stored procedure, which creates temporary in-memory tables containing the sets.

Mind that MySQL does not allow you refer to a temporary in-memory table more than one time in a query.

As noted, MySQL does not have an INTERSECT or EXCEPT. But you can emulate them. By removing duplicates from your raw data/ raw sets, emulation can be even more simplified.

Decided to store the computed value into a variable each and output a table consisting of all 15 of those values corresponding to components.

What I came up with is currently https://gist.github.com/Rillke/c2da0921f8f2a047615f41fab8781c11

answered Nov 14 at 10:26

Rainer Rillke

957819

add a comment |

up vote
0
down vote

The question is a little complex so the answers are. Let me explain K.T.'s answer

with universe as (
 select * from s1 
 union
 select * from s2
 union
 select * from s3
 union
 select * from s4
),
regions as (
 select
 case when s1.item is null then '0' else '1' end
 ||
 case when s2.item is null then '0' else '1' end
 ||
 case when s3.item is null then '0' else '1' end
 ||
 case when s4.item is null then '0' else '1' end as Region
 from universe u
 left join s1 on u.item = s1.item
 left join s2 on u.item = s2.item
 left join s3 on u.item = s3.item
 left join s4 on u.item = s4.item
)
select Region, count(*) from regions group by Region

The universe results in the UNION of all tables (duplicates eliminated), something like

+------+
| item |
+------+
| a |
+------+
| b |
+------+
| c |
+------+
| d |
+------+
| e |
+------+
| ... |
+------+

Then, s1, s2, s3 and s4 are joined

+------+---------+---------+---------+---------+
| item | s1.item | s2.item | s3.item | s4.item |
+------+---------+---------+---------+---------+
| a | a | a | a | a |
+------+---------+---------+---------+---------+
| b | b | b | b | NULL |
+------+---------+---------+---------+---------+
| c | c | c | NULL | c |
+------+---------+---------+---------+---------+
| d | d | NULL | d | d |
+------+---------+---------+---------+---------+
| e | NULL | e | e | e |
+------+---------+---------+---------+---------+
| ... | ... | ... | ... | ... |
+------+---------+---------+---------+---------+

and converted to a binary string (0: if cell is NULL; 1: else) called Region where the first digit corresponds to s1, the second to s2 and so on

+------+--------+
| item | Region |
+------+--------+
| a | 1111 |
+------+--------+
| b | 1110 |
+------+--------+
| c | 1101 |
+------+--------+
| d | 1011 |
+------+--------+
| e | 0111 |
+------+--------+
| ... | ... |
+------+--------+

and finally aggregated and grouped by Region

+--------+-------+
| Region | count |
+--------+-------+
| 1111 | 1 |
+--------+-------+
| 1110 | 1 |
+--------+-------+
| 1101 | 1 |
+--------+-------+
| 1011 | 1 |
+--------+-------+
| 0111 | 1 |
+--------+-------+
| ... | |
+--------+-------+

Note that regions having 0 set elements in them do not show up in the results and 0000 never will (=item not part of any set s1, s2, s3, s4) so there are 15 regions.

4-set venn diagram with regions in binary representation

edited Nov 25 at 20:36

answered Nov 25 at 20:26

Rainer Rillke

957819

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53234943%2fmysql-efficient-way-computing-set-powers-of-venn-diagram%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

up vote
1
down vote

accepted

Try something like this:

with universe as (
 select * from s1 
 union
 select * from s2
 union
 select * from s3
 union
 select * from s4
),
regions as (
 select
 case when s1.item is null then '0' else '1' end
 ||
 case when s2.item is null then '0' else '1' end
 ||
 case when s3.item is null then '0' else '1' end
 ||
 case when s4.item is null then '0' else '1' end as Region
 from universe u
 left join s1 on u.item = s1.item
 left join s2 on u.item = s2.item
 left join s3 on u.item = s3.item
 left join s4 on u.item = s4.item
)
select Region, count(*) from regions group by Region

If the sets are very large you might want to index the item column before querying in case the SQL optimizer won't figure it out by itself.

edited Nov 17 at 14:24

answered Nov 16 at 23:07

KT.

5,18222455

add a comment |

up vote
1
down vote

accepted

Try something like this:

with universe as (
 select * from s1 
 union
 select * from s2
 union
 select * from s3
 union
 select * from s4
),
regions as (
 select
 case when s1.item is null then '0' else '1' end
 ||
 case when s2.item is null then '0' else '1' end
 ||
 case when s3.item is null then '0' else '1' end
 ||
 case when s4.item is null then '0' else '1' end as Region
 from universe u
 left join s1 on u.item = s1.item
 left join s2 on u.item = s2.item
 left join s3 on u.item = s3.item
 left join s4 on u.item = s4.item
)
select Region, count(*) from regions group by Region

If the sets are very large you might want to index the item column before querying in case the SQL optimizer won't figure it out by itself.

edited Nov 17 at 14:24

answered Nov 16 at 23:07

KT.

5,18222455

add a comment |

up vote
1
down vote

accepted

Try something like this:

with universe as (
 select * from s1 
 union
 select * from s2
 union
 select * from s3
 union
 select * from s4
),
regions as (
 select
 case when s1.item is null then '0' else '1' end
 ||
 case when s2.item is null then '0' else '1' end
 ||
 case when s3.item is null then '0' else '1' end
 ||
 case when s4.item is null then '0' else '1' end as Region
 from universe u
 left join s1 on u.item = s1.item
 left join s2 on u.item = s2.item
 left join s3 on u.item = s3.item
 left join s4 on u.item = s4.item
)
select Region, count(*) from regions group by Region

If the sets are very large you might want to index the item column before querying in case the SQL optimizer won't figure it out by itself.

edited Nov 17 at 14:24

answered Nov 16 at 23:07

KT.

5,18222455

Try something like this:

with universe as (
 select * from s1 
 union
 select * from s2
 union
 select * from s3
 union
 select * from s4
),
regions as (
 select
 case when s1.item is null then '0' else '1' end
 ||
 case when s2.item is null then '0' else '1' end
 ||
 case when s3.item is null then '0' else '1' end
 ||
 case when s4.item is null then '0' else '1' end as Region
 from universe u
 left join s1 on u.item = s1.item
 left join s2 on u.item = s2.item
 left join s3 on u.item = s3.item
 left join s4 on u.item = s4.item
)
select Region, count(*) from regions group by Region

If the sets are very large you might want to index the item column before querying in case the SQL optimizer won't figure it out by itself.

edited Nov 17 at 14:24

answered Nov 16 at 23:07

KT.

5,18222455

edited Nov 17 at 14:24

answered Nov 16 at 23:07

KT.

5,18222455

answered Nov 16 at 23:07

KT.

5,18222455

answered Nov 16 at 23:07

KT.

5,18222455

add a comment |

up vote
0
down vote

Following procedure:

Created a stored procedure, which creates temporary in-memory tables containing the sets.

Mind that MySQL does not allow you refer to a temporary in-memory table more than one time in a query.

As noted, MySQL does not have an INTERSECT or EXCEPT. But you can emulate them. By removing duplicates from your raw data/ raw sets, emulation can be even more simplified.

Decided to store the computed value into a variable each and output a table consisting of all 15 of those values corresponding to components.

What I came up with is currently https://gist.github.com/Rillke/c2da0921f8f2a047615f41fab8781c11

answered Nov 14 at 10:26

Rainer Rillke

957819

add a comment |

up vote
0
down vote

Following procedure:

Created a stored procedure, which creates temporary in-memory tables containing the sets.

Mind that MySQL does not allow you refer to a temporary in-memory table more than one time in a query.

As noted, MySQL does not have an INTERSECT or EXCEPT. But you can emulate them. By removing duplicates from your raw data/ raw sets, emulation can be even more simplified.

Decided to store the computed value into a variable each and output a table consisting of all 15 of those values corresponding to components.

What I came up with is currently https://gist.github.com/Rillke/c2da0921f8f2a047615f41fab8781c11

answered Nov 14 at 10:26

Rainer Rillke

957819

add a comment |

up vote
0
down vote

Following procedure:

Created a stored procedure, which creates temporary in-memory tables containing the sets.

Mind that MySQL does not allow you refer to a temporary in-memory table more than one time in a query.

As noted, MySQL does not have an INTERSECT or EXCEPT. But you can emulate them. By removing duplicates from your raw data/ raw sets, emulation can be even more simplified.

Decided to store the computed value into a variable each and output a table consisting of all 15 of those values corresponding to components.

What I came up with is currently https://gist.github.com/Rillke/c2da0921f8f2a047615f41fab8781c11

answered Nov 14 at 10:26

Rainer Rillke

957819

Following procedure:

Created a stored procedure, which creates temporary in-memory tables containing the sets.

Mind that MySQL does not allow you refer to a temporary in-memory table more than one time in a query.

As noted, MySQL does not have an INTERSECT or EXCEPT. But you can emulate them. By removing duplicates from your raw data/ raw sets, emulation can be even more simplified.

Decided to store the computed value into a variable each and output a table consisting of all 15 of those values corresponding to components.

What I came up with is currently https://gist.github.com/Rillke/c2da0921f8f2a047615f41fab8781c11

answered Nov 14 at 10:26

Rainer Rillke

957819

answered Nov 14 at 10:26

Rainer Rillke

957819

answered Nov 14 at 10:26

Rainer Rillke

957819

answered Nov 14 at 10:26

Rainer Rillke

957819

add a comment |

up vote
0
down vote

The question is a little complex so the answers are. Let me explain K.T.'s answer

with universe as (
 select * from s1 
 union
 select * from s2
 union
 select * from s3
 union
 select * from s4
),
regions as (
 select
 case when s1.item is null then '0' else '1' end
 ||
 case when s2.item is null then '0' else '1' end
 ||
 case when s3.item is null then '0' else '1' end
 ||
 case when s4.item is null then '0' else '1' end as Region
 from universe u
 left join s1 on u.item = s1.item
 left join s2 on u.item = s2.item
 left join s3 on u.item = s3.item
 left join s4 on u.item = s4.item
)
select Region, count(*) from regions group by Region

The universe results in the UNION of all tables (duplicates eliminated), something like

+------+
| item |
+------+
| a |
+------+
| b |
+------+
| c |
+------+
| d |
+------+
| e |
+------+
| ... |
+------+

Then, s1, s2, s3 and s4 are joined

+------+---------+---------+---------+---------+
| item | s1.item | s2.item | s3.item | s4.item |
+------+---------+---------+---------+---------+
| a | a | a | a | a |
+------+---------+---------+---------+---------+
| b | b | b | b | NULL |
+------+---------+---------+---------+---------+
| c | c | c | NULL | c |
+------+---------+---------+---------+---------+
| d | d | NULL | d | d |
+------+---------+---------+---------+---------+
| e | NULL | e | e | e |
+------+---------+---------+---------+---------+
| ... | ... | ... | ... | ... |
+------+---------+---------+---------+---------+

and converted to a binary string (0: if cell is NULL; 1: else) called Region where the first digit corresponds to s1, the second to s2 and so on

+------+--------+
| item | Region |
+------+--------+
| a | 1111 |
+------+--------+
| b | 1110 |
+------+--------+
| c | 1101 |
+------+--------+
| d | 1011 |
+------+--------+
| e | 0111 |
+------+--------+
| ... | ... |
+------+--------+

and finally aggregated and grouped by Region

+--------+-------+
| Region | count |
+--------+-------+
| 1111 | 1 |
+--------+-------+
| 1110 | 1 |
+--------+-------+
| 1101 | 1 |
+--------+-------+
| 1011 | 1 |
+--------+-------+
| 0111 | 1 |
+--------+-------+
| ... | |
+--------+-------+

Note that regions having 0 set elements in them do not show up in the results and 0000 never will (=item not part of any set s1, s2, s3, s4) so there are 15 regions.

4-set venn diagram with regions in binary representation

edited Nov 25 at 20:36

answered Nov 25 at 20:26

Rainer Rillke

957819

add a comment |

up vote
0
down vote

The question is a little complex so the answers are. Let me explain K.T.'s answer

with universe as (
 select * from s1 
 union
 select * from s2
 union
 select * from s3
 union
 select * from s4
),
regions as (
 select
 case when s1.item is null then '0' else '1' end
 ||
 case when s2.item is null then '0' else '1' end
 ||
 case when s3.item is null then '0' else '1' end
 ||
 case when s4.item is null then '0' else '1' end as Region
 from universe u
 left join s1 on u.item = s1.item
 left join s2 on u.item = s2.item
 left join s3 on u.item = s3.item
 left join s4 on u.item = s4.item
)
select Region, count(*) from regions group by Region

The universe results in the UNION of all tables (duplicates eliminated), something like

+------+
| item |
+------+
| a |
+------+
| b |
+------+
| c |
+------+
| d |
+------+
| e |
+------+
| ... |
+------+

Then, s1, s2, s3 and s4 are joined

+------+---------+---------+---------+---------+
| item | s1.item | s2.item | s3.item | s4.item |
+------+---------+---------+---------+---------+
| a | a | a | a | a |
+------+---------+---------+---------+---------+
| b | b | b | b | NULL |
+------+---------+---------+---------+---------+
| c | c | c | NULL | c |
+------+---------+---------+---------+---------+
| d | d | NULL | d | d |
+------+---------+---------+---------+---------+
| e | NULL | e | e | e |
+------+---------+---------+---------+---------+
| ... | ... | ... | ... | ... |
+------+---------+---------+---------+---------+

and converted to a binary string (0: if cell is NULL; 1: else) called Region where the first digit corresponds to s1, the second to s2 and so on

+------+--------+
| item | Region |
+------+--------+
| a | 1111 |
+------+--------+
| b | 1110 |
+------+--------+
| c | 1101 |
+------+--------+
| d | 1011 |
+------+--------+
| e | 0111 |
+------+--------+
| ... | ... |
+------+--------+

and finally aggregated and grouped by Region

+--------+-------+
| Region | count |
+--------+-------+
| 1111 | 1 |
+--------+-------+
| 1110 | 1 |
+--------+-------+
| 1101 | 1 |
+--------+-------+
| 1011 | 1 |
+--------+-------+
| 0111 | 1 |
+--------+-------+
| ... | |
+--------+-------+

Note that regions having 0 set elements in them do not show up in the results and 0000 never will (=item not part of any set s1, s2, s3, s4) so there are 15 regions.

4-set venn diagram with regions in binary representation

edited Nov 25 at 20:36

answered Nov 25 at 20:26

Rainer Rillke

957819

add a comment |

up vote
0
down vote

The question is a little complex so the answers are. Let me explain K.T.'s answer

with universe as (
 select * from s1 
 union
 select * from s2
 union
 select * from s3
 union
 select * from s4
),
regions as (
 select
 case when s1.item is null then '0' else '1' end
 ||
 case when s2.item is null then '0' else '1' end
 ||
 case when s3.item is null then '0' else '1' end
 ||
 case when s4.item is null then '0' else '1' end as Region
 from universe u
 left join s1 on u.item = s1.item
 left join s2 on u.item = s2.item
 left join s3 on u.item = s3.item
 left join s4 on u.item = s4.item
)
select Region, count(*) from regions group by Region

The universe results in the UNION of all tables (duplicates eliminated), something like

+------+
| item |
+------+
| a |
+------+
| b |
+------+
| c |
+------+
| d |
+------+
| e |
+------+
| ... |
+------+

Then, s1, s2, s3 and s4 are joined

+------+---------+---------+---------+---------+
| item | s1.item | s2.item | s3.item | s4.item |
+------+---------+---------+---------+---------+
| a | a | a | a | a |
+------+---------+---------+---------+---------+
| b | b | b | b | NULL |
+------+---------+---------+---------+---------+
| c | c | c | NULL | c |
+------+---------+---------+---------+---------+
| d | d | NULL | d | d |
+------+---------+---------+---------+---------+
| e | NULL | e | e | e |
+------+---------+---------+---------+---------+
| ... | ... | ... | ... | ... |
+------+---------+---------+---------+---------+

and converted to a binary string (0: if cell is NULL; 1: else) called Region where the first digit corresponds to s1, the second to s2 and so on

+------+--------+
| item | Region |
+------+--------+
| a | 1111 |
+------+--------+
| b | 1110 |
+------+--------+
| c | 1101 |
+------+--------+
| d | 1011 |
+------+--------+
| e | 0111 |
+------+--------+
| ... | ... |
+------+--------+

and finally aggregated and grouped by Region

+--------+-------+
| Region | count |
+--------+-------+
| 1111 | 1 |
+--------+-------+
| 1110 | 1 |
+--------+-------+
| 1101 | 1 |
+--------+-------+
| 1011 | 1 |
+--------+-------+
| 0111 | 1 |
+--------+-------+
| ... | |
+--------+-------+

Note that regions having 0 set elements in them do not show up in the results and 0000 never will (=item not part of any set s1, s2, s3, s4) so there are 15 regions.

4-set venn diagram with regions in binary representation

edited Nov 25 at 20:36

answered Nov 25 at 20:26

Rainer Rillke

957819

The question is a little complex so the answers are. Let me explain K.T.'s answer

with universe as (
 select * from s1 
 union
 select * from s2
 union
 select * from s3
 union
 select * from s4
),
regions as (
 select
 case when s1.item is null then '0' else '1' end
 ||
 case when s2.item is null then '0' else '1' end
 ||
 case when s3.item is null then '0' else '1' end
 ||
 case when s4.item is null then '0' else '1' end as Region
 from universe u
 left join s1 on u.item = s1.item
 left join s2 on u.item = s2.item
 left join s3 on u.item = s3.item
 left join s4 on u.item = s4.item
)
select Region, count(*) from regions group by Region

The universe results in the UNION of all tables (duplicates eliminated), something like

+------+
| item |
+------+
| a |
+------+
| b |
+------+
| c |
+------+
| d |
+------+
| e |
+------+
| ... |
+------+

Then, s1, s2, s3 and s4 are joined

+------+---------+---------+---------+---------+
| item | s1.item | s2.item | s3.item | s4.item |
+------+---------+---------+---------+---------+
| a | a | a | a | a |
+------+---------+---------+---------+---------+
| b | b | b | b | NULL |
+------+---------+---------+---------+---------+
| c | c | c | NULL | c |
+------+---------+---------+---------+---------+
| d | d | NULL | d | d |
+------+---------+---------+---------+---------+
| e | NULL | e | e | e |
+------+---------+---------+---------+---------+
| ... | ... | ... | ... | ... |
+------+---------+---------+---------+---------+

and converted to a binary string (0: if cell is NULL; 1: else) called Region where the first digit corresponds to s1, the second to s2 and so on

+------+--------+
| item | Region |
+------+--------+
| a | 1111 |
+------+--------+
| b | 1110 |
+------+--------+
| c | 1101 |
+------+--------+
| d | 1011 |
+------+--------+
| e | 0111 |
+------+--------+
| ... | ... |
+------+--------+

and finally aggregated and grouped by Region

+--------+-------+
| Region | count |
+--------+-------+
| 1111 | 1 |
+--------+-------+
| 1110 | 1 |
+--------+-------+
| 1101 | 1 |
+--------+-------+
| 1011 | 1 |
+--------+-------+
| 0111 | 1 |
+--------+-------+
| ... | |
+--------+-------+

Note that regions having 0 set elements in them do not show up in the results and 0000 never will (=item not part of any set s1, s2, s3, s4) so there are 15 regions.

4-set venn diagram with regions in binary representation

edited Nov 25 at 20:36

answered Nov 25 at 20:26

Rainer Rillke

957819

edited Nov 25 at 20:36

answered Nov 25 at 20:26

Rainer Rillke

957819

answered Nov 25 at 20:26

Rainer Rillke

957819

answered Nov 25 at 20:26

Rainer Rillke

957819

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

ZUqk62RmtztTQwr pD9,OxWS,Czfp 5fDwkJ

搜尋此網誌

Odtnhj