ES - Sub buckets based on values in bucket property rather than document values
I am new to ElasticSearch and I am trying to bucket objects coming from a search by hierarchical categories.
I apologize in advance for the length of the question but I wanted to give ample samples and information to make the need as clear as possible.
What I am Trying to Achieve
The problem is that categories form a hierarchy but are represented as a flat array of objects, each with a depth. I would like to generate an aggregation that would bucket by category and category depth.
Here is a simplified mapping for the document that contains only the minimum data:
"mappings":
"_doc":
"properties":
"categoriesList":
"properties":
"depth":
"type": "long"
,
"title":
"type": "text",
"fields":
"keyword":
"type": "keyword",
"ignore_above": 256
Here is a simplified sample document:
"_index": "x",
"_type": "_doc",
"_id": "wY0w5GYBOIOl7fi31c_b",
"_score": 22.72073,
"_source":
"categoriesList": [
"title": "category_lvl_2_2",
"depth": 2
,
"title": "category_lvl_2",
"depth": 2,
,
"title": "category_lvl_1",
"depth": 1
]
Now, what I am trying to achieve is to get hierarchical buckets of categories based on their depth i.e. I want to have a bucket that contains all titles of categories of depth 1 across all hits, then another bucket (or sub-bucket with the titles of just the categories of depth 2 across all hits, and so on.
Something like:
"aggregations":
"depth":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": 1,
"doc_count": 47,
"name":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_1",
"doc_count": 47,
"depth_1":
"doc_count": 47
]
,
"key": 2,
"doc_count": 47,
"name":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_2_1",
"doc_count": 47
,
"key": "category_lvl_2_2",
"doc_count": 33
]
]
What I have tried
At first I tried to simply create nested aggregations as follows:
"aggs":
"depth":
"terms":
"field": "categoriesList.depth"
,
"aggs":
"name":
"terms":
"field": "categoriesList.title.keyword"
,
This, of course, did not give what I wanted. It basically gave me buckets whose keys were by depth but that contained all titles of all categories no matter what their depth was; the contents were the same. Something like the following:
"aggregations":
"depth":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": 1,
"doc_count": 47,
"name":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_1",
"doc_count": 47
,
"key": "category_lvl_2_1",
"doc_count": 33
,
"key": "category_lvl_2_2",
"doc_count": 15
]
,
"key": 2,
"doc_count": 47,
"name":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_1",
"doc_count": 47
,
"key": "category_lvl_2_1",
"doc_count": 33
,
"key": "category_lvl_2_2",
"doc_count": 15
]
]
Then I tried to see if a filtered aggregation would work by trying to filter one sub-bucket by value of depth 1:
"aggs":
"depth":
"terms":
"field": "categoriesList.depth"
,
"aggs":
"name":
"terms":
"field": "categoriesList.title.keyword"
,
"aggs":
"depth_1":
"filter":
"term":
"categoriesList.depth": 1
This gave the same results as the simple aggregation query above but with an extra nesting level that served no purpose.
The question
With my current understanding of ES, what I am seeing makes sense: it goes over each document from the search and then creates buckets based on category depth but since each document has at least one category with each depth, the entire categories list is added to the bucket.
Is what I am trying to do possible with ES? I get the feeling that this will not work because I am basically trying to bucket and filter the properties used by the initial bucketing query rather than working on the document properties.
I could also bucket myself directly in code since we are getting the categories results but I wanted to know if it was possible to get this done on ES' side which would save me from modifying quite a bit of existing code we have.
Thanks!
elasticsearch
add a comment |
I am new to ElasticSearch and I am trying to bucket objects coming from a search by hierarchical categories.
I apologize in advance for the length of the question but I wanted to give ample samples and information to make the need as clear as possible.
What I am Trying to Achieve
The problem is that categories form a hierarchy but are represented as a flat array of objects, each with a depth. I would like to generate an aggregation that would bucket by category and category depth.
Here is a simplified mapping for the document that contains only the minimum data:
"mappings":
"_doc":
"properties":
"categoriesList":
"properties":
"depth":
"type": "long"
,
"title":
"type": "text",
"fields":
"keyword":
"type": "keyword",
"ignore_above": 256
Here is a simplified sample document:
"_index": "x",
"_type": "_doc",
"_id": "wY0w5GYBOIOl7fi31c_b",
"_score": 22.72073,
"_source":
"categoriesList": [
"title": "category_lvl_2_2",
"depth": 2
,
"title": "category_lvl_2",
"depth": 2,
,
"title": "category_lvl_1",
"depth": 1
]
Now, what I am trying to achieve is to get hierarchical buckets of categories based on their depth i.e. I want to have a bucket that contains all titles of categories of depth 1 across all hits, then another bucket (or sub-bucket with the titles of just the categories of depth 2 across all hits, and so on.
Something like:
"aggregations":
"depth":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": 1,
"doc_count": 47,
"name":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_1",
"doc_count": 47,
"depth_1":
"doc_count": 47
]
,
"key": 2,
"doc_count": 47,
"name":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_2_1",
"doc_count": 47
,
"key": "category_lvl_2_2",
"doc_count": 33
]
]
What I have tried
At first I tried to simply create nested aggregations as follows:
"aggs":
"depth":
"terms":
"field": "categoriesList.depth"
,
"aggs":
"name":
"terms":
"field": "categoriesList.title.keyword"
,
This, of course, did not give what I wanted. It basically gave me buckets whose keys were by depth but that contained all titles of all categories no matter what their depth was; the contents were the same. Something like the following:
"aggregations":
"depth":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": 1,
"doc_count": 47,
"name":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_1",
"doc_count": 47
,
"key": "category_lvl_2_1",
"doc_count": 33
,
"key": "category_lvl_2_2",
"doc_count": 15
]
,
"key": 2,
"doc_count": 47,
"name":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_1",
"doc_count": 47
,
"key": "category_lvl_2_1",
"doc_count": 33
,
"key": "category_lvl_2_2",
"doc_count": 15
]
]
Then I tried to see if a filtered aggregation would work by trying to filter one sub-bucket by value of depth 1:
"aggs":
"depth":
"terms":
"field": "categoriesList.depth"
,
"aggs":
"name":
"terms":
"field": "categoriesList.title.keyword"
,
"aggs":
"depth_1":
"filter":
"term":
"categoriesList.depth": 1
This gave the same results as the simple aggregation query above but with an extra nesting level that served no purpose.
The question
With my current understanding of ES, what I am seeing makes sense: it goes over each document from the search and then creates buckets based on category depth but since each document has at least one category with each depth, the entire categories list is added to the bucket.
Is what I am trying to do possible with ES? I get the feeling that this will not work because I am basically trying to bucket and filter the properties used by the initial bucketing query rather than working on the document properties.
I could also bucket myself directly in code since we are getting the categories results but I wanted to know if it was possible to get this done on ES' side which would save me from modifying quite a bit of existing code we have.
Thanks!
elasticsearch
Can you change your mapping? If so set it to nested type for categoriesList and a nested aggregation then would do it
– sramalingam24
Nov 13 '18 at 23:10
That did it! I had failed to notice in the documentation that ES flattens nested objects thus making it impossible to have associations between sub-objects. Making it a nested object allowed me to setup a nested aggregration with sub-aggregations to achieve what I wanted. Do you care to post your comment as an answer? Otherwise I can post my final solution.
– Ouamer
Nov 14 '18 at 14:05
Go ahead and post your solution and Mark it as answer, since you already have it
– sramalingam24
Nov 14 '18 at 17:12
add a comment |
I am new to ElasticSearch and I am trying to bucket objects coming from a search by hierarchical categories.
I apologize in advance for the length of the question but I wanted to give ample samples and information to make the need as clear as possible.
What I am Trying to Achieve
The problem is that categories form a hierarchy but are represented as a flat array of objects, each with a depth. I would like to generate an aggregation that would bucket by category and category depth.
Here is a simplified mapping for the document that contains only the minimum data:
"mappings":
"_doc":
"properties":
"categoriesList":
"properties":
"depth":
"type": "long"
,
"title":
"type": "text",
"fields":
"keyword":
"type": "keyword",
"ignore_above": 256
Here is a simplified sample document:
"_index": "x",
"_type": "_doc",
"_id": "wY0w5GYBOIOl7fi31c_b",
"_score": 22.72073,
"_source":
"categoriesList": [
"title": "category_lvl_2_2",
"depth": 2
,
"title": "category_lvl_2",
"depth": 2,
,
"title": "category_lvl_1",
"depth": 1
]
Now, what I am trying to achieve is to get hierarchical buckets of categories based on their depth i.e. I want to have a bucket that contains all titles of categories of depth 1 across all hits, then another bucket (or sub-bucket with the titles of just the categories of depth 2 across all hits, and so on.
Something like:
"aggregations":
"depth":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": 1,
"doc_count": 47,
"name":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_1",
"doc_count": 47,
"depth_1":
"doc_count": 47
]
,
"key": 2,
"doc_count": 47,
"name":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_2_1",
"doc_count": 47
,
"key": "category_lvl_2_2",
"doc_count": 33
]
]
What I have tried
At first I tried to simply create nested aggregations as follows:
"aggs":
"depth":
"terms":
"field": "categoriesList.depth"
,
"aggs":
"name":
"terms":
"field": "categoriesList.title.keyword"
,
This, of course, did not give what I wanted. It basically gave me buckets whose keys were by depth but that contained all titles of all categories no matter what their depth was; the contents were the same. Something like the following:
"aggregations":
"depth":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": 1,
"doc_count": 47,
"name":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_1",
"doc_count": 47
,
"key": "category_lvl_2_1",
"doc_count": 33
,
"key": "category_lvl_2_2",
"doc_count": 15
]
,
"key": 2,
"doc_count": 47,
"name":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_1",
"doc_count": 47
,
"key": "category_lvl_2_1",
"doc_count": 33
,
"key": "category_lvl_2_2",
"doc_count": 15
]
]
Then I tried to see if a filtered aggregation would work by trying to filter one sub-bucket by value of depth 1:
"aggs":
"depth":
"terms":
"field": "categoriesList.depth"
,
"aggs":
"name":
"terms":
"field": "categoriesList.title.keyword"
,
"aggs":
"depth_1":
"filter":
"term":
"categoriesList.depth": 1
This gave the same results as the simple aggregation query above but with an extra nesting level that served no purpose.
The question
With my current understanding of ES, what I am seeing makes sense: it goes over each document from the search and then creates buckets based on category depth but since each document has at least one category with each depth, the entire categories list is added to the bucket.
Is what I am trying to do possible with ES? I get the feeling that this will not work because I am basically trying to bucket and filter the properties used by the initial bucketing query rather than working on the document properties.
I could also bucket myself directly in code since we are getting the categories results but I wanted to know if it was possible to get this done on ES' side which would save me from modifying quite a bit of existing code we have.
Thanks!
elasticsearch
I am new to ElasticSearch and I am trying to bucket objects coming from a search by hierarchical categories.
I apologize in advance for the length of the question but I wanted to give ample samples and information to make the need as clear as possible.
What I am Trying to Achieve
The problem is that categories form a hierarchy but are represented as a flat array of objects, each with a depth. I would like to generate an aggregation that would bucket by category and category depth.
Here is a simplified mapping for the document that contains only the minimum data:
"mappings":
"_doc":
"properties":
"categoriesList":
"properties":
"depth":
"type": "long"
,
"title":
"type": "text",
"fields":
"keyword":
"type": "keyword",
"ignore_above": 256
Here is a simplified sample document:
"_index": "x",
"_type": "_doc",
"_id": "wY0w5GYBOIOl7fi31c_b",
"_score": 22.72073,
"_source":
"categoriesList": [
"title": "category_lvl_2_2",
"depth": 2
,
"title": "category_lvl_2",
"depth": 2,
,
"title": "category_lvl_1",
"depth": 1
]
Now, what I am trying to achieve is to get hierarchical buckets of categories based on their depth i.e. I want to have a bucket that contains all titles of categories of depth 1 across all hits, then another bucket (or sub-bucket with the titles of just the categories of depth 2 across all hits, and so on.
Something like:
"aggregations":
"depth":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": 1,
"doc_count": 47,
"name":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_1",
"doc_count": 47,
"depth_1":
"doc_count": 47
]
,
"key": 2,
"doc_count": 47,
"name":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_2_1",
"doc_count": 47
,
"key": "category_lvl_2_2",
"doc_count": 33
]
]
What I have tried
At first I tried to simply create nested aggregations as follows:
"aggs":
"depth":
"terms":
"field": "categoriesList.depth"
,
"aggs":
"name":
"terms":
"field": "categoriesList.title.keyword"
,
This, of course, did not give what I wanted. It basically gave me buckets whose keys were by depth but that contained all titles of all categories no matter what their depth was; the contents were the same. Something like the following:
"aggregations":
"depth":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": 1,
"doc_count": 47,
"name":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_1",
"doc_count": 47
,
"key": "category_lvl_2_1",
"doc_count": 33
,
"key": "category_lvl_2_2",
"doc_count": 15
]
,
"key": 2,
"doc_count": 47,
"name":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_1",
"doc_count": 47
,
"key": "category_lvl_2_1",
"doc_count": 33
,
"key": "category_lvl_2_2",
"doc_count": 15
]
]
Then I tried to see if a filtered aggregation would work by trying to filter one sub-bucket by value of depth 1:
"aggs":
"depth":
"terms":
"field": "categoriesList.depth"
,
"aggs":
"name":
"terms":
"field": "categoriesList.title.keyword"
,
"aggs":
"depth_1":
"filter":
"term":
"categoriesList.depth": 1
This gave the same results as the simple aggregation query above but with an extra nesting level that served no purpose.
The question
With my current understanding of ES, what I am seeing makes sense: it goes over each document from the search and then creates buckets based on category depth but since each document has at least one category with each depth, the entire categories list is added to the bucket.
Is what I am trying to do possible with ES? I get the feeling that this will not work because I am basically trying to bucket and filter the properties used by the initial bucketing query rather than working on the document properties.
I could also bucket myself directly in code since we are getting the categories results but I wanted to know if it was possible to get this done on ES' side which would save me from modifying quite a bit of existing code we have.
Thanks!
elasticsearch
elasticsearch
asked Nov 13 '18 at 16:18
OuamerOuamer
113
113
Can you change your mapping? If so set it to nested type for categoriesList and a nested aggregation then would do it
– sramalingam24
Nov 13 '18 at 23:10
That did it! I had failed to notice in the documentation that ES flattens nested objects thus making it impossible to have associations between sub-objects. Making it a nested object allowed me to setup a nested aggregration with sub-aggregations to achieve what I wanted. Do you care to post your comment as an answer? Otherwise I can post my final solution.
– Ouamer
Nov 14 '18 at 14:05
Go ahead and post your solution and Mark it as answer, since you already have it
– sramalingam24
Nov 14 '18 at 17:12
add a comment |
Can you change your mapping? If so set it to nested type for categoriesList and a nested aggregation then would do it
– sramalingam24
Nov 13 '18 at 23:10
That did it! I had failed to notice in the documentation that ES flattens nested objects thus making it impossible to have associations between sub-objects. Making it a nested object allowed me to setup a nested aggregration with sub-aggregations to achieve what I wanted. Do you care to post your comment as an answer? Otherwise I can post my final solution.
– Ouamer
Nov 14 '18 at 14:05
Go ahead and post your solution and Mark it as answer, since you already have it
– sramalingam24
Nov 14 '18 at 17:12
Can you change your mapping? If so set it to nested type for categoriesList and a nested aggregation then would do it
– sramalingam24
Nov 13 '18 at 23:10
Can you change your mapping? If so set it to nested type for categoriesList and a nested aggregation then would do it
– sramalingam24
Nov 13 '18 at 23:10
That did it! I had failed to notice in the documentation that ES flattens nested objects thus making it impossible to have associations between sub-objects. Making it a nested object allowed me to setup a nested aggregration with sub-aggregations to achieve what I wanted. Do you care to post your comment as an answer? Otherwise I can post my final solution.
– Ouamer
Nov 14 '18 at 14:05
That did it! I had failed to notice in the documentation that ES flattens nested objects thus making it impossible to have associations between sub-objects. Making it a nested object allowed me to setup a nested aggregration with sub-aggregations to achieve what I wanted. Do you care to post your comment as an answer? Otherwise I can post my final solution.
– Ouamer
Nov 14 '18 at 14:05
Go ahead and post your solution and Mark it as answer, since you already have it
– sramalingam24
Nov 14 '18 at 17:12
Go ahead and post your solution and Mark it as answer, since you already have it
– sramalingam24
Nov 14 '18 at 17:12
add a comment |
1 Answer
1
active
oldest
votes
Based on sramalingam24's comment I did the following to get it working:
Create an index with a mapping specifying nested types
I changed the mapping to tell ES that the categoriesList property was a nested object. To do so I created a new index with the following mapping:
"mappings":
"_doc":
"properties":
"categoriesList":
"type": "nested",
"properties":
"depth":
"type": "long"
,
"title":
"type": "text",
"fields":
"keyword":
"type": "keyword",
"ignore_above": 256
Reindex into the new index
Then I reindex from the old index to the new one.
"source":
"index": "old_index"
,
"dest":
"index": "index_with_nested_mapping"
Use a nested aggregation
Then I used a nested aggregation similar to this:
"aggs":
"categories":
"nested":
"path": "categoriesList"
,
"aggs":
"depth":
"terms":
"field": "categoriesList.depth"
,
"aggs":
"sub-categories":
"terms":
"field": "categoriesList.title.keyword"
Which gave me the results I desired:
"aggregations":
"categories":
"doc_count": 96,
"depth":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": 2,
"doc_count": 49,
"sub-categories":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_2_1",
"doc_count": 33
,
"key": "category_lvl_2_2",
"doc_count": 15
]
,
"key": 1,
"doc_count": 47,
"sub-categories":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_1",
"doc_count": 47
]
]
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53285218%2fes-sub-buckets-based-on-values-in-bucket-property-rather-than-document-values%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Based on sramalingam24's comment I did the following to get it working:
Create an index with a mapping specifying nested types
I changed the mapping to tell ES that the categoriesList property was a nested object. To do so I created a new index with the following mapping:
"mappings":
"_doc":
"properties":
"categoriesList":
"type": "nested",
"properties":
"depth":
"type": "long"
,
"title":
"type": "text",
"fields":
"keyword":
"type": "keyword",
"ignore_above": 256
Reindex into the new index
Then I reindex from the old index to the new one.
"source":
"index": "old_index"
,
"dest":
"index": "index_with_nested_mapping"
Use a nested aggregation
Then I used a nested aggregation similar to this:
"aggs":
"categories":
"nested":
"path": "categoriesList"
,
"aggs":
"depth":
"terms":
"field": "categoriesList.depth"
,
"aggs":
"sub-categories":
"terms":
"field": "categoriesList.title.keyword"
Which gave me the results I desired:
"aggregations":
"categories":
"doc_count": 96,
"depth":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": 2,
"doc_count": 49,
"sub-categories":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_2_1",
"doc_count": 33
,
"key": "category_lvl_2_2",
"doc_count": 15
]
,
"key": 1,
"doc_count": 47,
"sub-categories":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_1",
"doc_count": 47
]
]
add a comment |
Based on sramalingam24's comment I did the following to get it working:
Create an index with a mapping specifying nested types
I changed the mapping to tell ES that the categoriesList property was a nested object. To do so I created a new index with the following mapping:
"mappings":
"_doc":
"properties":
"categoriesList":
"type": "nested",
"properties":
"depth":
"type": "long"
,
"title":
"type": "text",
"fields":
"keyword":
"type": "keyword",
"ignore_above": 256
Reindex into the new index
Then I reindex from the old index to the new one.
"source":
"index": "old_index"
,
"dest":
"index": "index_with_nested_mapping"
Use a nested aggregation
Then I used a nested aggregation similar to this:
"aggs":
"categories":
"nested":
"path": "categoriesList"
,
"aggs":
"depth":
"terms":
"field": "categoriesList.depth"
,
"aggs":
"sub-categories":
"terms":
"field": "categoriesList.title.keyword"
Which gave me the results I desired:
"aggregations":
"categories":
"doc_count": 96,
"depth":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": 2,
"doc_count": 49,
"sub-categories":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_2_1",
"doc_count": 33
,
"key": "category_lvl_2_2",
"doc_count": 15
]
,
"key": 1,
"doc_count": 47,
"sub-categories":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_1",
"doc_count": 47
]
]
add a comment |
Based on sramalingam24's comment I did the following to get it working:
Create an index with a mapping specifying nested types
I changed the mapping to tell ES that the categoriesList property was a nested object. To do so I created a new index with the following mapping:
"mappings":
"_doc":
"properties":
"categoriesList":
"type": "nested",
"properties":
"depth":
"type": "long"
,
"title":
"type": "text",
"fields":
"keyword":
"type": "keyword",
"ignore_above": 256
Reindex into the new index
Then I reindex from the old index to the new one.
"source":
"index": "old_index"
,
"dest":
"index": "index_with_nested_mapping"
Use a nested aggregation
Then I used a nested aggregation similar to this:
"aggs":
"categories":
"nested":
"path": "categoriesList"
,
"aggs":
"depth":
"terms":
"field": "categoriesList.depth"
,
"aggs":
"sub-categories":
"terms":
"field": "categoriesList.title.keyword"
Which gave me the results I desired:
"aggregations":
"categories":
"doc_count": 96,
"depth":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": 2,
"doc_count": 49,
"sub-categories":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_2_1",
"doc_count": 33
,
"key": "category_lvl_2_2",
"doc_count": 15
]
,
"key": 1,
"doc_count": 47,
"sub-categories":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_1",
"doc_count": 47
]
]
Based on sramalingam24's comment I did the following to get it working:
Create an index with a mapping specifying nested types
I changed the mapping to tell ES that the categoriesList property was a nested object. To do so I created a new index with the following mapping:
"mappings":
"_doc":
"properties":
"categoriesList":
"type": "nested",
"properties":
"depth":
"type": "long"
,
"title":
"type": "text",
"fields":
"keyword":
"type": "keyword",
"ignore_above": 256
Reindex into the new index
Then I reindex from the old index to the new one.
"source":
"index": "old_index"
,
"dest":
"index": "index_with_nested_mapping"
Use a nested aggregation
Then I used a nested aggregation similar to this:
"aggs":
"categories":
"nested":
"path": "categoriesList"
,
"aggs":
"depth":
"terms":
"field": "categoriesList.depth"
,
"aggs":
"sub-categories":
"terms":
"field": "categoriesList.title.keyword"
Which gave me the results I desired:
"aggregations":
"categories":
"doc_count": 96,
"depth":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": 2,
"doc_count": 49,
"sub-categories":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_2_1",
"doc_count": 33
,
"key": "category_lvl_2_2",
"doc_count": 15
]
,
"key": 1,
"doc_count": 47,
"sub-categories":
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
"key": "category_lvl_1",
"doc_count": 47
]
]
answered Nov 15 '18 at 15:29
OuamerOuamer
113
113
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53285218%2fes-sub-buckets-based-on-values-in-bucket-property-rather-than-document-values%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Can you change your mapping? If so set it to nested type for categoriesList and a nested aggregation then would do it
– sramalingam24
Nov 13 '18 at 23:10
That did it! I had failed to notice in the documentation that ES flattens nested objects thus making it impossible to have associations between sub-objects. Making it a nested object allowed me to setup a nested aggregration with sub-aggregations to achieve what I wanted. Do you care to post your comment as an answer? Otherwise I can post my final solution.
– Ouamer
Nov 14 '18 at 14:05
Go ahead and post your solution and Mark it as answer, since you already have it
– sramalingam24
Nov 14 '18 at 17:12