电商网站商品管理(三)group by+avg+sort 等聚合分析
发布于: 2021 年 01 月 16 日

内容来自 B 站中华石杉的《Elasticsearch 顶尖高手系列课程核心知识篇》,不知道别人怎么样,反正我是有点看不太明白,也记不住,不过好在这只是一个示例,后面还会有详细的讲解。
第一个分析需求
1. 计算每个 tag 下的商品数量
GET /ecommerce/_search{ "aggs":{ "group_by_tags":{ "terms": {"field":"tags"} } }}
{ "error" : { "root_cause" : [ { "type" : "illegal_argument_exception", "reason" : "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [tags] in order to load field data by uninverting the inverted index. Note that this can use significant memory." } ], "type" : "search_phase_execution_exception", "reason" : "all shards failed", "phase" : "query", "grouped" : true,...复制代码
将文本 field 的 fielddata 属性设置为 true
POST ecommerce/_mapping{ "properties":{ "tags":{ "type":"text", "fielddata":true } }}
GET ecommerce/_search{ "size":0, "aggs":{ "all_tags":{ "terms":{ "field":"tags" } } }}
{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 6, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : { "all_tags" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "fangzhu", "doc_count" : 2 }, ...复制代码
第二个聚合分析的需求
2. 对名称中包含牙膏的商品,计算每个 tag 下的商品数量
GET ecommerce/_search{ "size":0, "query":{ "match":{ "name":"yagao" } }, "aggs":{ "all_tags":{ "terms":{ "field":"tags" } } }}复制代码
第三个聚合分析的需求
3. 先分组,再算每组的平均值,计算每个 tag 下的商品的平均价格
GET ecommerce/_search{ "size":0, "aggs":{ "group_by_tags":{ "terms":{"field":"tags"}, "aggs":{ "avg_price":{ "avg":{"field":"price"} } } } }}
{ "took" : 103, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 6, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, ...复制代码
第四个数据分析需求
4. 计算每个 tag 下的商品的平均价格,并且按照平均价格降序排序
GET ecommerce/_search{ "size":0, "aggs":{ "all_tags":{ "terms":{ "field":"tags", "order":{"avg_price":"desc"} }, "aggs":{ "avg_price":{ "avg":{"field":"price"} } } } }}
{ "took" : 119, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 6, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, ...复制代码
第五个数据分析需求:
5. 按照指定的价格范围区间进行分组,然后在每组内再按照 tag 进行分组,最后再计算每组的平均价格
GET ecommerce/_search{ "size":0, "aggs":{ "group_by_price":{ "range":{ "field":"price", "ranges":[ {"from":0, "to":20}, {"from":20, "to":40}, {"from":40, "to":60} ] }, "aggs":{ "group_by_tags":{ "terms":{ "field":"tags" }, "aggs":{ "average_price":{ "avg":{ "field":"price" } } } } } } }}
{ "took" : 62, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 6, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : { "group_by_price" : { "buckets" : [ { "key" : "0.0-20.0", "from" : 0.0, "to" : 20.0, "doc_count" : 0, "group_by_tags" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ ] } },...复制代码
划线
评论
复制
发布于: 2021 年 01 月 16 日阅读数: 33
版权声明: 本文为 InfoQ 作者【escray】的原创文章。
原文链接:【http://xie.infoq.cn/article/2b531c88505b5085b3be300ee】。
本文遵守【CC-BY 4.0】协议,转载请保留原文出处及本版权声明。
escray
关注
Let's Go 2017.11.19 加入
在学 Elasticsearch 的项目经理











评论