电商网站商品管理(三)group by+avg+sort 等聚合分析
发布于: 2021 年 01 月 16 日
内容来自 B 站中华石杉的《Elasticsearch 顶尖高手系列课程核心知识篇》,不知道别人怎么样,反正我是有点看不太明白,也记不住,不过好在这只是一个示例,后面还会有详细的讲解。
第一个分析需求
1. 计算每个 tag 下的商品数量
GET /ecommerce/_search
{
"aggs":{
"group_by_tags":{
"terms": {"field":"tags"}
}
}
}
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [tags] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
],
"type" : "search_phase_execution_exception",
"reason" : "all shards failed",
"phase" : "query",
"grouped" : true,
...
复制代码
将文本 field 的 fielddata 属性设置为 true
POST ecommerce/_mapping
{
"properties":{
"tags":{
"type":"text",
"fielddata":true
}
}
}
GET ecommerce/_search
{
"size":0,
"aggs":{
"all_tags":{
"terms":{ "field":"tags" }
}
}
}
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"all_tags" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "fangzhu",
"doc_count" : 2
},
...
复制代码
第二个聚合分析的需求
2. 对名称中包含牙膏的商品,计算每个 tag 下的商品数量
GET ecommerce/_search
{
"size":0,
"query":{
"match":{
"name":"yagao"
}
},
"aggs":{
"all_tags":{
"terms":{
"field":"tags"
}
}
}
}
复制代码
第三个聚合分析的需求
3. 先分组,再算每组的平均值,计算每个 tag 下的商品的平均价格
GET ecommerce/_search
{
"size":0,
"aggs":{
"group_by_tags":{
"terms":{"field":"tags"},
"aggs":{
"avg_price":{
"avg":{"field":"price"}
}
}
}
}
}
{
"took" : 103,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
...
复制代码
第四个数据分析需求
4. 计算每个 tag 下的商品的平均价格,并且按照平均价格降序排序
GET ecommerce/_search
{
"size":0,
"aggs":{
"all_tags":{
"terms":{
"field":"tags",
"order":{"avg_price":"desc"}
},
"aggs":{
"avg_price":{
"avg":{"field":"price"}
}
}
}
}
}
{
"took" : 119,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
...
复制代码
第五个数据分析需求:
5. 按照指定的价格范围区间进行分组,然后在每组内再按照 tag 进行分组,最后再计算每组的平均价格
GET ecommerce/_search
{
"size":0,
"aggs":{
"group_by_price":{
"range":{
"field":"price",
"ranges":[
{"from":0, "to":20},
{"from":20, "to":40},
{"from":40, "to":60}
]
},
"aggs":{
"group_by_tags":{
"terms":{
"field":"tags"
},
"aggs":{
"average_price":{
"avg":{
"field":"price"
}
}
}
}
}
}
}
}
{
"took" : 62,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"group_by_price" : {
"buckets" : [
{
"key" : "0.0-20.0",
"from" : 0.0,
"to" : 20.0,
"doc_count" : 0,
"group_by_tags" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [ ]
}
},
...
复制代码
划线
评论
复制
发布于: 2021 年 01 月 16 日阅读数: 33
版权声明: 本文为 InfoQ 作者【escray】的原创文章。
原文链接:【http://xie.infoq.cn/article/2b531c88505b5085b3be300ee】。
本文遵守【CC-BY 4.0】协议,转载请保留原文出处及本版权声明。
escray
关注
Let's Go 2017.11.19 加入
在学 Elasticsearch 的项目经理
评论