写点什么

000 ES suggest- 英文

用户头像
小林-1025
关注
发布于: 2021 年 04 月 29 日

Suggest 使用场景,自动补全或者智能纠错,基本使用下来,可以很好的满足


  1. 建立索引,注意 type 的选择

PUT /blogs {   "mappings": {     "properties": {       "body": {         "type": "text"       }     }   } }
复制代码


  1. 写入数据

POST _bulk/?refresh=true { "index" : { "_index" : "blogs" } } { "body": "Lucene is cool"} { "index" : { "_index" : "blogs" } } { "body": "Elasticsearch builds on top of lucene"} { "index" : { "_index" : "blogs" } } { "body": "Elasticsearch rocks"} { "index" : { "_index" : "blogs" } } { "body": "Elastic is the company behind ELK stack"} { "index" : { "_index" : "blogs" } } { "body": "elk rocks"} { "index" : { "_index" : "blogs" } } {  "body": "elasticsearch is rock solid"}
复制代码


  1. Term suggest (missing model),大家可以从返回的 result 推测出 missing mode 的意思,返回结果中并没有 options

POST /blogs/_search{   "suggest": {    "my-suggestion": {      "text": "lucne rock",      "term": {        "suggest_mode": "missing",        "field": "body"      }    }  }}
# result{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "suggest" : { "my-suggestion" : [ { "text" : "lucne", "offset" : 0, "length" : 5, "options" : [ { "text" : "lucene", "score" : 0.8, "freq" : 2 } ] }, { "text" : "rock", "offset" : 6, "length" : 4, "options" : [ ] } ] }}
复制代码


  1. Term suggest (popular model),此时是有返回 options 的

POST /blogs/_search{   "suggest": {    "my-suggestion": {      "text": "lucne rock",      "term": {        "suggest_mode": "popular",        "field": "body"      }    }  }}
# result{ "took" : 582, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "suggest" : { "my-suggestion" : [ { "text" : "lucne", "offset" : 0, "length" : 5, "options" : [ { "text" : "lucene", "score" : 0.8, "freq" : 2 } ] }, { "text" : "rock", "offset" : 6, "length" : 4, "options" : [ { "text" : "rocks", "score" : 0.75, "freq" : 2 } ] } ] }}
复制代码


  1. Phrase suggest

POST /blogs/_search{  "suggest": {    "my-suggestion": {      "text": "lucne and elasticsear rock",      "phrase": {        "field": "body",        "highlight": {          "pre_tag": "<em>",          "post_tag": "</em>"        }      }    }  }}
# result{ "took" : 221, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "suggest" : { "my-suggestion" : [ { "text" : "lucne and elasticsear rock", "offset" : 0, "length" : 26, "options" : [ { "text" : "lucene and elasticsearch rock", "highlighted" : "<em>lucene</em> and <em>elasticsearch</em> rock", "score" : 0.004993905 }, { "text" : "lucne and elasticsearch rock", "highlighted" : "lucne and <em>elasticsearch</em> rock", "score" : 0.0033391973 }, { "text" : "lucene and elasticsear rock", "highlighted" : "<em>lucene</em> and elasticsear rock", "score" : 0.0029183894 } ] } ] }}
复制代码


  1. Completion Suggester(prefix completion)

    主要针对的应用场景就是"Auto Completion"。 此场景下用户每输入一个字符的时候,就需要即时发送一次查询请求到后端查找匹配项,在用户输入速度较高的情况下对后端响应速度要求比较苛刻。因此实现上它和前面两个 Suggester 采用了不同的数据结构,索引并非通过倒排来完成,而是将 analyze 过的数据编码成 FST 和索引一起存放。对于一个 open 状态的索引,FST 会被 ES 整个装载到内存里的,进行前缀查找速度极快。但是 FST 只能用于前缀查找,这也是 Completion Suggester 的局限所在。

# create index PUT /blogs_completion/{  "mappings": {      "properties": {        "body": {          "type": "completion"        }      }    }}
# add dataPOST _bulk/?refresh=true{ "index" : { "_index" : "blogs_completion" } }{ "body": "Lucene is cool"}{ "index" : { "_index" : "blogs_completion" } }{ "body": "Elasticsearch builds on top of lucene"}{ "index" : { "_index" : "blogs_completion" } }{ "body": "Elasticsearch rocks"}{ "index" : { "_index" : "blogs_completion"} }{ "body": "Elastic is the company behind ELK stack"}{ "index" : { "_index" : "blogs_completion" } }{ "body": "the elk stack rocks"}{ "index" : { "_index" : "blogs_completion"} }{ "body": "elasticsearch is rock solid”}
# prefix completionPOST blogs_completion/_search?pretty{ "size": 0, "suggest": { "blog-suggest": { "prefix": "elastic i", "completion": { "field": "body" } } }}
# prefix completion result{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "suggest" : { "blog-suggest" : [ { "text" : "elastic i", "offset" : 0, "length" : 9, "options" : [ { "text" : "Elastic is the company behind ELK stack", "_index" : "blogs_completion", "_type" : "_doc", "_id" : "dllnEnkBgt5AqocFAwm-", "_score" : 1.0, "_source" : { "body" : "Elastic is the company behind ELK stack" } } ] } ] }}
复制代码


  1. Completion suggest 注意事项

    值得注意的一点是 Completion Suggester 在索引原始数据的时候也要经过 analyze 阶段,取决于选用的 analyzer 不同,某些词可能会被转换,某些词可能被去除,这些会影响 FST 编码结果,也会影响查找匹配的效果。

用户头像

小林-1025

关注

还未添加个人签名 2018.03.01 加入

还未添加个人简介

评论

发布
暂无评论
000 ES suggest-英文