000 ES suggest- 英文
发布于: 2021 年 04 月 29 日
Suggest 使用场景,自动补全或者智能纠错,基本使用下来,可以很好的满足
建立索引,注意 type 的选择
PUT /blogs
{
"mappings": {
"properties": {
"body": {
"type": "text"
}
}
}
}
复制代码
写入数据
POST _bulk/?refresh=true
{ "index" : { "_index" : "blogs" } }
{ "body": "Lucene is cool"}
{ "index" : { "_index" : "blogs" } }
{ "body": "Elasticsearch builds on top of lucene"}
{ "index" : { "_index" : "blogs" } }
{ "body": "Elasticsearch rocks"}
{ "index" : { "_index" : "blogs" } }
{ "body": "Elastic is the company behind ELK stack"}
{ "index" : { "_index" : "blogs" } }
{ "body": "elk rocks"}
{ "index" : { "_index" : "blogs" } }
{ "body": "elasticsearch is rock solid"}
复制代码
Term suggest (missing model),大家可以从返回的 result 推测出 missing mode 的意思,返回结果中并没有 options
POST /blogs/_search
{
"suggest": {
"my-suggestion": {
"text": "lucne rock",
"term": {
"suggest_mode": "missing",
"field": "body"
}
}
}
}
# result
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"suggest" : {
"my-suggestion" : [
{
"text" : "lucne",
"offset" : 0,
"length" : 5,
"options" : [
{
"text" : "lucene",
"score" : 0.8,
"freq" : 2
}
]
},
{
"text" : "rock",
"offset" : 6,
"length" : 4,
"options" : [ ]
}
]
}
}
复制代码
Term suggest (popular model),此时是有返回 options 的
POST /blogs/_search
{
"suggest": {
"my-suggestion": {
"text": "lucne rock",
"term": {
"suggest_mode": "popular",
"field": "body"
}
}
}
}
# result
{
"took" : 582,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"suggest" : {
"my-suggestion" : [
{
"text" : "lucne",
"offset" : 0,
"length" : 5,
"options" : [
{
"text" : "lucene",
"score" : 0.8,
"freq" : 2
}
]
},
{
"text" : "rock",
"offset" : 6,
"length" : 4,
"options" : [
{
"text" : "rocks",
"score" : 0.75,
"freq" : 2
}
]
}
]
}
}
复制代码
Phrase suggest
POST /blogs/_search
{
"suggest": {
"my-suggestion": {
"text": "lucne and elasticsear rock",
"phrase": {
"field": "body",
"highlight": {
"pre_tag": "<em>",
"post_tag": "</em>"
}
}
}
}
}
# result
{
"took" : 221,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"suggest" : {
"my-suggestion" : [
{
"text" : "lucne and elasticsear rock",
"offset" : 0,
"length" : 26,
"options" : [
{
"text" : "lucene and elasticsearch rock",
"highlighted" : "<em>lucene</em> and <em>elasticsearch</em> rock",
"score" : 0.004993905
},
{
"text" : "lucne and elasticsearch rock",
"highlighted" : "lucne and <em>elasticsearch</em> rock",
"score" : 0.0033391973
},
{
"text" : "lucene and elasticsear rock",
"highlighted" : "<em>lucene</em> and elasticsear rock",
"score" : 0.0029183894
}
]
}
]
}
}
复制代码
Completion Suggester(prefix completion)
主要针对的应用场景就是"Auto Completion"。 此场景下用户每输入一个字符的时候,就需要即时发送一次查询请求到后端查找匹配项,在用户输入速度较高的情况下对后端响应速度要求比较苛刻。因此实现上它和前面两个 Suggester 采用了不同的数据结构,索引并非通过倒排来完成,而是将 analyze 过的数据编码成 FST 和索引一起存放。对于一个 open 状态的索引,FST 会被 ES 整个装载到内存里的,进行前缀查找速度极快。但是 FST 只能用于前缀查找,这也是 Completion Suggester 的局限所在。
# create index
PUT /blogs_completion/
{
"mappings": {
"properties": {
"body": {
"type": "completion"
}
}
}
}
# add data
POST _bulk/?refresh=true
{ "index" : { "_index" : "blogs_completion" } }
{ "body": "Lucene is cool"}
{ "index" : { "_index" : "blogs_completion" } }
{ "body": "Elasticsearch builds on top of lucene"}
{ "index" : { "_index" : "blogs_completion" } }
{ "body": "Elasticsearch rocks"}
{ "index" : { "_index" : "blogs_completion"} }
{ "body": "Elastic is the company behind ELK stack"}
{ "index" : { "_index" : "blogs_completion" } }
{ "body": "the elk stack rocks"}
{ "index" : { "_index" : "blogs_completion"} }
{ "body": "elasticsearch is rock solid”}
# prefix completion
POST blogs_completion/_search?pretty
{ "size": 0,
"suggest": {
"blog-suggest": {
"prefix": "elastic i",
"completion": {
"field": "body"
}
}
}
}
# prefix completion result
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"suggest" : {
"blog-suggest" : [
{
"text" : "elastic i",
"offset" : 0,
"length" : 9,
"options" : [
{
"text" : "Elastic is the company behind ELK stack",
"_index" : "blogs_completion",
"_type" : "_doc",
"_id" : "dllnEnkBgt5AqocFAwm-",
"_score" : 1.0,
"_source" : {
"body" : "Elastic is the company behind ELK stack"
}
}
]
}
]
}
}
复制代码
Completion suggest 注意事项
值得注意的一点是 Completion Suggester 在索引原始数据的时候也要经过 analyze 阶段,取决于选用的 analyzer 不同,某些词可能会被转换,某些词可能被去除,这些会影响 FST 编码结果,也会影响查找匹配的效果。
划线
评论
复制
发布于: 2021 年 04 月 29 日阅读数: 13
小林-1025
关注
还未添加个人签名 2018.03.01 加入
还未添加个人简介
评论