Elasticsearch query string 分词

escray

关注

发布于: 2021 年 02 月 13 日

Elasticsearch query string 分词，文字内容来自 B 站中华石杉 Elasticsearch 高手进阶课程，英文内容来自官方文档。

query string 分词

query string 必须以和 index 建立时相同的 analyzer 进行分词

似乎在新版本里面，已经可以用不同的 analyzer 了，但是对搜索结果有影响

Index and search analysis

In most cases, the same analyzer should be used at index and search time. This ensures the values and query strings for a field are changed into the same form of tokens. In turn, this ensures the tokens match as expected during a search.

While less common, it sometimes makes sense to use different analyzers at index and search time. To enable this, Elasticsearch allows you to specify a separate search analyzer.

Generally, a separate search analyzer should only be specified when using the same form of tokens for field values and query strings would create unexpected or irrelevant search matches.

query string 对 exact value 和 full text 的区别对待

date：exact value
_all：full text

比如我们有一个 document，其中有一个 field，包含的 value 是：hello you and me，建立倒排索引，我们要搜索这个 document 对应的 index，搜索文本是 hello me，这个搜索文本就是 query string

query string，默认情况下，Elasticsearch 会使用它对应的 field 建立倒排索引时相同的分词器去进行分词，分词和 normalization，只有这样，才能实现正确的搜索

我们建立倒排索引的时候，将 dogs → dog，结果你搜索的时候，还是一个 dogs，那不就搜索不到了吗？所以搜索的时候，那个 dogs 也必须变成 dog 才行。才能搜索到。

知识点：不同类型的 field，可能有的就是 full text，有的就是 exact value

post_date，date：exact value
_all：full text，分词，normalization

mapping 案例

GET /_search?q=2017

复制代码

搜索的是 _all field，document 所有的 field 都会拼接成一个大串，进行分词

2017-01-02 my second article this is my second article in this website 11400

复制代码

_all，2017，自然会搜索到 3 个 document

GET /_search?q=2017-01-01

复制代码

_all，2017-01-01，query string 会用跟建立倒排索引一样的分词器去进行分词

GET /_search?q=post_date:2017-01-01

复制代码

date，会作为 exact value 去建立索引

post_date: 2017-01-01，2017-01-01，命中 doc1 一条 document

GET /_search?q=post_date:2017，这个在这里不讲解，因为是 Elasticsearch 5.2 以后做的一个优化

测试分词器

GET /_analyze{  "analyzer": "standard",  "text": "Text to analyze"}

复制代码

发布于: 2021 年 02 月 13 日阅读数: 18

原文链接:【http://xie.infoq.cn/article/4a97b9524fd89c28156daaaa7】。

escray

关注

Let's Go 2017.11.19 加入

在学 Elasticsearch 的项目经理

发布

暂无评论

创作场景

Elasticsearch query string 分词

query string 分词

mapping 案例

escray

评论