Elasticsearch Search Options 搜索参数
Elasticsearch Search Option 搜索参数,内容来自 B 站中华石杉 Elasticsearch 顶尖高手系列课程核心知识篇,英文内容来自 Elasticsearch: The Definitive Guide [2.x],内容似乎有些过时,但是我觉得底层原理应该大同小异,欢迎拍砖
A few optional query-string parameters can influence the search process.
preference
决定了哪些 shard 会被用来执行搜索操作
The preference parameter allows you to control which shards or nodes are used to handle the search request.
_primary
_primary_first
_local
_only_node:xyz
_prefer_node:xyz
_shards:2,3
Bouncing Results
bouncing results 问题,两个 document 排序,field 值相同;不同的 shard 上,可能排序不同;每次请求轮询打到不同的 replica shard 上;每次页面上看到的搜索结果的排序都不一样。这就是 bouncing result,也就是跳跃的结果。
搜索的时候,是轮询将搜索请求发送到每一个 replica shard(primary shard),但是在不同的 shard 上,可能 document 的排序不同
解决方案就是将 preference 设置为一个字符串,比如说 user_id,让每个 user 每次搜索的时候,都使用同一个 replica shard 去执行,就不会看到 bouncing results 了
Bouncing Results: every time the user refreshes the page, the results appear in a different order.
Bouncing results can be avoided by always using the same shards for the same user, which can be done by setting the preference parameter to an arbitrary string like the user's session ID.
timeout
主要就是限定在一定时间内,将部分获取到的数据直接返回,避免查询耗时过长
By default, shards process all the data they have before returning a response to the coordinating node, which will in turn merge these responses to build the final response.
If one node is having trouble, it could slow down the response to all search requests.
The timeout parameter tells shards how long they are allowed to process data before returning a response to the coordinating node. If there was not enough time to process all data, results for this shard will be partial, even possible empty.
The response to a search request will indicate whether any shards returned a partial response with the time_out property:
routing
document 文档路由,_id 路由,routing=user_id,这样的话可以让同一个 user 对应的数据到一个 shard 上去
At index time, a custom routing parameter could be provided to ensure that all related documents, such as the documents belonging to a single user, are stored on a single shard.
At search time, instead of searching on all the shards of an index, you can specify one or more routing values to limit the search to just those shards:
search_type
default:query_then_fetch
The default search type is query_then_fetch.
将 search_type 设置为 dfs_query_then_fetch 可以提升 revelance sort 的精准度。
The dfs_query_then_fetch search type has a prequery phase that fetches the term frequencies from all involved shards to calculate global term frequencies.
版权声明: 本文为 InfoQ 作者【escray】的原创文章。
原文链接:【http://xie.infoq.cn/article/5dc42b89c0a1f1f5d31f3d902】。
本文遵守【CC-BY 4.0】协议,转载请保留原文出处及本版权声明。
评论