写点什么

Elasticsearch 组合查询

用户头像
escray
关注
发布于: 2021 年 02 月 22 日
Elasticsearch 组合查询

Elasticsearch 组合查询(Compound queries),部分内容来自 B 站中华石杉 Elasticsearch 顶尖高手系列课程核心知识篇,英文内容来自官方文档。 其中大部分内容来自官方文档,英文,TL;DR

组合查询 Compound queries


Compound queries wrap other compound or leaf queries, either to combine their results and scores, to change their behaviour,  or to switch from query to filter context.


  • bool query

  • boosting query

  • constant_score query

  • dis_max query

  • function_score query



Boolean query


The default query for combining multiple leaf or compound query clauses, as must, should, must_not, or filter clauses. The must and should clauses have their scores combined -- the more matching clauses, the better -- while the must_not and filter clauses are executed in filter context.


A query that matches documents matching boolean combinations of other queries. The bool query maps to Lucene BooleanQuery. It is built using one or more boolean clauses, each clause with a typed occurrence.


  • must: The clause (query) must appear in matching documents and will contribute to the score.

  • filter: The clause (query) must appear in matching documents. Filter clauses are executed in filter context, meaning that scoring is ignored and clauses are considered for caching.

  • should: The clause (query) should appear in the matching document.

  • must_not: The clause (query) must not appear in the matching documents. ...Because scoring is ignored, a score of 0 for all documents is returned.


The bool query takes a more-matches-is-better approach, so the score from each matching must or should clause will be added together to provide the final _score for each document.


GET /website/_search{  "query": {    "bool": {      "must": [        {          "match": {            "title": "elasticsearch"          }        }        ],      "should": [        {          "match": {            "content": "elasticsearch"          }        }        ],      "must_not": [        {          "match": {            "author_id": 111          }        }        ]    }  }}
GET /website/_search{ "query": { "bool": { "must": { "match": { "title": "how to make millions" } }, "must_not": { "match": { "tag": "spam" } }, "should": [ { "match": { "tag": "starred" } }], "filter": { "range": { "date": { "gte": "2014-01-01" } } } } }}
复制代码


每个子查询都会计算一个 document 针对它的相关度分数,然后 bool 综合所有分数,合并为一个分数,当然 filter 是不会计算分数的


GET /website/_search{  "query": {    "bool": {      "must": {        "match": {          "title": "how to make millions"        }      },      "must_not": {        "match": {          "tag": "spam"        }      },      "should": [{        "match": {          "tag": "starred"        }}      ],      "filter": {        "bool": {          "must": [            {               "range": {                "date": {                  "gte": "2014-01-01"                }              }            },            {              "range": {                "price": {                  "lte": "29.99"                }              }            }          ],          "must_not": [            {              "term": {                "category": "ebooks"              }            }          ]        }      }    }  }}
GET /employee/_search{ "query": { "constant_score": { "filter": { "range": { "age": { "gte": 30 } } } } }}
复制代码


POST _search{  "query": {    "bool": {      "must": {        "term": { "user.id" : "kimchy" }      },      "filter": {        "term": { "tags": "production" }      },      "must_not": {        "range": {          "age": { "gte": 10, "lte": 20 }        }      },      "should": [        { "term": { "tags": "env1" }},        { "term": { "tags": "deployed" }}      ],      "minimum_should_match": 1,      "boost": 1.0    }  }}
复制代码


Using minimum_should_match


You can use the minimum_should_match parameter to specify the number or percentage of should clauses returned documents must match.


If the bool query includes at least one should clause and no must or filter clauses, the default value is 1. Otherwise, the default value is 0.


Scoring with bool.filter


Queries specified under the filter element have no effect on scoring -- scores are returned as 0. Scores are only affected by the query that has been specified.


// assigns a score of 0 to all documents, s no scoring query has been specifiedGET _search{  "query": {    "bool": {      "filter": {        "term": {          "status": "active"        }      }    }  }}
// bool query has a match_all query, which assigns a score of 1.0 to all documentsGET _search{ "query": { "bool": { "must": { "match_all": {} }, "filter": { "term": { "status": "active" } } } }}
// constant_score query assigns a score of 1.0 to all documents matched by the filterGET _search{ "query": { "constant_score": { "filter": { "term": { "status": "active" } } } }}
复制代码


Named queries


Each query accepts a _name in its top level definition. You can use named queries to track which queries matched returned documents.


GET /_search{  "query": {    "bool": {      "should": [        { "match": { "name.first": { "query": "shay", "_name": "first" }}},        { "match": { "name.last": { "query": "banon", "_name": "last" }}}        ],      "filter": {        "terms": {          "name.last": ["banon", "kimchy"],          "_name": "test"        }      }    }  }}
复制代码
Boost Query


Returns documents matching a positive query while reducing the relevance score of documents that also match a negative query.


GET /_search{  "query": {    "boosting" {      "positive": {        "term": {          "text": "apple"        }      },      "negative": {        "term": {          "text": "pie tart fruit crumble tree"        }      },      "negative_boost": 0.5    }  }}
复制代码
Constant score Query


Wraps a filter query and returns every matching document with a relevance score equal to the boost paramster value.


GET /_search{  "query": {    "constant_score": {      "filter": {        "term": { "user.id": "kimchy" }      },      "boost": 1.2      }    }  }}
复制代码
Disjunction max query


Return documents matching one or more wrapped queries, call query clauses or clauses.


If a returned document matches multiple query clauses, the dis_max query assigns the document the highest relevance score from any matching clause, plus a tie breaking increment for any additional matching subqueries.


GET /_search{  "query": {    "dis_max": {      "queries": [        { "term": { "title": "Quick pets" }},        { "term": { "body": "Quick pets" }}      ],      "tie_breaker": 0.7    }  }}
复制代码
Function score query


The function_score allows you to modify the score of documents that are retrieved by a query.


GET /_search{  "query": {    "function_score": {      "query": { "match_all": {} },      "boost": "5",      // The function_score query provides several types of score functions: script_score, weight, random_score, field_value_factor, decay functions(gauss, linear, exp)      "random_score": {},      "boost_mode": "multiply"    }  }}
复制代码


发布于: 2021 年 02 月 22 日阅读数: 25
用户头像

escray

关注

Let's Go 2017.11.19 加入

在学 Elasticsearch 的项目经理

评论

发布
暂无评论
Elasticsearch 组合查询