
elasticsearch 实战三部曲之三:搜索操作

"description": "The book is aimed at experienced programmers who want to learn how to write useful Java applications and applets. "






  1. 如果我们的本意是只要"Core Java"的匹配结果,上面的结果显然是不符合要求的,此时可以给查询条件加个"operator":"and"属性,就会查询匹配了所有关键词的文档,注意 json 的结构略有变化,以前 title 的属性是搜索条件,现在变成了一个 json 对象,里面的 query 属性是原来的搜索条件:

GET englishbooks/_search





"query":"Core Java",






这次的搜索结果就是同时匹配了"core"和"java"两个词项的记录了(为什么 core 和 java 是小写? 因为"Core Java"被分词后改为了小写,再去搜索的):


"took": 11,

"timed_out": false,

"_shards": {

"total": 5,

"successful": 5,

"skipped": 0,

"failed": 0


"hits": {

"total": 1,

"max_score": 0.5753642,

"hits": [


"_index": "englishbooks",

"_type": "IT",

"_id": "3",

"_score": 0.5753642,

"_source": {

"id": "3",

"title": "Core Java",

"language": "java",

"author": "Horstmann",

"price": 85.9,

"publish_time": "2016-06-01",

"description": "The book is aimed at experienced programmers who want to learn how to write useful Java applications and applets. "






[](()match_phrase 搜索

match_phrase 搜索和前面的 match 搜索相似,并且有以下两个特点:

  1. 分词后的所有词项都要匹配上,也就是前面的"operator":"and"属性的效果;

  2. 分析后的词项顺序要和搜索字段的顺序一致,才能匹配上;

GET englishbooks/_search



"match_phrase":{"title":"Core Java"}



上述查询可以搜索到结果,但如果将"Core Java"改成"Java Core"就搜不到结果了,但是 match query 用"Java Core"是可以搜到结果的;

[](()match_phrase_prefix 搜索

match_phrase_prefix 的功能和前面的 match_phrase 类似,不过 match_phrase_prefix 支持最后一个词项做前缀匹配,如下所示,"Core J"这个搜索条件用 match_phrase 是搜不到结果的,但是 match_phrase_prefix 可以,因为"J"可以作为前缀和"Java"匹配:

GET englishbooks/_search



"match_phrase":{"title":"Core J"}



[](()multi_match 搜素

multi_match 是在 match 的基础上支持多字段搜索,以下查询就是用"1986"和"deep"这两个词项,同时搜索 title 和 description 两个字段:

GET englishbooks/_search




"query":"1986 deep",

"fields":["title", "description"]




响应如下,可见 title 和 description 中含有词项"1986"或者"deep"的文档都被返回了:


"took": 4,

"timed_out": false,

"_shards": {

"total": 5,

"successful": 5,

"skipped": 0,

"failed": 0


"hits": {

"total": 2,

"max_score": 0.79237825,

"hits": [


"_index": "englishbooks",

"_type": "IT",

"_id": "2",

"_score": 0.79237825,

"_source": {

"id": "2",

"title": "Compilers",

"language": "c",

"author": "Alfred V.Aho",

"price": 62.5,

"publish_time": "2011-01-01",

"description": "In the time since the 1986 edition of this book, the world of compiler designhas changed significantly."




"_index": "englishbooks",

"_type": "IT",

"_id": "1",

"_score": 0.2876821,

"_source": {

"id": "1",

"title": "Deep Learning",

"language": "python",

"author": "Yoshua Bengio",

"price": 549,

"publish_time": "2016-11-18",

"description": "written by three experts in the field, deep learning is the only comprehensive book on the subject."






[](()terms query

terms 是 term 查询的升级,用来查询多个词项:

GET englishbooks/_search




"title":["deep", "core"]




响应如下,title 中含有 deep 和 core 的文档都被查到:


"took": 5,

"timed_out": false,

"_shards": {

"total": 5,

"successful": 5,

"skipped": 0,

"failed": 0


"hits": {

"total": 2,

"max_score": 1,

"hits": [


"_index": "englishbooks",

"_type": "IT",

"_id": "1",

"_score": 1,

"_source": {

"id": "1",

"title": "Deep Learning",

"language": "python",

"author": "Yoshua Bengio",

"price": 549,

"publish_time": "2016-11-18",

"description": "written by three experts in the field, deep learning is the only comprehensive book on the subject."




"_index": "englishbooks",

"_type": "IT",

"_id": "3",

"_score": 1,

"_source": {

"id": "3",

"title": "Core Java",

"language": "java",

"author": "Horstmann",

"price": 85.9,

"publish_time": "2016-06-01",

"description": "The book is aimed at experienced programmers who want to learn how to write useful Java applications and applets. "







range query 是范围查询,例如查询 publish_time 在"2016-01-01"到"2016-12-31"之间的文档:

GET englishbooks/_search






"lte":"2016-12-31",







[](()exists query

exists query 返回的是字段中至少有一个非空值的文档:

GET englishbooks/_search










GET englishbooks/_search








以上请求可以查到 title 字段为"Core Java"的文档:


"took": 6,

"timed_out": false,

"_shards": {

"total": 5,

"successful": 5,

"skipped": 0,

"failed": 0


"hits": {

"total": 1,

"max_score": 1,

"hits": [


"_index": "englishbooks",

"_type": "IT",

"_id": "3",

"_score": 1,

"_source": {

"id": "3",

"title": "Core Java",

"language": "java",

"author": "Horstmann",

"price": 85.9,

"publish_time": "2016-06-01",

"description": "The book is aimed at experienced programmers who want to learn how to write useful Java applications and applets. "







以下查询,可以搜到 title 字段中含有"core"的文档,另外需要注意的是,"?“匹配一个字符,”*"匹配零个或者多个字符:

GET englishbooks/_search









使用属性 regexp 可以进行正则表达式查询,例如查找 description 字段带有 4 位数字的分词的文档:

GET englishbooks/_search








查找结果如下,description 字段中带有数字 1986:


"took": 4,

"timed_out": false,

"_shards": {

"total": 5,

"successful": 5,

"skipped": 0,

"failed": 0


"hits": {

"total": 1,

"max_score": 1,

"hits": [


"_index": "englishbooks",

"_type": "IT",

"_id": "2",

"_score": 1,

"_source": {

"id": "2",

"title": "Compilers",

"language": "c",

"author": "Alfred V.Aho",

"price": 62.5,

"publish_time": "2011-01-01",

"description": "In the time since the 1986 edition of this book, the world of compiler designhas changed significantly."






[](()模糊查询(fuzzy query)

fuzzy 是通过计算词项与文档的编辑距离来得到结果的,例如查找 description 字段还有分词"1986"的时候,不小心输入了"1987",通过 fuzzy 查询也能得到结果,只是得分变低了,请求内容如下所示:

GET englishbooks/_search








搜索到的文档如下所示,得分只有 0.5942837,低于用"1986"查询的 0.79237825:


"took": 5,

"timed_out": false,

"_shards": {

"total": 5,

"successful": 5,

"skipped": 0,

"failed": 0


"hits": {

"total": 1,

"max_score": 0.5942837,

"hits": [


"_index": "englishbooks",

"_type": "IT",

"_id": "2",

"_score": 0.5942837,

"_source": {

"id": "2",

"title": "Compilers",

"language": "c",

"author": "Alfred V.Aho",

"price": 62.5,

"publish_time": "2011-01-01",

"description": "In the time since the 1986 edition of this book, the world of compiler designhas changed significantly."






需要注意的是,fuzzy 查询时消耗资源较大;


常用到的复合查询是 bool query,可以用下表中的条件组合查询:

| 属性 | 作用 |

| --- | --- |

| must | 必须匹配,相当于 SQL 中的 AND |

| should | 可以匹配,相当于 SQL 中的 OR |

| must_not | 必须不匹配 |

| filter | 和 must 一样,但是不评分 |

以下条件,搜索的是 title 中带有 java,但是不包含 core 的文档:

GET englishbooks/_search













得到的文档中,带有 core 词项的已经被过滤了:


"took": 3,

"timed_out": false,

"_shards": {

"total": 5,


