elasticsearch 实战三部曲之三:搜索操作
"description": "The book is aimed at experienced programmers who want to learn how to write useful Java applications and applets. "
}
}
]
}
}
如果我们的本意是只要"Core Java"的匹配结果,上面的结果显然是不符合要求的,此时可以给查询条件加个"operator":"and"属性,就会查询匹配了所有关键词的文档,注意 json 的结构略有变化,以前 title 的属性是搜索条件,现在变成了一个 json 对象,里面的 query 属性是原来的搜索条件:
GET englishbooks/_search
{
"query":{
"match":{
"title":{
"query":"Core Java",
"operator":"and"
}
}
}
}
这次的搜索结果就是同时匹配了"core"和"java"两个词项的记录了(为什么 core 和 java 是小写? 因为"Core Java"被分词后改为了小写,再去搜索的):
{
"took": 11,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.5753642,
"hits": [
{
"_index": "englishbooks",
"_type": "IT",
"_id": "3",
"_score": 0.5753642,
"_source": {
"id": "3",
"title": "Core Java",
"language": "java",
"author": "Horstmann",
"price": 85.9,
"publish_time": "2016-06-01",
"description": "The book is aimed at experienced programmers who want to learn how to write useful Java applications and applets. "
}
}
]
}
}
[](()match_phrase 搜索
match_phrase 搜索和前面的 match 搜索相似,并且有以下两个特点:
分词后的所有词项都要匹配上,也就是前面的"operator":"and"属性的效果;
分析后的词项顺序要和搜索字段的顺序一致,才能匹配上;
GET englishbooks/_search
{
"query":{
"match_phrase":{"title":"Core Java"}
}
}
上述查询可以搜索到结果,但如果将"Core Java"改成"Java Core"就搜不到结果了,但是 match query 用"Java Core"是可以搜到结果的;
[](()match_phrase_prefix 搜索
match_phrase_prefix 的功能和前面的 match_phrase 类似,不过 match_phrase_prefix 支持最后一个词项做前缀匹配,如下所示,"Core J"这个搜索条件用 match_phrase 是搜不到结果的,但是 match_phrase_prefix 可以,因为"J"可以作为前缀和"Java"匹配:
GET englishbooks/_search
{
"query":{
"match_phrase":{"title":"Core J"}
}
}
[](()multi_match 搜素
multi_match 是在 match 的基础上支持多字段搜索,以下查询就是用"1986"和"deep"这两个词项,同时搜索 title 和 description 两个字段:
GET englishbooks/_search
{
"query":{
"multi_match":{
"query":"1986 deep",
"fields":["title", "description"]
}
}
}
响应如下,可见 title 和 description 中含有词项"1986"或者"deep"的文档都被返回了:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.79237825,
"hits": [
{
"_index": "englishbooks",
"_type": "IT",
"_id": "2",
"_score": 0.79237825,
"_source": {
"id": "2",
"title": "Compilers",
"language": "c",
"author": "Alfred V.Aho",
"price": 62.5,
"publish_time": "2011-01-01",
"description": "In the time since the 1986 edition of this book, the world of compiler designhas changed significantly."
}
},
{
"_index": "englishbooks",
"_type": "IT",
"_id": "1",
"_score": 0.2876821,
"_source": {
"id": "1",
"title": "Deep Learning",
"language": "python",
"author": "Yoshua Bengio",
"price": 549,
"publish_time": "2016-11-18",
"description": "written by three experts in the field, deep learning is the only comprehensive book on the subject."
}
}
]
}
}
[](()terms query
terms 是 term 查询的升级,用来查询多个词项:
GET englishbooks/_search
{
"query":{
"terms":{
"title":["deep", "core"]
}
}
}
响应如下,title 中含有 deep 和 core 的文档都被查到:
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "englishbooks",
"_type": "IT",
"_id": "1",
"_score": 1,
"_source": {
"id": "1",
"title": "Deep Learning",
"language": "python",
"author": "Yoshua Bengio",
"price": 549,
"publish_time": "2016-11-18",
"description": "written by three experts in the field, deep learning is the only comprehensive book on the subject."
}
},
{
"_index": "englishbooks",
"_type": "IT",
"_id": "3",
"_score": 1,
"_source": {
"id": "3",
"title": "Core Java",
"language": "java",
"author": "Horstmann",
"price": 85.9,
"publish_time": "2016-06-01",
"description": "The book is aimed at experienced programmers who want to learn how to write useful Java applications and applets. "
}
}
]
}
}
[](()范围查询
range query 是范围查询,例如查询 publish_time 在"2016-01-01"到"2016-12-31"之间的文档:
GET englishbooks/_search
{
"query":{
"range":{
"publish_time":{
"gte":"2016-01-01",
"lte":"2016-12 《一线大厂 Java 面试题解析+后端开发学习笔记+最新架构讲解视频+实战项目源码讲义》开源 -31",
"format":"yyyy-MM-dd"
}
}
}
}
篇幅所限,此处略去返回结果;
[](()exists query
exists query 返回的是字段中至少有一个非空值的文档:
GET englishbooks/_search
{
"query":{
"exists":{
"field":"author"
}
}
}
[](()前缀查询
用于查询某个字段是否以给定前缀开始:
GET englishbooks/_search
{
"query":{
"prefix":{
"title":"cor"
}
}
}
以上请求可以查到 title 字段为"Core Java"的文档:
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "englishbooks",
"_type": "IT",
"_id": "3",
"_score": 1,
"_source": {
"id": "3",
"title": "Core Java",
"language": "java",
"author": "Horstmann",
"price": 85.9,
"publish_time": "2016-06-01",
"description": "The book is aimed at experienced programmers who want to learn how to write useful Java applications and applets. "
}
}
]
}
}
[](()通配符查询
以下查询,可以搜到 title 字段中含有"core"的文档,另外需要注意的是,"?“匹配一个字符,”*"匹配零个或者多个字符:
GET englishbooks/_search
{
"query":{
"wildcard":{
"title":"cor?"
}
}
}
[](()正则表达式
使用属性 regexp 可以进行正则表达式查询,例如查找 description 字段带有 4 位数字的分词的文档:
GET englishbooks/_search
{
"query":{
"regexp":{
"description":"[0-9]{4}"
}
}
}
查找结果如下,description 字段中带有数字 1986:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "englishbooks",
"_type": "IT",
"_id": "2",
"_score": 1,
"_source": {
"id": "2",
"title": "Compilers",
"language": "c",
"author": "Alfred V.Aho",
"price": 62.5,
"publish_time": "2011-01-01",
"description": "In the time since the 1986 edition of this book, the world of compiler designhas changed significantly."
}
}
]
}
}
[](()模糊查询(fuzzy query)
fuzzy 是通过计算词项与文档的编辑距离来得到结果的,例如查找 description 字段还有分词"1986"的时候,不小心输入了"1987",通过 fuzzy 查询也能得到结果,只是得分变低了,请求内容如下所示:
GET englishbooks/_search
{
"query":{
"fuzzy":{
"description":"1987"
}
}
}
搜索到的文档如下所示,得分只有 0.5942837,低于用"1986"查询的 0.79237825:
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.5942837,
"hits": [
{
"_index": "englishbooks",
"_type": "IT",
"_id": "2",
"_score": 0.5942837,
"_source": {
"id": "2",
"title": "Compilers",
"language": "c",
"author": "Alfred V.Aho",
"price": 62.5,
"publish_time": "2011-01-01",
"description": "In the time since the 1986 edition of this book, the world of compiler designhas changed significantly."
}
}
]
}
}
需要注意的是,fuzzy 查询时消耗资源较大;
[](()复合查询
常用到的复合查询是 bool query,可以用下表中的条件组合查询:
| 属性 | 作用 |
| --- | --- |
| must | 必须匹配,相当于 SQL 中的 AND |
| should | 可以匹配,相当于 SQL 中的 OR |
| must_not | 必须不匹配 |
| filter | 和 must 一样,但是不评分 |
以下条件,搜索的是 title 中带有 java,但是不包含 core 的文档:
GET englishbooks/_search
{
"query":{
"bool":{
"must":{
"term":{"title":"java"}
},
"must_not":[
{"term":{"title":"core"}}
]
}
}
}
得到的文档中,带有 core 词项的已经被过滤了:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
总结
如果你选择了 IT 行业并坚定的走下去,这个方向肯定是没有一丝问题的,这是个高薪行业,但是高薪是凭自己的努力学习获取来的,这次我把 P8 大佬用过的一些学习笔记(pdf)都整理在本文中了
《Java 中高级核心知识全面解析》
小米商场项目实战,别再担心面试没有实战项目:
评论