ElasticSearch.04 - 基础操作

insight

关注

发布于: 2021 年 02 月 18 日

ES 基本使用

ES 通过 RESTful API 进行操作，它的数据格式的 JSON 格式，它的语法如下：

API 基本格式：http://<ip>：<port>/<索引>/<类型>/<文档id>
常用 HTTP 动词：GET/PUT/POST/DELETE

通过 Postman 来使用

通过地址栏对资源进行定位
选择“body——raw——json”，就可以对数据进行编写

通过 Kibana 使用

通过使用 Kibana 的 Dev Tools 工具，可以很方便地对 ES 进行 Restful 的操作，例如：

# 插入一个文档PUT index/type/1{  "body": "here"}

复制代码

查询

ES 查询分为：

简单查询
条件查询
聚合查询

简单查询

GET /customer/_doc/1

复制代码

{  "_index" : "customer",  "_type" : "_doc",  "_id" : "1",  "_version" : 1,  "_seq_no" : 26,  "_primary_term" : 4,  "found" : true,  "_source" : {    "name": "John Doe"  }}

复制代码

条件查询

通过在请求 URI 中指定要搜索的索引的名称，然后将请求发送到_search 端点可以进行条件查询。
使用 Elasticsearch Query DSL 在 request Body 中指定搜索条件。

查询全部

GET book/_search{  "query": {    "match_all": {}  },  "sort": [    { "account_number": "asc" }  ]}

复制代码

query：是查询的关键字
match_all：代表返回所有相匹配的文档
sort：代表排序

{  "took" : 63,  "timed_out" : false,  "_shards" : {    "total" : 5,    "successful" : 5,    "skipped" : 0,    "failed" : 0  },  "hits" : {    "total" : {        "value": 1000,        "relation": "eq"    },    "max_score" : null,    "hits" : [ {      "_index" : "bank",      "_type" : "_doc",      "_id" : "0",      "sort": [0],      "_score" : null,      "_source" : {"account_number":0,"balance":16623,"firstname":"Bradshaw","lastname":"Mckenzie","age":29,"gender":"F","address":"244 Columbus Place","employer":"Euron","email":"bradshawmckenzie@euron.com","city":"Hobucken","state":"CO"}    }, {      "_index" : "bank",      "_type" : "_doc",      "_id" : "1",      "sort": [1],      "_score" : null,      "_source" : {"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}    }, ...    ]  }}

复制代码

"took"：表示查询花费的毫秒数
timed_out – 是否请求超时
_shards – 搜索了多少个分片，其中成功，失败或跳过了多少个分片。
max_score – 被找到最相关的文档的分数
hits.total.value - 找到匹配文档的总数
hits.sort - the document’s sort position (when not sorting by relevance score)
hits._score - the document’s relevance score (not applicable when using match_all)
"hits"：代表响应的结果，默认情况下，返回 10 条满足条件的结果。

添加查询条件

GET book/_search{  "query": {      "match": {          "title":"Elasticsearch"      }  }}

复制代码

该请求会查询 title 字段中包含 Elasticsearch 的文档。

如果想要搜索 title = Elasticsearch 的文档，可以把 match 换成 match_phrase。

指定返回字段

通过设置 _source 的值，可以指定返回的哪些 field。

比如这个例子，就指定返回"firstname", "balance" , “lastname”这三个字段：

GET book/_search{  "query": {    "match_all": {}  },  "_source": [     "firstname", "balance" , "lastname"  ]}

复制代码

{  "took" : 63,  "timed_out" : false,  "_shards" : {    "total" : 5,    "successful" : 5,    "skipped" : 0,    "failed" : 0  },  "hits" : {    "total" : {        "value": 1000,        "relation": "eq"    },    "max_score" : null,    "hits" : [ {      "_index" : "bank",      "_type" : "_doc",      "_id" : "0",      "sort": [0],      "_score" : null,      "_source" : {        "balance":16623,        "firstname":"Bradshaw",        "lastname":"Mckenzie"      }    }, {      "_index" : "bank",      "_type" : "_doc",      "_id" : "1",      "sort": [1],      "_score" : null,      "_source" : {        "balance":39225,        "firstname":"Amber",        "lastname":"Duke"      }    }, ...    ]  }}

复制代码

分页查询

每个搜索请求都是独立的：Elasticsearch 不会在请求中维护任何状态信息。要对搜索结果进行分页，就需要在请求中指定 from 和 size 参数。

比如请求第 10 到 19 的文档。

GET /bank/_search{  "query": { "match_all": {} },  "sort": [    { "account_number": "asc" }  ],  "from": 10,  "size": 10}

复制代码

复杂查询

如果需要使用复杂查询，就需要在请求中使用 bool 字段来组合多个查询条件，查询条件中包括了三种条件：

must：必须符合条件
should：期待符合，如果符合，则 _score 的值更高
must_not：必须不符合。它相当于 filter，影响是否展示结果，但不会影响_score的值。

GET /bank/_search{  "query": {    "bool": {      "must": [       		{ "match": { "title": "quick" }}			],      "must_not": [        { "match": { "title": "lazy"  }}      ],      "should": [                  { "match": { "title": "brown" }},                  { "match": { "title": "dog"   }}      ]    }  }}

复制代码

以上的查询结果返回 title 字段包含词项 quick 但不包含 lazy 的任意文档。区别就在于两个 should 语句，也就是说：一个文档不必包含 brown 或 dog 这两个词项，但如果一旦包含，我们就认为它们 更相关 ：

{  "hits": [     {        "_id":      "3",        "_score":   0.70134366,         "_source": {           "title": "The quick brown fox jumps over the quick dog"        }     },     {        "_id":      "1",        "_score":   0.3312608,        "_source": {           "title": "The quick brown fox"        }     }  ]}

复制代码

文档 3 会比文档 1 有更高 _score 是因为它同时包含 brown 和 dog 。默认情况下，Elasticsearch 返回的结果按这些相关性分数排序。

也可以直接使用 filter 来进行条件过滤，举个例子，以下请求使用范围过滤器将结果限制为余额在 20,000 美元到 30,000 美元（包括 30000 美元）之间的帐户。

GET /bank/_search{  "query": {    "bool": {      "must": { "match_all": {} },      "filter": {        "range": {          "balance": {            "gte": 20000,            "lte": 30000          }        }      }    }  }}

复制代码

Term Query-精确查询

term 用于精确查询，使用 term 时，ES 会返回与指定字段的值完全匹配的结果。因此，通常使用 Term Query 来查找具有精确的值（例如价格，产品 ID 或用户名）的文档。

注意：避免对 text 类型的字段使用term query
By default, Elasticsearch changes the values of text fields as part of analysis. This can make finding exact matches for text field values difficult.
如果需要搜索 text 字段的值, 使用 match 来进行查找会更好。

索引创建

索引相当于数据库中的库，创建索引时，可以指定索引的分片和备份，以及创建结构化的索引如：

{	"settings":{		"number_of_shards": 3,		"number_of_replicas": 1	},	"mappings":{		"man":{			"properties":{				"name":{					"type":"text"				},				"country":{					"type":"keyword"				},   "age":{					"type":"integer"				},				"date":{					"type":"date",					"format":"yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"				}			}		}	}}

复制代码

number_of_shards：指定分片数量
number_of_replicas：指定备份数量
mappings：指定结构化的类型，可以指定多个
format:可以指定多种日期格式。

成功后，就会返回以下数据：

{    "acknowledged": true,    "shards_acknowledged": true,    "index": "people"}

复制代码

插入

插入分为两种：

指定文档 id 插入
自动产生文档 id 插入

下面说明两种插入的使用方法：

指定文档 id 插入

使用 PUT 方法来插入数据，即可自行指定 id。

requests

PUT /customer/_doc/1{  "name": "John Doe"}

复制代码

Response

{  "_index" : "customer",  "_type" : "_doc",  "_id" : "1",  "_version" : 1,  "result" : "created",  "_shards" : {    "total" : 2,    "successful" : 2,    "failed" : 0  },  "_seq_no" : 26,  "_primary_term" : 4}

复制代码

可以看到 "_id": "1" 指定的文档 id 为 1。

自动产生文档 id 插入

使用 post 方法插入，则会自动指定 id

requests

POST /people/man{  "name":"insight",  "age":23,  "country":"China",  "date":"1996-12-01"}

复制代码

response

{    "_index": "people",    "_type": "man",    "_id": "Fw25E2gB3PQKZKHhk0Lj",    "_version": 1,    "result": "created",    "_shards": {        "total": 2,        "successful": 1,        "failed": 0    },    "_seq_no": 0,    "_primary_term": 1}

复制代码

_id 为随机生成的一串字符串。

批量插入

如果有大量的数据需要插入，可以使用 bulk API 进行批量插入。在单个 API 调用中执行多个插入或删除操作。这样可以减少网络开销，并可以大大提高插入速度。

POST /_bulkPOST /<target>/_bulk
{ "index":{"_index":"twitter", "_type":"doc" }}{"word_count" : 1000,"author" : "李四","title" :"Elasticsearch大法好","public_date" : "2010-10-01"}{ "index":{"_index":"twitter", "_type":"doc" }}{"word_count" : 2000,"author" : "李三","title" : "Java入门","public_date" : "2010-10-01"}

复制代码

每个插入的数据由两个 {} 组成

- 第一个{}：指定插入的索引(表)和插入数据的类型

- 第二个{}：是数据本身

最佳批处理大小取决于许多因素：文档大小和复杂性，索引编制和搜索负载以及群集可用的资源。比较推荐的配置是：批处理 1,000 至 5,000 个文档，总有效负载在 5MB 至 15MB 之间。

修改

修改分为两种：

直接修改文档
脚本修改文档

直接修改文档

request

POST /people/man/1/_update{  "doc":{    "name":"我是谁"  }}

复制代码

更新时要在资源定位后面加上：/_update，这种修改方式只会修改指定的字段
更新的内容要放在 “doc” 字段上

response

{  "_index" : "people",  "_type" : "man",  "_id" : "1",  "_version" : 3,  "result" : "updated",  "_shards" : {    "total" : 2,    "successful" : 1,    "failed" : 0  },  "_seq_no" : 2,  "_primary_term" : 3}

复制代码

删除

删除文档

request

DELETE /people/man/1

复制代码

response

{  "_index" : "people",  "_type" : "man",  "_id" : "1",  "_version" : 4,  "result" : "deleted",  "_shards" : {    "total" : 2,    "successful" : 1,    "failed" : 0  },  "_seq_no" : 3,  "_primary_term" : 3}

复制代码

删除索引

request

DELETE /people

复制代码

response

{  "acknowledged" : true}

复制代码

发布于: 2021 年 02 月 18 日阅读数: 14

原文链接:【http://xie.infoq.cn/article/5caef427991a61e261d97a254】。文章转载请联系作者。

insight

关注

不要混淆行动与进展、忙碌与多产。 2018.11.17 加入

永远都是初学者

发布

暂无评论

创作场景

ElasticSearch.04 - 基础操作

ES 基本使用

通过 Postman 来使用

通过 Kibana 使用

查询

简单查询

条件查询

查询全部

添加查询条件

指定返回字段

分页查询

复杂查询

Term Query-精确查询

索引创建

插入

指定文档 id 插入

requests

Response

自动产生文档 id 插入

requests

response

批量插入

修改

直接修改文档

request

response

删除

删除文档

request

response

删除索引

request

response

insight

评论