SpringBoot 整合 Elasticsearch 超详细教程

2022-11-26
北京
本文字数：11050 字
阅读完需：约 36 分钟

陈老老老板
说明：工作了，学习一些新的技术栈和工作中遇到的问题，边学习边总结，各位一起加油。需要注意的地方都标红了，还有资源的分享. 一起加油。本文是介绍 Elasticsearch 用法与 SpringBoot 整合

1.ES 简介

注：公司中大部分也是对于管理日志信息使用 es，我们也是，这里做简单的教学，之后会有更加完整的 ES 学习介绍。说明：ES（Elasticsearch）Elasticsearch 是一个基于 Lucene 的搜索服务器。它提供了一个分布式多用户能力的全文搜索引擎，基于 RESTful web 接口。Elasticsearch 是用 Java 语言开发的，并作为 Apache 许可条款下的开放源码发布，是一种流行的企业级搜索引擎。Elasticsearch 用于云计算中，能够达到实时搜索，稳定，可靠，快速，安装使用方便。其实记住是一个分布式全文搜索引擎，重点是全文搜索。

全文搜索：这里解释一下全文索引比如用户要搜索一个文章，以 Java 为关键字进行搜索，不管是书名中还是文章的标题，文章的作者名字，文章的摘要，只要是包含 java 关键字就会作为查询结果返回给用户查看，这就使用了全文搜索技术。搜索的条件不再是仅用于对某一个字段进行比对与查找，而是在一条数据中使用搜索条件去比对表中更多的字段，只要能匹配上就作为查询结果，而 ES 技术就是一种可以实现上述效果的技术。

2.全文搜索实现过程：倒排索引

ES 设计了一种全新的思想，来实现全文搜索。具体操作过程如下：

（1）进行分词

将被查询的字段的数据全部文本信息进行查分，分成若干个词

例如“我不想上班”就会被拆分成三个词，分别是“我”、“不想”、“上班”，此过程专业术语叫做分词。（根据分词的策略不同，分出的效果是不一样的，不同的分词策略称为分词器（ik）。）

（2）存储对应 id

将分词得到的结果存储起来，对应每条数据的 id

例如 id 为 1 的数据中名称这一项的值是“我不想上班”，那么分词结束后，就会出现“我”对应 id 为 1，“不想”对应 id 为 1，“上班”对应 id 为 1
例如 id 为 2 的数据中名称这一项的值是“上班真的快乐“，那么分词结束后，就会出现“上班”对应 id 为 2，“真的”对应 id 为 2，“快乐”对应 id 为 2
按照上述形式可以对所有文档进行分词。需要注意分词的过程不是仅对一个字段进行，而是对每一个参与查询的字段都执行，最终结果汇总到一个表格中此时就会出现如下对应结果：

（3）通过 id 查询结果

当进行查询时，如果输入“上班”作为查询条件，可以通过上述表格数据进行比对，得到 id 值 1,2，然后根据 id 值就可以得到查询的结果数据了。

注：全文搜索中的分词结果关键字查询后得到的并不是整条的数据，而是数据的 id，要想获得具体数据还要再次查询，这种分词结果关键字叫做倒排索引。

3.安装

（1）下载 ES

windows 版安装包下载地址：https://www.elastic.co/cn/downloads/elasticsearch

（2）解压缩

下载的安装包是解压缩就能使用的 zip 文件，解压缩完毕后会得到如下文件

bin 目录：包含所有的可执行命令
config 目录：包含 ES 服务器使用的配置文件
jdk 目录：此目录中包含了一个完整的 jdk 工具包，版本 17，当 ES 升级时，使用最新版本的 jdk 确保不会出现版本支持性不足的问题
lib 目录：包含 ES 运行的依赖 jar 文件
logs 目录：包含 ES 运行后产生的所有日志文件
modules 目录：包含 ES 软件中所有的功能模块，也是一个一个的 jar 包。和 jar 目录不同，jar 目录是 ES 运行期间依赖的 jar 包，modules 是 ES 软件自己的功能 jar 包
plugins 目录：包含 ES 软件安装的插件，默认为空

（3）启动服务器

进入 bin 目录，再进入命令窗口，输入以下命令：

<b><b><b><code class="language-CMD">elasticsearch.bat</code></b></b></b>

复制代码

<b><b><b><code>    双击elasticsearch.bat文件即可启动ES服务器，默认服务端口9200。通过浏览器访问http://localhost:9200看到如下信息视为ES服务器正常启动</code></b></b></b>

复制代码

<b><b><b><code class="language-CMD">{  "name" : "CZBK-**********",  "cluster_name" : "elasticsearch",  "cluster_uuid" : "j137DSswTPG8U4Yb-0T1Mg",  "version" : {    "number" : "7.16.2",    "build_flavor" : "default",    "build_type" : "zip",    "build_hash" : "2b937c44140b6559905130a8650c64dbd0879cfb",    "build_date" : "2021-12-18T19:42:46.604893745Z",    "build_snapshot" : false,    "lucene_version" : "8.10.1",    "minimum_wire_compatibility_version" : "6.8.0",    "minimum_index_compatibility_version" : "6.0.0-beta1"  },  "tagline" : "You Know, for Search"}</code></b></b></b>

复制代码

（4）基本操作

<b><b><b><code>    ES中保存要查询的数据，只不过格式和数据库存储数据格式不同。在ES中我们要先创建倒排索引（这个索引的功能又点类似于数据库的表），然后将数据添加到倒排索引中，添加的数据称为文档。所以要<font color = "red"><b>进行ES的操作要先创建索引，再添加文档，这样才能进行后续的查询操作。</font>
    要操作ES可以通过Rest风格的请求来进行，也就是说发送一个请求就可以执行一个操作。比如新建索引，删除索引这些操作都可以使用发送请求的形式来进行。</code></b></b></b>

复制代码

（1）创建索引

user 是索引名称，注意是 put 请求
发送请求后，看到如下信息即索引创建成功

注：重复创建已经存在的索引会出现错误信息，reason 属性中描述错误原因。

<b><b><b><code class="language-json">{    "error": {        "root_cause": [            {                "type": "resource_already_exists_exception",                "reason": "index [books/VgC_XMVAQmedaiBNSgO2-w] already exists",                "index_uuid": "VgC_XMVAQmedaiBNSgO2-w",                "index": "books"            }        ],        "type": "resource_already_exists_exception",        "reason": "index [books/VgC_XMVAQmedaiBNSgO2-w] already exists",  # books索引已经存在        "index_uuid": "VgC_XMVAQmedaiBNSgO2-w",        "index": "book"    },    "status": 400}</code></b></b></b>

复制代码

（2）查询索引

<b><b><b><code class="language-CMD">GET请求    http://localhost:9200/user</code></b></b></b>

复制代码

查询索引得到索引相关信息，如下

<b><b><b><code class="language-json">{    "book": {        "aliases": {},        "mappings": {},        "settings": {            "index": {                "routing": {                    "allocation": {                        "include": {                            "_tier_preference": "data_content"                        }                    }                },                "number_of_shards": "1",                "provided_name": "books",                "creation_date": "1645768584849",                "number_of_replicas": "1",                "uuid": "VgC_XMVAQmedaiBNSgO2-w",                "version": {                    "created": "7160299"                }            }        }    }}</code></b></b></b>

复制代码

注：如果查询了不存在的索引，会返回错误信息。

<b><b><b><code class="language-json">{    "error": {        "root_cause": [            {                "type": "index_not_found_exception",                "reason": "no such index [book]",                "resource.type": "index_or_alias",                "resource.id": "book",                "index_uuid": "_na_",                "index": "book"            }        ],        "type": "index_not_found_exception",        "reason": "no such index [book]",    # 没有book索引        "resource.type": "index_or_alias",        "resource.id": "book",        "index_uuid": "_na_",        "index": "book"    },    "status": 404}</code></b></b></b>

复制代码

（3）删除索引

<b><b><b><code class="language-CMD">DELETE请求  http://localhost:9200/books</code></b></b></b>

复制代码

删除所有后，给出删除结果

<b><b><b><code class="language-json">{    "acknowledged": true}</code></b></b></b>

复制代码

注：如果重复删除，会给出错误信息，同样在 reason 属性中描述具体的错误原因

<b><b><b><code class="language-JSON">{    "error": {        "root_cause": [            {                "type": "index_not_found_exception",                "reason": "no such index [books]",                "resource.type": "index_or_alias",                "resource.id": "book",                "index_uuid": "_na_",                "index": "book"            }        ],        "type": "index_not_found_exception",        "reason": "no such index [books]",    # 没有books索引        "resource.type": "index_or_alias",        "resource.id": "book",        "index_uuid": "_na_",        "index": "book"    },    "status": 404}</code></b></b></b>

复制代码

（4）创建索引并指定分词器

<b><b><b><code>      前面创建的索引是未指定分词器的，可以在创建索引时添加请求参数，设置分词器。目前国内较为流行的分词器是IK分词器，使用前先在下对应的分词器，然后使用。</code></b></b></b>

复制代码

IK 分词器下载地址：https://github.com/medcl/elasticsearch-analysis-ik/releases

<b><b><b><code>      分词器下载后解压到ES安装目录的plugins目录中即可，安装分词器后需要重新启动ES服务器。</code></b></b></b>

复制代码

使用 IK 分词器创建索引格式：（要把注释删掉否则报错）

<b><b><b><code class="language-json">PUT请求    http://localhost:9200/books
请求参数如下（注意是json格式的参数）{    "mappings":{              #定义mappings属性，替换创建索引时对应的mappings属性            "properties":{            #定义索引中包含的属性设置            "id":{              #设置索引中包含id属性                "type":"keyword"      #当前属性可以被直接搜索            },            "name":{            #设置索引中包含name属性                "type":"text",              #当前属性是文本信息，参与分词                  "analyzer":"ik_max_word",   #使用IK分词器进行分词                             "copy_to":"all"        #分词结果拷贝到all属性中            },            "type":{                "type":"keyword"            },            "description":{                "type":"text",                                  "analyzer":"ik_max_word",                                "copy_to":"all"            },            "all":{              #定义属性，用来描述多个字段的分词结果集合，当前属性可以参与查询                "type":"text",                                  "analyzer":"ik_max_word"            }        }    }}</code></b></b></b>

复制代码

<b><b><b><code>      创建完毕后返回结果和不使用分词器创建索引的结果是一样的。</code></b></b></b>

复制代码

此时可以通过查看索引信息观察到添加的请求参数 mappings 已经进入到了索引属性中

<b><b><b><code class="language-json">{    "user": {        "aliases": {},        "mappings": {            #mappings属性已经被替换            "properties": {                "all": {                    "type": "text",                    "analyzer": "ik_max_word"                },                "description": {                    "type": "text",                    "copy_to": [                        "all"                    ],                    "analyzer": "ik_max_word"                },                "id": {                    "type": "keyword"                },                "name": {                    "type": "text",                    "copy_to": [                        "all"                    ],                    "analyzer": "ik_max_word"                },                "type": {                    "type": "keyword"                }            }        },        "settings": {            "index": {                "routing": {                    "allocation": {                        "include": {                            "_tier_preference": "data_content"                        }                    }                },                "number_of_shards": "1",                "provided_name": "books",                "creation_date": "1645769809521",                "number_of_replicas": "1",                "uuid": "DohYKvr_SZO4KRGmbZYmTQ",                "version": {                    "created": "7160299"                }            }        }    }}</code></b></b></b>

复制代码

目前我们已经有了索引了，但是索引中还没有数据，所以要先添加数据，ES 中称数据为文档，下面进行文档操作。

a.添加文档，有三种方式

<b><b><b><code class="language-json">POST请求  http://localhost:9200/user/_doc    #使用系统生成idPOST请求  http://localhost:9200/user/_create/1  #使用指定idPOST请求  http://localhost:9200/user/_doc/1    #使用指定id，不存在创建，存在更新（版本递增）
文档通过请求参数传递，数据格式json{    "name":"cllb",    "type":"bozhu",    "description":"xihuan java"}  </code></b></b></b>

复制代码

b.查询文档

这里注意请求时要把参数调整为 none，否则会报错。

<b><b><b><code class="language-json">GET请求  http://localhost:9200/user/_doc/1     #查询单个文档     GET请求  http://localhost:9200/user/_search     #查询全部文档</code></b></b></b>

复制代码

c.条件查询

<b><b><b><code class="language-json">GET请求  http://localhost:9200/user/_search?q=name:cllb  # q=查询属性名:查询属性值</code></b></b></b>

复制代码

d.修改文档（全量更新）

<b><b><b><code class="language-json">PUT请求  http://localhost:9200/user/_doc/1
文档通过请求参数传递，数据格式json{    "name":"ccc",    "type":"bb",    "description":"123"}</code></b></b></b>

复制代码

e.修改文档（部分更新）

<b><b><b><code class="language-json">POST请求  http://localhost:9200/user/_update/1
文档通过请求参数传递，数据格式json{          "doc":{            #部分更新并不是对原始文档进行更新，而是对原始文档对象中的doc属性中的指定属性更新        "name":"springboot"    #仅更新提供的属性值，未提供的属性值不参与更新操作    }}</code></b></b></b>

复制代码

f.删除文档

<b><b><b><code class="language-json">DELETE请求  http://localhost:9200/books/_doc/1</code></b></b></b>

复制代码

4. 整合（早期低级版）

其实和整合 Redis，MongoDB，ES 都是一样的。下面就开始 springboot 整合 ES，操作步骤如下：

（1）：导入 springboot 整合 ES 的 starter 坐标

<b><b><b><code class="language-xml"><dependency>    <groupId>org.springframework.boot</groupId>    <artifactId>spring-boot-starter-data-elasticsearch</artifactId></dependency></code></b></b></b>

复制代码

（2）：进行基础配置

<b><b><b><code class="language-yaml">spring:  elasticsearch:    rest:      uris: http://localhost:9200</code></b></b></b>

复制代码

<b><b><b><code>    配置ES服务器地址，端口9200</code></b></b></b>

复制代码

（3）：使用 springboot 整合 ES 的专用客户端接口 ElasticsearchRestTemplate 来进行操作

<b><b><b><code class="language-java">@SpringBootTestclass Springboot18EsApplicationTests {    @Autowired    private ElasticsearchRestTemplate template;}</code></b></b></b>

复制代码

（4）连接 pojo 层

<b><b><b><code class="language-java">package com.test;
import org.springframework.data.elasticsearch.annotations.Document;
import java.lang.annotation.Documented;
@Document(indexName = "user")public class User {    private Integer id;
    private String name;
    private String type;
    private String description;
    public User(Integer id, String name, String type, String description) {        this.id = id;        this.name = name;        this.type = type;        this.description = description;    }
    public User() {    }
    public Integer getId() {        return id;    }
    public void setId(Integer id) {        this.id = id;    }
    public String getName() {        return name;    }
    public void setName(String name) {        this.name = name;    }
    public String getType() {        return type;    }
    public void setType(String type) {        this.type = type;    }
    public String getDescription() {        return description;    }
    public void setDescription(String description) {        this.description = description;    }
    @Override    public String toString() {        return "Book{" +                "id=" + id +                ", name='" + name + '\'' +                ", type='" + type + '\'' +                ", description='" + description + '\'' +                '}';    }}</code></b></b></b>

复制代码

（5）连接 dao 层

<b><b><b><code class="language-java">package com.test;
import org.elasticsearch.ElasticsearchSecurityException;import org.springframework.data.elasticsearch.repository.ElasticsearchRepository;
public interface Esresposity extends ElasticsearchRepository<User,Integer> {}</code></b></b></b>

复制代码

注：上述这是 ES 早期的操作方式，使用的客户端被称为 Low Level Client，因为这种操作方式在性能方面略显不足。于是 ES 开发了全新的客户端操作方式，称为 High Level Client。高级别客户端与 ES 版本同步更新，但是 springboot 最初整合 ES 的时候使用的是低级别客户端，所以企业开发需要更换成高级别的客户端模式。

5.整合（最新高级版）

下面使用高级别客户端方式进行 springboot 整合 ES，操作步骤如下：

（1）导入 springboot 整合 ES 高级别客户端的坐标

此种形式目前没有对应的 starter，需要去找。

<b><b><b><code class="language-xml"><dependency>    <groupId>org.elasticsearch.client</groupId>    <artifactId>elasticsearch-rest-high-level-client</artifactId></dependency></code></b></b></b>

复制代码

（2）使用编程的形式设置连接的 ES 服务器，并获取客户端对象

<b><b><b><code class="language-java">@SpringBootTestclass HighClientTest {    private RestHighLevelClient client;      @Test      void testCreateClient() throws IOException {          HttpHost host = HttpHost.create("http://localhost:9200");          RestClientBuilder builder = RestClient.builder(host);          client = new RestHighLevelClient(builder);            client.close();      }}</code></b></b></b>

复制代码

注：记得客户端使用完毕需要手工关闭。配置 ES 服务器地址与端口 9200，由于当前客户端是手工维护的，因此不能通过自动装配的形式加载对象。

（3）使用客户端对象操作 ES

例如创建索引：（这里需要先执行上面的删除索引操作，否则会报错）

<b><b><b><code class="language-java">@SpringBootTestclass HighClientTest{    private RestHighLevelClient client;      @Test      void testCreateIndex() throws IOException {          HttpHost host = HttpHost.create("http://localhost:9200");          RestClientBuilder builder = RestClient.builder(host);          client = new RestHighLevelClient(builder);                    CreateIndexRequest request = new CreateIndexRequest("user");          client.indices().create(request, RequestOptions.DEFAULT);                     client.close();      }}</code></b></b></b>

复制代码

注：第一步永远是获取 RestHighLevelClient 对象，创建索引的对象是 CreateIndexRequest，其他操作也会有自己专用的 Request 对象。最后一步永远是关闭该对象的连接。可以得出以下结论，进行方法提取。

<b><b><b><code class="language-JAVA">@SpringBootTestclass Springboot18EsApplicationTests {    @BeforeEach    //在测试类中每个操作运行前运行的方法    void setUp() {        HttpHost host = HttpHost.create("http://localhost:9200");        RestClientBuilder builder = RestClient.builder(host);        client = new RestHighLevelClient(builder);    }
    @AfterEach    //在测试类中每个操作运行后运行的方法    void tearDown() throws IOException {        client.close();    }
    private RestHighLevelClient client;
    @Test    void testCreateIndex() throws IOException {        CreateIndexRequest request = new CreateIndexRequest("book");        client.indices().create(request, RequestOptions.DEFAULT);    }}</code></b></b></b>

复制代码

<b><b><b><code>    现在的书写简化了很多，也更合理。下面使用上述模式将所有的ES操作执行一遍，测试结果</code></b></b></b>

复制代码

创建索引（IK 分词器）：

<b><b><b><code class="language-java">@Testvoid testCreateIndexByIK() throws IOException {    CreateIndexRequest request = new CreateIndexRequest("books");    String json = "{\n" +            "    \"mappings\":{\n" +            "        \"properties\":{\n" +            "            \"id\":{\n" +            "                \"type\":\"keyword\"\n" +            "            },\n" +            "            \"name\":{\n" +            "                \"type\":\"text\",\n" +            "                \"analyzer\":\"ik_max_word\",\n" +            "                \"copy_to\":\"all\"\n" +            "            },\n" +            "            \"type\":{\n" +            "                \"type\":\"keyword\"\n" +            "            },\n" +            "            \"description\":{\n" +            "                \"type\":\"text\",\n" +            "                \"analyzer\":\"ik_max_word\",\n" +            "                \"copy_to\":\"all\"\n" +            "            },\n" +            "            \"all\":{\n" +            "                \"type\":\"text\",\n" +            "                \"analyzer\":\"ik_max_word\"\n" +            "            }\n" +            "        }\n" +            "    }\n" +            "}";    //设置请求中的参数    request.source(json, XContentType.JSON);    client.indices().create(request, RequestOptions.DEFAULT);}</code></b></b></b>

复制代码

注：IK 分词器是通过请求参数的形式进行设置的，设置请求参数使用 request 对象中的 source 方法进行设置，至于参数是什么，取决于你的操作种类。当请求中需要参数时，均可使用当前形式进行参数设置。

添加文档：

<b><b><b><code class="language-java">@Test//添加文档void testCreateDoc() throws IOException {    User user = userDao.selectById(1);    IndexRequest request = new IndexRequest("user").id(book.getId().toString());    String json = JSON.toJSONString(book);    request.source(json,XContentType.JSON);    client.index(request,RequestOptions.DEFAULT);}</code></b></b></b>

复制代码

<b><b><b><code>    添加文档使用的请求对象是IndexRequest，与创建索引使用的请求对象不同。  </code></b></b></b>

复制代码

批量添加文档：

<b><b><b><code class="language-java">@Test//批量添加文档void testCreateDocAll() throws IOException {    List<User> userList = userDao.selectList(null);    BulkRequest bulk = new BulkRequest();    for (User user : userList) {        IndexRequest request = new IndexRequest("user").id(user.getId().toString());        String json = JSON.toJSONString(book);        request.source(json,XContentType.JSON);        bulk.add(request);    }    client.bulk(bulk,RequestOptions.DEFAULT);}</code></b></b></b>

复制代码

注：批量做时，先创建一个 BulkRequest 的对象，可以将该对象理解为是一个保存 request 对象的容器，将所有的请求都初始化好后，添加到 BulkRequest 对象中，再使用 BulkRequest 对象的 bulk 方法，一次性执行完毕。

按 id 查询文档：

<b><b><b><code class="language-java">@Test//按id查询void testGet() throws IOException {    GetRequest request = new GetRequest("user","1");    GetResponse response = client.get(request, RequestOptions.DEFAULT);    String json = response.getSourceAsString();    System.out.println(json);}</code></b></b></b>

复制代码

<b><b><b><code>    根据id查询文档使用的请求对象是GetRequest。</code></b></b></b>

复制代码

按条件查询文档：

<b><b><b><code class="language-java">@Test//按条件查询void testSearch() throws IOException {    SearchRequest request = new SearchRequest("user");
    SearchSourceBuilder builder = new SearchSourceBuilder();    builder.query(QueryBuilders.termQuery("all","spring"));    request.source(builder);
    SearchResponse response = client.search(request, RequestOptions.DEFAULT);    SearchHits hits = response.getHits();    for (SearchHit hit : hits) {        String source = hit.getSourceAsString();        //System.out.println(source);        Book book = JSON.parseObject(source, Book.class);        System.out.println(book);    }}</code></b></b></b>

复制代码

注：按条件查询文档使用的请求对象是 SearchRequest，查询时调用 SearchRequest 对象的 termQuery 方法，需要给出查询属性名，此处支持使用合并字段，也就是前面定义索引属性时添加的 all 属性。

总结：ES 是为了查询速度快，之后会有更细致的有关 ES 的博客。希望对您有帮助，感谢阅读结束语：裸体一旦成为艺术，便是最圣洁的。道德一旦沦为虚伪，便是最下流的。勇敢去做你认为正确的事，不要被世俗的流言蜚语所困扰。

发布于: 2022-11-26阅读数: 44

原文链接:【http://xie.infoq.cn/article/77d40c057b969545f818fc8b7】。文章转载请联系作者。

陈老老老板

关注

奇安信开发工程师，有问题可以私聊我！ 2019-03-16 加入

还未添加个人简介

发布

暂无评论

创作场景

SpringBoot 整合 Elasticsearch 超详细教程

1.ES 简介

2.全文搜索实现过程：倒排索引

（1）进行分词

（2）存储对应 id

（3）通过 id 查询结果

3.安装

（1）下载 ES

（2）解压缩

（3）启动服务器

（4）基本操作

（1）创建索引

（2）查询索引

（3）删除索引

（4）创建索引并指定分词器

a.添加文档，有三种方式

b.查询文档

c.条件查询

d.修改文档（全量更新）

e.修改文档（部分更新）

f.删除文档

4. 整合（早期低级版）

（1）：导入 springboot 整合 ES 的 starter 坐标

（2）：进行基础配置

（3）：使用 springboot 整合 ES 的专用客户端接口 ElasticsearchRestTemplate 来进行操作

（4）连接 pojo 层

（5）连接 dao 层

5.整合（最新高级版）

（1）导入 springboot 整合 ES 高级别客户端的坐标

（2）使用编程的形式设置连接的 ES 服务器，并获取客户端对象

（3）使用客户端对象操作 ES

陈老老老板

评论