写点什么

Elasticsearch 实战:常见错误及详细解决方案

  • 2023-11-02
    浙江
  • 本文字数:2959 字

    阅读完需:约 10 分钟

Elasticsearch实战:常见错误及详细解决方案

Elasticsearch 实战:常见错误及详细解决方案

1.read_only_allow_delete":"true"

当我们在向某个索引添加一条数据的时候,可能(极少情况)会碰到下面的报错:


{  "error": {    "root_cause": [      {        "type": "cluster_block_exception",        "reason": "blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"      }    ],    "type": "cluster_block_exception",    "reason": "blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"  },  "status": 403}
复制代码


上述报错是说索引现在的状态是只读模式(read-only),如果查看该索引此时的状态:


GET z1/_settings#结果如下{  "z1" : {    "settings" : {      "index" : {        "number_of_shards" : "5",        "blocks" : {          "read_only_allow_delete" : "true"        },        "provided_name" : "z1",        "creation_date" : "1556204559161",        "number_of_replicas" : "1",        "uuid" : "3PEevS9xSm-r3tw54p0o9w",        "version" : {          "created" : "6050499"        }      }    }  }}
复制代码


可以看到"read_only_allow_delete" : "true",说明此时无法插入数据,当然,我们也可以模拟出来这个错误:


PUT z1{  "mappings": {    "doc": {      "properties": {        "title": {          "type":"text"        }      }    }  },  "settings": {    "index.blocks.read_only_allow_delete": true  }}
PUT z1/doc/1{ "title": "es真难学"}
复制代码


现在我们如果执行插入数据,就会报开始的错误。那么怎么解决呢?


  • 清理磁盘,使占用率低于 85%。

  • 手动调整该项,具体参考官网


这里介绍一种,我们将该字段重新设置为:


PUT z1/_settings{  "index.blocks.read_only_allow_delete": null}
复制代码


现在再查看该索引就正常了,也可以正常的插入数据和查询了。

2. illegal_argument_exception

有时候,在聚合中,我们会发现如下报错:


{  "error": {    "root_cause": [      {        "type": "illegal_argument_exception",        "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [age] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."      }    ],    "type": "search_phase_execution_exception",    "reason": "all shards failed",    "phase": "query",    "grouped": true,    "failed_shards": [      {        "shard": 0,        "index": "z2",        "node": "NRwiP9PLRFCTJA7w3H9eqA",        "reason": {          "type": "illegal_argument_exception",          "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [age] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."        }      }    ],    "caused_by": {      "type": "illegal_argument_exception",      "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [age] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.",      "caused_by": {        "type": "illegal_argument_exception",        "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [age] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."      }    }  },  "status": 400}
复制代码


这是怎么回事呢?是因为,聚合查询时,指定字段不能是text类型。比如下列示例:


PUT z2/doc/1{  "age":"18"}PUT z2/doc/2{  "age":20}
GET z2/doc/_search{ "query": { "match_all": {} }, "aggs": { "my_sum": { "sum": { "field": "age" } } }}
复制代码


当我们向elasticsearch中,添加一条数据时(此时,如果索引存在则直接新增或者更新文档,不存在则先创建索引),首先检查该age字段的映射类型。如上示例中,我们添加第一篇文档时(z1索引不存在),elasticsearch会自动的创建索引,然后为age字段创建映射关系(es 就猜此时age字段的值是什么类型,如果发现是text类型,那么存储该字段的映射类型就是text),此时age字段的值是text类型,所以,第二条插入数据,age的值也是text类型,而不是我们看到的long类型。我们可以查看一下该索引的mappings信息:


GET z2/_mapping#mapping信息如下{  "z2" : {    "mappings" : {      "doc" : {        "properties" : {          "age" : {            "type" : "text",            "fields" : {              "keyword" : {                "type" : "keyword",                "ignore_above" : 256              }            }          }        }      }    }  }}
复制代码


上述返回结果发现,age类型是text。而该类型又不支持聚合,所以,就会报错了。解决办法就是:


  • 如果选择动态创建一篇文档,映射关系取决于你添加的第一条文档的各字段都对应什么类型。而不是我们看到的那样,第一次是text,第二次不加引号,就是long类型了不是这样的。

  • 如果嫌弃上面的解决办法麻烦,那就选择手动创建映射关系。首先指定好各字段对应什么类型。后续才不至于出错。

3.Result window is too large

很多时候,我们在查询文档时,一次查询结果很可能会有很多,而 elasticsearch 一次返回多少条结果,由size参数决定:


GET e2/doc/_search{  "size": 100000,  "query": {    "match_all": {}  }}
复制代码


而默认是最多范围一万条,那么当我们的请求超过一万条时(比如有十万条),就会报:


Result window is too large, from + size must be less than or equal to: [10000] but was [100000]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.
复制代码


意思是一次请求返回的结果太大,可以另行参考 scroll API或者设置index.max_result_window参数手动调整size的最大默认值:


#kibana中设置PUT e2/_settings{  "index": {    "max_result_window": "100000"  }}#Python中设置from elasticsearch import Elasticsearches = Elasticsearch()es.indices.put_settings(index='e2', body={"index": {"max_result_window": 100000}})
复制代码


如上例,我们手动调整索引e2size参数最大默认值到十万,这时,一次查询结果只要不超过 10 万就都会一次返回。 注意,这个设置对于索引essize参数是永久生效的。

4.持续更新中

更多优质内容请关注公号:汀丶人工智能;会提供一些相关的资源和优质文章,免费获取阅读。

发布于: 刚刚阅读数: 3
用户头像

本博客将不定期更新关于NLP等领域相关知识 2022-01-06 加入

本博客将不定期更新关于机器学习、强化学习、数据挖掘以及NLP等领域相关知识,以及分享自己学习到的知识技能,感谢大家关注!

评论

发布
暂无评论
Elasticsearch实战:常见错误及详细解决方案_Elastic Search_汀丶人工智能_InfoQ写作社区