使用腾讯云大数据 Elasticsearch 8.8.1 实现：NLP+Vector Search+GAI

2024-01-03
广东
本文字数：7516 字
阅读完需：约 25 分钟

导读

腾讯云大数据 Elasticsearch Service 在最近上线了 8.8.1 版本。该版本中的核心能力，是为 AI 革命提供高级搜索能力！该版本特别引入了 Elasticsearch Relevance Engine™（ESRE™）—— 一款强大的 AI 增强搜索引擎，为搜索与分析带来全新的前沿体验。

虽然大家会觉得现在遍地开花的向量库，这个发布也仅仅是“其中之一”。但如果我给大家强调一下其中的重点，或许会让大家有一个更清楚的认知和定位：

这是目前中国公有云上仅有的提供从自然语言处理，到向量化，再到向量搜索，并能与大模型集成的端到端的搜索与分析平台：

图一

1、不是所有的向量库，都能在单一接口调用中实现多路召回混合排序！

2、也不是所有的搜索引擎，能在执行向量搜索之后能再进行聚合操作！

图二

当然，本文的重点不是介绍，而是应用与实践。接下来，本文将展示如何在腾讯云上创建 Elasticsearch 8.8.1 集群，并部署与使用 NLP 模型结合，并在向量搜索的基础上，与大模型进行结合。

一、创建 Elasticsearch 8.8.1 集群

建的过程很简单，与以往一样，选择对应的版本即可。这里需要强调的是，因为我们要将各种 NLP 模型，embedding 模型部署到集群当中，因此需要尽量选择足够的内存用于模型的部署。

图三

图四

二、部署 NLP 模型

无论是执行向量搜索还是通过执行命名实体识别等 NLP 任务的来提取文本中的信息，都需要执行推理任务。腾讯云 Elasticsearch 8.8.1 最大的区别在于，您无需再搭建一套机器学习环境用于数据的处理和推理，您可以直接在 Elasticsearch 上，通过在管道中集成不同的 Processor，灵活的处理数据。

图五、在 ingest pipeline 中执行处理和推理

并且保证我们的查询和写入的数据都是使用同样的一个模型来处理数据。以简化模型在使用，更新和维护上的成本。

图六

而部署的方式非常简单。我们提供了一款叫做 eland 的工具来实现模型的上传和部署：

eland_import_hub_model --url https://es-7cu6zx9m.public.tencentelasticsearch.com:9200  --insecure -u elastic -p changeme --hub-model-id sentence-transformers/msmarco-MiniLM-L-12-v3 --task-type text_embedding --start --insecure

复制代码

在部署时，如果是在你自己的个人电脑上使用 eland_import_hub_model（因为如果是从 huggingFace 上下载模型，需要互联网的访问能力），则需要提供腾讯云 Elasticsearch 的公网访问接口：

图七

当然，你也可以在腾讯云上购买一个 CVM，然后使用内网访问地址：

eland_import_hub_model --url https://172.27.0.11:9200  --insecure -u elastic -p changeme --hub-model-id canIjoin/datafun --task-type ner --start

复制代码

但需要注意的是，CVM 上有时无法访问 huggingFace 或者访问超时，都可能会造成模型无法上传和部署。

与互联网访问受限类似，如果你是自己训练的自有模型，又不想发布到 huggingFace，则可以参考本文如何将本地 transformer 模型部署到 Elasticsearch，以将本地的 NLP 模型上传和部署。

如果正确执行了模型的上传，你会看到如下的打印：

eland_import_hub_model --url https://es-7cu6zx9m.public.tencentelasticsearch.com:9200  --insecure -u elastic -p changeme  --hub-model-id distilbert-base-uncased-finetuned-sst-2-english --task-type text_classification --start --insecure2023-07-13 10:06:23,354 WARNING : NOTE: Redirects are currently not supported in Windows or MacOs.2023-07-13 10:06:24,358 INFO : Establishing connection to Elasticsearch/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/elasticsearch/_sync/client/__init__.py:394: SecurityWarning: Connecting to 'https://es-7cu6zx9m.public.tencentelasticsearch.com:9200' using TLS with verify_certs=False is insecure  _transport = transport_class(/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/connectionpool.py:1045: InsecureRequestWarning: Unverified HTTPS request is being made to host 'es-7cu6zx9m.public.tencentelasticsearch.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings  warnings.warn(2023-07-13 10:06:24,535 INFO : Connected to cluster named 'es-7cu6zx9m' (version: 8.8.1)2023-07-13 10:06:24,537 INFO : Loading HuggingFace transformer tokenizer and model 'distilbert-base-uncased-finetuned-sst-2-english'Downloading pytorch_model.bin: 100%|████████████████████████████████████████████████████████████| 268M/268M [00:19<00:00, 13.6MB/s]/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/transformers/models/distilbert/modeling_distilbert.py:223: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.  mask, torch.tensor(torch.finfo(scores.dtype).min)/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/connectionpool.py:1045: InsecureRequestWarning: Unverified HTTPS request is being made to host 'es-7cu6zx9m.public.tencentelasticsearch.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings  warnings.warn(2023-07-13 10:06:48,795 INFO : Creating model with id 'distilbert-base-uncased-finetuned-sst-2-english'/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/connectionpool.py:1045: InsecureRequestWarning: Unverified HTTPS request is being made to host 'es-7cu6zx9m.public.tencentelasticsearch.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings  warnings.warn(2023-07-13 10:06:48,855 INFO : Uploading model definition  0%|                                                                                                   | 0/64 [00:00<?, ? parts/s]/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/connectionpool.py:1045: InsecureRequestWarning: Unverified HTTPS request is being made to host 'es-7cu6zx9m.public.tencentelasticsearch.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings  warnings.warn(  2%|█                                                                                         | 1/64 [00:01<01:25,  1.36s/ parts]/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/connectionpool.py:1045: InsecureRequestWarning: Unverified HTTPS request is being made to host 'es-7cu6zx9m.public.tencentelasticsearch.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings  warnings.warn(  3%|██                                                                                        | 2/64 [00:01<00:53,  1.16 parts/s]/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/connectionpool.py:1045: InsecureRequestWarning: Unverified HTTPS request is being made to host 'es-7cu6zx9m.public.tencentelasticsearch.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings...100%|██████████████████████████████████████████████████████████████████████████████████████████| 64/64 [00:45<00:00,  1.42 parts/s]2023-07-13 10:07:34,021 INFO : Uploading model vocabulary/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/connectionpool.py:1045: InsecureRequestWarning: Unverified HTTPS request is being made to host 'es-7cu6zx9m.public.tencentelasticsearch.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings  warnings.warn(2023-07-13 10:07:34,110 INFO : Starting model deployment/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/connectionpool.py:1045: InsecureRequestWarning: Unverified HTTPS request is being made to host 'es-7cu6zx9m.public.tencentelasticsearch.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings  warnings.warn(2023-07-13 10:07:41,163 INFO : Model successfully imported with id 'distilbert-base-uncased-finetuned-sst-2-english'

复制代码

三、模型管理和测试

模型上传之后，您可以直接在腾讯云 Elasticsearch Service 的 Kibana 界面上管理和测试模型：

图八

四、在应用中通过 Elasticsearch 实现 NLP+Vector Search+GAI

当我们完成模型的部署和调试之后，就可以在应用中集成这种能力。比如，我们要实现一个论文的问答系统，可以按照如下的步骤实施：

图九

其中的一些核心代码为：

async def nlp_blog_search():    # 判断模型是否在Elasticsaerch中加载    global app_models    is_model_up_and_running(INFER_MODEL_TEXT_EMBEDDINGS)    is_model_up_and_running(INFER_MODEL_Q_AND_A)
    qa_model = True if app_models.get(        INFER_MODEL_Q_AND_A) == 'started' else False    index_name = INDEX_BLOG_SEARCH
    if not es.indices.exists(index=index_name):        return render_template('nlp_blog_search.html', title='Blog search', te_model_up=False,                               index_name=index_name, missing_index=True, qa_model_up=qa_model)
    if app_models.get(INFER_MODEL_TEXT_EMBEDDINGS) == 'started':        form = SearchBlogsForm()
        # Check for method        if request.method == 'POST':
            if form.validate_on_submit():                if 'filter_by_author' in request.form:                    form.searchboxAuthor.data = request.form['filter_by_author']
                if form.searchboxBlogWindow.data is None or len(form.searchboxBlogWindow.data) == 0:                    # 对查询进行embedding转换                    embeddings_response = infer_trained_model(                        form.searchbox.data, INFER_MODEL_TEXT_EMBEDDINGS)                    # 执行向量搜索（）混合搜索、/混合搜索                                         search_response = knn_blogs_embeddings(embeddings_response['predicted_value'],                                                           form.searchboxAuthor.data)                    cfg = {                        "question_answering": {                            "question": form.searchbox.data,                            "max_answer_length": 30                        }                    }
                    hits_with_answers = search_response['hits']['hits']                    #使用QA模型做第一遍过滤                    answers = executor.map(q_and_a, map(lambda hit: hit["_id"], hits_with_answers),                                           map(lambda hit: form.searchbox.data, hits_with_answers),                                           map(lambda hit: get_text(hit=hit), hits_with_answers))
                    best_answer = None                    for i in range(0, len(hits_with_answers)):                        hit_with_answer = hits_with_answers[i]
                        matched_answer = next(                            (obj['result'] for obj in answers if obj["_id"] == hit_with_answer["_id"]), None)
                        if (matched_answer is not None):                            hit_with_answer['answer'] = matched_answer                            if (best_answer is None or (                                    matched_answer is not None and 'prediction_probability' in matched_answer and                                    matched_answer['prediction_probability'] > best_answer['prediction_probability'])):                                best_answer = matched_answer
                            start_idx = matched_answer['start_offset']                            end_idx = matched_answer['end_offset']
                            text = hits_with_answers[i]['fields']['body_content_window'][0]                            text_with_highlighted_answer = Markup(''.join([text[0:start_idx - 1],                                                                           "<b>", text[start_idx -                                                                                       1:end_idx],                                                                           "</b>", text[end_idx:]]))                            hits_with_answers[i]['fields']['body_content_window'][0] = text_with_highlighted_answer
                    # 将结果交给大模型进行总结                    messages = blogs_convert_es_response_to_messages(search_response,                                                                            form.searchbox.data)                    # Send a request to the OpenAI API                    try:                        response_ai = openai.ChatCompletion.create(                            engine="gpt-35-turbo",                            temperature=0,                            messages=messages                        )
                        answer_openai = response_ai["choices"][0]["message"]["content"]                    except RateLimitError as e:                        print(e.error.message)                        answer_openai =  e.error.message                    except APIConnectionError as e:                        print(e.error.message)                        answer_openai = e.error.message
                    return render_template('nlp_blog_search.html', title='Blog search', form=form,                                           search_results=hits_with_answers,                                           best_answer=best_answer, openai_answer=answer_openai,                                           query=form.searchbox.data, te_model_up=True, qa_model_up=qa_model,                                           missing_index=False)                else:                    search_response = q_and_a(                        question=form.searchbox.data, full_text=form.searchboxBlogWindow.data)                    return render_template('nlp_blog_search.html', title='Blog search', form=form,                                           qa_results=search_response,                                           query=form.searchbox.data, te_model_up=True, qa_model_up=qa_model,                                           missing_index=False)            else:                return redirect(url_for('nlp_blog_search'))        else:  # GET            return render_template('nlp_blog_search.html', title='Blog Search', form=form, te_model_up=True,                               qa_model_up=qa_model, missing_index=False)    else:        return render_template('nlp_blog_search.html', title='Blog search', te_model_up=False, qa_model_up=qa_model,                               model_name=INFER_MODEL_TEXT_EMBEDDINGS, missing_index=False)

复制代码

通过这些集成，可以实现如下效果：

图十

也就是说，通过腾讯云 Elasticsearch 8.8.1，我们可以在应用轻松实现仅通过调用 Elasticsearch 的接口，即可实现向量搜索并将结果交给 QnA 模型来抓取重点。并进一步把内容交给大模型来进行总结。

五、总结

腾讯云大数据 Elasticsearch Service 发布的这个最新版本 8.8.1，它引入了 Elasticsearch Relevance Engine™（ESRE™），提供高级搜索和 AI 增强搜索功能。这个版本支持在单一端到端搜索与分析平台中实现自然语言处理、向量搜索以及与大模型的集成。使用该服务，你可以轻松创建集群、部署 NLP 模型，并进行搜索和推理任务。你还可以在 Kibana 界面上进行模型的管理和测试。通过腾讯云 Elasticsearch，你可以实现由 AI 驱动的高级搜索能力，并帮助你更好地利用 AI 技术。赶快体验吧！

发布于: 刚刚阅读数: 3

腾讯云大数据

关注

还未添加个人签名 2020-06-19 加入

欢迎关注，邀您一起探索数据的无限潜能！

发布

暂无评论

创作场景

使用腾讯云大数据 Elasticsearch 8.8.1 实现：NLP+Vector Search+GAI

一、创建 Elasticsearch 8.8.1 集群

二、部署 NLP 模型

三、模型管理和测试

四、在应用中通过 Elasticsearch 实现 NLP+Vector Search+GAI

五、总结

腾讯云大数据

评论