LLM- 结合三元组 SPO 和提示工程来试用 Baichuan2-7B-Chat-4bits 模型

2024-05-13
河南
本文字数：1074 字
阅读完需：约 4 分钟

概述

《LLM-结合三元组 SPO 和提示工程来试用 Baichuan2-7B-Chat-4bits 模型》近期对 LLM 进行了一些应用场景的思考，其中很简单的一个场景是客服，假设目前所有的知识信息都在一个 Excel 文档中，首先将其转换为三元组关系，然后结合提示工程技术向 LLM 进行提问，期望得到反馈。

效果

最左侧是一个 Excel 表格，包含商品信息，中间的文字部分是将 Excel 中的数据转换为三元组 SPO 信息，并且添加上如图所示的提示工程，右侧是模型返回的结果，可以看到能够按照要求返回数据。

调用

在安装 Baichuan2-7B-Chat-4bits 后，使用如下代码进行调用，得到返回结果。

import torchfrom transformers import AutoModelForCausalLM, AutoTokenizerfrom transformers.generation.utils import GenerationConfigimport os
# 获取当前文件所在的目录路径current_dir = os.path.dirname(os.path.abspath(__file__))# 将当前目录和'model'连接起来，获得'model'文件夹的完整路径model_path = os.path.join(current_dir, 'model/Baichuan2-7B-Chat-4bits')print('model_path=', model_path)
if __name__ == '__main__':tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False, trust_remote_code=True)model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True)model.generation_config = GenerationConfig.from_pretrained(model_path)messages = []messages.append({"role": "user", "content": "解释一下“温故而知新”"})response = model.chat(tokenizer, messages)print(response)

复制代码

部署

git clone https://github.com/baichuan-inc/Baichuan2.git
cd Baichuan2
pip install -r requirements.txt
## 处理安装包的兼容问题，Baichuan2-7B-Chat-4bitspip install bitsandbytes==0.41.1pip install accelerate-0.25.0
## 同样的可以将如上的python脚本放到 ${Baichuan2} 文件夹下。## 匹配如上的python脚本，需要下载模型文件到 ${Baichuan2}/model/Baichuan2-7B-Chat-4bits 文件夹下。

复制代码

总结

百川的 Baichuan2-7B-Chat-4bits 量化模型，在实际部署的时候，显存占用 10G 左右，略高于其他人的实验结果，对消费级显卡也有一定要求。

前期之所以选择 Baichuan2-7B-Chat-4bits 量化模型，其实是想尽可能降低对硬件环境的要求，实际部署的过程中，硬件要求会比预期的高。

实践过程中，暂未选择私有知识库的形式，也未做出对比，后续会进一步进行对比实现。

参考

发布于: 刚刚阅读数: 4

原文链接:【http://xie.infoq.cn/article/3ed6f652cdf1c2399c3dc48e2】。

alexgaoyh

关注

DevOps 2013-12-08 加入

https://gitee.com/alexgaoyh

发布

暂无评论

创作场景

LLM- 结合三元组 SPO 和提示工程来试用 Baichuan2-7B-Chat-4bits 模型

概述

效果

调用

部署

总结

参考

alexgaoyh

评论