写点什么

私密离线聊天新体验!llama-gpt 聊天机器人:极速、安全、搭载 Llama 2

  • 2023-10-11
    浙江
  • 本文字数:1734 字

    阅读完需:约 6 分钟

私密离线聊天新体验!llama-gpt聊天机器人:极速、安全、搭载Llama 2

“私密离线聊天新体验!llama-gpt 聊天机器人:极速、安全、搭载 Llama 2,尽享 Code Llama 支持!”

一个自托管的、离线的、类似 chatgpt 的聊天机器人。由美洲驼提供动力。100%私密,没有数据离开您的设备。

Demo

https://github.com/getumbrel/llama-gpt/assets/10330103/5d1a76b8-ed03-4a51-90bd-12ebfaf1e6cd

1.支持模型

Currently, LlamaGPT supports the following models. Support for running custom models is on the roadmap.


1.1 安装 LlamaGPT 在 umbrelOS

Running LlamaGPT on an umbrelOS home server is one click. Simply install it from the Umbrel App Store.


1.2 安装 LlamaGPT on M1/M2 Mac

Make sure your have Docker and Xcode installed.


Then, clone this repo and cd into it:


git clone https://github.com/getumbrel/llama-gpt.gitcd llama-gpt
复制代码


Run LlamaGPT with the following command:


./run-mac.sh --model 7b
复制代码


You can access LlamaGPT at http://localhost:3000.


To run 13B or 70B chat models, replace 7b with 13b or 70b respectively.To run 7B, 13B or 34B Code Llama models, replace 7b with code-7b, code-13b or code-34b respectively.


To stop LlamaGPT, do Ctrl + C in Terminal.

1.3 在 Docker 上安装

You can run LlamaGPT on any x86 or arm64 system. Make sure you have Docker installed.


Then, clone this repo and cd into it:


git clone https://github.com/getumbrel/llama-gpt.gitcd llama-gpt
复制代码


Run LlamaGPT with the following command:


./run.sh --model 7b
复制代码


Or if you have an Nvidia GPU, you can run LlamaGPT with CUDA support using the --with-cuda flag, like:


./run.sh --model 7b --with-cuda
复制代码


You can access LlamaGPT at http://localhost:3000.


To run 13B or 70B chat models, replace 7b with 13b or 70b respectively.To run Code Llama 7B, 13B or 34B models, replace 7b with code-7b, code-13b or code-34b respectively.


To stop LlamaGPT, do Ctrl + C in Terminal.


Note: On the first run, it may take a while for the model to be downloaded to the /models directory. You may also see lots of output like this for a few minutes, which is normal:

llama-gpt-llama-gpt-ui-1       | [INFO  wait] Host [llama-gpt-api-13b:8000] not yet available...

After the model has been automatically downloaded and loaded, and the API server is running, you'll see an output like:

llama-gpt-ui_1   | ready - started server on 0.0.0.0:3000, url: http://localhost:3000

You can then access LlamaGPT at http://localhost:3000.



1.4 在 Kubernetes 安装

First, make sure you have a running Kubernetes cluster and kubectl is configured to interact with it.


Then, clone this repo and cd into it.


To deploy to Kubernetes first create a namespace:


kubectl create ns llama
复制代码


Then apply the manifests under the /deploy/kubernetes directory with


kubectl apply -k deploy/kubernetes/. -n llama
复制代码


Expose your service however you would normally do that.

2.OpenAI 兼容 API

Thanks to llama-cpp-python, a drop-in replacement for OpenAI API is available at http://localhost:3001. Open http://localhost:3001/docs to see the API documentation.


  • 基线


We've tested LlamaGPT models on the following hardware with the default system prompt, and user prompt: "How does the universe expand?" at temperature 0 to guarantee deterministic results. Generation speed is averaged over the first 10 generations.


Feel free to add your own benchmarks to this table by opening a pull request.

2.1 Nous Hermes Llama 2 7B Chat (GGML q4_0)

2.2 Nous Hermes Llama 2 13B Chat (GGML q4_0)

2.3 Nous Hermes Llama 2 70B Chat (GGML q4_0)

2.4 Code Llama 7B Chat (GGUF Q4_K_M)

2.5 Code Llama 13B Chat (GGUF Q4_K_M)

2.6 Phind Code Llama 34B Chat (GGUF Q4_K_M)


更多优质内容请关注公号:汀丶人工智能;会提供一些相关的资源和优质文章,免费获取阅读。

发布于: 刚刚阅读数: 5
用户头像

本博客将不定期更新关于NLP等领域相关知识 2022-01-06 加入

本博客将不定期更新关于机器学习、强化学习、数据挖掘以及NLP等领域相关知识,以及分享自己学习到的知识技能,感谢大家关注!

评论

发布
暂无评论
私密离线聊天新体验!llama-gpt聊天机器人:极速、安全、搭载Llama 2_人工智能_汀丶人工智能_InfoQ写作社区