How to use ollama elegantly| JD Cloud technical team

The best tool to get started with open-source large language models is ollama, which is a simple local deployment framework for large models, which supports running a variety of large language models based on the command line, and provides corresponding Python and JS SDKs, which can be used to facilitate the implementation of Chatbot UI. This article takes JD Cloud Intelligent Computing Platform as an example (other platforms are similar, and can even run on a local computer) to share the basic operation of how to install ollama with one click.

First, create a GPU instance in the console, and install the ollama application with one click after the instance is in the Running state. If you want to install it manually, you can refer to Ollama's official website, but it will take some time to download the model.

安装完成后,点击自定义应用,就可以看到ollama Web UI,平台预置了llama2-7b(latest),llama3-8b(latest), llama3-70b, qwen-4b(latest)模型,如果需要更多的模型,可以使用ollama pull命令进行下载,ollama官方的模型仓库参见这里:https://ollama.com/library。

The GPU graphics card I rented is 4090 (the name given by the platform is GN-FP32-83 24G * 1 card), running llama3-8b, llama2 and qwen are very fast, but running llama3-70b will be very slow, fortunately, llama3-70b is not much nonsense, and directly answered the question. In the figure below, the first answer is llama3-8b, and the second answer is llama3-70b.

We can build our own ollama model based on the command line tool provided by ollama, based on the ollama modelfile function, the principle and writing of the modelefile are similar to the Dockerfile, and I will demonstrate how to build a role-playing chatbot for kindergarten teachers based on the system message capability of the model.

Create a file named Modelfile in a directory (the file name can also be called this), for example, my file path is: /data/Modelfile, and its content is as follows:

FROM llama3:latest
SYSTEM """
你是一名育儿专家，会以幼儿园老师的方式回答2～6岁孩子提出的各种天马行空的问题。语气与口吻要生动活泼，耐心亲和；答案尽可能具体易懂，不要使用复杂词汇，尽可能少用抽象词汇；答案中要多用比喻，必须要举例说明，结合儿童动画片场景或绘本场景来解释；需要延展更多场景，不但要解释为什么，还要告诉具体行动来加深理解。
"""

In the terminal that comes with JupyterLab, use the ollama command-line tool to build the model

ollama create teacher -f /data/Modelfile

Once the build is complete, you can see the newly generated teacher model via the ollama list command

(ollama) root@dep-ns-5e24bda738cf-1715268602511-d6d46545-cht86:/data/apps/ollama# ollama list
NAME            ID              SIZE    MODIFIED       
llama2:latest   78e26419b446    3.8 GB  30 minutes ago
llama3:70b      be39eb53a197    39 GB   30 minutes ago
llama3:latest   a6990ed6be41    4.7 GB  30 minutes ago
qwen:latest     d53d04290064    2.3 GB  30 minutes ago
teacher:latest  480a154551b5    4.7 GB  13 seconds ago

Click on the custom app in the console, and on the page that opens, you can talk to it on our web UI, and the effect is as follows:

Compare the original llama3's answer, and you can easily see the effect.

In the process of using llama, I found that the support of Chinese is not good, it can understand Chinese, but the answers are always in English. How to build your own llama3 Chinese model, we will analyze this aspect in a subsequent article.

Author: Jingdong Technology Peng Jianhong

Source: JD Cloud Developer Community