SemanticKernel/C#：使用Ollama中的对话模型与嵌入模型用于本地离线场景

preface

In the previous article, I introduced a simple practice of using SemanticKernel/C# for RAG, in the previous article I used an online API compatible with the OpenAI format, but there are actually many local offline scenarios. Today, I would like to introduce to you how to use the dialogue model and embedding model in Ollama for local offline scenarios in SemanticKernel/C#.

Start practicing

本文使用的对话模型是gemma2:2b,嵌入模型是all-minilm:latest,可以先在Ollama中下载好。

SemanticKernel/C#：使用Ollama中的对话模型与嵌入模型用于本地离线场景

2024年2月8号,Ollamama中的兼容了OpenAI Chat Completions API,具体见 https://ollama.com/blog/openai-compatibility。

Therefore, it is relatively simple to use the dialogue model in Ollama in SemanticKernel/C#.

var kernel = Kernel.CreateBuilder()
 .AddOpenAIChatCompletion(modelId: "gemma2:2b", apiKey: , endpoint: new Uri("http://localhost:11434")).Build();

This is how you can build the kernel.

Let's try it out:

public async Task<string> Praise()
{
var skPrompt = """ 
 你是一个夸人的专家，回复一句话夸人。 
 你的回复应该是一句话，不要太长，也不要太短。 
 """;
var result = await _kernel.InvokePromptAsync(skPrompt);
var str = result.ToString();
return str;
}

In this way, the setup successfully uses Ollama's dialogue model in SemanticKernel.

Now let's take a look at the embedding model, since Ollama is not compatible with OpenAI's format, it is not possible to use it directly.

Ollama's format looks like this:

OpenAI's request format looks like this:

curl https://api.openai.com/v1/embeddings \
 -H "Content-Type: application/json" \
 -H "Authorization: Bearer $OPENAI_API_KEY" \
 -d '{
 "input": "Your text string goes here",
 "model": "text-embedding-3-small"
 }'

OpenAI's return format looks like this:

{
"object": "list",
"data": [
 {
"object": "embedding",
"index": 0,
"embedding": [
-0.006929283495992422,
-0.005336422007530928,
 ... (omitted for spacing)
-4.547132266452536e-05,
-0.024047505110502243
 ],
 }
 ],
"model": "text-embedding-3-small",
"usage": {
"prompt_tokens": 5,
"total_tokens": 5
 }
}

Therefore, forwarding by request is not possible.

Someone also raised this question in ollama's issue before:

There also seems to be a ready implementation of the compatibility of the embedded interface:

I tried it so far, and it is not compatible.

In SemanticKernel, I need to implement some interfaces to use Ollama's embedding model, but after searching, I found that there are already big guys who have done this, github address: https://github.com/BLaZeKiLL/Codeblaze.SemanticKernel.

For use, see https://github.com/BLaZeKiLL/Codeblaze.SemanticKernel/tree/main/dotnet/Codeblaze.SemanticKernel.Connectors.Ollama

大佬实现了ChatCompletion、EmbeddingGeneration与TextGenerationService,如果你只使用到EmbeddingGeneration可以看大佬的代码,在项目里自己添加一些类,来减少项目中的包。

Here, for convenience, install the big guy's package directly:

构建ISemanticTextMemory：

public async Task<ISemanticTextMemory> GetTextMemory3()
 {
var builder = new MemoryBuilder();
var embeddingEndpoint = "http://localhost:11434";
var cancellationTokenSource = new System.Threading.CancellationTokenSource();
var cancellationToken = cancellationTokenSource.Token;
 builder.WithHttpClient(new HttpClient());
 builder.WithOllamaTextEmbeddingGeneration("all-minilm:latest", embeddingEndpoint); 
 IMemoryStore memoryStore = await SqliteMemoryStore.ConnectAsync("memstore.db");
 builder.WithMemoryStore(memoryStore);
var textMemory = builder.Build();
return textMemory;
 }

Now let's try the effect, improve it based on yesterday's sharing, and upload a txt file today.

A private document looks like this, and the privacy information has been replaced:

各位同学：
你好，为了帮助大家平安、顺利地度过美好的大学时光，学校专门引进“互联网+”高校安全教育服务平台，可通过手机端随时随地学习安全知识的网络微课程。大学生活多姿多彩，牢固掌握安全知识，全面提升安全技能和素质。请同学们务必在规定的学习时间完成该课程的学习与考试。
请按如下方式自主完成学习和考试：
1、手机端学习平台入口：请关注微信公众号“XX大学”或扫描下方二维码，进入后点击公众号菜单栏【学术导航】→【XX微课】，输入账号（学号）、密码（学号），点【登录】后即可绑定信息，进入学习平台。
2、网页端学习平台入口：打开浏览器，登录www.xxx.cn，成功进入平台后，即可进行安全知识的学习。
3、平台开放时间：2024年4月1日—2024年4月30日，必须完成所有的课程学习后才能进行考试，试题共计50道，满分为100分，80分合格，有3次考试机会，最终成绩取最优分值。
4、答疑qq群号：123123123。
学习平台登录流程
1. 手机端学习平台入口：
请扫描下方二维码，关注微信公众号“XX大学”；
公众号菜单栏【学术导航】→【XX微课】，选择学校名称，输入账号（学号）、密码（学号），点【登录】后即可绑定信息，进入学习平台；
遇到问题请点【在线课服】或【常见问题】，进行咨询（咨询时间：周一至周日8:30-17:00）。
2. 网页端学习平台入口：
打开浏览器，登录www.xxx.cn，成功进入平台后，即可进行安全知识的学习。
3. 安全微课学习、考试
1) 微课学习
 点击首页【学习任务中】的【2024年春季安全教育】，进入课程学习；
 展开微课列表，点击微课便可开始学习；
 大部分微课是点击继续学习，个别微课是向上或向左滑动学习；
 微课学习完成后会有“恭喜，您已完成本微课的学习”的提示，需点击【确定】，再点击【返回课程列表】，方可记录微课完成状态；
2) 结课考试
完成该项目的所有微课学习后，点击【考试安排】→【参加考试】即可参加结课考试。

To upload a document:

Cut into three sections:

Deposit data:

Reply to a question, such as "What is the Q&A QQ group number?" ”：

Although it took a bit long, about tens of seconds, the answer was correct:

Try answering one more question:

The answer effect is not very good, and because the configuration is not good, the local run is also very slow, if there is a condition, you can change a model, if there is no condition and it is not necessary to run offline, you can connect a free API, in combination with the local embedding model.

换成在线api的Qwen/Qwen2-7B-Instruct,效果还不错:

summary

The main takeaway from this practice is how to use the dialogue model and embedding model in Ollama in SemanticKernel for local offline scenarios. In the process of practicing RAG, I found that there are two main places that affect the effect.

The first place is the determination of the slice size:

var lines = TextChunker.SplitPlainTextLines(input, 20);
var paragraphs = TextChunker.SplitPlainTextParagraphs(lines, 100);

The second place is to get a few pieces of relevant data and the setting of relevance:

var memoryResults = textMemory.SearchAsync(index, input, limit: 3, minRelevanceScore: 0.3);

The correlation is too high, and a piece of data is also found

SemanticKernel/C#：使用Ollama中的对话模型与嵌入模型用于本地离线场景

preface

Start practicing

summary

Read on

双Orin-X+双激光雷达！小鹏G9智驾算力拉满，XNGP领跑全场景

Leading the trend of cultural tourism, Quwo's 72-hour graffiti action creates a new scene of cultural tourism experience

Wang Chuqin left the airport, and this battle was probably more lively than the opening ceremony of the Olympic Games. These 800 feints are not to be handsome, but to avoid the dense cameras. You say this guy

Award-winning "5G + Cultural Tourism" Application Demonstration Scenario! The visual effects of the AR experience of Datang Never Sleeps City are directly full~

The University of Science and Technology of China has developed BSF soft fingers that can be used for early cancer screening, pulse measurement and other scenarios

Penghua Pension Investment and Education New Scene|Jointly launched the "Blue Vest Caravan" to help the elderly and prevent fraud

CNCC | The future of multimodal affective computing under large models

The "Fuxi Eye" large model was released! It has the world's largest ophthalmic image database

National Health Insurance Administration: The next step will be to focus on key business scenarios and develop new models of medical insurance digital services

New car | The AI large model is on the car, 13 new/27 optimizations, and the ZEEKR 009 glorious OTA upgrade

Fantasy Journey to the West: 20 kindergarten script numbers in one scene, online 24 hours a day, and thousands of profits in 1 day

AI Daily: Fudan and Baidu's new models can generate 1-hour long videos; The new version of ChatGPT for Windows is launched; Two new features have been added to NotebookLM

Make a small wooden scene, looking back at the water supply station and the trembling water delivery people

Surveying and Mapping Bulletin | Ren Ping: Noise data visualization based on LOD1 city model

Huawei's all-scenario new product electronics expo shines brightly, and once again leads the new trend of science and technology

2024 Electronics Expo|Huawei's new products in all scenarios shine and lead the new trend of science and technology again

Jinxiang Mountain Forest Park has a new scene and new gameplay! Go for the weekend!

The terminal AI grading standard has been implemented, and the "fire" of the mobile phone model has burned to the agent

J Clin Invest丨Yang Weili/Li Shihua/Li Xiaojiang's team used monkey models to reveal new pathological mechanisms of Parkinson's disease

2024 Electronics Expo|Huawei's new products in all scenarios shine and lead the new trend of science and technology again

2024 Music Variety Show Market Trend Observation: New Programs Coming Everywhere Focus on Vertical Tracks and Composite Scenes

Tens of millions of dollars lost by poisoning for large model training? Anthropic found a hidden bug in the LLM codebase

Nearly 1,000 teenagers in the city gathered at Zhonghai Expo to show their skills in the three major model competitions of navigation, aviation and architecture

DeepMind and MIT developed Fluid, which enables autoregressive models to achieve large-scale expansion of Wensheng graphs

SAIC Maxus: Scene-based car manufacturing, ingenuity to create Chinese symbols丨People's City · fifth anniversary

AI Weekly | ByteDance's large model training was "poisoned"; Microsoft will terminate the Azure OpenAI service for individuals in China

Vita Lemon Tea launched an autumn marketing offensive, focusing on these two scenes

ByteDance responded to the attack on the intern for the training of the large model: it has been dismissed and does not affect the online business

A number of large models have been rolled out in the field of traditional Chinese medicine, and the "AI old Chinese medicine" is coming?

Shoot the king to bomb? Photorealistic generative world model, with Pixar investment

Tencent, Huawei, etc. access to DeepSeek lose more than 400 million yuan per month, and the MaaS model as a service is about to be subverted? Titanium media AGI

The sex robot was unexpectedly empowered by a large model, and the concept stocks of adult products rose collectively, against the sky?