代码收藏家技术教程 2025-01-28

Ollama Python Library 使用指南

简介

Ollama Python库提供了一种最简单的方式将Python 3.8+项目与Ollama集成。这个库支持同步和异步操作，可以轻松实现与Ollama模型的交互。

前置条件

已安装并运行Ollama服务

使用ollama pull <model>拉取所需模型（例如：ollama pull llama3.2）

更多可用模型信息请访问Ollama.com

安装

pip install ollama

基本用法

简单对话

from ollama import chat
from ollama import ChatResponse

response: ChatResponse = chat(model='llama3.2', messages=[
    {
        'role': 'user',
        'content': 'Why is the sky blue?',
    },
])
print(response['message']['content'])
# 或直接访问响应对象的字段
print(response.message.content)

流式响应

from ollama import chat

stream = chat(
    model='llama3.2',
    messages=[{'role': 'user', 'content': 'Why is the sky blue?'}],
    stream=True,
)

for chunk in stream:
    print(chunk['message']['content'], end='', flush=True)

自定义客户端

同步客户端

from ollama import Client
client = Client(
    host='http://localhost:11434',
    headers={'x-some-header': 'some-value'}
)
response = client.chat(model='llama3.2', messages=[
    {
        'role': 'user',
        'content': 'Why is the sky blue?',
    },
])

异步客户端

import asyncio
from ollama import AsyncClient

async def chat():
    message = {'role': 'user', 'content': 'Why is the sky blue?'}
    response = await AsyncClient().chat(model='llama3.2', messages=[message])

asyncio.run(chat())

API功能

主要功能

chat(): 进行对话

generate(): 生成文本

list(): 列出模型

show(): 显示模型信息

create(): 创建模型

copy(): 复制模型

delete(): 删除模型

pull(): 拉取模型

push(): 推送模型

embed(): 生成嵌入向量

ps(): 查看进程状态

错误处理

try:
    ollama.chat('does-not-exist-model')
except ollama.ResponseError as e:
    print('Error:', e.error)
    if e.status_code == 404:
        ollama.pull(model)