Skip to content

Commit

Permalink
Examples refactor (#329)
Browse files Browse the repository at this point in the history
* Examples and README updates

---------

Co-authored-by: fujitatomoya <[email protected]>
Co-authored-by: Michael Yang <[email protected]>
  • Loading branch information
3 people authored Nov 21, 2024
1 parent 139c89e commit 64c1eb7
Show file tree
Hide file tree
Showing 28 changed files with 457 additions and 282 deletions.
139 changes: 79 additions & 60 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,12 @@

The Ollama Python library provides the easiest way to integrate Python 3.8+ projects with [Ollama](https://github.com/ollama/ollama).

## Prerequisites

- [Ollama](https://ollama.com/download) should be installed and running
- Pull a model to use with the library: `ollama pull <model>` e.g. `ollama pull llama3.2`
- See [Ollama.com](https://ollama.com/search) for more information on the models available.

## Install

```sh
Expand All @@ -11,25 +17,34 @@ pip install ollama
## Usage

```python
import ollama
response = ollama.chat(model='llama3.1', messages=[
from ollama import chat
from ollama import ChatResponse

response: ChatResponse = chat(model='llama3.2', messages=[
{
'role': 'user',
'content': 'Why is the sky blue?',
},
])
print(response['message']['content'])
# or access fields directly from the response object
print(response.message.content)
```

See [_types.py](ollama/_types.py) for more information on the response types.

## Streaming responses

Response streaming can be enabled by setting `stream=True`, modifying function calls to return a Python generator where each part is an object in the stream.
Response streaming can be enabled by setting `stream=True`.

> [!NOTE]
> Streaming Tool/Function calling is not yet supported.
```python
import ollama
from ollama import chat

stream = ollama.chat(
model='llama3.1',
stream = chat(
model='llama3.2',
messages=[{'role': 'user', 'content': 'Why is the sky blue?'}],
stream=True,
)
Expand All @@ -38,20 +53,68 @@ for chunk in stream:
print(chunk['message']['content'], end='', flush=True)
```

## Custom client
A custom client can be created by instantiating `Client` or `AsyncClient` from `ollama`.

All extra keyword arguments are passed into the [`httpx.Client`](https://www.python-httpx.org/api/#client).

```python
from ollama import Client
client = Client(
host='http://localhost:11434',
headers={'x-some-header': 'some-value'}
)
response = client.chat(model='llama3.2', messages=[
{
'role': 'user',
'content': 'Why is the sky blue?',
},
])
```

## Async client

The `AsyncClient` class is used to make asynchronous requests. It can be configured with the same fields as the `Client` class.

```python
import asyncio
from ollama import AsyncClient

async def chat():
message = {'role': 'user', 'content': 'Why is the sky blue?'}
response = await AsyncClient().chat(model='llama3.2', messages=[message])

asyncio.run(chat())
```

Setting `stream=True` modifies functions to return a Python asynchronous generator:

```python
import asyncio
from ollama import AsyncClient

async def chat():
message = {'role': 'user', 'content': 'Why is the sky blue?'}
async for part in await AsyncClient().chat(model='llama3.2', messages=[message], stream=True):
print(part['message']['content'], end='', flush=True)

asyncio.run(chat())
```

## API

The Ollama Python library's API is designed around the [Ollama REST API](https://github.com/ollama/ollama/blob/main/docs/api.md)

### Chat

```python
ollama.chat(model='llama3.1', messages=[{'role': 'user', 'content': 'Why is the sky blue?'}])
ollama.chat(model='llama3.2', messages=[{'role': 'user', 'content': 'Why is the sky blue?'}])
```

### Generate

```python
ollama.generate(model='llama3.1', prompt='Why is the sky blue?')
ollama.generate(model='llama3.2', prompt='Why is the sky blue?')
```

### List
Expand All @@ -63,14 +126,14 @@ ollama.list()
### Show

```python
ollama.show('llama3.1')
ollama.show('llama3.2')
```

### Create

```python
modelfile='''
FROM llama3.1
FROM llama3.2
SYSTEM You are mario from super mario bros.
'''

Expand All @@ -80,37 +143,37 @@ ollama.create(model='example', modelfile=modelfile)
### Copy

```python
ollama.copy('llama3.1', 'user/llama3.1')
ollama.copy('llama3.2', 'user/llama3.2')
```

### Delete

```python
ollama.delete('llama3.1')
ollama.delete('llama3.2')
```

### Pull

```python
ollama.pull('llama3.1')
ollama.pull('llama3.2')
```

### Push

```python
ollama.push('user/llama3.1')
ollama.push('user/llama3.2')
```

### Embed

```python
ollama.embed(model='llama3.1', input='The sky is blue because of rayleigh scattering')
ollama.embed(model='llama3.2', input='The sky is blue because of rayleigh scattering')
```

### Embed (batch)

```python
ollama.embed(model='llama3.1', input=['The sky is blue because of rayleigh scattering', 'Grass is green because of chlorophyll'])
ollama.embed(model='llama3.2', input=['The sky is blue because of rayleigh scattering', 'Grass is green because of chlorophyll'])
```

### Ps
Expand All @@ -119,50 +182,6 @@ ollama.embed(model='llama3.1', input=['The sky is blue because of rayleigh scatt
ollama.ps()
```

## Custom client

A custom client can be created with the following fields:

- `host`: The Ollama host to connect to
- `timeout`: The timeout for requests

```python
from ollama import Client
client = Client(host='http://localhost:11434')
response = client.chat(model='llama3.1', messages=[
{
'role': 'user',
'content': 'Why is the sky blue?',
},
])
```

## Async client

```python
import asyncio
from ollama import AsyncClient

async def chat():
message = {'role': 'user', 'content': 'Why is the sky blue?'}
response = await AsyncClient().chat(model='llama3.1', messages=[message])

asyncio.run(chat())
```

Setting `stream=True` modifies functions to return a Python asynchronous generator:

```python
import asyncio
from ollama import AsyncClient

async def chat():
message = {'role': 'user', 'content': 'Why is the sky blue?'}
async for part in await AsyncClient().chat(model='llama3.1', messages=[message], stream=True):
print(part['message']['content'], end='', flush=True)

asyncio.run(chat())
```

## Errors

Expand Down
57 changes: 57 additions & 0 deletions examples/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Running Examples

Run the examples in this directory with:
```sh
# Run example
python3 examples/<example>.py
```

### Chat - Chat with a model
- [chat.py](chat.py)
- [async-chat.py](async-chat.py)
- [chat-stream.py](chat-stream.py) - Streamed outputs
- [chat-with-history.py](chat-with-history.py) - Chat with model and maintain history of the conversation


### Generate - Generate text with a model
- [generate.py](generate.py)
- [async-generate.py](async-generate.py)
- [generate-stream.py](generate-stream.py) - Streamed outputs
- [fill-in-middle.py](fill-in-middle.py) - Given a prefix and suffix, fill in the middle


### Tools/Function Calling - Call a function with a model
- [tools.py](tools.py) - Simple example of Tools/Function Calling
- [async-tools.py](async-tools.py)


### Multimodal with Images - Chat with a multimodal (image chat) model
- [multimodal_chat.py](multimodal_chat.py)
- [multimodal_generate.py](multimodal_generate.py)


### Ollama List - List all downloaded models and their properties
- [list.py](list.py)


### Ollama ps - Show model status with CPU/GPU usage
- [ps.py](ps.py)


### Ollama Pull - Pull a model from Ollama
Requirement: `pip install tqdm`
- [pull.py](pull.py)


### Ollama Create - Create a model from a Modelfile
```python
python create.py <model> <modelfile>
```
- [create.py](create.py)

See [ollama/docs/modelfile.md](https://github.com/ollama/ollama/blob/main/docs/modelfile.md) for more information on the Modelfile format.


### Ollama Embed - Generate embeddings with a model
- [embed.py](embed.py)

3 changes: 0 additions & 3 deletions examples/async-chat-stream/README.md

This file was deleted.

59 changes: 0 additions & 59 deletions examples/async-chat-stream/main.py

This file was deleted.

19 changes: 19 additions & 0 deletions examples/async-chat.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
import asyncio
from ollama import AsyncClient


async def main():
messages = [
{
'role': 'user',
'content': 'Why is the sky blue?',
},
]

client = AsyncClient()
response = await client.chat('llama3.2', messages=messages)
print(response['message']['content'])


if __name__ == '__main__':
asyncio.run(main())
15 changes: 15 additions & 0 deletions examples/async-generate.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
import asyncio
import ollama


async def main():
client = ollama.AsyncClient()
response = await client.generate('llama3.2', 'Why is the sky blue?')
print(response['response'])


if __name__ == '__main__':
try:
asyncio.run(main())
except KeyboardInterrupt:
print('\nGoodbye!')
Loading

0 comments on commit 64c1eb7

Please sign in to comment.