Streams
The .stream()
method returns a generator that yields the model’s response one chunk at a time. This makes it a better choice than .chat()
for programming bespoke experiences such as chatbot apps, where some other framework is responsible for consuming and displaying the response.
import chatlas as ctl
= ctl.ChatOpenAI(
chat ="gpt-4o-mini",
model="You are a helpful assistant.",
system_prompt
)
= chat.stream("My name is Chatlas.")
stream for chunk in stream:
print(chunk)
Hello
Chatlas.
How
can
I
help?
Just like .chat()
, once the model is done responding (i.e., the generator is exhausted), the user and assistant turns are stored on the Chat
instance. That means, you can save and manage them same as you would with .chat()
.
len(chat.get_turns())
2
.stream()
accepts multi-modal input, just like .chat()
.
Content types
.stream()
also provides access to rich content types beyond just text. To gain access to these, set the content
parameter to "all"
. If the response includes things like tool calls, the stream will yield the relevant content types as they are generated. As we’ll learn later, this can be useful for displaying tool calls in something like a chatbot app.
def get_current_weather(lat: float, lng: float):
"Get the current weather for a given location."
return "sunny"
chat.register_tool(get_current_weather)
= chat.stream(
stream "How's the weather in San Francisco?",
="all"
content
)for chunk in stream:
print(type(chunk))
<class 'chatlas._content.ContentToolRequest'>
<class 'chatlas._content.ContentToolResult'>
<class 'str'>
<class 'str'>
<class 'str'>
<class 'str'>
Wrapping generators
Sometimes it’s useful to wrap the .stream()
generator up into another generator function.
This is useful for adding additional functionality, such as:
- Adding a delay between chunks
- Filtering out certain content types
- Adding custom formatting
- etc.
import time
def stream_with_delay(prompt: str, delay: float = 0.5):
"""
Stream the model's response with a delay between chunks.
"""
= chat.stream(prompt)
stream for chunk in stream:
time.sleep(delay)yield chunk
for chunk in stream_with_delay("How's the weather in San Francisco?"):
print(chunk)
The current weather in San Francisco is sunny.