Writer supports streaming responses when interacting with the Chat Completion API.
The SDK provides a client.chat.stream() method that wraps the client.chat.chat(stream=True) stream providing a more granular event API and automatic accumulation of each delta.
Unlike client.chat.chat(stream=True), the stream() method requires usage within a context manager to prevent accidental leak of the response:
from writerai import AsyncWriter
client = AsyncWriter()
async with client.chat.stream(
model='palmyra-x-004',
messages=[...],
) as stream:
async for event in stream:
if event.type == 'content.delta':
print(event.content, flush=True, end='')When the context manager is entered, a ChatCompletionStream / AsyncChatCompletionStream instance is returned which, like .create(stream=True) is an iterator in the sync client and an async iterator in the async client. The full list of events that are yielded by the iterator are outlined below.
When the context manager exits, the response will be closed, however the stream instance is still available outside
the context manager.
These events allow you to track the progress of the chat completion generation, access partial results, and handle different aspects of the stream separately.
Below is a list of the different event types you may encounter:
Emitted for every chunk received from the API.
type:"chunk"chunk: The rawChatCompletionChunkobject received from the APIsnapshot: The current accumulated state of the chat completion
Emitted for every chunk containing new content.
type:"content.delta"delta: The new content string received in this chunksnapshot: The accumulated content so farparsed: The partially parsed content (if applicable)
Emitted when the content generation is complete. May be fired multiple times if there are multiple choices.
type:"content.done"content: The full generated contentparsed: The fully parsed content (if applicable)
Emitted when a chunk contains part of a content refusal.
type:"refusal.delta"delta: The new refusal content string received in this chunksnapshot: The accumulated refusal content string so far
Emitted when the refusal content is complete.
type:"refusal.done"refusal: The full refusal content
Emitted when a chunk contains new content log probabilities.
type:"logprobs.content.delta"content: A list of the new log probabilities received in this chunksnapshot: A list of the accumulated log probabilities so far
Emitted when all content log probabilities have been received.
type:"logprobs.content.done"content: The full list of token log probabilities for the content
Emitted when a chunk contains new refusal log probabilities.
type:"logprobs.refusal.delta"refusal: A list of the new log probabilities received in this chunksnapshot: A list of the accumulated log probabilities so far
Emitted when all refusal log probabilities have been received.
type:"logprobs.refusal.done"refusal: The full list of token log probabilities for the refusal
A handful of helper methods are provided on the stream class for additional convenience,
.get_final_completion()
Returns the accumulated ParsedChatCompletion object
async with client.chat.stream(...) as stream:
...
completion = await stream.get_final_completion()
print(completion.choices[0].message).until_done()
If you want to wait for the stream to complete, you can use the .until_done() method.
async with client.chat.stream(...) as stream:
await stream.until_done()
# stream is now finished