Streaming allows for a more human response from our LLMs, when streaming, we can obtain chunks of data at a time, similar to how we would talk. Then we will move on to creating a system where the LLM outputs updates in the events that occour during the run time. This will keep the user updated on the LLM's activities.
!pip install -qU \
"openai-agents==0.1.0"
Firstly we need to get a OPENAI_API_KEY
set up, for this you will need to create an account on OpenAI and grab your API key
import os
import getpass
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY") or \
getpass.getpass("OpenAI API Key: ")
Streaming Text Events
In this section we will quickly cover the basics to stream text straight from an agent.
First, we need to import the Agent
class and define our agent object, here we will only need to do the basic settup as we did in previous tutorials.
from agents import Agent
agent = Agent(
name="Streamer Agent",
instructions="You are a helpful assistant.",
model="gpt-4.1-mini",
)
To run our agent asynchronously and with streaming, we will use the run_streamed
method rather than the default run
method, allowing us to stream tokens or events to our user/console as soon as they are received from OpenAI.
Now we can create a for loop that checks for any events happening, and then if the event is a chunk of text via the ResponseTextDeltaEvent
we want to force print that to the console using flush
. We also use end=""
to avoid printing each individual token on a new line.
When you run this code you should see tokens being streamed to the output below, rather than being printed in a single large chunk.
from openai.types.responses import ResponseTextDeltaEvent
from agents import Runner
result = Runner.run_streamed(
agent,
input="Tell me what caused the stock market to crash in 2008."
)
async for event in result.stream_events():
if event.type == "raw_response_event" and isinstance(event.data, ResponseTextDeltaEvent):
print(event.data.delta, end="", flush=True)
Streaming Event Information
Now we want to stream event information. This can include anything from an agent change, to a tool call, or even the output finally being ready to output.
First we will create a function using the function_tool
decorator.
This tool will be a simple time tool that will return the current time in a string format.
from agents import function_tool
from datetime import datetime
@function_tool()
async def fetch_time() -> str:
"""Fetch the current time."""
return datetime.now().strftime("%Y-%m-%d %H:%M:%S")
Next we want a lower level agent that has access to this tool, we can make this by defining a new agent from the Agent
class as we did previously. Whilst also supplying instructions to make sure the agent is aware of the role it plays.
time_agent = Agent(
name="Time-Agent",
instructions=(
"You are a time agent that fetches the current time. Make sure when returning "
"your response you include the agent that provided the information along with "
"any additional tool calls used within the agent."
),
tools=[fetch_time],
model='gpt-4.1-mini',
)
Next we want to define our top level agent, and use the agent we defined in the previous step as_tool
, allowing us to make extra events in our stream.
orchestrator_agent = Agent(
name="Orchestrator-Agent",
instructions="""
You are an orchestrator agent that uses the tools given to you to complete the user's query.
You have access to the `Time Agent` tool.
""",
tools=[
time_agent.as_tool(
tool_name="Time-Agent",
tool_description="Fetch the current time",
)
],
model='gpt-4.1-mini',
)
Now we can use the run_streamed
method from our Runner
to begin the events.
Then we can use the example given from the Agents SDK team to filter through all the event information.
Event information can be anything from a handoff, tool call, or even the message output.
from agents import ItemHelpers
result = Runner.run_streamed(
orchestrator_agent,
input="what time is it?",
)
print("=== Run starting ===")
async for event in result.stream_events():
if event.type == "raw_response_event": # ignore any raw events as this is just the LLM output
continue
elif event.type == "agent_updated_stream_event": # when a handoff or agent change occurs
print(f"Agent updated from {event.new_agent.name} to {event.new_agent.tools[0].name}")
continue
elif event.type == "run_item_stream_event": # when items are generated
if event.item.type == "tool_call_item": # if the item is a tool call
print("-- Tool was called")
elif event.item.type == "tool_call_output_item": # if the item is a tool call output
print(f"-- Tool output: {event.item.output}")
elif event.item.type == "message_output_item": # if the item is a message output
print(f"-- Message output:\n {ItemHelpers.text_message_output(event.item)}")
else:
pass # ignore everything else
=== Run starting ===
Agent updated from Orchestrator-Agent to Time-Agent
-- Tool was called
-- Tool output: The current time is 2025-04-10 08:51:29. (Agent: Time Agent)
-- Message output:
The current time is 8:51 AM on April 10, 2025.