Streaming with Agents SDK

Streaming allows for a more human response from our LLMs, when streaming, we can obtain chunks of data at a time, similar to how we would talk. Then we will move on to creating a system where the LLM outputs updates in the events that occour during the run time. This will keep the user updated on the LLM's activities.

python

!pip install -qU \
    "openai-agents==0.1.0"

Firstly we need to get a OPENAI_API_KEY set up, for this you will need to create an account on OpenAI and grab your API key

python

import os
import getpass

os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY") or \
    getpass.getpass("OpenAI API Key: ")

Streaming Text Events

In this section we will quickly cover the basics to stream text straight from an agent.

First, we need to import the Agent class and define our agent object, here we will only need to do the basic settup as we did in previous tutorials.

python

from agents import Agent
agent = Agent(
    name="Streamer Agent",
    instructions="You are a helpful assistant.",
    model="gpt-4.1-mini",
)

To run our agent asynchronously and with streaming, we will use the run_streamed method rather than the default run method, allowing us to stream tokens or events to our user/console as soon as they are received from OpenAI.

Now we can create a for loop that checks for any events happening, and then if the event is a chunk of text via the ResponseTextDeltaEvent we want to force print that to the console using flush. We also use end="" to avoid printing each individual token on a new line.

When you run this code you should see tokens being streamed to the output below, rather than being printed in a single large chunk.

python

from openai.types.responses import ResponseTextDeltaEvent
from agents import Runner

result = Runner.run_streamed(
    agent,
    input="Tell me what caused the stock market to crash in 2008."
)

async for event in result.stream_events():
    if event.type == "raw_response_event" and isinstance(event.data, ResponseTextDeltaEvent):
        print(event.data.delta, end="", flush=True)

Streaming Event Information

Now we want to stream event information. This can include anything from an agent change, to a tool call, or even the output finally being ready to output.

First we will create a function using the function_tool decorator.

This tool will be a simple time tool that will return the current time in a string format.

python

from agents import function_tool
from datetime import datetime

@function_tool()
async def fetch_time() -> str:
    """Fetch the current time."""
    return datetime.now().strftime("%Y-%m-%d %H:%M:%S")

Next we want a lower level agent that has access to this tool, we can make this by defining a new agent from the Agent class as we did previously. Whilst also supplying instructions to make sure the agent is aware of the role it plays.

python

time_agent = Agent(
    name="Time-Agent",
    instructions=(
        "You are a time agent that fetches the current time. Make sure when returning "
        "your response you include the agent that provided the information along with "
        "any additional tool calls used within the agent."
    ),
    tools=[fetch_time],
    model='gpt-4.1-mini',
)

Next we want to define our top level agent, and use the agent we defined in the previous step as_tool, allowing us to make extra events in our stream.

python

orchestrator_agent = Agent(
    name="Orchestrator-Agent",
    instructions="""
    You are an orchestrator agent that uses the tools given to you to complete the user's query.
    You have access to the `Time Agent` tool.
    """,
    tools=[
        time_agent.as_tool(
            tool_name="Time-Agent",
            tool_description="Fetch the current time",
        )
    ],
    model='gpt-4.1-mini',
)

Now we can use the run_streamed method from our Runner to begin the events.

Then we can use the example given from the Agents SDK team to filter through all the event information.

Event information can be anything from a handoff, tool call, or even the message output.

python

from agents import ItemHelpers

result = Runner.run_streamed(
        orchestrator_agent,
        input="what time is it?",
)

print("=== Run starting ===")
async for event in result.stream_events():
    if event.type == "raw_response_event": # ignore any raw events as this is just the LLM output
        continue
    elif event.type == "agent_updated_stream_event": # when a handoff or agent change occurs
        print(f"Agent updated from {event.new_agent.name} to {event.new_agent.tools[0].name}")
        continue
    elif event.type == "run_item_stream_event": # when items are generated
        if event.item.type == "tool_call_item": # if the item is a tool call
            print("-- Tool was called")
        elif event.item.type == "tool_call_output_item": # if the item is a tool call output
            print(f"-- Tool output: {event.item.output}")
        elif event.item.type == "message_output_item": # if the item is a message output
            print(f"-- Message output:\n {ItemHelpers.text_message_output(event.item)}")
        else:
            pass  # ignore everything else

text

=== Run starting ===
Agent updated from Orchestrator-Agent to Time-Agent
-- Tool was called
-- Tool output: The current time is 2025-04-10 08:51:29. (Agent: Time Agent)
-- Message output:
 The current time is 8:51 AM on April 10, 2025.