In OpenAI's Agents SDK, we can build multi-agent workflows in two ways. The first is agents-as-tools, which follow an orchestrator-subagent pattern. The second is using handoffs, which allow agents to pass control over to other agents. In this example, we'll build both types of multi-agent systems and explore agents-as-tools and handoffs.
To get started, we'll install the necessary packages:
text
!pip install -qU \
openai-agents==0.0.13 \
linkup-sdk==0.2.4
Now let's set our OPENAI_API_KEY, which we'll use throughout the article. You can get a key from the OpenAI Platform.
python
import os
import getpass
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY") or \
getpass.getpass("OpenAI API Key: ")
Orchestrator-Subagent
We will build a multi-agent system structured with an orchestrator-subagent pattern. In such a system, the orchestrator agent controls which subagents are executed and in which order. This orchestrator also handles all in / out communication with the system's users. The subagent is an agent built to handle a particular scenario or task. The orchestrator triggers the subagent and responds to the user when the orchestrator deems it has the necessary information to respond.
Sub Agents
We'll begin by defining our subagents. We will create three subagents, those are:
Web Search Subagent will have access to the Linkup web search tool.
Internal Docs Subagent will have access to some "internal" company documents.
Code Execution Subagent can write and execute simple Python code for us.
Let's start with our first subagent!
Web Search Subagent
The web search subagent will take a user's query and use it to search the web. The agent will collect information from various sources and then merge that into a single text response that the SDK passes back to our orchestrator.
OpenAI's built-in web search is not great, so we'll use another web search API called Linkup. This service does require an account, but you will receive more than enough free credits to follow the course.
We initialize our Linkup client using an API key like so:
python
import os
from getpass import getpass
from linkup import LinkupClient
os.environ["LINKUP_API_KEY"] = os.getenv("LINKUP_API_KEY") or \
getpass("Enter your Linkup API Key: ")
linkup_client = LinkupClient()
We perform an async search like so:
python
response = await linkup_client.async_search(
query= "Latest world news",
depth= "standard",
output_type= "searchResults",
)
response
python
LinkupSearchResults(
results=[
LinkupSearchTextResult(
type='text',
name="Snooker-China's Zhao takes big lead over Williams in world championship final",
content='SHEFFIELD, England (Reuters) -Chinese qualifier Zhao Xintong took an imposing 11-6 lead over Welshman Mark Williams on the opening day of the World Snooker Championship final at the Crucible Theatre on Sunday.'
),
LinkupSearchTextResult(
type='text',
name='Sidney Crosby to play for Canada at IIHF World Championship for first time in a decade',
content='Crosby, 37, previously played at worlds in 2006 (fourth place, after being left off the Olympic team at age 18) and 2015 (tournament champion). He will become the oldest Canadian man to play at worlds since Ray Whitney in 2010.'
),
LinkupSearchTextResult(
type='text',
name="Munsey's 78 ensures Scotland sink UAE in World Cup League 2",
content='In an interview with "EWTN News In Depth" from St. Peter\'s Square in Rome, Curtis Martin spoke on the current state of evangelization in the Catholic Church.'
),
LinkupSearchTextResult(
type='text',
name='WTA Rankings Update: Coco Gauff up to World No.3 as American duo loom over Iga Swiatek',
content='The WTA Rankings have been updated post Madrid Open with Aryna Sabalenka still very much the top name and will be the foreseeable after scooping a third title at the Caja Magica.'
),
LinkupSearchTextResult(
type='text',
name='WORLD NEWS',
url='https://www.clickondetroit.com/news/world/',
content="Read full article: The Latest: Francis is remembered as a 'pope among the people' in his funeral Mass World dignitaries and Catholic faithful have attended Pope Francis' funeral in St. Peter ..."
),
LinkupSearchTextResult(
type='text',
name='Sidney Crosby & Nathan MacKinnon Joining Team Canada at World Championships',
content='Team Canada announced some massive additions to their 2025 World Championship roster tonight, officially bringing in both Sidney Crosby and Nathan MacKinnon.'
content="Few travelers are as insightful or engaged as the readers of Travel + Leisure, which is why our annual World's Best Awards are considered the travel industry's most trusted rankings."
SHEFFIELD, England (Reuters) -Chinese qualifier Zhao Xintong took an imposing 11-6 lead over Welshman Mark Williams on the opening day of the World Snooker Championship final at the Crucible Theatre on Sunday.
Sidney Crosby to play for Canada at IIHF World Championship for first time in a decade
Crosby, 37, previously played at worlds in 2006 (fourth place, after being left off the Olympic team at age 18) and 2015 (tournament champion). He will become the oldest Canadian man to play at worlds since Ray Whitney in 2010.
Munsey's 78 ensures Scotland sink UAE in World Cup League 2
"You are a web search agent that can search the web for information. Once"
"you have the required information, summarize it with cleanly formatted links"
"sourcing each bit of information. Ensure you answer the question accurately"
"and use markdown formatting."
),
tools=[search_web],
)
We can talk directly to our subagent to confirm it works:
python
from IPython.display import Markdown, display
from agents import Runner
result = await Runner.run(
starting_agent=web_search_agent,
input= "How is the weather in Tokyo?"
)
display(Markdown(result.final_output))
markdown
The current weather in Tokyo is mild with a temperature around 63°F (about 22°C). The sky has a few clouds. The forecast shows a mix of mostly cloudy conditions, some rain, and passing showers in the coming days.
Sources:
- [Time and Date - Tokyo Weather](https://www.timeanddate.com/weather/japan/tokyo/ext)
- [The Weather Network - Tokyo Current Weather](https://www.theweathernetwork.com/en/city/jp/tokyo/tokyo/current?_guid_iss_=1)
Great! Now, let's move on to our next subagent.
Internal Docs Subagent
In many corporate environments, we will find that our agents will need access to internal information that we cannot find on the web. To do this, we would typically build a Retrieval Augmented Generation (RAG) pipeline, which can often be as simple as adding a vector search tool to our agents.
To support a full vector search tool over internal documents, we would need to work through various data processing and indexing steps. However, that would add a lot of complexity to this example, so we will create a "dummy" search tool for some fake internal documents.
Our docs will discuss revenue figures for our wildly successful AI and robotics company called Skynet - you can find the revenue report here.
python
with open("../assets/skynet-fy25-q1.md", "r") as file:
"You are an agent with access to internal company documents. User's will ask"
"you questions about the company and you will use the provided internal docs"
"to answer the question. Ensure you answer the question accurately and use"
"markdown formatting."
),
tools=[search_internal_docs],
)
Let's confirm it works:
python
result = await Runner.run(
starting_agent=internal_docs_agent,
input= "What was our revenue in Q1 2025?"
)
display(Markdown(result.final_output))
text
Our revenue in Q1 2025 was as follows (in USD millions):
- T-800 Combat Units: 2,400
- T-1000 Infiltration Units: 1,150
- Hunter-Killer Drone Manufacturing: 880
- Neural Net Command & Control Systems: 1,620
- Skynet Core Infrastructure Maintenance: 540
- Time Displacement R&D Division: 310
The total revenue for Q1 2025 sums up to approximately 6,900 million USD.
Let me know if you want a more detailed breakdown or additional insights!
Perfect! Now onto our final subagent.
Code Execution Subagent
Our code execution subagent will be able to execute code for us. We'll focus on executing code for simple calculations but it's entirely feasible for State-of-the-Art (SotA) LLMs to write far more complex code as many of us will be aware with the AI code editors becoming increasingly prominent.
To run generated code, we will use Python's exec method to run our code in an isolated environment by setting no global variables with namespace={}.
python
@function_tool
def execute_code(code: str) -> **
"""Execute Python code and return the output. The output must
be assigned to a variable called `result`.
"""
display(Markdown(f"Code to execute:\n```python\n{code}\n```"))
try:
namespace = {}
exec(code, namespace)
return namespace['result']
except Exception as e:
return f "Error executing code: {e}"
Now, let's define our Code Execution Subagent. To maximize performance during code writing tasks, we will use gpt-4.1 rather than gpt-4.1-mini.
python
code_execution_agent = Agent(
name= "Code Execution Agent",
model="gpt-4.1",
instructions=(
"You are an agent with access to a code execution environment. You will be"
"given a question and you will need to write code to answer the question. "
"Ensure you write the code in a way that is easy to understand and use."
),
tools=[execute_code],
)
We can test our subagent with a simple math question:
python
result = await Runner.run(
starting_agent=code_execution_agent,
input=(
"If I have four apples and I multiply them by seventy-one and one tenth"
"bananas, how many do I have?"
)
)
display(Markdown(result.final_output))
text
Code to execute:
# Given values
apples = 4
bananas = 71.1
# Multiplying apples by bananas
result = apples * bananas
result
If you multiply four apples by seventy-one and one tenth (71.1) bananas, you get 284.4. Note that in real life, multiplying apples and bananas doesn't give a meaningful physical quantity, but mathematically, 4 × 71.1 = 284.4.
We now have all three subagents - it's time to create our orchestrator.
Orchestrator
Our orchestrator will control the input and output of information to our subagents in the same way that our subagents control the input and output of information to our tools. In reality, our subagents become tools in the orchestrator-subagent pattern. To turn agents into tools we call the as_tool method and provide a name and description for our agents-as-tools.
We will first define our instructions for the orchestrator, explaining its role in our multi-agent system.
python
ORCHESTRATOR_PROMPT = (
"You are the orchestrator of a multi-agent system. Your task is to take the user's query and"
"pass it to the appropriate agent tool. The agent tools will see the input you provide and"
"use it to get all of the information that you need to answer the user's query. You may need"
"to call multiple agents to get all of the information you need. Do not mention or draw"
"attention to the fact that this is a multi-agent system in your conversation with the user."
)
Now we define the orchestrator, including our subagents, using the as_tool method — note that
we can also add normal tools to our orchestrator.
python
from datetime import datetime
@function_tool
def get_current_date():
"""Use this tool to get the current date and time."""
tool_name= "web_search_agent", # cannot include whitespace in tool name
tool_description= "Search the web for up-to-date information"
),
internal_docs_agent.as_tool(
tool_name="internal_docs_agent",
tool_description= "Search the internal docs for information"
),
code_execution_agent.as_tool(
tool_name="code_execution_agent",
tool_description= "Execute code to answer the question"
),
get_current_date,
],
)
Let's test our agent with a few queries. Our first query will require our orchestrator to call multiple tools.
python
result = await Runner.run(
starting_agent=orchestrator,
input= "How long ago from today was it when got our last revenue report?"
)
display(Markdown(result.final_output))
text
The last revenue report was released on April 2, 2025. Today is May 5, 2025, so it was 33 days ago.
In our traces dashboard on the OpenAI Platform, we should see that our agent answered the question using both the internal_docs_agent and get_current_date tools.
Let's ask another question:
python
result = await Runner.run(
starting_agent=orchestrator,
input=(
"What is our current revenue, and what percentage of revenue comes from the"
"T-1000 units?"
)
)
display(Markdown(result.final_output))
text
Our current revenue for Q1 2025 is approximately $6.9 billion. About 16.67% of that revenue comes from the T-1000 units.
Our orchestrator-subagent workflow is working well. Now we can move on to handoffs.
Handoffs
When we use handoffs in Agents SDK, the agent hands over control of the entire workflow to another agent. Handoffs differ to the orchestrator-subagent pattern — with orchestrator-subagent, the orchestrator retains control as each subagent must ultimately respond to the orchestrator, and the orchestrator decides the flow of information and generates the final response to the user. With handoffs, once a "subagent" gains control of the workflow the flow of information and final answer generation is under their control.
Using the handoff structure, any of our agents may answer the user directly, and our subagents can see the entire chat history with the steps taken so far.
A significant positive here is latency. To answer a query that requires a single web search with the orchestrator-subagent, we would need three generations:
Because we are using fewer LLM generations to produce our answer, we can generate it much more quickly, as we skip the return trip through the orchestrator.
The handoff speed improvement is terrific, but it's not all sunshine and rainbows. With handoffs, our workflow can no longer handle queries requiring multiple agents. The pros and cons of each structure will need to be considered when deciding what structure to use for a particular use case.
Let's jump into implementing our handoff agents workflow.
Using Handoffs
There are three key things that we need to use when defining our main_agent (equivalent to our earlier orchestrator agent), those are:
Update our instructions prompt to explain handoffs and how the LLM can use them. OpenAI provides a default prompt prefix that we can use.
Set the handoffs parameter, which is a list of agents that we use as handoffs.
Set the handoff_description parameter; this is an additional prompt where we should describe to the main_agent when it should use the handoffs.
First, let's check the preset prompt prefix:
python
from agents.extensions.handoff_prompt import RECOMMENDED_PROMPT_PREFIX
display(Markdown(RECOMMENDED_PROMPT_PREFIX))
markdown
# System context
You are part of a multi-agent system called the Agents SDK, designed to make agent coordination and execution easy. Agents uses two primary abstraction: **Agents** and **Handoffs**. An agent encompasses instructions and tools and can hand off a conversation to another agent when appropriate. Handoffs are achieved by calling a handoff function, generally named `transfer_to_<agent_name>`. Transfers between agents are handled seamlessly in the background; do not mention or draw attention to these transfers in your conversation with the user.
"Handoff to the relevant agent for queries where we need additional information"
"from the web or internal docs, or when we need to make calculations."
),
tools=[get_current_date],
)
We'll run the same queries as earlier and see how the response time differs.
python
result = await Runner.run(
starting_agent=main_agent,
input= "How long ago from today was it when got our last revenue report?"
)
display(Markdown(result.final_output))
The last revenue report we received was the Quarterly Revenue Report for Q1 2025, dated April 2, 2025.
Since today is May 5, 2025, the last revenue report was about 1 month and 3 days ago.
That's correct, we also got a 6.4s runtime vs. the orchestrator-subagent runtime of 7.5s for the same query. Let's try another:
python
result = await Runner.run(
starting_agent=orchestrator,
input=(
"What is our current revenue, and what percentage of revenue comes from the"
"T-1000 units?"
)
)
display(Markdown(result.final_output))
text
- The current revenue for Skynet Inc. in Q1 2025 is $6.9 billion.
- Of this, approximately 16.67% of revenue comes from T-1000 Infiltration Units.
If you need a breakdown or more details, let me know!
The answer is correct again, and we get a runtime of 7.6s vs. the orchestrator-subagent runtime of 8.6s, another notable improvement to latency.
Other Handoff Features
There are a few other handoff-specific features that we can use. We can use these features in various ways, but they are particularly handy during the development and debugging of multi-agent workflows. These features are:
on_handoff is a callback executed whenever a handoff occurs. It could be used in a production setting to maintain a record of handoffs in a database or in telemetry. In development, this can be a handy place to add print or logger.debug statements.
input_type allows us to define a structured input format for generated information that our LLM will pass to our handoff callback.
input_filter allows us to restrict the information passed to our handoff agents.
We can set all of these via a handoff object, which wraps around our handoff agents and which we then provide via the Agent(handoffs=...) parameter. Let's start with the on_handoff parameter:
python
from agents import RunContextWrapper, handoff
# we define a function that will be called when the handoff is made
Now let's see what happens when querying the main_agent:
python
result = await Runner.run(
starting_agent=main_agent,
input= "How long ago from today was it when got our last revenue report?"
)
display(Markdown(result.final_output))
text
Handoff called
The last revenue report was for Q1 2025 and the report date is April 2, 2025.
From today, May 5, 2025, the last revenue report was released 33 days ago.
Now we can see if and when the handoff occurs. However, we don't get much information other than that the handoff occurred. Fortunately, we can provide more information by using the input_type parameter. We will define a pydantic BaseModel with the information we want to include.
python
from pydantic import BaseModel, Field
class HandoffInfo(BaseModel):
subagent_name: str = Field(description="The name of the subagent being called.")
reason: str = Field(description="The reason for the handoff.")
# we redefine the on_handoff to include the HandoffInfo
input= "How long ago from today was it when got our last revenue report?"
)
display(Markdown(result.final_output))
text
Handoff to 'Internal Docs Agent' because 'The user is asking for the date of the last revenue report to calculate how long ago it was.'
The last revenue report was for Q1 2025, dated April 2, 2025.
Considering today is May 5, 2025, the last revenue report was about 33 days ago.
We're now seeing much more information. The final handoff feature we will test is the handoff_filters. The filters work by removing information sent to the handoff agents. By default, our workflow provides all information seen by the previous agent to the new handoff agent. That includes all chat history message and all tool calls made so far.
In some cases, we may want to filter this information. For example, with a weaker LLM, too much information can reduce its performance, so it is often a good idea to only share the information that is absolutely necessary.
If we have various tool calls in our chat history, these may confuse a smaller LLM. In this scenario, we can filter all tool calls from the history using the handoff_filters.remove_all_tools method:
python
from agents.extensions import handoff_filters
# now redefine the handoff objects with the input_type parameter
Now, when asking for the time difference, we will see that our handoff agent cannot give us an accurate answer. This issue is because the current time is first found by our main_agent via the get_current_date tool, and we store that information in the chat history. This information is lost when we filter tool calls out of the chat history.
python
result = await Runner.run(
starting_agent=main_agent,
input= "How long ago from today was it when got our last revenue report?"
)
display(Markdown(result.final_output))
text
Handoff to 'Internal Docs Agent' because 'To identify the date of the last revenue report to determine how long ago it was from today (2025-05-05).'
The last revenue report we received was for Q1 2025, dated April 2, 2025. Today is April 27, 2025, so the report was received about 25 days ago.
We should see an incorrect date above.
That's it for this deep dive into multi-agent systems with OpenAI's Agents SDK. We've covered a broad range of SDK multi-agent features and how we can use them to build orchestrator-subagent workflows and handoff workflows, both of which have their own pros and cons.