Prompting is an essential component when working with LLMs, and Agents SDK naturally has it's own way of handling various the components of prompts. In this chapter, we'll look at how to use static and dynamic prompting, how to correctly use system, user, assistant, and tool prompts. Then, we'll see how these come together to create conversational agents.
To begin, we need to get an OpenAI API key from the OpenAI Platform and enter it below:
Static Instructions
Now we create an agent, in Agents SDK we do this via the Agent
class. When initializing an Agent
we include a few parameters:
name
is naturally the agent's name. This is referenced by the agent (for example if you ask it's name) but otherwise is more of an identifier for us.instructions
is the system prompt which guides the behavior of the agent.model
is the model to be used, we're usinggpt-4.1-mini
as it's a strong yet fast and cheap model.
Now we want to run the agent, Agents SDK provides a Runner
object that will allow us to run the agent
However we need to use the await
keyword to run the agent, this is because the Runner
object is an asynchronous object
In the run
method we have the following parameters:
starting_agent
defines which agent our workflow begins with. In this case, we only have a single agent workflow but in more complex scenarios we may find ourselves using many agents in a single workflow run, and in that scenario we would also have a specificstarting_agent
that may handover to our other agents.input
: The input to pass to the agent, typically our user query.
We'll ask our agent to write us a haiku. A haiku is a traditional form of Japanese poetry, which follows a 5-7-5 syllable pattern. Typically, a haiku should invoke some sense of a window into a broader world, such as making you think of the rain as it splashes into a pond, or the wind as it flows through the trees in a forest — traditionally haikus also tend to focus on the natural world.
From our instructions/system prompt of "Speak like a pirate."
and our user query of "Write me a haiku"
the agent generates a haiku spoken like a pirate.
Dynamic Instructions
Now we can take this a step further and provide a dynamic system prompt to the agent. With dynamic system prompts / instructions
we can modify what is passed to the agent based on some dynamic parameter which is filled at query time.
First, we create a function that will construct our dynamic prompt. This function will simply provide the current time to the agent, and then ask the agent to change it's behavior based on the time that is provided.
Next we need to redefine our agent with this new dynamic system prompt. To do this we pass the time_based_instructions
function to our instructions
parameter. Note that we pass the function itself to instructions
, which is then called at query time not when we initialize the agent.
Then using the Runner
object we can test our dynamic instructions
For this function to be dynamic
you can see that when asking the time, the agent will return a different response based on the time of day without having to re-initialize the agent. We can ask for another haiku too:
Message Types
We're going to using five primary message types from OpenAI, those are:
user
is almost always just the input query from a user. Occasionally we might modify this in some way but this isn't particularly common — so we can assume this is the direct input query from a user.developer
is used to provide instructions directly to the LLM. Typically these are where we would put behavioral instructions, or rules and parameters for how we'd like a conversation with the LLM to be executed. In the past these were calledsystem
messages.assistant
is the direct response from an LLM to a user.function_call
is the response from an LLM in the scenario where the LLM has decided it would like to use a tool / function call. Many frameworks will structure this as anassistant
message with an additional tool call field — but with OpenAI and Agents SDK these are their own message type.function_call_output
is the output from our executed tool / function. It is typically constructed within our codebase as OpenAI is not executing our code for us.
In a typical conversation with an agent that has tools, we might see the following:
It's worth clarifying that technically we have just listed three message types. The user
, developer
, and assistant
messages are all of the same message type, which is type="message"
. These three messages are distinguished as different having roles, meaning they are all of type="message"
but are of different roles, ie role="user"
, role="developer"
, or role="assistant"
.
Now, Agents SDK will abstract away the majority of these message types for us. In fact, during typical use of the framework we'll typically define an initial developer
message via the instructions
field of our Agent
object, and we'll define user
messages via the input
field of our Runner.run
method.
We would not necessarily need to know these other message types to use Agents SDK. Fortunately, there are easy-to-use methods such as the to_input_list()
method that will take the outputs we receive from Agents SDK and format them into the format we need for feeding them back into the input
parameter.
However, by not understanding these message types and how they are used by Agents SDK we would (1) have less understanding of how the system we're building truly works, which can be important particularly with prompting and designing a good agent workflow. And (2) when pulling in messages from other places, such as our own databases or simply via our own code logic, we do need to construct our own chat history using these message types.
So, although not 100% necessary, we think it's still pretty important to understand message types well and practically essential for most production use-cases.
User Messages
Beginning with our user
message. The user
messages are automatically defined when we call our runner via the input
parameter:
Alternatively, if we'd like to use typing we can import the Message
object directly from the openai
library (which is used under-the-hood by Agents SDK).
In most cases, we can simplify all of this and directly define user messages using the dictionary format:
Developer Messages
The developer
message defines how the agent should behave. This message was previously called the system
message but for models o1 and newer the developer
message should be used in it's place. The initial developer
message is automatically added to our agents when we define the Agent
object and it is defined via the instructions
parameter:
We can also define a developer message directly by setting role="developer"
in a dictionary like so:
However, it's worth noting that the instructions / first system prompt cannot be set other than via the instructions
parameter. Instead we would likely use the system message to add additional instructions within our chat history, which might look something like this:
Assistant Messages
Assistant messages are typically our direct response to the user. The content
field of our message is generated by the LLM, and may look something like this:
Our output type here is different to the type we'd need to feed into our input
field when using Runner.run
. Fortunately, there is a simple method for turning it into the format we need:
We can see here that we have our output assistant message (the final item in the list) and all other messages — with the exception of the initial developer message. Naturally, we can use this when pulling out chat histories for later use.
For the sake of clarity, let's try querying with one more message and seeing how that changes our outputted chat history.
We can see that if we don't provide our previous messages to the input
they are not maintained by the agent or runner itself. Naturally, that means we need to be passing in our full chat history (or the parts you want to keep) with each new interaction.
Function Call Messages
Function or tool calls consist of two message types. The first being the LLM-generated instruction to go and use tool X and the second being the output that we received after executing tool X. We refer to these two message types as the function_call
and function_call_output
respectively.
We'll begin by looking at the function_call
message. This message type is formatted differently to the messages we've seen so far, it looks like this:
To construct an function call message where a function/tool named get_current_weather
is called, with the single input parameter of location="London"
, we would do:
Note that each assistant tool call requires a function_call_output
message before being fed back into the input
of our LLM, otherwise we'll get this error:
The reason we're seeing this error is because OpenAI's (and many other provider's) LLMs expect to see pairs of tool call and tool output messages — which must be paired by the call_id
field. Meaning that our chat history is invalid, hence the error message.
Let's examine a few examples of chat histories that would be valid vs. invalid. First, this is approximately what we currently have, and it is, ofcourse, invalid:
But this chat history is valid:
Whereas this chat history is invalid (note the lack of matching call IDs):
To get our inputs
valid for a new agent call, we need to provide a tool output message to pair with our already defined tool call message. Note that we would typically be calling an actual tool or function to create the tool output but we will not be covering that here. We'll cover tool execution in the tools chapter. For now, we'll create the tool output manually.
Now let's feed this tool output message into our inputs
to simulate our agent having already made the get_current_weather
tool call and having received the answer, leaving the assistant to generate the final answer.
That covers everything we need regarding prompting and chat history for Agents SDK — we'll naturally be using what we've learned here throughout the rest of the course, and very likely beyond in any projects you work on with Agents SDK.