State-of-the-Art (SotA) LLMs are no longer limited to running on big AI superclusters, locked away behind an API key. Various open source and open weight LLMs alongside open source software such as LM Studio and LiteLLM allow us to build locally.
In this post, we'll show you how to do exactly that using the excellent Cogito V1 models, LM Studio, and a dash of tool-calling magic to build your own async-capable local agent.
By the end of this tutorial, you'll have:
- A local LLM server running Cogito V1
- A working dev environment with async streaming completions
- Tool/function calling support (e.g., live web search)
- An agent abstraction for iterative reasoning
Getting Started: LM Studio + Cogito V1
The first step is to grab LM Studio, a local LLM runner that's dead simple to use. Download it here.
Next, go to the Discover tab inside LM Studio and search for cogito
. Pick the model cogito-v1-preview-qwen-32b
and hit that green Download button.
Once the download is done:
- Switch to the Server tab
- Start the server on port
1234
- Load your chosen model
You should now be able to query the LLM locally at http://localhost:1234/v1
.
Confirm the Server Is Running
Run this quick check in your terminal:
curl http://localhost:1234/v1/models
You should see a list of available models, including cogito-v1-preview-qwen-32b
:
{
"data": [
{
"id": "cogito-v1-preview-qwen-32b",
"object": "model",
"owned_by": "organization_owner"
},
{
"id": "cogito-v1-preview-llama-70b",
"object": "model",
"owned_by": "organization_owner"
},
{
"id": "unsloth/llama-4-scout-17b-16e-instruct",
"object": "model",
"owned_by": "organization_owner"
},
{
"id": "lmstudio-community/llama-4-scout-17b-16e-instruct",
"object": "model",
"owned_by": "organization_owner"
},
{
"id": "text-embedding-nomic-embed-text-v1.5",
"object": "model",
"owned_by": "organization_owner"
},
{
"id": "mistral-small-3.1-24b-instruct-2503",
"object": "model",
"owned_by": "organization_owner"
}
],
"object": "list"
}
Set Up Your Project
Clone the repo from the Aurelio Labs Cookbook:
git clone https://github.com/aurelio-labs/cookbook.git
cd cookbook/gen_ai/local_lm_studio
Use UV to manage your Python environment:
brew install uv # Mac only
uv venv python3.12.7
source .venv/bin/activate
uv sync
You now have a ready-to-go environment with all dependencies installed, including litellm
and graphai
.
First Completion Call
This code snippet initializes the LiteLLM client, sets the environment variable for the local server, and sends a basic prompt to the LLM:
from litellm import completion
import os
MODEL = "lm_studio/cogito-v1-preview-qwen-32b"
os.environ["LM_STUDIO_API_BASE"] = "http://localhost:1234/v1"
response = completion(
model=MODEL,
messages=[{"role": "user", "content": "Hello, how are you?"}],
api_key="sk-dummy-key"
)
print(response)
From this, we will get a ModelResponse
object, which includes the returned assistant message:
ModelResponse(
model='lm_studio/cogito-v1-preview-qwen-32b',
choices=[Choices(
message=Message(
role='assistant',
content="I'm doing well, thank you for asking! How can I help you today?"
)
)]
)
We access the content
with:
response.choices[0].messages.content
Async Streaming with LiteLLM
In most use-cases we're likely to be using async code to enable a more scalable application, and we'll also likely be using streaming — which allows us to build more user-friendly and responsive interfaces. For async we use the acompletion
function from LiteLLM, and we stream the tokens by setting stream=True
.
We can then parse and print each token as our LLM generates it like so:
from litellm import acompletion
response = await acompletion(
model=MODEL,
messages=[{"role": "user", "content": "Hello, how are you?"}],
stream=True,
api_key="sk-dummy-key"
)
async for chunk in response:
if (token := chunk.choices[0].delta.content):
print(token, end="", flush=True)
Once streaming is complete, we should see a full response:
I'm doing well, thank you for asking! How can I help you today?
Tool Calls and Function Calling
Now let's enable function calling (aka tool use). Not all models support tool use, and LiteLLM provides the supports
_
function_calling
function to check if an LLM supports tool-use — however, this isn't particularly reliable for LM Studio models, and for the Cogito v1 models this function returns False
:
from litellm import supports_function_calling
supports_function_calling(MODEL)
Returns:
False
Cogito v1 does support tool-use, so LiteLLM is wrong here — however, we do need to make some modifications to how we're calling the model. For tool use with Cogito v1 on LM Studio we need to proxy OpenAI so that LiteLLM calls our endpoint with OpenAI-standard requests. To do this, we need to replace our lm_studio
prefix with openai
:
OAI_MODEL = MODEL.replace("lm_studio/", "openai/")
Then pass the base_url
to tell LiteLLM to call our LM Studio endpoint (http://localhost:1234/v1
) rather than the default OpenAI endpoint (https://api.openai.com/v1
):
response = completion(
model=OAI_MODEL,
messages=[{"role": "user", "content": "Hello!"}],
api_key="sk-dummy-key",
base_url="http://localhost:1234/v1"
)
Expected response:
ModelResponse(
model='openai/cogito-v1-preview-qwen-32b',
choices=[Choices(
message=Message(
role='assistant',
content="I'm doing well, thank you! How are you today?"
)
)]
)
Add Web Search as a Tool
Now that we've put together our completion and proxy calls, let's define a tool. We'll use SerpAPI to build a simple web search tool. We do need an API key for this, but it comes with 100 free calls per month.
We'll also be using aiohttp
to make the HTTP request asynchronously, keeping our full LLM
and tool execution pipeline asynchronous. To call SerpAPI we do:
from getpass import getpass
import aiohttp
SERPAPI_API_KEY = getpass("Enter your SerpAPI API key: ")
# define our search parameters
params = {
"api_key": SERPAPI_API_KEY,
"engine": "google",
"q": "latest world news"
}
async with aiohttp.ClientSession() as session:
async with session.get(
"https://serpapi.com/search",
params=params
) as response:
results = await response.json()
results["organic_results"]
Results return a list of 10 returned records, each with a title
, link
, snippet
, and source
— among other fields.
[{'position': 1,
'title': 'World | Latest News & Updates',
'link': 'https://www.bbc.com/news/world',
'redirect_link': 'https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.bbc.com/news/world&ved=2ahUKEwjBiILP6eGMAxXMFVkFHbyQDvgQFnoECCAQAQ',
'displayed_link': 'https://www.bbc.com › news › world',
'favicon': 'https://serpapi.com/searches/6802660d7e6fc2987a91e511/images/16c8f2982b167f589c032bebd52ffcaac23c298b281c232418cab8d216c47a55.png',
'date': '5 hours ago',
'snippet': 'British couple killed in cable car crash, Italian police say. Four people died in the incident at Mount Faito, while another was "extremely seriously injured".',
'snippet_highlighted_words': ['British couple killed in cable car crash'],
'sitelinks': {'inline': [{'title': 'BBC World News',
'link': 'https://www.bbc.com/news/world_radio_and_tv'},
{'title': 'Europe', 'link': 'https://www.bbc.com/news/world/europe'},
{'title': 'Africa', 'link': 'https://www.bbc.com/news/world/africa'},
{'title': 'Middle East',
'link': 'https://www.bbc.com/news/world/middle_east'}]},
'source': 'BBC'},
{'position': 2,
'title': 'World news - breaking news, video, headlines and opinion',
'link': 'https://www.cnn.com/world',
'redirect_link': 'https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.cnn.com/world&ved=2ahUKEwjBiILP6eGMAxXMFVkFHbyQDvgQFnoECCMQAQ',
'displayed_link': 'https://www.cnn.com › world',
'favicon': 'https://serpapi.com/searches/6802660d7e6fc2987a91e511/images/16c8f2982b167f589c032bebd52ffcaa6c4f5045df1314fc3a0e874d127b2c2e.png',
'date': '5 hours ago',
'snippet': "Turkey begins mass trials following protests over Istanbul mayor's detention · US will abandon Ukraine peace efforts 'within days' if no progress made, Rubio ...",
'snippet_highlighted_words': ["Turkey begins mass trials following protests over Istanbul mayor's detention"],
'sitelinks': {'inline': [{'title': 'Impact Your World',
'link': 'https://edition.cnn.com/world/impact-your-world'},
{'title': 'US is ‘destroying’ world order...',
'link': 'https://www.cnn.com/2025/03/06/europe/us-world-order-ukraine-zaluzhnyi-intl/index.html'},
{'title': 'Europe', 'link': 'https://www.cnn.com/world/europe'},
{'title': 'Americas', 'link': 'https://www.cnn.com/world/americas'}]},
'source': 'CNN'},
{'position': 3,
'title': 'World News | Latest Top Stories',
'link': 'https://www.reuters.com/world/',
'redirect_link': 'https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.reuters.com/world/&ved=2ahUKEwjBiILP6eGMAxXMFVkFHbyQDvgQFnoECCQQAQ',
'displayed_link': 'https://www.reuters.com › world',
'favicon': 'https://serpapi.com/searches/6802660d7e6fc2987a91e511/images/16c8f2982b167f589c032bebd52ffcaa1f9c74f3e3216f8467771437cd888e82.png',
'date': '3 hours ago',
'snippet': 'Reuters.com is your online source for the latest world news stories and current events, ensuring our readers up to date with any breaking news developments.',
'snippet_highlighted_words': ['Reuters.com is your online source for the latest world news stories'],
'sitelinks': {'inline': [{'title': 'Europe',
'link': 'https://www.reuters.com/world/europe/'},
{'title': 'Econ World',
'link': 'https://www.reuters.com/markets/econ-world/'},
{'title': 'United States', 'link': 'https://www.reuters.com/world/us/'},
{'title': 'China', 'link': 'https://www.reuters.com/world/china/'}]},
'source': 'Reuters'},
{'position': 4,
'title': 'Latest news from around the world',
'link': 'https://www.theguardian.com/world',
'redirect_link': 'https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.theguardian.com/world&ved=2ahUKEwjBiILP6eGMAxXMFVkFHbyQDvgQFnoECDIQAQ',
'displayed_link': 'https://www.theguardian.com › world',
'favicon': 'https://serpapi.com/searches/6802660d7e6fc2987a91e511/images/16c8f2982b167f589c032bebd52ffcaa48b2ea00a209294a3ba58a436d46087f.png',
'date': '5 hours ago',
'snippet': 'Most viewed in world news · British woman among four killed in Italian cable car crash · Live · US ready to abandon Ukraine peace deal if there is no progress, ...',
'snippet_highlighted_words': ['British woman among four killed in Italian cable car crash'],
'sitelinks': {'inline': [{'title': 'Europe',
'link': 'https://www.theguardian.com/world/europe-news'},
{'title': 'Americas',
'link': 'https://www.theguardian.com/world/americas'},
{'title': 'World Bank announces...',
'link': 'https://www.theguardian.com/global-development/2025/apr/03/world-bank-multimillion-dollar-redress-killings-and-abuse-claims-tanzania-project-ruaha-national-park'},
{'title': '‘Shame’ on world leaders for...',
'link': 'https://www.theguardian.com/global-development/2025/apr/05/democratic-republic-congo-goma-shame-subhuman-neglect-displaced-civilians-norwegian-refugee-council-egeland'}]},
'source': 'The Guardian'},
{'position': 5,
'title': 'Breaking News, World News and Video from Al Jazeera',
'link': 'https://www.aljazeera.com/',
'redirect_link': 'https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.aljazeera.com/&ved=2ahUKEwjBiILP6eGMAxXMFVkFHbyQDvgQFnoECEMQAQ',
'displayed_link': 'https://www.aljazeera.com',
'favicon': 'https://serpapi.com/searches/6802660d7e6fc2987a91e511/images/16c8f2982b167f589c032bebd52ffcaa0a1198d7119b4fba775a96c1ae4609f9.png',
'date': '4 hours ago',
'snippet': "Maguire's last-minute goal sends Man Utd into Europa League semifinals · 'It's just incredible': Van Dijk extends Liverpool contract until 2027.",
'snippet_highlighted_words': ["Maguire's last-minute goal sends Man Utd into Europa League semifinals"],
'sitelinks': {'inline': [{'title': 'News',
'link': 'https://www.aljazeera.com/news/'},
{'title': 'US & Canada News',
'link': 'https://www.aljazeera.com/us-canada/'},
{'title': 'Middle East News',
'link': 'https://www.aljazeera.com/middle-east/'},
{'title': 'Live', 'link': 'https://www.aljazeera.com/live'}]},
'source': 'Al Jazeera'},
{'position': 6,
'title': 'NBC News - Breaking News & Top Stories - Latest World, US ...',
'link': 'https://www.nbcnews.com/',
'redirect_link': 'https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.nbcnews.com/&ved=2ahUKEwjBiILP6eGMAxXMFVkFHbyQDvgQFnoECDgQAQ',
'displayed_link': 'https://www.nbcnews.com',
'favicon': 'https://serpapi.com/searches/6802660d7e6fc2987a91e511/images/16c8f2982b167f589c032bebd52ffcaade72b784dd8cd9f7222b9918f97a84a6.png',
'snippet': 'Go to NBCNews.com for breaking news, videos, and the latest top stories in world news, business, politics, health and pop culture.',
'snippet_highlighted_words': ['breaking news, videos, and the latest top stories in world news'],
'sitelinks': {'inline': [{'title': 'World',
'link': 'https://www.nbcnews.com/world'},
{'title': 'U.S. News', 'link': 'https://www.nbcnews.com/us-news'},
{'title': 'Latest Stories',
'link': 'https://www.nbcnews.com/latest-stories'},
{'title': 'Nightly News',
'link': 'https://www.nbcnews.com/nightly-news'}]},
'source': 'NBC News'},
{'position': 7,
'title': 'Google News',
'link': 'https://news.google.com/',
'redirect_link': 'https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://news.google.com/&ved=2ahUKEwjBiILP6eGMAxXMFVkFHbyQDvgQFnoECDUQAQ',
'displayed_link': 'https://news.google.com',
'favicon': 'https://serpapi.com/searches/6802660d7e6fc2987a91e511/images/16c8f2982b167f589c032bebd52ffcaa3549b89513fe8355867d18051abe5e52.png',
'date': '1 hour ago',
'snippet': "Court denies White House appeal of 'shocking' Abrego Garcia deportation case · What to know about the shooting at Florida State University · U.S. citizen released ...",
'snippet_highlighted_words': ["Court denies White House appeal of 'shocking' Abrego Garcia deportation case"],
'source': 'Google News'},
{'position': 8,
'title': 'CNN: Breaking News, Latest News and Videos',
'link': 'https://www.cnn.com/',
'redirect_link': 'https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.cnn.com/&ved=2ahUKEwjBiILP6eGMAxXMFVkFHbyQDvgQFnoECDEQAQ',
'displayed_link': 'https://www.cnn.com',
'favicon': 'https://serpapi.com/searches/6802660d7e6fc2987a91e511/images/16c8f2982b167f589c032bebd52ffcaa17224953f58f6bc829748cf5b3dc70f6.png',
'snippet': 'View the latest news and breaking news today for U.S., world, weather, entertainment, politics and health at CNN.com.',
'snippet_highlighted_words': ['breaking news today'],
'sitelinks': {'inline': [{'title': 'World',
'link': 'https://www.cnn.com/world'},
{'title': 'US', 'link': 'https://www.cnn.com/us'},
{'title': 'CNN', 'link': 'https://edition.cnn.com/'},
{'title': 'Latest Videos', 'link': 'https://www.cnn.com/videos'}]},
'source': 'CNN'},
{'position': 9,
'title': 'BBC Home - Breaking News, World News, US News, Sports ...',
'link': 'https://www.bbc.com/',
'redirect_link': 'https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.bbc.com/&ved=2ahUKEwjBiILP6eGMAxXMFVkFHbyQDvgQFnoECDsQAQ',
'displayed_link': 'https://www.bbc.com',
'favicon': 'https://serpapi.com/searches/6802660d7e6fc2987a91e511/images/16c8f2982b167f589c032bebd52ffcaae061d74b1e6b3f7c85913924b4a41647.png',
'date': '4 hours ago',
'snippet': 'Visit BBC for trusted reporting on the latest world and US news, sports, business, climate, innovation, culture and much more.',
'snippet_highlighted_words': ['latest world and US news'],
'sitelinks': {'inline': [{'title': 'World',
'link': 'https://www.bbc.com/news/world'},
{'title': 'News', 'link': 'https://www.bbc.com/news'},
{'title': 'US & Canada', 'link': 'https://www.bbc.com/news/us-canada'},
{'title': 'Europe', 'link': 'https://www.bbc.com/news/world/europe'}]},
'source': 'BBC'}]
The results are fairly messy so we can clean them up and organize them into a Pydantic
BaseModel
, which we'll call Article
and will include attributes for title
, source
,
link
, and snippet
.
from pydantic import BaseModel
class Article(BaseModel):
title: str
source: str
link: str
snippet: str
@classmethod
def from_serpapi_result(cls, result: dict) -> "Article":
return cls(
title=result["title"],
source=result["source"],
link=result["link"],
snippet=result["snippet"],
)
def __str__(self) -> str:
return f"## {self.title} - ({self.source})\n_{self.link}_\n{self.snippet}\n"
We also define the classmethod
from_serpapi_result
to convert the raw SerpAPI results
into our Article
object, and the __str__
method to format the object as a markdown string
which we will provide back to our LLM.
To create a list of Article
objects from the SerpAPI results we do:
articles = [Article.from_serpapi_result(result) for result in results["organic_results"]]
articles
Returning a list of Article
objects:
[
Article(
title='World | Latest News & Updates',
source='BBC',
link='https://www.bbc.com/news/world',
snippet='British couple killed in cable car crash, Italian police say. Four people died in the incident at Mount Faito, while another was "extremely seriously injured".'
),
Article(
title='World news - breaking news, video, headlines and opinion',
source='CNN',
link='https://www.cnn.com/world',
snippet="Turkey begins mass trials following protests over Istanbul mayor's detention · US will abandon Ukraine peace efforts 'within days' if no progress made, Rubio ..."
),
Article(
title='World News | Latest Top Stories',
source='Reuters',
link='https://www.reuters.com/world/',
snippet='Reuters.com is your online source for the latest world news stories and current events, ensuring our readers up to date with any breaking news developments.'
),
Article(
title='Latest news from around the world',
source='The Guardian',
link='https://www.theguardian.com/world',
snippet='Most viewed in world news · British woman among four killed in Italian cable car crash · Live · US ready to abandon Ukraine peace deal if there is no progress, ...'
),
Article(
title='Breaking News, World News and Video from Al Jazeera',
source='Al Jazeera',
link='https://www.aljazeera.com/',
snippet="Maguire's last-minute goal sends Man Utd into Europa League semifinals · 'It's just incredible': Van Dijk extends Liverpool contract until 2027."
),
Article(
title='NBC News - Breaking News & Top Stories - Latest World, US ...',
source='NBC News',
link='https://www.nbcnews.com/',
snippet='Go to NBCNews.com for breaking news, videos, and the latest top stories in world news, business, politics, health and pop culture.'
),
Article(
title='Google News',
source='Google News',
link='https://news.google.com/',
snippet="Court denies White House appeal of 'shocking' Abrego Garcia deportation case · What to know about the shooting at Florida State University · U.S. citizen released ..."
),
Article(
title='CNN: Breaking News, Latest News and Videos',
source='CNN',
link='https://www.cnn.com/',
snippet='View the latest news and breaking news today for U.S., world, weather, entertainment, politics and health at CNN.com.'
),
Article(
title='BBC Home - Breaking News, World News, US News, Sports ...',
source='BBC',
link='https://www.bbc.com/',
snippet='Visit BBC for trusted reporting on the latest world and US news, sports, business, climate, innovation, culture and much more.'
)
]
Let's display one of those in markdown:
from IPython.display import Markdown, display
display(Markdown(str(articles[0])))
Giving us this:
### World | Latest News & Updates - (BBC)
[https://www.bbc.com/news/world](https://www.bbc.com/news/world) British couple killed in cable car crash, Italian police say. Four people died in the incident at Mount Faito, while another was "extremely seriously injured".
Finally, we can refactor all of this into a single async function that our LLM can call:
async def web_search(query: str) -> list[Article]:
"""Use this function to search the web for information. Provide natural language to the
query with as much context as possible to get the best results.
"""
params = {
"api_key": SERPAPI_API_KEY,
"engine": "google",
"q": query
}
async with aiohttp.ClientSession() as session:
async with session.get(
"https://serpapi.com/search",
params=params
) as response:
results = await response.json()
articles = [Article.from_serpapi_result(result) for result in results["organic_results"]]
articles = "\n".join([str(article) for article in articles])
return articles
Our LLM doesn't call this function directly, but instead given a set of function schemas
the LLM will decide which functions / tools to call and which arguments to provide. To generate
these schemas we use the get_schemas
function from graphai-lib
:
from graphai.utils import get_schemas
tools = get_schemas([web_search])
This gives us a list of function schemas in tools
which look like this:
[{'type': 'function',
'function': {'name': 'web_search',
'description': 'Use this function to search the web for information. Provide natural language to the\nquery with as much context as possible to get the best results.',
'parameters': {'type': 'object',
'properties': {'query': {'description': None, 'type': 'string'}},
'required': ['query']}}}]
We then execute our query with the tools
passed to the tools
parameter like so:
query = {"role": "user", "content": "tell me about the latest world news"}
response = completion(
model=OAI_MODEL,
messages=[query],
tools=tools,
tool_choice="auto",
api_key="sk-some-api-key",
base_url="http://localhost:1234/v1",
)
(
response.choices[0].message.tool_calls[0].function.name,
response.choices[0].message.tool_calls[0].function.arguments,
)
Returning:
('web_search', '{"query":"tell me about the latest world news"}')
Our LLM has generated the tool choice and input parameters for our tool but we have not executed the tool, we must handle that ourselves. To do so we will create a mapping from tool names to their functions.
tool_map = {
"web_search": web_search
# when using multiple tools, we would add them here
}
Now we execute the tool like so:
tool_out = await tool_map[response.choices[0].message.tool_calls[0].function.name](
response.choices[0].message.tool_calls[0].function.arguments
)
We then format this and the initial tool call from our LLM into messages, and feed them back into our LLM for a final response.
tool_call = {"role": "assistant", "content": response.choices[0].message.content, "tool_calls": response.choices[0].message.tool_calls, "tool_call_id": response.choices[0].message.tool_calls[0].id}
tool_exec = {"role": "tool", "content": tool_out, "tool_call_id": response.choices[0].message.tool_calls[0].id}
We feed this into our LLM:
messages = [query, tool_call, tool_exec]
response = completion(
model=OAI_MODEL,
messages=messages,
tool_choice="auto",
api_key="sk-some-api-key",
base_url="http://localhost:1234/v1",
tools=tools
)
response.choices[0]
Giving us:
Choices(
message=Message(
content='Here are some of the latest world news headlines from various sources:\n\n1. **Trump Tariffs Live Coverage** - Yahoo News has been following updates on Trump\'s tariffs.\n \n2. **Israel-Hamas Conflict** - AP News provides coverage of the ongoing developments in the Israel-Hamas conflict.\n\n3. **Seiko Anniversary Watch Release** - Men\'s Journal covers Seiko\'s new anniversary watch release as part of world news.\n\n4. **Good Friday 2025 & JEE Mains Result** - Livemint reports on upcoming events and exams in India.\n\n5. **Cross LOC Surgical Strike Preparation by Indian Army** - The Express Tribune discusses military preparedness, specifically focusing on cross Line of Control (LOC) operations.\n\n6. **Australian Man\'s World Record Attempt** - Rayo news mentions an Australian man who attempted a 73-hour world record but faced an unexpected issue.\n\n7. **Trump on \'What is a Woman\'** - Hindustan Times reports that Trump evoked laughter with quips about defining "woman," then shifted to more serious topics.\n\nThese headlines cover various aspects of global events, from political developments and military actions to cultural happenings around the world. For detailed information, you can visit the respective news sources mentioned above.',
role='assistant',
tool_calls=None,
function_call=None,
provider_specific_fields={'refusal': None}
),
finish_reason='stop',
index=0
)
Building a Simple Agent
We can wrap all of this up into some easier to use agentic logic to keep track of the conversation, execute tools when needed, etc, like so:
class Agent:
def __init__(self, tools):
self.tools = tools
self.schemas = get_schemas(tools)
self.mapping = {fn.__name__: fn for fn in tools}
self.messages = [{"role": "system", "content": "You are a helpful assistant."}]
async def __call__(self, query: str):
self.messages.append({"role": "user", "content": query})
for _ in range(3):
resp = await acompletion(
model=OAI_MODEL,
messages=self.messages,
tools=self.schemas,
api_key="sk-dummy-key",
base_url="http://localhost:1234/v1"
)
choice = resp.choices[0].message
self.messages.append(choice)
if choice.tool_calls:
fn = self.mapping[choice.tool_calls[0].function.name]
args = json.loads(choice.tool_calls[0].function.arguments)
tool_out = await fn(**args)
self.messages.append({"role": "tool", "content": tool_out, "tool_call_id": choice.tool_calls[0].id})
else:
break
return self.messages[-1]["content"]
Usage:
agent = Agent([web_search])
out = await agent("What's going on in the world today?")
print(out)
Sample output:
Based on the latest world news, here are some major headlines:
1. **Ukraine Peace Talks**: The US has indicated it will "move on" from Ukraine peace talks if no progress is made soon.
2. **Turkey and Protests**: Turkey has begun mass trials following protests over the detention of Istanbul's mayor.
3. **US Deportation Case**: There have been developments in a case where an American was mistakenly deported to El Salvador, with a US senator meeting the individual.
4. **Sudan Situation**: The UK is holding a conference on Sudan as part of ongoing efforts regarding that country's situation.
5. **South Africa Kidnapping**: A US pastor was kidnapped during a sermon in South Africa but was later rescued after a shootout.
6. **Criminal Investigation in Germany**: Prosecutors in Berlin are investigating a doctor who allegedly killed palliative care patients and set fire to some of their homes.
These stories represent just some of the major international developments happening right now. For more detailed coverage, you can visit news websites like BBC, CNN, Reuters, The Guardian, or NBC News.
Wrapping Up
We just:
- Ran Cogito V1 locally using LM Studio
- Built a completion and async streaming pipeline
- Wired up tool calling
- Created a reusable async agent
The best part? We're running everything locally.
Now swap in your favorite quantized models (Mistral, LLaMA, etc.) and extend with more tools, memory, and reasoning flows.