Aurelio logo
Updated on April 22, 2025

Building a Fully Local LLM Agent with LM Studio and Cogito V1

Models

State-of-the-Art (SotA) LLMs are no longer limited to running on big AI superclusters, locked away behind an API key. Various open source and open weight LLMs alongside open source software such as LM Studio and LiteLLM allow us to build locally.

In this post, we'll show you how to do exactly that using the excellent Cogito V1 models, LM Studio, and a dash of tool-calling magic to build your own async-capable local agent.

By the end of this tutorial, you'll have:

  • A local LLM server running Cogito V1
  • A working dev environment with async streaming completions
  • Tool/function calling support (e.g., live web search)
  • An agent abstraction for iterative reasoning

Getting Started: LM Studio + Cogito V1

The first step is to grab LM Studio, a local LLM runner that's dead simple to use. Download it here.

Next, go to the Discover tab inside LM Studio and search for cogito. Pick the model cogito-v1-preview-qwen-32b and hit that green Download button.

Once the download is done:

  • Switch to the Server tab
  • Start the server on port 1234
  • Load your chosen model

You should now be able to query the LLM locally at http://localhost:1234/v1.


Confirm the Server Is Running

Run this quick check in your terminal:

bash
curl http://localhost:1234/v1/models

You should see a list of available models, including cogito-v1-preview-qwen-32b:

json
{
"data": [
{
"id": "cogito-v1-preview-qwen-32b",
"object": "model",
"owned_by": "organization_owner"
},
{
"id": "cogito-v1-preview-llama-70b",
"object": "model",
"owned_by": "organization_owner"
},
{
"id": "unsloth/llama-4-scout-17b-16e-instruct",
"object": "model",
"owned_by": "organization_owner"
},
{
"id": "lmstudio-community/llama-4-scout-17b-16e-instruct",
"object": "model",
"owned_by": "organization_owner"
},
{
"id": "text-embedding-nomic-embed-text-v1.5",
"object": "model",
"owned_by": "organization_owner"
},
{
"id": "mistral-small-3.1-24b-instruct-2503",
"object": "model",
"owned_by": "organization_owner"
}
],
"object": "list"
}

Set Up Your Project

Clone the repo from the Aurelio Labs Cookbook:

bash
git clone https://github.com/aurelio-labs/cookbook.git
cd cookbook/gen_ai/local_lm_studio

Use UV to manage your Python environment:

bash
brew install uv # Mac only
uv venv python3.12.7
source .venv/bin/activate
uv sync

You now have a ready-to-go environment with all dependencies installed, including litellm and graphai.


First Completion Call

This code snippet initializes the LiteLLM client, sets the environment variable for the local server, and sends a basic prompt to the LLM:

python
from litellm import completion
import os

MODEL = "lm_studio/cogito-v1-preview-qwen-32b"
os.environ["LM_STUDIO_API_BASE"] = "http://localhost:1234/v1"

response = completion(
model=MODEL,
messages=[{"role": "user", "content": "Hello, how are you?"}],
api_key="sk-dummy-key"
)
print(response)

From this, we will get a ModelResponse object, which includes the returned assistant message:

json
ModelResponse(
model='lm_studio/cogito-v1-preview-qwen-32b',
choices=[Choices(
message=Message(
role='assistant',
content="I'm doing well, thank you for asking! How can I help you today?"
)
)]
)

We access the content with:

python
response.choices[0].messages.content

Async Streaming with LiteLLM

In most use-cases we're likely to be using async code to enable a more scalable application, and we'll also likely be using streaming — which allows us to build more user-friendly and responsive interfaces. For async we use the acompletion function from LiteLLM, and we stream the tokens by setting stream=True.

We can then parse and print each token as our LLM generates it like so:

python
from litellm import acompletion

response = await acompletion(
model=MODEL,
messages=[{"role": "user", "content": "Hello, how are you?"}],
stream=True,
api_key="sk-dummy-key"
)

async for chunk in response:
if (token := chunk.choices[0].delta.content):
print(token, end="", flush=True)

Once streaming is complete, we should see a full response:

text
I'm doing well, thank you for asking! How can I help you today?

Tool Calls and Function Calling

Now let's enable function calling (aka tool use). Not all models support tool use, and LiteLLM provides the supports_function_calling function to check if an LLM supports tool-use — however, this isn't particularly reliable for LM Studio models, and for the Cogito v1 models this function returns False:

python
from litellm import supports_function_calling

supports_function_calling(MODEL)

Returns:

python
False

Cogito v1 does support tool-use, so LiteLLM is wrong here — however, we do need to make some modifications to how we're calling the model. For tool use with Cogito v1 on LM Studio we need to proxy OpenAI so that LiteLLM calls our endpoint with OpenAI-standard requests. To do this, we need to replace our lm_studio prefix with openai:

python
OAI_MODEL = MODEL.replace("lm_studio/", "openai/")

Then pass the base_url to tell LiteLLM to call our LM Studio endpoint (http://localhost:1234/v1) rather than the default OpenAI endpoint (https://api.openai.com/v1):

python
response = completion(
model=OAI_MODEL,
messages=[{"role": "user", "content": "Hello!"}],
api_key="sk-dummy-key",
base_url="http://localhost:1234/v1"
)

Expected response:

json
ModelResponse(
model='openai/cogito-v1-preview-qwen-32b',
choices=[Choices(
message=Message(
role='assistant',
content="I'm doing well, thank you! How are you today?"
)
)]
)

Add Web Search as a Tool

Now that we've put together our completion and proxy calls, let's define a tool. We'll use SerpAPI to build a simple web search tool. We do need an API key for this, but it comes with 100 free calls per month.

We'll also be using aiohttp to make the HTTP request asynchronously, keeping our full LLM and tool execution pipeline asynchronous. To call SerpAPI we do:

python
from getpass import getpass
import aiohttp

SERPAPI_API_KEY = getpass("Enter your SerpAPI API key: ")

# define our search parameters
params = {
"api_key": SERPAPI_API_KEY,
"engine": "google",
"q": "latest world news"
}

async with aiohttp.ClientSession() as session:
async with session.get(
"https://serpapi.com/search",
params=params
) as response:
results = await response.json()

results["organic_results"]

Results return a list of 10 returned records, each with a title, link, snippet, and source — among other fields.

python
[{'position': 1,
'title': 'World | Latest News & Updates',
'link': 'https://www.bbc.com/news/world',
'redirect_link': 'https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.bbc.com/news/world&ved=2ahUKEwjBiILP6eGMAxXMFVkFHbyQDvgQFnoECCAQAQ',
'displayed_link': 'https://www.bbc.com › news › world',
'favicon': 'https://serpapi.com/searches/6802660d7e6fc2987a91e511/images/16c8f2982b167f589c032bebd52ffcaac23c298b281c232418cab8d216c47a55.png',
'date': '5 hours ago',
'snippet': 'British couple killed in cable car crash, Italian police say. Four people died in the incident at Mount Faito, while another was "extremely seriously injured".',
'snippet_highlighted_words': ['British couple killed in cable car crash'],
'sitelinks': {'inline': [{'title': 'BBC World News',
'link': 'https://www.bbc.com/news/world_radio_and_tv'},
{'title': 'Europe', 'link': 'https://www.bbc.com/news/world/europe'},
{'title': 'Africa', 'link': 'https://www.bbc.com/news/world/africa'},
{'title': 'Middle East',
'link': 'https://www.bbc.com/news/world/middle_east'}]},
'source': 'BBC'},
{'position': 2,
'title': 'World news - breaking news, video, headlines and opinion',
'link': 'https://www.cnn.com/world',
'redirect_link': 'https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.cnn.com/world&ved=2ahUKEwjBiILP6eGMAxXMFVkFHbyQDvgQFnoECCMQAQ',
'displayed_link': 'https://www.cnn.com › world',
'favicon': 'https://serpapi.com/searches/6802660d7e6fc2987a91e511/images/16c8f2982b167f589c032bebd52ffcaa6c4f5045df1314fc3a0e874d127b2c2e.png',
'date': '5 hours ago',
'snippet': "Turkey begins mass trials following protests over Istanbul mayor's detention · US will abandon Ukraine peace efforts 'within days' if no progress made, Rubio ...",
'snippet_highlighted_words': ["Turkey begins mass trials following protests over Istanbul mayor's detention"],
'sitelinks': {'inline': [{'title': 'Impact Your World',
'link': 'https://edition.cnn.com/world/impact-your-world'},
{'title': 'US is ‘destroying’ world order...',
'link': 'https://www.cnn.com/2025/03/06/europe/us-world-order-ukraine-zaluzhnyi-intl/index.html'},
{'title': 'Europe', 'link': 'https://www.cnn.com/world/europe'},
{'title': 'Americas', 'link': 'https://www.cnn.com/world/americas'}]},
'source': 'CNN'},
{'position': 3,
'title': 'World News | Latest Top Stories',
'link': 'https://www.reuters.com/world/',
'redirect_link': 'https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.reuters.com/world/&ved=2ahUKEwjBiILP6eGMAxXMFVkFHbyQDvgQFnoECCQQAQ',
'displayed_link': 'https://www.reuters.com › world',
'favicon': 'https://serpapi.com/searches/6802660d7e6fc2987a91e511/images/16c8f2982b167f589c032bebd52ffcaa1f9c74f3e3216f8467771437cd888e82.png',
'date': '3 hours ago',
'snippet': 'Reuters.com is your online source for the latest world news stories and current events, ensuring our readers up to date with any breaking news developments.',
'snippet_highlighted_words': ['Reuters.com is your online source for the latest world news stories'],
'sitelinks': {'inline': [{'title': 'Europe',
'link': 'https://www.reuters.com/world/europe/'},
{'title': 'Econ World',
'link': 'https://www.reuters.com/markets/econ-world/'},
{'title': 'United States', 'link': 'https://www.reuters.com/world/us/'},
{'title': 'China', 'link': 'https://www.reuters.com/world/china/'}]},
'source': 'Reuters'},
{'position': 4,
'title': 'Latest news from around the world',
'link': 'https://www.theguardian.com/world',
'redirect_link': 'https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.theguardian.com/world&ved=2ahUKEwjBiILP6eGMAxXMFVkFHbyQDvgQFnoECDIQAQ',
'displayed_link': 'https://www.theguardian.com › world',
'favicon': 'https://serpapi.com/searches/6802660d7e6fc2987a91e511/images/16c8f2982b167f589c032bebd52ffcaa48b2ea00a209294a3ba58a436d46087f.png',
'date': '5 hours ago',
'snippet': 'Most viewed in world news · British woman among four killed in Italian cable car crash · Live · US ready to abandon Ukraine peace deal if there is no progress, ...',
'snippet_highlighted_words': ['British woman among four killed in Italian cable car crash'],
'sitelinks': {'inline': [{'title': 'Europe',
'link': 'https://www.theguardian.com/world/europe-news'},
{'title': 'Americas',
'link': 'https://www.theguardian.com/world/americas'},
{'title': 'World Bank announces...',
'link': 'https://www.theguardian.com/global-development/2025/apr/03/world-bank-multimillion-dollar-redress-killings-and-abuse-claims-tanzania-project-ruaha-national-park'},
{'title': '‘Shame’ on world leaders for...',
'link': 'https://www.theguardian.com/global-development/2025/apr/05/democratic-republic-congo-goma-shame-subhuman-neglect-displaced-civilians-norwegian-refugee-council-egeland'}]},
'source': 'The Guardian'},
{'position': 5,
'title': 'Breaking News, World News and Video from Al Jazeera',
'link': 'https://www.aljazeera.com/',
'redirect_link': 'https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.aljazeera.com/&ved=2ahUKEwjBiILP6eGMAxXMFVkFHbyQDvgQFnoECEMQAQ',
'displayed_link': 'https://www.aljazeera.com',
'favicon': 'https://serpapi.com/searches/6802660d7e6fc2987a91e511/images/16c8f2982b167f589c032bebd52ffcaa0a1198d7119b4fba775a96c1ae4609f9.png',
'date': '4 hours ago',
'snippet': "Maguire's last-minute goal sends Man Utd into Europa League semifinals · 'It's just incredible': Van Dijk extends Liverpool contract until 2027.",
'snippet_highlighted_words': ["Maguire's last-minute goal sends Man Utd into Europa League semifinals"],
'sitelinks': {'inline': [{'title': 'News',
'link': 'https://www.aljazeera.com/news/'},
{'title': 'US & Canada News',
'link': 'https://www.aljazeera.com/us-canada/'},
{'title': 'Middle East News',
'link': 'https://www.aljazeera.com/middle-east/'},
{'title': 'Live', 'link': 'https://www.aljazeera.com/live'}]},
'source': 'Al Jazeera'},
{'position': 6,
'title': 'NBC News - Breaking News & Top Stories - Latest World, US ...',
'link': 'https://www.nbcnews.com/',
'redirect_link': 'https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.nbcnews.com/&ved=2ahUKEwjBiILP6eGMAxXMFVkFHbyQDvgQFnoECDgQAQ',
'displayed_link': 'https://www.nbcnews.com',
'favicon': 'https://serpapi.com/searches/6802660d7e6fc2987a91e511/images/16c8f2982b167f589c032bebd52ffcaade72b784dd8cd9f7222b9918f97a84a6.png',
'snippet': 'Go to NBCNews.com for breaking news, videos, and the latest top stories in world news, business, politics, health and pop culture.',
'snippet_highlighted_words': ['breaking news, videos, and the latest top stories in world news'],
'sitelinks': {'inline': [{'title': 'World',
'link': 'https://www.nbcnews.com/world'},
{'title': 'U.S. News', 'link': 'https://www.nbcnews.com/us-news'},
{'title': 'Latest Stories',
'link': 'https://www.nbcnews.com/latest-stories'},
{'title': 'Nightly News',
'link': 'https://www.nbcnews.com/nightly-news'}]},
'source': 'NBC News'},
{'position': 7,
'title': 'Google News',
'link': 'https://news.google.com/',
'redirect_link': 'https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://news.google.com/&ved=2ahUKEwjBiILP6eGMAxXMFVkFHbyQDvgQFnoECDUQAQ',
'displayed_link': 'https://news.google.com',
'favicon': 'https://serpapi.com/searches/6802660d7e6fc2987a91e511/images/16c8f2982b167f589c032bebd52ffcaa3549b89513fe8355867d18051abe5e52.png',
'date': '1 hour ago',
'snippet': "Court denies White House appeal of 'shocking' Abrego Garcia deportation case · What to know about the shooting at Florida State University · U.S. citizen released ...",
'snippet_highlighted_words': ["Court denies White House appeal of 'shocking' Abrego Garcia deportation case"],
'source': 'Google News'},
{'position': 8,
'title': 'CNN: Breaking News, Latest News and Videos',
'link': 'https://www.cnn.com/',
'redirect_link': 'https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.cnn.com/&ved=2ahUKEwjBiILP6eGMAxXMFVkFHbyQDvgQFnoECDEQAQ',
'displayed_link': 'https://www.cnn.com',
'favicon': 'https://serpapi.com/searches/6802660d7e6fc2987a91e511/images/16c8f2982b167f589c032bebd52ffcaa17224953f58f6bc829748cf5b3dc70f6.png',
'snippet': 'View the latest news and breaking news today for U.S., world, weather, entertainment, politics and health at CNN.com.',
'snippet_highlighted_words': ['breaking news today'],
'sitelinks': {'inline': [{'title': 'World',
'link': 'https://www.cnn.com/world'},
{'title': 'US', 'link': 'https://www.cnn.com/us'},
{'title': 'CNN', 'link': 'https://edition.cnn.com/'},
{'title': 'Latest Videos', 'link': 'https://www.cnn.com/videos'}]},
'source': 'CNN'},
{'position': 9,
'title': 'BBC Home - Breaking News, World News, US News, Sports ...',
'link': 'https://www.bbc.com/',
'redirect_link': 'https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.bbc.com/&ved=2ahUKEwjBiILP6eGMAxXMFVkFHbyQDvgQFnoECDsQAQ',
'displayed_link': 'https://www.bbc.com',
'favicon': 'https://serpapi.com/searches/6802660d7e6fc2987a91e511/images/16c8f2982b167f589c032bebd52ffcaae061d74b1e6b3f7c85913924b4a41647.png',
'date': '4 hours ago',
'snippet': 'Visit BBC for trusted reporting on the latest world and US news, sports, business, climate, innovation, culture and much more.',
'snippet_highlighted_words': ['latest world and US news'],
'sitelinks': {'inline': [{'title': 'World',
'link': 'https://www.bbc.com/news/world'},
{'title': 'News', 'link': 'https://www.bbc.com/news'},
{'title': 'US & Canada', 'link': 'https://www.bbc.com/news/us-canada'},
{'title': 'Europe', 'link': 'https://www.bbc.com/news/world/europe'}]},
'source': 'BBC'}]

The results are fairly messy so we can clean them up and organize them into a Pydantic BaseModel, which we'll call Article and will include attributes for title, source, link, and snippet.

python
from pydantic import BaseModel

class Article(BaseModel):
title: str
source: str
link: str
snippet: str

@classmethod
def from_serpapi_result(cls, result: dict) -> "Article":
return cls(
title=result["title"],
source=result["source"],
link=result["link"],
snippet=result["snippet"],
)

def __str__(self) -> str:
return f"## {self.title} - ({self.source})\n_{self.link}_\n{self.snippet}\n"

We also define the classmethod from_serpapi_result to convert the raw SerpAPI results into our Article object, and the __str__ method to format the object as a markdown string which we will provide back to our LLM.

To create a list of Article objects from the SerpAPI results we do:

python
articles = [Article.from_serpapi_result(result) for result in results["organic_results"]]
articles

Returning a list of Article objects:

python
[
Article(
title='World | Latest News & Updates',
source='BBC',
link='https://www.bbc.com/news/world',
snippet='British couple killed in cable car crash, Italian police say. Four people died in the incident at Mount Faito, while another was "extremely seriously injured".'
),
Article(
title='World news - breaking news, video, headlines and opinion',
source='CNN',
link='https://www.cnn.com/world',
snippet="Turkey begins mass trials following protests over Istanbul mayor's detention · US will abandon Ukraine peace efforts 'within days' if no progress made, Rubio ..."
),
Article(
title='World News | Latest Top Stories',
source='Reuters',
link='https://www.reuters.com/world/',
snippet='Reuters.com is your online source for the latest world news stories and current events, ensuring our readers up to date with any breaking news developments.'
),
Article(
title='Latest news from around the world',
source='The Guardian',
link='https://www.theguardian.com/world',
snippet='Most viewed in world news · British woman among four killed in Italian cable car crash · Live · US ready to abandon Ukraine peace deal if there is no progress, ...'
),
Article(
title='Breaking News, World News and Video from Al Jazeera',
source='Al Jazeera',
link='https://www.aljazeera.com/',
snippet="Maguire's last-minute goal sends Man Utd into Europa League semifinals · 'It's just incredible': Van Dijk extends Liverpool contract until 2027."
),
Article(
title='NBC News - Breaking News & Top Stories - Latest World, US ...',
source='NBC News',
link='https://www.nbcnews.com/',
snippet='Go to NBCNews.com for breaking news, videos, and the latest top stories in world news, business, politics, health and pop culture.'
),
Article(
title='Google News',
source='Google News',
link='https://news.google.com/',
snippet="Court denies White House appeal of 'shocking' Abrego Garcia deportation case · What to know about the shooting at Florida State University · U.S. citizen released ..."
),
Article(
title='CNN: Breaking News, Latest News and Videos',
source='CNN',
link='https://www.cnn.com/',
snippet='View the latest news and breaking news today for U.S., world, weather, entertainment, politics and health at CNN.com.'
),
Article(
title='BBC Home - Breaking News, World News, US News, Sports ...',
source='BBC',
link='https://www.bbc.com/',
snippet='Visit BBC for trusted reporting on the latest world and US news, sports, business, climate, innovation, culture and much more.'
)
]

Let's display one of those in markdown:

python
from IPython.display import Markdown, display

display(Markdown(str(articles[0])))

Giving us this:

markdown
### World | Latest News & Updates - (BBC)

[https://www.bbc.com/news/world](https://www.bbc.com/news/world) British couple killed in cable car crash, Italian police say. Four people died in the incident at Mount Faito, while another was "extremely seriously injured".

Finally, we can refactor all of this into a single async function that our LLM can call:

python
async def web_search(query: str) -> list[Article]:
"""Use this function to search the web for information. Provide natural language to the
query with as much context as possible to get the best results.
"""
params = {
"api_key": SERPAPI_API_KEY,
"engine": "google",
"q": query
}

async with aiohttp.ClientSession() as session:
async with session.get(
"https://serpapi.com/search",
params=params
) as response:
results = await response.json()

articles = [Article.from_serpapi_result(result) for result in results["organic_results"]]
articles = "\n".join([str(article) for article in articles])
return articles

Our LLM doesn't call this function directly, but instead given a set of function schemas the LLM will decide which functions / tools to call and which arguments to provide. To generate these schemas we use the get_schemas function from graphai-lib:

python
from graphai.utils import get_schemas

tools = get_schemas([web_search])

This gives us a list of function schemas in tools which look like this:

python
[{'type': 'function',
'function': {'name': 'web_search',
'description': 'Use this function to search the web for information. Provide natural language to the\nquery with as much context as possible to get the best results.',
'parameters': {'type': 'object',
'properties': {'query': {'description': None, 'type': 'string'}},
'required': ['query']}}}]

We then execute our query with the tools passed to the tools parameter like so:

python
query = {"role": "user", "content": "tell me about the latest world news"}

response = completion(
model=OAI_MODEL,
messages=[query],
tools=tools,
tool_choice="auto",
api_key="sk-some-api-key",
base_url="http://localhost:1234/v1",
)

(
response.choices[0].message.tool_calls[0].function.name,
response.choices[0].message.tool_calls[0].function.arguments,
)

Returning:

python
('web_search', '{"query":"tell me about the latest world news"}')

Our LLM has generated the tool choice and input parameters for our tool but we have not executed the tool, we must handle that ourselves. To do so we will create a mapping from tool names to their functions.

python
tool_map = {
"web_search": web_search
# when using multiple tools, we would add them here
}

Now we execute the tool like so:

python
tool_out = await tool_map[response.choices[0].message.tool_calls[0].function.name](
response.choices[0].message.tool_calls[0].function.arguments
)

We then format this and the initial tool call from our LLM into messages, and feed them back into our LLM for a final response.

python
tool_call = {"role": "assistant", "content": response.choices[0].message.content, "tool_calls": response.choices[0].message.tool_calls, "tool_call_id": response.choices[0].message.tool_calls[0].id}
tool_exec = {"role": "tool", "content": tool_out, "tool_call_id": response.choices[0].message.tool_calls[0].id}

We feed this into our LLM:

python
messages = [query, tool_call, tool_exec]

response = completion(
model=OAI_MODEL,
messages=messages,
tool_choice="auto",
api_key="sk-some-api-key",
base_url="http://localhost:1234/v1",
tools=tools
)
response.choices[0]

Giving us:

python
Choices(
message=Message(
content='Here are some of the latest world news headlines from various sources:\n\n1. **Trump Tariffs Live Coverage** - Yahoo News has been following updates on Trump\'s tariffs.\n \n2. **Israel-Hamas Conflict** - AP News provides coverage of the ongoing developments in the Israel-Hamas conflict.\n\n3. **Seiko Anniversary Watch Release** - Men\'s Journal covers Seiko\'s new anniversary watch release as part of world news.\n\n4. **Good Friday 2025 & JEE Mains Result** - Livemint reports on upcoming events and exams in India.\n\n5. **Cross LOC Surgical Strike Preparation by Indian Army** - The Express Tribune discusses military preparedness, specifically focusing on cross Line of Control (LOC) operations.\n\n6. **Australian Man\'s World Record Attempt** - Rayo news mentions an Australian man who attempted a 73-hour world record but faced an unexpected issue.\n\n7. **Trump on \'What is a Woman\'** - Hindustan Times reports that Trump evoked laughter with quips about defining "woman," then shifted to more serious topics.\n\nThese headlines cover various aspects of global events, from political developments and military actions to cultural happenings around the world. For detailed information, you can visit the respective news sources mentioned above.',
role='assistant',
tool_calls=None,
function_call=None,
provider_specific_fields={'refusal': None}
),
finish_reason='stop',
index=0
)

Building a Simple Agent

We can wrap all of this up into some easier to use agentic logic to keep track of the conversation, execute tools when needed, etc, like so:

python
class Agent:
def __init__(self, tools):
self.tools = tools
self.schemas = get_schemas(tools)
self.mapping = {fn.__name__: fn for fn in tools}
self.messages = [{"role": "system", "content": "You are a helpful assistant."}]

async def __call__(self, query: str):
self.messages.append({"role": "user", "content": query})
for _ in range(3):
resp = await acompletion(
model=OAI_MODEL,
messages=self.messages,
tools=self.schemas,
api_key="sk-dummy-key",
base_url="http://localhost:1234/v1"
)
choice = resp.choices[0].message
self.messages.append(choice)
if choice.tool_calls:
fn = self.mapping[choice.tool_calls[0].function.name]
args = json.loads(choice.tool_calls[0].function.arguments)
tool_out = await fn(**args)
self.messages.append({"role": "tool", "content": tool_out, "tool_call_id": choice.tool_calls[0].id})
else:
break
return self.messages[-1]["content"]

Usage:

python
agent = Agent([web_search])
out = await agent("What's going on in the world today?")
print(out)

Sample output:

markdown
Based on the latest world news, here are some major headlines:

1. **Ukraine Peace Talks**: The US has indicated it will "move on" from Ukraine peace talks if no progress is made soon.

2. **Turkey and Protests**: Turkey has begun mass trials following protests over the detention of Istanbul's mayor.

3. **US Deportation Case**: There have been developments in a case where an American was mistakenly deported to El Salvador, with a US senator meeting the individual.

4. **Sudan Situation**: The UK is holding a conference on Sudan as part of ongoing efforts regarding that country's situation.

5. **South Africa Kidnapping**: A US pastor was kidnapped during a sermon in South Africa but was later rescued after a shootout.

6. **Criminal Investigation in Germany**: Prosecutors in Berlin are investigating a doctor who allegedly killed palliative care patients and set fire to some of their homes.

These stories represent just some of the major international developments happening right now. For more detailed coverage, you can visit news websites like BBC, CNN, Reuters, The Guardian, or NBC News.

Wrapping Up

We just:

  • Ran Cogito V1 locally using LM Studio
  • Built a completion and async streaming pipeline
  • Wired up tool calling
  • Created a reusable async agent

The best part? We're running everything locally.

Now swap in your favorite quantized models (Mistral, LLaMA, etc.) and extend with more tools, memory, and reasoning flows.