Archon V6 - Tool/Example/MCP Library

This commit is contained in:
Cole Medin
2025-04-01 07:21:25 -05:00
parent 3c51aa297b
commit 2802b549bd
84 changed files with 6967 additions and 17 deletions

View File

@@ -6,14 +6,12 @@
<h3>🚀 **CURRENT VERSION** 🚀</h3>
**[ V5 - Multi-Agent Coding Workflow ]**
*Specialized agents for different parts of the agent creation process*
**[ V6 - Tool Library and MCP Integration ]**
*Prebuilt tools, examples, and MCP server integration*
</div>
> **🔄 IMPORTANT UPDATE (March 20th)**: Archon now uses a multi-agent workflow with specialized refiner agents for autonomous prompt, tools, and agent definition improvements. The primary coding agent still creates the initial
agent by itself, but then you can say 'refine' or something along those lines as a follow up prompt to kick
off the specialized agents in parallel.
> **🔄 IMPORTANT UPDATE (March 31st)**: Archon now includes a library of prebuilt tools, examples, and MCP server integrations. Archon can now incorporate these resources when building new agents, significantly enhancing capabilities and reducing hallucinations. Note that the examples/tool library for Archon is just starting out. Please feel free to contribute examples, MCP servers, and prebuilt tools!
Archon is the world's first **"Agenteer"**, an AI agent designed to autonomously build, refine, and optimize other AI agents.
@@ -24,7 +22,7 @@ Through its iterative development, Archon showcases the power of planning, feedb
## Important Links
- The current version of Archon is V5 as mentioned above - see [V5 Documentation](iterations/v5-parallel-specialized-agents/README.md) for details.
- The current version of Archon is V6 as mentioned above - see [V6 Documentation](iterations/v6-tool-library-integration/README.md) for details.
- I **just** created the [Archon community](https://thinktank.ottomator.ai/c/archon/30) forum over in the oTTomator Think Tank! Please post any questions you have there!
@@ -38,9 +36,11 @@ Archon demonstrates three key principles in modern AI development:
2. **Domain Knowledge Integration**: Seamless embedding of frameworks like Pydantic AI and LangGraph within autonomous workflows
3. **Scalable Architecture**: Modular design supporting maintainability, cost optimization, and ethical AI practices
## Getting Started with V5 (current version)
## Getting Started with V6 (current version)
Since V5 is the current version of Archon, all the code for V5 is in both the main directory and `archon/iterations/v5-parallel-specialized-agents` directory.
Since V6 is the current version of Archon, all the code for V6 is in both the main directory and `archon/iterations/v6-tool-library-integration` directory.
Note that the examples/tool library for Archon is just starting out. Please feel free to contribute examples, MCP servers, and prebuilt tools!
### Prerequisites
- Docker (optional but preferred)
@@ -179,7 +179,7 @@ This ensures you're always running the most recent version of Archon with all th
- MCP configuration through the UI
- [Learn more about V4](iterations/v4-streamlit-ui-overhaul/README.md)
### V5: Current - Multi-Agent Coding Workflow
### V5: Multi-Agent Coding Workflow
- Specialized refiner agents for different autonomously improving the initially generated agent
- Prompt refiner agent for optimizing system prompts
- Tools refiner agent for specialized tool implementation
@@ -188,8 +188,17 @@ This ensures you're always running the most recent version of Archon with all th
- Improved workflow orchestration with LangGraph
- [Learn more about V5](iterations/v5-parallel-specialized-agents/README.md)
### V6: Current - Tool Library and MCP Integration
- Comprehensive library of prebuilt tools, examples, and agent templates
- Integration with MCP servers for massive amounts of prebuilt tools
- Advisor agent that recommends relevant tools and examples based on user requirements
- Automatic incorporation of prebuilt components into new agents
- Specialized tools refiner agent also validates and optimizes MCP server configurations
- Streamlined access to external services through MCP integration
- Reduced development time through component reuse
- [Learn more about V6](iterations/v6-tool-library-integration/README.md)
### Future Iterations
- V6: Tool Library and Example Integration - Pre-built external tool and agent examples incorporation
- V7: LangGraph Documentation - Allow Archon to build Pydantic AI AND LangGraph agents
- V8: Self-Feedback Loop - Automated validation and error correction
- V9: Self Agent Execution - Testing and iterating on agents in an isolated environment
@@ -313,3 +322,4 @@ For version-specific details:
- [V3 Documentation](iterations/v3-mcp-support/README.md)
- [V4 Documentation](iterations/v4-streamlit-ui-overhaul/README.md)
- [V5 Documentation](iterations/v5-parallel-specialized-agents/README.md)
- [V6 Documentation](iterations/v6-tool-library-integration/README.md)

View File

@@ -0,0 +1,173 @@
from __future__ import annotations as _annotations
import asyncio
import os
from dataclasses import dataclass
from typing import Any, List, Dict
import tempfile
from pathlib import Path
from dotenv import load_dotenv
import shutil
import time
import re
import json
import httpx
import logfire
from pydantic_ai import Agent, ModelRetry, RunContext
from pydantic_ai.providers.openai import OpenAIProvider
from pydantic_ai.models.openai import OpenAIModel
from devtools import debug
load_dotenv()
llm = os.getenv('LLM_MODEL', 'deepseek/deepseek-chat')
model = OpenAIModel(
llm,
provider=OpenAIProvider(base_url="https://openrouter.ai/api/v1", api_key=os.getenv('OPEN_ROUTER_API_KEY'))
) if os.getenv('OPEN_ROUTER_API_KEY', None) else OpenAIModel(llm)
logfire.configure(send_to_logfire='if-token-present')
@dataclass
class GitHubDeps:
client: httpx.AsyncClient
github_token: str | None = None
system_prompt = """
You are a coding expert with access to GitHub to help the user manage their repository and get information from it.
Your only job is to assist with this and you don't answer other questions besides describing what you are able to do.
Don't ask the user before taking an action, just do it. Always make sure you look at the repository with the provided tools before answering the user's question unless you have already.
When answering a question about the repo, always start your answer with the full repo URL in brackets and then give your answer on a newline. Like:
[Using https://github.com/[repo URL from the user]]
Your answer here...
"""
github_agent = Agent(
model,
system_prompt=system_prompt,
deps_type=GitHubDeps,
retries=2
)
@github_agent.tool
async def get_repo_info(ctx: RunContext[GitHubDeps], github_url: str) -> str:
"""Get repository information including size and description using GitHub API.
Args:
ctx: The context.
github_url: The GitHub repository URL.
Returns:
str: Repository information as a formatted string.
"""
match = re.search(r'github\.com[:/]([^/]+)/([^/]+?)(?:\.git)?$', github_url)
if not match:
return "Invalid GitHub URL format"
owner, repo = match.groups()
headers = {'Authorization': f'token {ctx.deps.github_token}'} if ctx.deps.github_token else {}
response = await ctx.deps.client.get(
f'https://api.github.com/repos/{owner}/{repo}',
headers=headers
)
if response.status_code != 200:
return f"Failed to get repository info: {response.text}"
data = response.json()
size_mb = data['size'] / 1024
return (
f"Repository: {data['full_name']}\n"
f"Description: {data['description']}\n"
f"Size: {size_mb:.1f}MB\n"
f"Stars: {data['stargazers_count']}\n"
f"Language: {data['language']}\n"
f"Created: {data['created_at']}\n"
f"Last Updated: {data['updated_at']}"
)
@github_agent.tool
async def get_repo_structure(ctx: RunContext[GitHubDeps], github_url: str) -> str:
"""Get the directory structure of a GitHub repository.
Args:
ctx: The context.
github_url: The GitHub repository URL.
Returns:
str: Directory structure as a formatted string.
"""
match = re.search(r'github\.com[:/]([^/]+)/([^/]+?)(?:\.git)?$', github_url)
if not match:
return "Invalid GitHub URL format"
owner, repo = match.groups()
headers = {'Authorization': f'token {ctx.deps.github_token}'} if ctx.deps.github_token else {}
response = await ctx.deps.client.get(
f'https://api.github.com/repos/{owner}/{repo}/git/trees/main?recursive=1',
headers=headers
)
if response.status_code != 200:
# Try with master branch if main fails
response = await ctx.deps.client.get(
f'https://api.github.com/repos/{owner}/{repo}/git/trees/master?recursive=1',
headers=headers
)
if response.status_code != 200:
return f"Failed to get repository structure: {response.text}"
data = response.json()
tree = data['tree']
# Build directory structure
structure = []
for item in tree:
if not any(excluded in item['path'] for excluded in ['.git/', 'node_modules/', '__pycache__/']):
structure.append(f"{'📁 ' if item['type'] == 'tree' else '📄 '}{item['path']}")
return "\n".join(structure)
@github_agent.tool
async def get_file_content(ctx: RunContext[GitHubDeps], github_url: str, file_path: str) -> str:
"""Get the content of a specific file from the GitHub repository.
Args:
ctx: The context.
github_url: The GitHub repository URL.
file_path: Path to the file within the repository.
Returns:
str: File content as a string.
"""
match = re.search(r'github\.com[:/]([^/]+)/([^/]+?)(?:\.git)?$', github_url)
if not match:
return "Invalid GitHub URL format"
owner, repo = match.groups()
headers = {'Authorization': f'token {ctx.deps.github_token}'} if ctx.deps.github_token else {}
response = await ctx.deps.client.get(
f'https://raw.githubusercontent.com/{owner}/{repo}/main/{file_path}',
headers=headers
)
if response.status_code != 200:
# Try with master branch if main fails
response = await ctx.deps.client.get(
f'https://raw.githubusercontent.com/{owner}/{repo}/master/{file_path}',
headers=headers
)
if response.status_code != 200:
return f"Failed to get file content: {response.text}"
return response.text

View File

@@ -0,0 +1,33 @@
from pydantic_ai.providers.openai import OpenAIProvider
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.mcp import MCPServerStdio
from pydantic_ai import Agent
from dotenv import load_dotenv
import asyncio
import os
load_dotenv()
def get_model():
llm = os.getenv('MODEL_CHOICE', 'gpt-4o-mini')
base_url = os.getenv('BASE_URL', 'https://api.openai.com/v1')
api_key = os.getenv('LLM_API_KEY', 'no-api-key-provided')
return OpenAIModel(llm, provider=OpenAIProvider(base_url=base_url, api_key=api_key))
server = MCPServerStdio(
'npx',
['-y', '@modelcontextprotocol/server-brave-search', 'stdio'],
env={"BRAVE_API_KEY": os.getenv("BRAVE_API_KEY")}
)
agent = Agent(get_model(), mcp_servers=[server])
async def main():
async with agent.run_mcp_servers():
result = await agent.run('What is new with Gemini 2.5 Pro?')
print(result.data)
user_input = input("Press enter to quit...")
if __name__ == '__main__':
asyncio.run(main())

View File

@@ -0,0 +1,110 @@
from __future__ import annotations as _annotations
import asyncio
import os
from dataclasses import dataclass
from datetime import datetime
from typing import Any
import logfire
from devtools import debug
from httpx import AsyncClient
from dotenv import load_dotenv
from openai import AsyncOpenAI
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai import Agent, ModelRetry, RunContext
load_dotenv()
llm = os.getenv('LLM_MODEL', 'gpt-4o')
client = AsyncOpenAI(
base_url = 'http://localhost:11434/v1',
api_key='ollama'
)
model = OpenAIModel(llm) if llm.lower().startswith("gpt") else OpenAIModel(llm, openai_client=client)
# 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured
logfire.configure(send_to_logfire='if-token-present')
@dataclass
class Deps:
client: AsyncClient
brave_api_key: str | None
web_search_agent = Agent(
model,
system_prompt=f'You are an expert at researching the web to answer user questions. The current date is: {datetime.now().strftime("%Y-%m-%d")}',
deps_type=Deps,
retries=2
)
@web_search_agent.tool
async def search_web(
ctx: RunContext[Deps], web_query: str
) -> str:
"""Search the web given a query defined to answer the user's question.
Args:
ctx: The context.
web_query: The query for the web search.
Returns:
str: The search results as a formatted string.
"""
if ctx.deps.brave_api_key is None:
return "This is a test web search result. Please provide a Brave API key to get real search results."
headers = {
'X-Subscription-Token': ctx.deps.brave_api_key,
'Accept': 'application/json',
}
with logfire.span('calling Brave search API', query=web_query) as span:
r = await ctx.deps.client.get(
'https://api.search.brave.com/res/v1/web/search',
params={
'q': web_query,
'count': 5,
'text_decorations': True,
'search_lang': 'en'
},
headers=headers
)
r.raise_for_status()
data = r.json()
span.set_attribute('response', data)
results = []
# Add web results in a nice formatted way
web_results = data.get('web', {}).get('results', [])
for item in web_results[:3]:
title = item.get('title', '')
description = item.get('description', '')
url = item.get('url', '')
if title and description:
results.append(f"Title: {title}\nSummary: {description}\nSource: {url}\n")
return "\n".join(results) if results else "No results found for the query."
async def main():
async with AsyncClient() as client:
brave_api_key = os.getenv('BRAVE_API_KEY', None)
deps = Deps(client=client, brave_api_key=brave_api_key)
result = await web_search_agent.run(
'Give me some articles talking about the new release of React 19.', deps=deps
)
debug(result)
print('Response:', result.data)
if __name__ == '__main__':
asyncio.run(main())

View File

@@ -0,0 +1,14 @@
{
"mcpServers": {
"airtable": {
"command": "npx",
"args": [
"-y",
"airtable-mcp-server"
],
"env": {
"AIRTABLE_API_KEY": "pat123.abc123"
}
}
}
}

View File

@@ -0,0 +1,14 @@
{
"mcpServers": {
"brave-search": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-brave-search"
],
"env": {
"BRAVE_API_KEY": "YOUR_API_KEY_HERE"
}
}
}
}

View File

@@ -0,0 +1,14 @@
{
"mcpServers": {
"chroma": {
"command": "uvx",
"args": [
"chroma-mcp",
"--client-type",
"persistent",
"--data-dir",
"/full/path/to/your/data/directory"
]
}
}
}

View File

@@ -0,0 +1,13 @@
{
"mcpServers": {
"filesystem": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-filesystem",
"/Users/username/Desktop",
"/path/to/other/allowed/dir"
]
}
}
}

View File

@@ -0,0 +1,19 @@
{
"mcpServers": {
"mcp-server-firecrawl": {
"command": "npx",
"args": ["-y", "firecrawl-mcp"],
"env": {
"FIRECRAWL_API_KEY": "YOUR_API_KEY_HERE",
"FIRECRAWL_RETRY_MAX_ATTEMPTS": "5",
"FIRECRAWL_RETRY_INITIAL_DELAY": "2000",
"FIRECRAWL_RETRY_MAX_DELAY": "30000",
"FIRECRAWL_RETRY_BACKOFF_FACTOR": "3",
"FIRECRAWL_CREDIT_WARNING_THRESHOLD": "2000",
"FIRECRAWL_CREDIT_CRITICAL_THRESHOLD": "500"
}
}
}
}

View File

@@ -0,0 +1,16 @@
{
"mcpServers": {
"git": {
"command": "docker",
"args": [
"run",
"--rm",
"-i",
"--mount", "type=bind,src=/Users/username/Desktop,dst=/projects/Desktop",
"--mount", "type=bind,src=/path/to/other/allowed/dir,dst=/projects/other/allowed/dir,ro",
"--mount", "type=bind,src=/path/to/file.txt,dst=/projects/path/to/file.txt",
"mcp/git"
]
}
}
}

View File

@@ -0,0 +1,14 @@
{
"mcpServers": {
"github": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-github"
],
"env": {
"GITHUB_PERSONAL_ACCESS_TOKEN": "<YOUR_TOKEN>"
}
}
}
}

View File

@@ -0,0 +1,8 @@
{
"mcpServers": {
"gdrive": {
"command": "docker",
"args": ["run", "-i", "--rm", "-v", "mcp-gdrive:/gdrive-server", "-e", "GDRIVE_CREDENTIALS_PATH=/gdrive-server/credentials.json", "mcp/gdrive"]
}
}
}

View File

@@ -0,0 +1,12 @@
{
"qdrant": {
"command": "uvx",
"args": ["mcp-server-qdrant"],
"env": {
"QDRANT_URL": "https://xyz-example.eu-central.aws.cloud.qdrant.io:6333",
"QDRANT_API_KEY": "your_api_key",
"COLLECTION_NAME": "your-collection-name",
"EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2"
}
}
}

View File

@@ -0,0 +1,12 @@
{
"mcpServers": {
"redis": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-redis",
"redis://localhost:6379"
]
}
}
}

View File

@@ -0,0 +1,15 @@
{
"mcpServers": {
"slack": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-slack"
],
"env": {
"SLACK_BOT_TOKEN": "xoxb-your-bot-token",
"SLACK_TEAM_ID": "T01234567"
}
}
}
}

View File

@@ -0,0 +1,17 @@
{
"mcpServers": {
"sqlite": {
"command": "docker",
"args": [
"run",
"--rm",
"-i",
"-v",
"mcp-test:/mcp",
"mcp/sqlite",
"--db-path",
"/mcp/test.db"
]
}
}
}

View File

@@ -0,0 +1,34 @@
@github_agent.tool
async def get_file_content(ctx: RunContext[GitHubDeps], github_url: str, file_path: str) -> str:
"""Get the content of a specific file from the GitHub repository.
Args:
ctx: The context.
github_url: The GitHub repository URL.
file_path: Path to the file within the repository.
Returns:
str: File content as a string.
"""
match = re.search(r'github\.com[:/]([^/]+)/([^/]+?)(?:\.git)?$', github_url)
if not match:
return "Invalid GitHub URL format"
owner, repo = match.groups()
headers = {'Authorization': f'token {ctx.deps.github_token}'} if ctx.deps.github_token else {}
response = await ctx.deps.client.get(
f'https://raw.githubusercontent.com/{owner}/{repo}/main/{file_path}',
headers=headers
)
if response.status_code != 200:
# Try with master branch if main fails
response = await ctx.deps.client.get(
f'https://raw.githubusercontent.com/{owner}/{repo}/master/{file_path}',
headers=headers
)
if response.status_code != 200:
return f"Failed to get file content: {response.text}"
return response.text

View File

@@ -0,0 +1,42 @@
@github_agent.tool
async def get_repo_structure(ctx: RunContext[GitHubDeps], github_url: str) -> str:
"""Get the directory structure of a GitHub repository.
Args:
ctx: The context.
github_url: The GitHub repository URL.
Returns:
str: Directory structure as a formatted string.
"""
match = re.search(r'github\.com[:/]([^/]+)/([^/]+?)(?:\.git)?$', github_url)
if not match:
return "Invalid GitHub URL format"
owner, repo = match.groups()
headers = {'Authorization': f'token {ctx.deps.github_token}'} if ctx.deps.github_token else {}
response = await ctx.deps.client.get(
f'https://api.github.com/repos/{owner}/{repo}/git/trees/main?recursive=1',
headers=headers
)
if response.status_code != 200:
# Try with master branch if main fails
response = await ctx.deps.client.get(
f'https://api.github.com/repos/{owner}/{repo}/git/trees/master?recursive=1',
headers=headers
)
if response.status_code != 200:
return f"Failed to get repository structure: {response.text}"
data = response.json()
tree = data['tree']
# Build directory structure
structure = []
for item in tree:
if not any(excluded in item['path'] for excluded in ['.git/', 'node_modules/', '__pycache__/']):
structure.append(f"{'📁 ' if item['type'] == 'tree' else '📄 '}{item['path']}")
return "\n".join(structure)

View File

@@ -0,0 +1,38 @@
@github_agent.tool
async def get_repo_info(ctx: RunContext[GitHubDeps], github_url: str) -> str:
"""Get repository information including size and description using GitHub API.
Args:
ctx: The context.
github_url: The GitHub repository URL.
Returns:
str: Repository information as a formatted string.
"""
match = re.search(r'github\.com[:/]([^/]+)/([^/]+?)(?:\.git)?$', github_url)
if not match:
return "Invalid GitHub URL format"
owner, repo = match.groups()
headers = {'Authorization': f'token {ctx.deps.github_token}'} if ctx.deps.github_token else {}
response = await ctx.deps.client.get(
f'https://api.github.com/repos/{owner}/{repo}',
headers=headers
)
if response.status_code != 200:
return f"Failed to get repository info: {response.text}"
data = response.json()
size_mb = data['size'] / 1024
return (
f"Repository: {data['full_name']}\n"
f"Description: {data['description']}\n"
f"Size: {size_mb:.1f}MB\n"
f"Stars: {data['stargazers_count']}\n"
f"Language: {data['language']}\n"
f"Created: {data['created_at']}\n"
f"Last Updated: {data['updated_at']}"
)

View File

@@ -0,0 +1,48 @@
@web_search_agent.tool
async def search_web(
ctx: RunContext[Deps], web_query: str
) -> str:
"""Search the web given a query defined to answer the user's question.
Args:
ctx: The context.
web_query: The query for the web search.
Returns:
str: The search results as a formatted string.
"""
if ctx.deps.brave_api_key is None:
return "This is a test web search result. Please provide a Brave API key to get real search results."
headers = {
'X-Subscription-Token': ctx.deps.brave_api_key,
'Accept': 'application/json',
}
with logfire.span('calling Brave search API', query=web_query) as span:
r = await ctx.deps.client.get(
'https://api.search.brave.com/res/v1/web/search',
params={
'q': web_query,
'count': 5,
'text_decorations': True,
'search_lang': 'en'
},
headers=headers
)
r.raise_for_status()
data = r.json()
span.set_attribute('response', data)
results = []
# Add web results in a nice formatted way
web_results = data.get('web', {}).get('results', [])
for item in web_results[:3]:
title = item.get('title', '')
description = item.get('description', '')
url = item.get('url', '')
if title and description:
results.append(f"Title: {title}\nSummary: {description}\nSource: {url}\n")
return "\n".join(results) if results else "No results found for the query."

67
archon/advisor_agent.py Normal file
View File

@@ -0,0 +1,67 @@
from __future__ import annotations as _annotations
from dataclasses import dataclass
from dotenv import load_dotenv
import logfire
import asyncio
import httpx
import os
import sys
import json
from typing import List
from pydantic import BaseModel
from pydantic_ai import Agent, ModelRetry, RunContext
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.models.openai import OpenAIModel
from openai import AsyncOpenAI
from supabase import Client
# Add the parent directory to sys.path to allow importing from the parent directory
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from utils.utils import get_env_var
from archon.agent_prompts import advisor_prompt
from archon.agent_tools import get_file_content_tool
load_dotenv()
provider = get_env_var('LLM_PROVIDER') or 'OpenAI'
llm = get_env_var('PRIMARY_MODEL') or 'gpt-4o-mini'
base_url = get_env_var('BASE_URL') or 'https://api.openai.com/v1'
api_key = get_env_var('LLM_API_KEY') or 'no-llm-api-key-provided'
model = AnthropicModel(llm, api_key=api_key) if provider == "Anthropic" else OpenAIModel(llm, base_url=base_url, api_key=api_key)
logfire.configure(send_to_logfire='if-token-present')
@dataclass
class AdvisorDeps:
file_list: List[str]
advisor_agent = Agent(
model,
system_prompt=advisor_prompt,
deps_type=AdvisorDeps,
retries=2
)
@advisor_agent.system_prompt
def add_file_list(ctx: RunContext[str]) -> str:
return f"""
\n\Here is the list of all the files that you can pull the contents of with the
'get_file_content' tool if the example/tool/MCP server is relevant to the
agent the user is trying to build:\n
{"\n".join(ctx.deps.file_list)}
"""
@advisor_agent.tool_plain
def get_file_content(file_path: str) -> str:
"""
Retrieves the content of a specific file. Use this to get the contents of an example, tool, config for an MCP server
Args:
file_path: The path to the file
Returns:
The raw contents of the file
"""
return get_file_content_tool(file_path)

View File

@@ -1,3 +1,60 @@
advisor_prompt = """
You are an AI agent engineer specialized in using example code and prebuilt tools/MCP servers
and synthesizing these prebuilt components into a recommended starting point for the primary coding agent.
You will be given a prompt from the user for the AI agent they want to build, and also a list of examples,
prebuilt tools, and MCP servers you can use to aid in creating the agent so the least amount of code possible
has to be recreated.
Use the file name to determine if the example/tool/MCP server is relevant to the agent the user is requesting.
Examples will be in the examples/ folder. These are examples of AI agents to use as a starting point if applicable.
Prebuilt tools will be in the tools/ folder. Use some or none of these depending on if any of the prebuilt tools
would be needed for the agent.
MCP servers will be in the mcps/ folder. These are all config files that show the necessary parameters to set up each
server. MCP servers are just pre-packaged tools that you can include in the agent.
Take a look at examples/pydantic_mpc_agent.py to see how to incorporate MCP servers into the agents.
For example, if the Brave Search MCP config is:
{
"mcpServers": {
"brave-search": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-brave-search"
],
"env": {
"BRAVE_API_KEY": "YOUR_API_KEY_HERE"
}
}
}
}
Then the way to connect that into the agent is:
server = MCPServerStdio(
'npx',
['-y', '@modelcontextprotocol/server-brave-search', 'stdio'],
env={"BRAVE_API_KEY": os.getenv("BRAVE_API_KEY")}
)
agent = Agent(get_model(), mcp_servers=[server])
So you can see how you would map the config parameters to the MCPServerStdio instantiation.
You are given a single tool to look at the contents of any file, so call this as many times as you need to look
at the different files given to you that you think are relevant for the AI agent being created.
IMPORTANT: Only look at a few examples/tools/servers. Keep your search concise.
Your primary job at the end of looking at examples/tools/MCP servers is to provide a recommendation for a starting
point of an AI agent that uses applicable resources you pulled. Only focus on the examples/tools/servers that
are actually relevant to the AI agent the user requested.
"""
prompt_refiner_prompt = """
You are an AI agent engineer specialized in refining prompts for the agents.
@@ -18,8 +75,10 @@ Output the new prompt and nothing else.
tools_refiner_prompt = """
You are an AI agent engineer specialized in refining tools for the agents.
You have comprehensive access to the Pydantic AI documentation, including API references, usage guides, and implementation examples.
You also have access to a list of files mentioned below that give you examples, prebuilt tools, and MCP servers
you can reference when vaildating the tools and MCP servers given to the current agent.
Your only job is to take the current tools from the conversation, and refine them so the agent being created
Your only job is to take the current tools/MCP servers from the conversation, and refine them so the agent being created
has the optimal tooling to fulfill its role and tasks. Also make sure the tools are coded properly
and allow the agent to solve the problems they are meant to help with.
@@ -31,9 +90,16 @@ For each tool, ensure that it:
4. Is coded properly (uses API calls correctly for the services, returns the correct data, etc.)
5. Handles errors properly
Only change what is necessary to refine the tools, don't go overboard unless of course the tools are broken and need a lot of fixing.
For each MCP server:
Output the new code for the tools and nothing else.
1. Get the contents of the JSON config for the server
2. Make sure the name of the server and arguments match what is in the config
3. Make sure the correct environment variables are used
Only change what is necessary to refine the tools and MCP server definitions, don't go overboard
unless of course the tools are broken and need a lot of fixing.
Output the new code for the tools/MCP servers and nothing else.
"""
agent_refiner_prompt = """

View File

@@ -121,3 +121,21 @@ async def get_page_content_tool(supabase: Client, url: str) -> str:
except Exception as e:
print(f"Error retrieving page content: {e}")
return f"Error retrieving page content: {str(e)}"
def get_file_content_tool(file_path: str) -> str:
"""
Retrieves the content of a specific file. Use this to get the contents of an example, tool, config for an MCP server
Args:
file_path: The path to the file
Returns:
The raw contents of the file
"""
try:
with open(file_path, "r") as file:
file_contents = file.read()
return file_contents
except Exception as e:
print(f"Error retrieving file contents: {e}")
return f"Error retrieving file contents: {str(e)}"

View File

@@ -22,6 +22,7 @@ from pydantic_ai.messages import (
# Add the parent directory to Python path
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from archon.pydantic_ai_coder import pydantic_ai_coder, PydanticAIDeps
from archon.advisor_agent import advisor_agent, AdvisorDeps
from archon.refiner_agents.prompt_refiner_agent import prompt_refiner_agent
from archon.refiner_agents.tools_refiner_agent import tools_refiner_agent, ToolsRefinerDeps
from archon.refiner_agents.agent_refiner_agent import agent_refiner_agent, AgentRefinerDeps
@@ -71,6 +72,8 @@ class AgentState(TypedDict):
messages: Annotated[List[bytes], lambda x, y: x + y]
scope: str
advisor_output: str
file_list: List[str]
refined_prompt: str
refined_tools: str
@@ -113,13 +116,40 @@ async def define_scope_with_reasoner(state: AgentState):
return {"scope": scope}
# Advisor agent - create a starting point based on examples and prebuilt tools/MCP servers
async def advisor_with_examples(state: AgentState):
# Get the directory one level up from the current file (archon_graph.py)
current_dir = os.path.dirname(os.path.abspath(__file__))
parent_dir = os.path.dirname(current_dir)
# The agent-resources folder is adjacent to the parent folder of archon_graph.py
agent_resources_dir = os.path.join(parent_dir, "agent-resources")
# Get a list of all files in the agent-resources directory and its subdirectories
file_list = []
for root, dirs, files in os.walk(agent_resources_dir):
for file in files:
# Get the full path to the file
file_path = os.path.join(root, file)
# Use the full path instead of relative path
file_list.append(file_path)
# Then, prompt the advisor with the list of files it can use for examples and tools
deps = AdvisorDeps(file_list=file_list)
result = await advisor_agent.run(state['latest_user_message'], deps=deps)
advisor_output = result.data
return {"file_list": file_list, "advisor_output": advisor_output}
# Coding Node with Feedback Handling
async def coder_agent(state: AgentState, writer):
# Prepare dependencies
deps = PydanticAIDeps(
supabase=supabase,
embedding_client=embedding_client,
reasoner_output=state['scope']
reasoner_output=state['scope'],
advisor_output=state['advisor_output']
)
# Get the message history into the format for Pydantic AI
@@ -219,7 +249,8 @@ async def refine_tools(state: AgentState):
# Prepare dependencies
deps = ToolsRefinerDeps(
supabase=supabase,
embedding_client=embedding_client
embedding_client=embedding_client,
file_list=state['file_list']
)
# Get the message history into the format for Pydantic AI
@@ -282,6 +313,7 @@ builder = StateGraph(AgentState)
# Add nodes
builder.add_node("define_scope_with_reasoner", define_scope_with_reasoner)
builder.add_node("advisor_with_examples", advisor_with_examples)
builder.add_node("coder_agent", coder_agent)
builder.add_node("get_next_user_message", get_next_user_message)
builder.add_node("refine_prompt", refine_prompt)
@@ -291,7 +323,9 @@ builder.add_node("finish_conversation", finish_conversation)
# Set edges
builder.add_edge(START, "define_scope_with_reasoner")
builder.add_edge(START, "advisor_with_examples")
builder.add_edge("define_scope_with_reasoner", "coder_agent")
builder.add_edge("advisor_with_examples", "coder_agent")
builder.add_edge("coder_agent", "get_next_user_message")
builder.add_conditional_edges(
"get_next_user_message",

View File

@@ -42,6 +42,7 @@ class PydanticAIDeps:
supabase: Client
embedding_client: AsyncOpenAI
reasoner_output: str
advisor_output: str
pydantic_ai_coder = Agent(
model,
@@ -56,6 +57,9 @@ def add_reasoner_output(ctx: RunContext[str]) -> str:
\n\nAdditional thoughts/instructions from the reasoner LLM.
This scope includes documentation pages for you to search as well:
{ctx.deps.reasoner_output}
Recommended starting point from the advisor agent:
{ctx.deps.advisor_output}
"""
@pydantic_ai_coder.tool

View File

@@ -23,7 +23,8 @@ from archon.agent_prompts import tools_refiner_prompt
from archon.agent_tools import (
retrieve_relevant_documentation_tool,
list_documentation_pages_tool,
get_page_content_tool
get_page_content_tool,
get_file_content_tool
)
load_dotenv()
@@ -42,6 +43,7 @@ logfire.configure(send_to_logfire='if-token-present')
class ToolsRefinerDeps:
supabase: Client
embedding_client: AsyncOpenAI
file_list: List[str]
tools_refiner_agent = Agent(
model,
@@ -50,6 +52,16 @@ tools_refiner_agent = Agent(
retries=2
)
@tools_refiner_agent.system_prompt
def add_file_list(ctx: RunContext[str]) -> str:
return f"""
\n\Here is the list of all the files that you can pull the contents of with the
'get_file_content' tool if the example/tool/MCP server is relevant to the
agent the user is trying to build:\n
{"\n".join(ctx.deps.file_list)}
"""
@tools_refiner_agent.tool
async def retrieve_relevant_documentation(ctx: RunContext[ToolsRefinerDeps], query: str) -> str:
"""
@@ -89,4 +101,17 @@ async def get_page_content(ctx: RunContext[ToolsRefinerDeps], url: str) -> str:
Returns:
str: The complete page content with all chunks combined in order
"""
return await get_page_content_tool(ctx.deps.supabase, url)
return await get_page_content_tool(ctx.deps.supabase, url)
@tools_refiner_agent.tool_plain
def get_file_content(file_path: str) -> str:
"""
Retrieves the content of a specific file. Use this to get the contents of an example, tool, config for an MCP server
Args:
file_path: The path to the file
Returns:
The raw contents of the file
"""
return get_file_content_tool(file_path)

View File

@@ -0,0 +1,38 @@
# Ignore specified folders
iterations/
venv/
.langgraph_api/
.github/
__pycache__/
.env
# Git related
.git/
.gitignore
.gitattributes
# Python cache
*.pyc
*.pyo
*.pyd
.Python
*.so
.pytest_cache/
# Environment files
.env.local
.env.development.local
.env.test.local
.env.production.local
# Logs
*.log
# IDE specific files
.idea/
.vscode/
*.swp
*.swo
# Keep the example env file for reference
!.env.example

View File

@@ -0,0 +1,43 @@
# Base URL for the OpenAI instance (default is https://api.openai.com/v1)
# OpenAI: https://api.openai.com/v1
# Ollama (example): http://localhost:11434/v1
# OpenRouter: https://openrouter.ai/api/v1
# Anthropic: https://api.anthropic.com/v1
BASE_URL=
# For OpenAI: https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key
# For Anthropic: https://console.anthropic.com/account/keys
# For OpenRouter: https://openrouter.ai/keys
# For Ollama, no need to set this unless you specifically configured an API key
LLM_API_KEY=
# Get your Open AI API Key by following these instructions -
# https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key
# Even if using Anthropic or OpenRouter, you still need to set this for the embedding model.
# No need to set this if using Ollama.
OPENAI_API_KEY=
# For the Supabase version (sample_supabase_agent.py), set your Supabase URL and Service Key.
# Get your SUPABASE_URL from the API section of your Supabase project settings -
# https://supabase.com/dashboard/project/<your project ID>/settings/api
SUPABASE_URL=
# Get your SUPABASE_SERVICE_KEY from the API section of your Supabase project settings -
# https://supabase.com/dashboard/project/<your project ID>/settings/api
# On this page it is called the service_role secret.
SUPABASE_SERVICE_KEY=
# The LLM you want to use for the reasoner (o3-mini, R1, QwQ, etc.).
# Example: o3-mini
# Example: deepseek-r1:7b-8k
REASONER_MODEL=
# The LLM you want to use for the primary agent/coder.
# Example: gpt-4o-mini
# Example: qwen2.5:14b-instruct-8k
PRIMARY_MODEL=
# Embedding model you want to use
# Example for Ollama: nomic-embed-text
# Example for OpenAI: text-embedding-3-small
EMBEDDING_MODEL=

View File

@@ -0,0 +1,28 @@
FROM python:3.12-slim
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
&& rm -rf /var/lib/apt/lists/*
# Copy requirements first for better caching
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy the rest of the application
COPY . .
# Set environment variables
ENV PYTHONUNBUFFERED=1
ENV PYTHONPATH=/app
# Expose port for Streamlit
EXPOSE 8501
# Expose port for the Archon Service (started within Streamlit)
EXPOSE 8100
# Set the entrypoint to run Streamlit directly
CMD ["streamlit", "run", "streamlit_ui.py", "--server.port=8501", "--server.address=0.0.0.0"]

View File

@@ -0,0 +1,151 @@
# Archon V6 - Tool Library and MCP Integration
This is the sixth iteration of the Archon project, building upon V5 by implementing a comprehensive library of prebuilt tools, examples, and MCP server integrations. The system retains the multi-agent coding workflow with specialized refiner agents from V5, but now adds a powerful advisor agent that can recommend and incorporate prebuilt components.
What makes V6 special is its approach to reducing development time and hallucinations through component reuse. The advisor agent can now analyze the user's requirements and recommend relevant prebuilt tools, examples, and MCP server integrations from the agent-resources library. This significantly enhances Archon's capabilities and reduces the need to create everything from scratch.
1. **Advisor Agent**: Recommends relevant prebuilt tools, examples, and MCP servers
2. **Tools Refiner Agent**: Now also validates and optimizes MCP server configurations
3. **Prebuilt Components Library**: Growing collection of reusable agent components
The core remains an intelligent multi-agent system built using Pydantic AI, LangGraph, and Supabase, but now with access to a library of prebuilt components that can be incorporated into new agents.
## Key Features
- **Prebuilt Tools Library**: Collection of ready-to-use tools for common agent tasks
- **Example Agents**: Reference implementations that can be adapted for new agents
- **MCP Server Integrations**: Preconfigured connections to various external services
- **Advisor Agent**: Recommends relevant prebuilt components based on requirements
- **Enhanced Tools Refiner**: Validates and optimizes MCP server configurations
- **Component Reuse**: Significantly reduces development time and hallucinations
- **Multi-Agent Workflow**: Retains the specialized refiner agents from V5
- **Streamlined External Access**: Easy integration with various services through MCP
## Architecture
The V6 architecture enhances the V5 workflow with prebuilt component integration:
1. **Initial Request**: User describes the AI agent they want to create
2. **Scope Definition**: Reasoner LLM creates a high-level scope for the agent
3. **Component Recommendation**: Advisor agent analyzes requirements and recommends relevant prebuilt components
4. **Initial Agent Creation**: Primary coding agent creates a cohesive initial agent, incorporating recommended components
5. **User Interaction**: User can provide feedback or request refinement
6. **Specialized Refinement**: When "refine" is requested, three specialized agents work in parallel:
- Prompt Refiner Agent optimizes the system prompt
- Tools Refiner Agent improves the agent's tools and validates MCP configurations
- Agent Refiner Agent enhances the agent configuration
7. **Integrated Improvements**: Primary coding agent incorporates all refinements
8. **Iterative Process**: Steps 5-7 repeat until the user is satisfied
9. **Finalization**: Archon provides the complete code with execution instructions
### Agent Graph
The LangGraph workflow orchestrates the entire process:
![Archon Graph](../../public/ArchonGraph.png)
The graph shows how control flows between different agents and how the advisor agent now contributes to the initial agent creation process.
## Prebuilt Components
### Agent-Resources Library
- Located in the `agent-resources` directory at the project root
- Organized into three main categories:
- `examples/`: Complete agent implementations that can be adapted
- `tools/`: Individual tools for specific tasks
- `mcps/`: Configuration files for MCP server integrations
### MCP Server Integrations
- Preconfigured connections to various external services
- JSON configuration files that define server capabilities
- Includes integrations for:
- Brave Search
- GitHub
- File System
- Git
- And many more
### Example Agents
- Complete agent implementations that can be used as templates
- Includes examples for:
- GitHub integration
- MCP server usage
- Web search functionality
### Prebuilt Tools
- Ready-to-use tools for common agent tasks
- Includes tools for:
- GitHub file access
- Web search
- And more
## Using the Prebuilt Components
To leverage the prebuilt components in V6:
1. Start a conversation with Archon and describe the agent you want to create
2. Archon will automatically analyze your requirements through the advisor agent
3. Relevant prebuilt components will be recommended and incorporated into your agent
4. You can request refinement to further optimize the agent
5. The tools refiner agent will validate and optimize any MCP server configurations
6. When satisfied, ask Archon to finalize the agent
## Core Files
### Advisor Components
- `archon/advisor_agent.py`: Agent that recommends relevant prebuilt components
- `archon/agent-resources/`: Directory containing prebuilt tools, examples, and MCP configurations
### Refiner Agents
- `archon/refiner_agents/tools_refiner_agent.py`: Enhanced to validate MCP configurations
### Workflow Orchestration
- `archon/archon_graph.py`: Updated LangGraph workflow with advisor integration
## Contributing
Contributions are welcome! The prebuilt component library is just starting out, so please feel free to contribute examples, MCP servers, and prebuilt tools by submitting a Pull Request.
## Prerequisites
- Docker (optional but preferred)
- Python 3.11+
- Supabase account (for vector database)
- OpenAI/Anthropic/OpenRouter API key or Ollama for local LLMs
## Installation
### Option 1: Docker (Recommended)
1. Clone the repository:
```bash
git clone https://github.com/coleam00/archon.git
cd archon
```
2. Run the Docker setup script:
```bash
# This will build both containers and start Archon
python run_docker.py
```
3. Access the Streamlit UI at http://localhost:8501.
### Option 2: Local Python Installation
1. Clone the repository:
```bash
git clone https://github.com/coleam00/archon.git
cd archon
```
2. Install dependencies:
```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
```
3. Start the Streamlit UI:
```bash
streamlit run streamlit_ui.py
```
4. Access the Streamlit UI at http://localhost:8501.

View File

@@ -0,0 +1,173 @@
from __future__ import annotations as _annotations
import asyncio
import os
from dataclasses import dataclass
from typing import Any, List, Dict
import tempfile
from pathlib import Path
from dotenv import load_dotenv
import shutil
import time
import re
import json
import httpx
import logfire
from pydantic_ai import Agent, ModelRetry, RunContext
from pydantic_ai.providers.openai import OpenAIProvider
from pydantic_ai.models.openai import OpenAIModel
from devtools import debug
load_dotenv()
llm = os.getenv('LLM_MODEL', 'deepseek/deepseek-chat')
model = OpenAIModel(
llm,
provider=OpenAIProvider(base_url="https://openrouter.ai/api/v1", api_key=os.getenv('OPEN_ROUTER_API_KEY'))
) if os.getenv('OPEN_ROUTER_API_KEY', None) else OpenAIModel(llm)
logfire.configure(send_to_logfire='if-token-present')
@dataclass
class GitHubDeps:
client: httpx.AsyncClient
github_token: str | None = None
system_prompt = """
You are a coding expert with access to GitHub to help the user manage their repository and get information from it.
Your only job is to assist with this and you don't answer other questions besides describing what you are able to do.
Don't ask the user before taking an action, just do it. Always make sure you look at the repository with the provided tools before answering the user's question unless you have already.
When answering a question about the repo, always start your answer with the full repo URL in brackets and then give your answer on a newline. Like:
[Using https://github.com/[repo URL from the user]]
Your answer here...
"""
github_agent = Agent(
model,
system_prompt=system_prompt,
deps_type=GitHubDeps,
retries=2
)
@github_agent.tool
async def get_repo_info(ctx: RunContext[GitHubDeps], github_url: str) -> str:
"""Get repository information including size and description using GitHub API.
Args:
ctx: The context.
github_url: The GitHub repository URL.
Returns:
str: Repository information as a formatted string.
"""
match = re.search(r'github\.com[:/]([^/]+)/([^/]+?)(?:\.git)?$', github_url)
if not match:
return "Invalid GitHub URL format"
owner, repo = match.groups()
headers = {'Authorization': f'token {ctx.deps.github_token}'} if ctx.deps.github_token else {}
response = await ctx.deps.client.get(
f'https://api.github.com/repos/{owner}/{repo}',
headers=headers
)
if response.status_code != 200:
return f"Failed to get repository info: {response.text}"
data = response.json()
size_mb = data['size'] / 1024
return (
f"Repository: {data['full_name']}\n"
f"Description: {data['description']}\n"
f"Size: {size_mb:.1f}MB\n"
f"Stars: {data['stargazers_count']}\n"
f"Language: {data['language']}\n"
f"Created: {data['created_at']}\n"
f"Last Updated: {data['updated_at']}"
)
@github_agent.tool
async def get_repo_structure(ctx: RunContext[GitHubDeps], github_url: str) -> str:
"""Get the directory structure of a GitHub repository.
Args:
ctx: The context.
github_url: The GitHub repository URL.
Returns:
str: Directory structure as a formatted string.
"""
match = re.search(r'github\.com[:/]([^/]+)/([^/]+?)(?:\.git)?$', github_url)
if not match:
return "Invalid GitHub URL format"
owner, repo = match.groups()
headers = {'Authorization': f'token {ctx.deps.github_token}'} if ctx.deps.github_token else {}
response = await ctx.deps.client.get(
f'https://api.github.com/repos/{owner}/{repo}/git/trees/main?recursive=1',
headers=headers
)
if response.status_code != 200:
# Try with master branch if main fails
response = await ctx.deps.client.get(
f'https://api.github.com/repos/{owner}/{repo}/git/trees/master?recursive=1',
headers=headers
)
if response.status_code != 200:
return f"Failed to get repository structure: {response.text}"
data = response.json()
tree = data['tree']
# Build directory structure
structure = []
for item in tree:
if not any(excluded in item['path'] for excluded in ['.git/', 'node_modules/', '__pycache__/']):
structure.append(f"{'📁 ' if item['type'] == 'tree' else '📄 '}{item['path']}")
return "\n".join(structure)
@github_agent.tool
async def get_file_content(ctx: RunContext[GitHubDeps], github_url: str, file_path: str) -> str:
"""Get the content of a specific file from the GitHub repository.
Args:
ctx: The context.
github_url: The GitHub repository URL.
file_path: Path to the file within the repository.
Returns:
str: File content as a string.
"""
match = re.search(r'github\.com[:/]([^/]+)/([^/]+?)(?:\.git)?$', github_url)
if not match:
return "Invalid GitHub URL format"
owner, repo = match.groups()
headers = {'Authorization': f'token {ctx.deps.github_token}'} if ctx.deps.github_token else {}
response = await ctx.deps.client.get(
f'https://raw.githubusercontent.com/{owner}/{repo}/main/{file_path}',
headers=headers
)
if response.status_code != 200:
# Try with master branch if main fails
response = await ctx.deps.client.get(
f'https://raw.githubusercontent.com/{owner}/{repo}/master/{file_path}',
headers=headers
)
if response.status_code != 200:
return f"Failed to get file content: {response.text}"
return response.text

View File

@@ -0,0 +1,33 @@
from pydantic_ai.providers.openai import OpenAIProvider
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.mcp import MCPServerStdio
from pydantic_ai import Agent
from dotenv import load_dotenv
import asyncio
import os
load_dotenv()
def get_model():
llm = os.getenv('MODEL_CHOICE', 'gpt-4o-mini')
base_url = os.getenv('BASE_URL', 'https://api.openai.com/v1')
api_key = os.getenv('LLM_API_KEY', 'no-api-key-provided')
return OpenAIModel(llm, provider=OpenAIProvider(base_url=base_url, api_key=api_key))
server = MCPServerStdio(
'npx',
['-y', '@modelcontextprotocol/server-brave-search', 'stdio'],
env={"BRAVE_API_KEY": os.getenv("BRAVE_API_KEY")}
)
agent = Agent(get_model(), mcp_servers=[server])
async def main():
async with agent.run_mcp_servers():
result = await agent.run('What is new with Gemini 2.5 Pro?')
print(result.data)
user_input = input("Press enter to quit...")
if __name__ == '__main__':
asyncio.run(main())

View File

@@ -0,0 +1,110 @@
from __future__ import annotations as _annotations
import asyncio
import os
from dataclasses import dataclass
from datetime import datetime
from typing import Any
import logfire
from devtools import debug
from httpx import AsyncClient
from dotenv import load_dotenv
from openai import AsyncOpenAI
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai import Agent, ModelRetry, RunContext
load_dotenv()
llm = os.getenv('LLM_MODEL', 'gpt-4o')
client = AsyncOpenAI(
base_url = 'http://localhost:11434/v1',
api_key='ollama'
)
model = OpenAIModel(llm) if llm.lower().startswith("gpt") else OpenAIModel(llm, openai_client=client)
# 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured
logfire.configure(send_to_logfire='if-token-present')
@dataclass
class Deps:
client: AsyncClient
brave_api_key: str | None
web_search_agent = Agent(
model,
system_prompt=f'You are an expert at researching the web to answer user questions. The current date is: {datetime.now().strftime("%Y-%m-%d")}',
deps_type=Deps,
retries=2
)
@web_search_agent.tool
async def search_web(
ctx: RunContext[Deps], web_query: str
) -> str:
"""Search the web given a query defined to answer the user's question.
Args:
ctx: The context.
web_query: The query for the web search.
Returns:
str: The search results as a formatted string.
"""
if ctx.deps.brave_api_key is None:
return "This is a test web search result. Please provide a Brave API key to get real search results."
headers = {
'X-Subscription-Token': ctx.deps.brave_api_key,
'Accept': 'application/json',
}
with logfire.span('calling Brave search API', query=web_query) as span:
r = await ctx.deps.client.get(
'https://api.search.brave.com/res/v1/web/search',
params={
'q': web_query,
'count': 5,
'text_decorations': True,
'search_lang': 'en'
},
headers=headers
)
r.raise_for_status()
data = r.json()
span.set_attribute('response', data)
results = []
# Add web results in a nice formatted way
web_results = data.get('web', {}).get('results', [])
for item in web_results[:3]:
title = item.get('title', '')
description = item.get('description', '')
url = item.get('url', '')
if title and description:
results.append(f"Title: {title}\nSummary: {description}\nSource: {url}\n")
return "\n".join(results) if results else "No results found for the query."
async def main():
async with AsyncClient() as client:
brave_api_key = os.getenv('BRAVE_API_KEY', None)
deps = Deps(client=client, brave_api_key=brave_api_key)
result = await web_search_agent.run(
'Give me some articles talking about the new release of React 19.', deps=deps
)
debug(result)
print('Response:', result.data)
if __name__ == '__main__':
asyncio.run(main())

View File

@@ -0,0 +1,14 @@
{
"mcpServers": {
"airtable": {
"command": "npx",
"args": [
"-y",
"airtable-mcp-server"
],
"env": {
"AIRTABLE_API_KEY": "pat123.abc123"
}
}
}
}

View File

@@ -0,0 +1,14 @@
{
"mcpServers": {
"brave-search": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-brave-search"
],
"env": {
"BRAVE_API_KEY": "YOUR_API_KEY_HERE"
}
}
}
}

View File

@@ -0,0 +1,14 @@
{
"mcpServers": {
"chroma": {
"command": "uvx",
"args": [
"chroma-mcp",
"--client-type",
"persistent",
"--data-dir",
"/full/path/to/your/data/directory"
]
}
}
}

View File

@@ -0,0 +1,13 @@
{
"mcpServers": {
"filesystem": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-filesystem",
"/Users/username/Desktop",
"/path/to/other/allowed/dir"
]
}
}
}

View File

@@ -0,0 +1,19 @@
{
"mcpServers": {
"mcp-server-firecrawl": {
"command": "npx",
"args": ["-y", "firecrawl-mcp"],
"env": {
"FIRECRAWL_API_KEY": "YOUR_API_KEY_HERE",
"FIRECRAWL_RETRY_MAX_ATTEMPTS": "5",
"FIRECRAWL_RETRY_INITIAL_DELAY": "2000",
"FIRECRAWL_RETRY_MAX_DELAY": "30000",
"FIRECRAWL_RETRY_BACKOFF_FACTOR": "3",
"FIRECRAWL_CREDIT_WARNING_THRESHOLD": "2000",
"FIRECRAWL_CREDIT_CRITICAL_THRESHOLD": "500"
}
}
}
}

View File

@@ -0,0 +1,16 @@
{
"mcpServers": {
"git": {
"command": "docker",
"args": [
"run",
"--rm",
"-i",
"--mount", "type=bind,src=/Users/username/Desktop,dst=/projects/Desktop",
"--mount", "type=bind,src=/path/to/other/allowed/dir,dst=/projects/other/allowed/dir,ro",
"--mount", "type=bind,src=/path/to/file.txt,dst=/projects/path/to/file.txt",
"mcp/git"
]
}
}
}

View File

@@ -0,0 +1,14 @@
{
"mcpServers": {
"github": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-github"
],
"env": {
"GITHUB_PERSONAL_ACCESS_TOKEN": "<YOUR_TOKEN>"
}
}
}
}

View File

@@ -0,0 +1,8 @@
{
"mcpServers": {
"gdrive": {
"command": "docker",
"args": ["run", "-i", "--rm", "-v", "mcp-gdrive:/gdrive-server", "-e", "GDRIVE_CREDENTIALS_PATH=/gdrive-server/credentials.json", "mcp/gdrive"]
}
}
}

View File

@@ -0,0 +1,12 @@
{
"qdrant": {
"command": "uvx",
"args": ["mcp-server-qdrant"],
"env": {
"QDRANT_URL": "https://xyz-example.eu-central.aws.cloud.qdrant.io:6333",
"QDRANT_API_KEY": "your_api_key",
"COLLECTION_NAME": "your-collection-name",
"EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2"
}
}
}

View File

@@ -0,0 +1,12 @@
{
"mcpServers": {
"redis": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-redis",
"redis://localhost:6379"
]
}
}
}

View File

@@ -0,0 +1,15 @@
{
"mcpServers": {
"slack": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-slack"
],
"env": {
"SLACK_BOT_TOKEN": "xoxb-your-bot-token",
"SLACK_TEAM_ID": "T01234567"
}
}
}
}

View File

@@ -0,0 +1,17 @@
{
"mcpServers": {
"sqlite": {
"command": "docker",
"args": [
"run",
"--rm",
"-i",
"-v",
"mcp-test:/mcp",
"mcp/sqlite",
"--db-path",
"/mcp/test.db"
]
}
}
}

View File

@@ -0,0 +1,34 @@
@github_agent.tool
async def get_file_content(ctx: RunContext[GitHubDeps], github_url: str, file_path: str) -> str:
"""Get the content of a specific file from the GitHub repository.
Args:
ctx: The context.
github_url: The GitHub repository URL.
file_path: Path to the file within the repository.
Returns:
str: File content as a string.
"""
match = re.search(r'github\.com[:/]([^/]+)/([^/]+?)(?:\.git)?$', github_url)
if not match:
return "Invalid GitHub URL format"
owner, repo = match.groups()
headers = {'Authorization': f'token {ctx.deps.github_token}'} if ctx.deps.github_token else {}
response = await ctx.deps.client.get(
f'https://raw.githubusercontent.com/{owner}/{repo}/main/{file_path}',
headers=headers
)
if response.status_code != 200:
# Try with master branch if main fails
response = await ctx.deps.client.get(
f'https://raw.githubusercontent.com/{owner}/{repo}/master/{file_path}',
headers=headers
)
if response.status_code != 200:
return f"Failed to get file content: {response.text}"
return response.text

View File

@@ -0,0 +1,42 @@
@github_agent.tool
async def get_repo_structure(ctx: RunContext[GitHubDeps], github_url: str) -> str:
"""Get the directory structure of a GitHub repository.
Args:
ctx: The context.
github_url: The GitHub repository URL.
Returns:
str: Directory structure as a formatted string.
"""
match = re.search(r'github\.com[:/]([^/]+)/([^/]+?)(?:\.git)?$', github_url)
if not match:
return "Invalid GitHub URL format"
owner, repo = match.groups()
headers = {'Authorization': f'token {ctx.deps.github_token}'} if ctx.deps.github_token else {}
response = await ctx.deps.client.get(
f'https://api.github.com/repos/{owner}/{repo}/git/trees/main?recursive=1',
headers=headers
)
if response.status_code != 200:
# Try with master branch if main fails
response = await ctx.deps.client.get(
f'https://api.github.com/repos/{owner}/{repo}/git/trees/master?recursive=1',
headers=headers
)
if response.status_code != 200:
return f"Failed to get repository structure: {response.text}"
data = response.json()
tree = data['tree']
# Build directory structure
structure = []
for item in tree:
if not any(excluded in item['path'] for excluded in ['.git/', 'node_modules/', '__pycache__/']):
structure.append(f"{'📁 ' if item['type'] == 'tree' else '📄 '}{item['path']}")
return "\n".join(structure)

View File

@@ -0,0 +1,38 @@
@github_agent.tool
async def get_repo_info(ctx: RunContext[GitHubDeps], github_url: str) -> str:
"""Get repository information including size and description using GitHub API.
Args:
ctx: The context.
github_url: The GitHub repository URL.
Returns:
str: Repository information as a formatted string.
"""
match = re.search(r'github\.com[:/]([^/]+)/([^/]+?)(?:\.git)?$', github_url)
if not match:
return "Invalid GitHub URL format"
owner, repo = match.groups()
headers = {'Authorization': f'token {ctx.deps.github_token}'} if ctx.deps.github_token else {}
response = await ctx.deps.client.get(
f'https://api.github.com/repos/{owner}/{repo}',
headers=headers
)
if response.status_code != 200:
return f"Failed to get repository info: {response.text}"
data = response.json()
size_mb = data['size'] / 1024
return (
f"Repository: {data['full_name']}\n"
f"Description: {data['description']}\n"
f"Size: {size_mb:.1f}MB\n"
f"Stars: {data['stargazers_count']}\n"
f"Language: {data['language']}\n"
f"Created: {data['created_at']}\n"
f"Last Updated: {data['updated_at']}"
)

View File

@@ -0,0 +1,48 @@
@web_search_agent.tool
async def search_web(
ctx: RunContext[Deps], web_query: str
) -> str:
"""Search the web given a query defined to answer the user's question.
Args:
ctx: The context.
web_query: The query for the web search.
Returns:
str: The search results as a formatted string.
"""
if ctx.deps.brave_api_key is None:
return "This is a test web search result. Please provide a Brave API key to get real search results."
headers = {
'X-Subscription-Token': ctx.deps.brave_api_key,
'Accept': 'application/json',
}
with logfire.span('calling Brave search API', query=web_query) as span:
r = await ctx.deps.client.get(
'https://api.search.brave.com/res/v1/web/search',
params={
'q': web_query,
'count': 5,
'text_decorations': True,
'search_lang': 'en'
},
headers=headers
)
r.raise_for_status()
data = r.json()
span.set_attribute('response', data)
results = []
# Add web results in a nice formatted way
web_results = data.get('web', {}).get('results', [])
for item in web_results[:3]:
title = item.get('title', '')
description = item.get('description', '')
url = item.get('url', '')
if title and description:
results.append(f"Title: {title}\nSummary: {description}\nSource: {url}\n")
return "\n".join(results) if results else "No results found for the query."

View File

@@ -0,0 +1,67 @@
from __future__ import annotations as _annotations
from dataclasses import dataclass
from dotenv import load_dotenv
import logfire
import asyncio
import httpx
import os
import sys
import json
from typing import List
from pydantic import BaseModel
from pydantic_ai import Agent, ModelRetry, RunContext
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.models.openai import OpenAIModel
from openai import AsyncOpenAI
from supabase import Client
# Add the parent directory to sys.path to allow importing from the parent directory
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from utils.utils import get_env_var
from archon.agent_prompts import advisor_prompt
from archon.agent_tools import get_file_content_tool
load_dotenv()
provider = get_env_var('LLM_PROVIDER') or 'OpenAI'
llm = get_env_var('PRIMARY_MODEL') or 'gpt-4o-mini'
base_url = get_env_var('BASE_URL') or 'https://api.openai.com/v1'
api_key = get_env_var('LLM_API_KEY') or 'no-llm-api-key-provided'
model = AnthropicModel(llm, api_key=api_key) if provider == "Anthropic" else OpenAIModel(llm, base_url=base_url, api_key=api_key)
logfire.configure(send_to_logfire='if-token-present')
@dataclass
class AdvisorDeps:
file_list: List[str]
advisor_agent = Agent(
model,
system_prompt=advisor_prompt,
deps_type=AdvisorDeps,
retries=2
)
@advisor_agent.system_prompt
def add_file_list(ctx: RunContext[str]) -> str:
return f"""
\n\Here is the list of all the files that you can pull the contents of with the
'get_file_content' tool if the example/tool/MCP server is relevant to the
agent the user is trying to build:\n
{"\n".join(ctx.deps.file_list)}
"""
@advisor_agent.tool_plain
def get_file_content(file_path: str) -> str:
"""
Retrieves the content of a specific file. Use this to get the contents of an example, tool, config for an MCP server
Args:
file_path: The path to the file
Returns:
The raw contents of the file
"""
return get_file_content_tool(file_path)

View File

@@ -0,0 +1,334 @@
advisor_prompt = """
You are an AI agent engineer specialized in using example code and prebuilt tools/MCP servers
and synthesizing these prebuilt components into a recommended starting point for the primary coding agent.
You will be given a prompt from the user for the AI agent they want to build, and also a list of examples,
prebuilt tools, and MCP servers you can use to aid in creating the agent so the least amount of code possible
has to be recreated.
Use the file name to determine if the example/tool/MCP server is relevant to the agent the user is requesting.
Examples will be in the examples/ folder. These are examples of AI agents to use as a starting point if applicable.
Prebuilt tools will be in the tools/ folder. Use some or none of these depending on if any of the prebuilt tools
would be needed for the agent.
MCP servers will be in the mcps/ folder. These are all config files that show the necessary parameters to set up each
server. MCP servers are just pre-packaged tools that you can include in the agent.
Take a look at examples/pydantic_mpc_agent.py to see how to incorporate MCP servers into the agents.
For example, if the Brave Search MCP config is:
{
"mcpServers": {
"brave-search": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-brave-search"
],
"env": {
"BRAVE_API_KEY": "YOUR_API_KEY_HERE"
}
}
}
}
Then the way to connect that into the agent is:
server = MCPServerStdio(
'npx',
['-y', '@modelcontextprotocol/server-brave-search', 'stdio'],
env={"BRAVE_API_KEY": os.getenv("BRAVE_API_KEY")}
)
agent = Agent(get_model(), mcp_servers=[server])
So you can see how you would map the config parameters to the MCPServerStdio instantiation.
You are given a single tool to look at the contents of any file, so call this as many times as you need to look
at the different files given to you that you think are relevant for the AI agent being created.
IMPORTANT: Only look at a few examples/tools/servers. Keep your search concise.
Your primary job at the end of looking at examples/tools/MCP servers is to provide a recommendation for a starting
point of an AI agent that uses applicable resources you pulled. Only focus on the examples/tools/servers that
are actually relevant to the AI agent the user requested.
"""
prompt_refiner_prompt = """
You are an AI agent engineer specialized in refining prompts for the agents.
Your only job is to take the current prompt from the conversation, and refine it so the agent being created
has optimal instructions to carry out its role and tasks.
You want the prompt to:
1. Clearly describe the role of the agent
2. Provide concise and easy to understand goals
3. Help the agent understand when and how to use each tool provided
4. Give interactaction guidelines
5. Provide instructions for handling issues/errors
Output the new prompt and nothing else.
"""
tools_refiner_prompt = """
You are an AI agent engineer specialized in refining tools for the agents.
You have comprehensive access to the Pydantic AI documentation, including API references, usage guides, and implementation examples.
You also have access to a list of files mentioned below that give you examples, prebuilt tools, and MCP servers
you can reference when vaildating the tools and MCP servers given to the current agent.
Your only job is to take the current tools/MCP servers from the conversation, and refine them so the agent being created
has the optimal tooling to fulfill its role and tasks. Also make sure the tools are coded properly
and allow the agent to solve the problems they are meant to help with.
For each tool, ensure that it:
1. Has a clear docstring to help the agent understand when and how to use it
2. Has correct arguments
3. Uses the run context properly if applicable (not all tools need run context)
4. Is coded properly (uses API calls correctly for the services, returns the correct data, etc.)
5. Handles errors properly
For each MCP server:
1. Get the contents of the JSON config for the server
2. Make sure the name of the server and arguments match what is in the config
3. Make sure the correct environment variables are used
Only change what is necessary to refine the tools and MCP server definitions, don't go overboard
unless of course the tools are broken and need a lot of fixing.
Output the new code for the tools/MCP servers and nothing else.
"""
agent_refiner_prompt = """
You are an AI agent engineer specialized in refining agent definitions in code.
There are other agents handling refining the prompt and tools, so your job is to make sure the higher
level definition of the agent (depedencies, setting the LLM, etc.) is all correct.
You have comprehensive access to the Pydantic AI documentation, including API references, usage guides, and implementation examples.
Your only job is to take the current agent definition from the conversation, and refine it so the agent being created
has dependencies, the LLM, the prompt, etc. all configured correctly. Use the Pydantic AI documentation tools to
confirm that the agent is set up properly, and only change the current definition if it doesn't align with
the documentation.
Output the agent depedency and definition code if it needs to change and nothing else.
"""
primary_coder_prompt = """
[ROLE AND CONTEXT]
You are a specialized AI agent engineer focused on building robust Pydantic AI agents. You have comprehensive access to the Pydantic AI documentation, including API references, usage guides, and implementation examples.
[CORE RESPONSIBILITIES]
1. Agent Development
- Create new agents from user requirements
- Complete partial agent implementations
- Optimize and debug existing agents
- Guide users through agent specification if needed
2. Documentation Integration
- Systematically search documentation using RAG before any implementation
- Cross-reference multiple documentation pages for comprehensive understanding
- Validate all implementations against current best practices
- Notify users if documentation is insufficient for any requirement
[CODE STRUCTURE AND DELIVERABLES]
All new agents must include these files with complete, production-ready code:
1. agent.py
- Primary agent definition and configuration
- Core agent logic and behaviors
- No tool implementations allowed here
2. agent_tools.py
- All tool function implementations
- Tool configurations and setup
- External service integrations
3. agent_prompts.py
- System prompts
- Task-specific prompts
- Conversation templates
- Instruction sets
4. .env.example
- Required environment variables
- Clear setup instructions in a comment above the variable for how to do so
- API configuration templates
5. requirements.txt
- Core dependencies without versions
- User-specified packages included
[DOCUMENTATION WORKFLOW]
1. Initial Research
- Begin with RAG search for relevant documentation
- List all documentation pages using list_documentation_pages
- Retrieve specific page content using get_page_content
- Cross-reference the weather agent example for best practices
2. Implementation
- Provide complete, working code implementations
- Never leave placeholder functions
- Include all necessary error handling
- Implement proper logging and monitoring
3. Quality Assurance
- Verify all tool implementations are complete
- Ensure proper separation of concerns
- Validate environment variable handling
- Test critical path functionality
[INTERACTION GUIDELINES]
- Take immediate action without asking for permission
- Always verify documentation before implementation
- Provide honest feedback about documentation gaps
- Include specific enhancement suggestions
- Request user feedback on implementations
- Maintain code consistency across files
- After providing code, ask the user at the end if they want you to refine the agent autonomously,
otherwise they can give feedback for you to use. The can specifically say 'refine' for you to continue
working on the agent through self reflection.
[ERROR HANDLING]
- Implement robust error handling in all tools
- Provide clear error messages
- Include recovery mechanisms
- Log important state changes
[BEST PRACTICES]
- Follow Pydantic AI naming conventions
- Implement proper type hints
- Include comprehensive docstrings, the agent uses this to understand what tools are for.
- Maintain clean code structure
- Use consistent formatting
Here is a good example of a Pydantic AI agent:
```python
from __future__ import annotations as _annotations
import asyncio
import os
from dataclasses import dataclass
from typing import Any
import logfire
from devtools import debug
from httpx import AsyncClient
from pydantic_ai import Agent, ModelRetry, RunContext
# 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured
logfire.configure(send_to_logfire='if-token-present')
@dataclass
class Deps:
client: AsyncClient
weather_api_key: str | None
geo_api_key: str | None
weather_agent = Agent(
'openai:gpt-4o',
# 'Be concise, reply with one sentence.' is enough for some models (like openai) to use
# the below tools appropriately, but others like anthropic and gemini require a bit more direction.
system_prompt=(
'Be concise, reply with one sentence.'
'Use the `get_lat_lng` tool to get the latitude and longitude of the locations, '
'then use the `get_weather` tool to get the weather.'
),
deps_type=Deps,
retries=2,
)
@weather_agent.tool
async def get_lat_lng(
ctx: RunContext[Deps], location_description: str
) -> dict[str, float]:
\"\"\"Get the latitude and longitude of a location.
Args:
ctx: The context.
location_description: A description of a location.
\"\"\"
if ctx.deps.geo_api_key is None:
# if no API key is provided, return a dummy response (London)
return {'lat': 51.1, 'lng': -0.1}
params = {
'q': location_description,
'api_key': ctx.deps.geo_api_key,
}
with logfire.span('calling geocode API', params=params) as span:
r = await ctx.deps.client.get('https://geocode.maps.co/search', params=params)
r.raise_for_status()
data = r.json()
span.set_attribute('response', data)
if data:
return {'lat': data[0]['lat'], 'lng': data[0]['lon']}
else:
raise ModelRetry('Could not find the location')
@weather_agent.tool
async def get_weather(ctx: RunContext[Deps], lat: float, lng: float) -> dict[str, Any]:
\"\"\"Get the weather at a location.
Args:
ctx: The context.
lat: Latitude of the location.
lng: Longitude of the location.
\"\"\"
if ctx.deps.weather_api_key is None:
# if no API key is provided, return a dummy response
return {'temperature': '21 °C', 'description': 'Sunny'}
params = {
'apikey': ctx.deps.weather_api_key,
'location': f'{lat},{lng}',
'units': 'metric',
}
with logfire.span('calling weather API', params=params) as span:
r = await ctx.deps.client.get(
'https://api.tomorrow.io/v4/weather/realtime', params=params
)
r.raise_for_status()
data = r.json()
span.set_attribute('response', data)
values = data['data']['values']
# https://docs.tomorrow.io/reference/data-layers-weather-codes
code_lookup = {
...
}
return {
'temperature': f'{values["temperatureApparent"]:0.0f}°C',
'description': code_lookup.get(values['weatherCode'], 'Unknown'),
}
async def main():
async with AsyncClient() as client:
# create a free API key at https://www.tomorrow.io/weather-api/
weather_api_key = os.getenv('WEATHER_API_KEY')
# create a free API key at https://geocode.maps.co/
geo_api_key = os.getenv('GEO_API_KEY')
deps = Deps(
client=client, weather_api_key=weather_api_key, geo_api_key=geo_api_key
)
result = await weather_agent.run(
'What is the weather like in London and in Wiltshire?', deps=deps
)
debug(result)
print('Response:', result.data)
if __name__ == '__main__':
asyncio.run(main())
```
"""

View File

@@ -0,0 +1,141 @@
from typing import Dict, Any, List, Optional
from openai import AsyncOpenAI
from supabase import Client
import sys
import os
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from utils.utils import get_env_var
embedding_model = get_env_var('EMBEDDING_MODEL') or 'text-embedding-3-small'
async def get_embedding(text: str, embedding_client: AsyncOpenAI) -> List[float]:
"""Get embedding vector from OpenAI."""
try:
response = await embedding_client.embeddings.create(
model=embedding_model,
input=text
)
return response.data[0].embedding
except Exception as e:
print(f"Error getting embedding: {e}")
return [0] * 1536 # Return zero vector on error
async def retrieve_relevant_documentation_tool(supabase: Client, embedding_client: AsyncOpenAI, user_query: str) -> str:
try:
# Get the embedding for the query
query_embedding = await get_embedding(user_query, embedding_client)
# Query Supabase for relevant documents
result = supabase.rpc(
'match_site_pages',
{
'query_embedding': query_embedding,
'match_count': 4,
'filter': {'source': 'pydantic_ai_docs'}
}
).execute()
if not result.data:
return "No relevant documentation found."
# Format the results
formatted_chunks = []
for doc in result.data:
chunk_text = f"""
# {doc['title']}
{doc['content']}
"""
formatted_chunks.append(chunk_text)
# Join all chunks with a separator
return "\n\n---\n\n".join(formatted_chunks)
except Exception as e:
print(f"Error retrieving documentation: {e}")
return f"Error retrieving documentation: {str(e)}"
async def list_documentation_pages_tool(supabase: Client) -> List[str]:
"""
Function to retrieve a list of all available Pydantic AI documentation pages.
This is called by the list_documentation_pages tool and also externally
to fetch documentation pages for the reasoner LLM.
Returns:
List[str]: List of unique URLs for all documentation pages
"""
try:
# Query Supabase for unique URLs where source is pydantic_ai_docs
result = supabase.from_('site_pages') \
.select('url') \
.eq('metadata->>source', 'pydantic_ai_docs') \
.execute()
if not result.data:
return []
# Extract unique URLs
urls = sorted(set(doc['url'] for doc in result.data))
return urls
except Exception as e:
print(f"Error retrieving documentation pages: {e}")
return []
async def get_page_content_tool(supabase: Client, url: str) -> str:
"""
Retrieve the full content of a specific documentation page by combining all its chunks.
Args:
ctx: The context including the Supabase client
url: The URL of the page to retrieve
Returns:
str: The complete page content with all chunks combined in order
"""
try:
# Query Supabase for all chunks of this URL, ordered by chunk_number
result = supabase.from_('site_pages') \
.select('title, content, chunk_number') \
.eq('url', url) \
.eq('metadata->>source', 'pydantic_ai_docs') \
.order('chunk_number') \
.execute()
if not result.data:
return f"No content found for URL: {url}"
# Format the page with its title and all chunks
page_title = result.data[0]['title'].split(' - ')[0] # Get the main title
formatted_content = [f"# {page_title}\n"]
# Add each chunk's content
for chunk in result.data:
formatted_content.append(chunk['content'])
# Join everything together but limit the characters in case the page is massive (there are a coule big ones)
# This will be improved later so if the page is too big RAG will be performed on the page itself
return "\n\n".join(formatted_content)[:20000]
except Exception as e:
print(f"Error retrieving page content: {e}")
return f"Error retrieving page content: {str(e)}"
def get_file_content_tool(file_path: str) -> str:
"""
Retrieves the content of a specific file. Use this to get the contents of an example, tool, config for an MCP server
Args:
file_path: The path to the file
Returns:
The raw contents of the file
"""
try:
with open(file_path, "r") as file:
file_contents = file.read()
return file_contents
except Exception as e:
print(f"Error retrieving file contents: {e}")
return f"Error retrieving file contents: {str(e)}"

View File

@@ -0,0 +1,342 @@
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai import Agent, RunContext
from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.memory import MemorySaver
from typing import TypedDict, Annotated, List, Any
from langgraph.config import get_stream_writer
from langgraph.types import interrupt
from dotenv import load_dotenv
from openai import AsyncOpenAI
from supabase import Client
import logfire
import os
import sys
# Import the message classes from Pydantic AI
from pydantic_ai.messages import (
ModelMessage,
ModelMessagesTypeAdapter
)
# Add the parent directory to Python path
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from archon.pydantic_ai_coder import pydantic_ai_coder, PydanticAIDeps
from archon.advisor_agent import advisor_agent, AdvisorDeps
from archon.refiner_agents.prompt_refiner_agent import prompt_refiner_agent
from archon.refiner_agents.tools_refiner_agent import tools_refiner_agent, ToolsRefinerDeps
from archon.refiner_agents.agent_refiner_agent import agent_refiner_agent, AgentRefinerDeps
from archon.agent_tools import list_documentation_pages_tool
from utils.utils import get_env_var, get_clients
# Load environment variables
load_dotenv()
# Configure logfire to suppress warnings (optional)
logfire.configure(send_to_logfire='never')
provider = get_env_var('LLM_PROVIDER') or 'OpenAI'
base_url = get_env_var('BASE_URL') or 'https://api.openai.com/v1'
api_key = get_env_var('LLM_API_KEY') or 'no-llm-api-key-provided'
is_anthropic = provider == "Anthropic"
is_openai = provider == "OpenAI"
reasoner_llm_model_name = get_env_var('REASONER_MODEL') or 'o3-mini'
reasoner_llm_model = AnthropicModel(reasoner_llm_model_name, api_key=api_key) if is_anthropic else OpenAIModel(reasoner_llm_model_name, base_url=base_url, api_key=api_key)
reasoner = Agent(
reasoner_llm_model,
system_prompt='You are an expert at coding AI agents with Pydantic AI and defining the scope for doing so.',
)
primary_llm_model_name = get_env_var('PRIMARY_MODEL') or 'gpt-4o-mini'
primary_llm_model = AnthropicModel(primary_llm_model_name, api_key=api_key) if is_anthropic else OpenAIModel(primary_llm_model_name, base_url=base_url, api_key=api_key)
router_agent = Agent(
primary_llm_model,
system_prompt='Your job is to route the user message either to the end of the conversation or to continue coding the AI agent.',
)
end_conversation_agent = Agent(
primary_llm_model,
system_prompt='Your job is to end a conversation for creating an AI agent by giving instructions for how to execute the agent and they saying a nice goodbye to the user.',
)
# Initialize clients
embedding_client, supabase = get_clients()
# Define state schema
class AgentState(TypedDict):
latest_user_message: str
messages: Annotated[List[bytes], lambda x, y: x + y]
scope: str
advisor_output: str
file_list: List[str]
refined_prompt: str
refined_tools: str
refined_agent: str
# Scope Definition Node with Reasoner LLM
async def define_scope_with_reasoner(state: AgentState):
# First, get the documentation pages so the reasoner can decide which ones are necessary
documentation_pages = await list_documentation_pages_tool(supabase)
documentation_pages_str = "\n".join(documentation_pages)
# Then, use the reasoner to define the scope
prompt = f"""
User AI Agent Request: {state['latest_user_message']}
Create detailed scope document for the AI agent including:
- Architecture diagram
- Core components
- External dependencies
- Testing strategy
Also based on these documentation pages available:
{documentation_pages_str}
Include a list of documentation pages that are relevant to creating this agent for the user in the scope document.
"""
result = await reasoner.run(prompt)
scope = result.data
# Get the directory one level up from the current file
current_dir = os.path.dirname(os.path.abspath(__file__))
parent_dir = os.path.dirname(current_dir)
scope_path = os.path.join(parent_dir, "workbench", "scope.md")
os.makedirs(os.path.join(parent_dir, "workbench"), exist_ok=True)
with open(scope_path, "w", encoding="utf-8") as f:
f.write(scope)
return {"scope": scope}
# Advisor agent - create a starting point based on examples and prebuilt tools/MCP servers
async def advisor_with_examples(state: AgentState):
# Get the directory one level up from the current file (archon_graph.py)
current_dir = os.path.dirname(os.path.abspath(__file__))
parent_dir = os.path.dirname(current_dir)
# The agent-resources folder is adjacent to the parent folder of archon_graph.py
agent_resources_dir = os.path.join(parent_dir, "agent-resources")
# Get a list of all files in the agent-resources directory and its subdirectories
file_list = []
for root, dirs, files in os.walk(agent_resources_dir):
for file in files:
# Get the full path to the file
file_path = os.path.join(root, file)
# Use the full path instead of relative path
file_list.append(file_path)
# Then, prompt the advisor with the list of files it can use for examples and tools
deps = AdvisorDeps(file_list=file_list)
result = await advisor_agent.run(state['latest_user_message'], deps=deps)
advisor_output = result.data
return {"file_list": file_list, "advisor_output": advisor_output}
# Coding Node with Feedback Handling
async def coder_agent(state: AgentState, writer):
# Prepare dependencies
deps = PydanticAIDeps(
supabase=supabase,
embedding_client=embedding_client,
reasoner_output=state['scope'],
advisor_output=state['advisor_output']
)
# Get the message history into the format for Pydantic AI
message_history: list[ModelMessage] = []
for message_row in state['messages']:
message_history.extend(ModelMessagesTypeAdapter.validate_json(message_row))
# The prompt either needs to be the user message (initial agent request or feedback)
# or the refined prompt/tools/agent if we are in that stage of the agent creation process
if 'refined_prompt' in state and state['refined_prompt']:
prompt = f"""
I need you to refine the agent you created.
Here is the refined prompt:\n
{state['refined_prompt']}\n\n
Here are the refined tools:\n
{state['refined_tools']}\n
And finally, here are the changes to the agent definition to make if any:\n
{state['refined_agent']}\n\n
Output any changes necessary to the agent code based on these refinements.
"""
else:
prompt = state['latest_user_message']
# Run the agent in a stream
if not is_openai:
writer = get_stream_writer()
result = await pydantic_ai_coder.run(prompt, deps=deps, message_history=message_history)
writer(result.data)
else:
async with pydantic_ai_coder.run_stream(
state['latest_user_message'],
deps=deps,
message_history=message_history
) as result:
# Stream partial text as it arrives
async for chunk in result.stream_text(delta=True):
writer(chunk)
# print(ModelMessagesTypeAdapter.validate_json(result.new_messages_json()))
# Add the new conversation history (including tool calls)
# Reset the refined properties in case they were just used to refine the agent
return {
"messages": [result.new_messages_json()],
"refined_prompt": "",
"refined_tools": "",
"refined_agent": ""
}
# Interrupt the graph to get the user's next message
def get_next_user_message(state: AgentState):
value = interrupt({})
# Set the user's latest message for the LLM to continue the conversation
return {
"latest_user_message": value
}
# Determine if the user is finished creating their AI agent or not
async def route_user_message(state: AgentState):
prompt = f"""
The user has sent a message:
{state['latest_user_message']}
If the user wants to end the conversation, respond with just the text "finish_conversation".
If the user wants to continue coding the AI agent and gave feedback, respond with just the text "coder_agent".
If the user asks specifically to "refine" the agent, respond with just the text "refine".
"""
result = await router_agent.run(prompt)
if result.data == "finish_conversation": return "finish_conversation"
if result.data == "refine": return ["refine_prompt", "refine_tools", "refine_agent"]
return "coder_agent"
# Refines the prompt for the AI agent
async def refine_prompt(state: AgentState):
# Get the message history into the format for Pydantic AI
message_history: list[ModelMessage] = []
for message_row in state['messages']:
message_history.extend(ModelMessagesTypeAdapter.validate_json(message_row))
prompt = "Based on the current conversation, refine the prompt for the agent."
# Run the agent to refine the prompt for the agent being created
result = await prompt_refiner_agent.run(prompt, message_history=message_history)
return {"refined_prompt": result.data}
# Refines the tools for the AI agent
async def refine_tools(state: AgentState):
# Prepare dependencies
deps = ToolsRefinerDeps(
supabase=supabase,
embedding_client=embedding_client,
file_list=state['file_list']
)
# Get the message history into the format for Pydantic AI
message_history: list[ModelMessage] = []
for message_row in state['messages']:
message_history.extend(ModelMessagesTypeAdapter.validate_json(message_row))
prompt = "Based on the current conversation, refine the tools for the agent."
# Run the agent to refine the tools for the agent being created
result = await tools_refiner_agent.run(prompt, deps=deps, message_history=message_history)
return {"refined_tools": result.data}
# Refines the defintion for the AI agent
async def refine_agent(state: AgentState):
# Prepare dependencies
deps = AgentRefinerDeps(
supabase=supabase,
embedding_client=embedding_client
)
# Get the message history into the format for Pydantic AI
message_history: list[ModelMessage] = []
for message_row in state['messages']:
message_history.extend(ModelMessagesTypeAdapter.validate_json(message_row))
prompt = "Based on the current conversation, refine the agent definition."
# Run the agent to refine the definition for the agent being created
result = await agent_refiner_agent.run(prompt, deps=deps, message_history=message_history)
return {"refined_agent": result.data}
# End of conversation agent to give instructions for executing the agent
async def finish_conversation(state: AgentState, writer):
# Get the message history into the format for Pydantic AI
message_history: list[ModelMessage] = []
for message_row in state['messages']:
message_history.extend(ModelMessagesTypeAdapter.validate_json(message_row))
# Run the agent in a stream
if not is_openai:
writer = get_stream_writer()
result = await end_conversation_agent.run(state['latest_user_message'], message_history= message_history)
writer(result.data)
else:
async with end_conversation_agent.run_stream(
state['latest_user_message'],
message_history= message_history
) as result:
# Stream partial text as it arrives
async for chunk in result.stream_text(delta=True):
writer(chunk)
return {"messages": [result.new_messages_json()]}
# Build workflow
builder = StateGraph(AgentState)
# Add nodes
builder.add_node("define_scope_with_reasoner", define_scope_with_reasoner)
builder.add_node("advisor_with_examples", advisor_with_examples)
builder.add_node("coder_agent", coder_agent)
builder.add_node("get_next_user_message", get_next_user_message)
builder.add_node("refine_prompt", refine_prompt)
builder.add_node("refine_tools", refine_tools)
builder.add_node("refine_agent", refine_agent)
builder.add_node("finish_conversation", finish_conversation)
# Set edges
builder.add_edge(START, "define_scope_with_reasoner")
builder.add_edge(START, "advisor_with_examples")
builder.add_edge("define_scope_with_reasoner", "coder_agent")
builder.add_edge("advisor_with_examples", "coder_agent")
builder.add_edge("coder_agent", "get_next_user_message")
builder.add_conditional_edges(
"get_next_user_message",
route_user_message,
["coder_agent", "finish_conversation", "refine_prompt", "refine_tools", "refine_agent"]
)
builder.add_edge("refine_prompt", "coder_agent")
builder.add_edge("refine_tools", "coder_agent")
builder.add_edge("refine_agent", "coder_agent")
builder.add_edge("finish_conversation", END)
# Configure persistence
memory = MemorySaver()
agentic_flow = builder.compile(checkpointer=memory)

View File

@@ -0,0 +1,513 @@
import os
import sys
import asyncio
import threading
import subprocess
import requests
import json
import time
from typing import List, Dict, Any, Optional, Callable
from xml.etree import ElementTree
from dataclasses import dataclass
from datetime import datetime, timezone
from urllib.parse import urlparse
from dotenv import load_dotenv
from openai import AsyncOpenAI
import re
import html2text
# Add the parent directory to sys.path to allow importing from the parent directory
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from utils.utils import get_env_var, get_clients
from crawl4ai import AsyncWebCrawler, BrowserConfig, CrawlerRunConfig, CacheMode
load_dotenv()
# Initialize embedding and Supabase clients
embedding_client, supabase = get_clients()
# Define the embedding model for embedding the documentation for RAG
embedding_model = get_env_var('EMBEDDING_MODEL') or 'text-embedding-3-small'
# LLM client setup
llm_client = None
base_url = get_env_var('BASE_URL') or 'https://api.openai.com/v1'
api_key = get_env_var('LLM_API_KEY') or 'no-api-key-provided'
provider = get_env_var('LLM_PROVIDER') or 'OpenAI'
# Setup OpenAI client for LLM
if provider == "Ollama":
if api_key == "NOT_REQUIRED":
api_key = "ollama" # Use a dummy key for Ollama
llm_client = AsyncOpenAI(base_url=base_url, api_key=api_key)
else:
llm_client = AsyncOpenAI(base_url=base_url, api_key=api_key)
# Initialize HTML to Markdown converter
html_converter = html2text.HTML2Text()
html_converter.ignore_links = False
html_converter.ignore_images = False
html_converter.ignore_tables = False
html_converter.body_width = 0 # No wrapping
@dataclass
class ProcessedChunk:
url: str
chunk_number: int
title: str
summary: str
content: str
metadata: Dict[str, Any]
embedding: List[float]
class CrawlProgressTracker:
"""Class to track progress of the crawling process."""
def __init__(self,
progress_callback: Optional[Callable[[Dict[str, Any]], None]] = None):
"""Initialize the progress tracker.
Args:
progress_callback: Function to call with progress updates
"""
self.progress_callback = progress_callback
self.urls_found = 0
self.urls_processed = 0
self.urls_succeeded = 0
self.urls_failed = 0
self.chunks_stored = 0
self.logs = []
self.is_running = False
self.start_time = None
self.end_time = None
def log(self, message: str):
"""Add a log message and update progress."""
timestamp = datetime.now().strftime("%H:%M:%S")
log_entry = f"[{timestamp}] {message}"
self.logs.append(log_entry)
print(message) # Also print to console
# Call the progress callback if provided
if self.progress_callback:
self.progress_callback(self.get_status())
def start(self):
"""Mark the crawling process as started."""
self.is_running = True
self.start_time = datetime.now()
self.log("Crawling process started")
# Call the progress callback if provided
if self.progress_callback:
self.progress_callback(self.get_status())
def complete(self):
"""Mark the crawling process as completed."""
self.is_running = False
self.end_time = datetime.now()
duration = self.end_time - self.start_time if self.start_time else None
duration_str = str(duration).split('.')[0] if duration else "unknown"
self.log(f"Crawling process completed in {duration_str}")
# Call the progress callback if provided
if self.progress_callback:
self.progress_callback(self.get_status())
def get_status(self) -> Dict[str, Any]:
"""Get the current status of the crawling process."""
return {
"is_running": self.is_running,
"urls_found": self.urls_found,
"urls_processed": self.urls_processed,
"urls_succeeded": self.urls_succeeded,
"urls_failed": self.urls_failed,
"chunks_stored": self.chunks_stored,
"progress_percentage": (self.urls_processed / self.urls_found * 100) if self.urls_found > 0 else 0,
"logs": self.logs,
"start_time": self.start_time,
"end_time": self.end_time
}
@property
def is_completed(self) -> bool:
"""Return True if the crawling process is completed."""
return not self.is_running and self.end_time is not None
@property
def is_successful(self) -> bool:
"""Return True if the crawling process completed successfully."""
return self.is_completed and self.urls_failed == 0 and self.urls_succeeded > 0
def chunk_text(text: str, chunk_size: int = 5000) -> List[str]:
"""Split text into chunks, respecting code blocks and paragraphs."""
chunks = []
start = 0
text_length = len(text)
while start < text_length:
# Calculate end position
end = start + chunk_size
# If we're at the end of the text, just take what's left
if end >= text_length:
chunks.append(text[start:].strip())
break
# Try to find a code block boundary first (```)
chunk = text[start:end]
code_block = chunk.rfind('```')
if code_block != -1 and code_block > chunk_size * 0.3:
end = start + code_block
# If no code block, try to break at a paragraph
elif '\n\n' in chunk:
# Find the last paragraph break
last_break = chunk.rfind('\n\n')
if last_break > chunk_size * 0.3: # Only break if we're past 30% of chunk_size
end = start + last_break
# If no paragraph break, try to break at a sentence
elif '. ' in chunk:
# Find the last sentence break
last_period = chunk.rfind('. ')
if last_period > chunk_size * 0.3: # Only break if we're past 30% of chunk_size
end = start + last_period + 1
# Extract chunk and clean it up
chunk = text[start:end].strip()
if chunk:
chunks.append(chunk)
# Move start position for next chunk
start = max(start + 1, end)
return chunks
async def get_title_and_summary(chunk: str, url: str) -> Dict[str, str]:
"""Extract title and summary using GPT-4."""
system_prompt = """You are an AI that extracts titles and summaries from documentation chunks.
Return a JSON object with 'title' and 'summary' keys.
For the title: If this seems like the start of a document, extract its title. If it's a middle chunk, derive a descriptive title.
For the summary: Create a concise summary of the main points in this chunk.
Keep both title and summary concise but informative."""
try:
response = await llm_client.chat.completions.create(
model=get_env_var("PRIMARY_MODEL") or "gpt-4o-mini",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": f"URL: {url}\n\nContent:\n{chunk[:1000]}..."} # Send first 1000 chars for context
],
response_format={ "type": "json_object" }
)
return json.loads(response.choices[0].message.content)
except Exception as e:
print(f"Error getting title and summary: {e}")
return {"title": "Error processing title", "summary": "Error processing summary"}
async def get_embedding(text: str) -> List[float]:
"""Get embedding vector from OpenAI."""
try:
response = await embedding_client.embeddings.create(
model=embedding_model,
input=text
)
return response.data[0].embedding
except Exception as e:
print(f"Error getting embedding: {e}")
return [0] * 1536 # Return zero vector on error
async def process_chunk(chunk: str, chunk_number: int, url: str) -> ProcessedChunk:
"""Process a single chunk of text."""
# Get title and summary
extracted = await get_title_and_summary(chunk, url)
# Get embedding
embedding = await get_embedding(chunk)
# Create metadata
metadata = {
"source": "pydantic_ai_docs",
"chunk_size": len(chunk),
"crawled_at": datetime.now(timezone.utc).isoformat(),
"url_path": urlparse(url).path
}
return ProcessedChunk(
url=url,
chunk_number=chunk_number,
title=extracted['title'],
summary=extracted['summary'],
content=chunk, # Store the original chunk content
metadata=metadata,
embedding=embedding
)
async def insert_chunk(chunk: ProcessedChunk):
"""Insert a processed chunk into Supabase."""
try:
data = {
"url": chunk.url,
"chunk_number": chunk.chunk_number,
"title": chunk.title,
"summary": chunk.summary,
"content": chunk.content,
"metadata": chunk.metadata,
"embedding": chunk.embedding
}
result = supabase.table("site_pages").insert(data).execute()
print(f"Inserted chunk {chunk.chunk_number} for {chunk.url}")
return result
except Exception as e:
print(f"Error inserting chunk: {e}")
return None
async def process_and_store_document(url: str, markdown: str, tracker: Optional[CrawlProgressTracker] = None):
"""Process a document and store its chunks in parallel."""
# Split into chunks
chunks = chunk_text(markdown)
if tracker:
tracker.log(f"Split document into {len(chunks)} chunks for {url}")
# Ensure UI gets updated
if tracker.progress_callback:
tracker.progress_callback(tracker.get_status())
else:
print(f"Split document into {len(chunks)} chunks for {url}")
# Process chunks in parallel
tasks = [
process_chunk(chunk, i, url)
for i, chunk in enumerate(chunks)
]
processed_chunks = await asyncio.gather(*tasks)
if tracker:
tracker.log(f"Processed {len(processed_chunks)} chunks for {url}")
# Ensure UI gets updated
if tracker.progress_callback:
tracker.progress_callback(tracker.get_status())
else:
print(f"Processed {len(processed_chunks)} chunks for {url}")
# Store chunks in parallel
insert_tasks = [
insert_chunk(chunk)
for chunk in processed_chunks
]
await asyncio.gather(*insert_tasks)
if tracker:
tracker.chunks_stored += len(processed_chunks)
tracker.log(f"Stored {len(processed_chunks)} chunks for {url}")
# Ensure UI gets updated
if tracker.progress_callback:
tracker.progress_callback(tracker.get_status())
else:
print(f"Stored {len(processed_chunks)} chunks for {url}")
def fetch_url_content(url: str) -> str:
"""Fetch content from a URL using requests and convert to markdown."""
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
}
try:
response = requests.get(url, headers=headers, timeout=30)
response.raise_for_status()
# Convert HTML to Markdown
markdown = html_converter.handle(response.text)
# Clean up the markdown
markdown = re.sub(r'\n{3,}', '\n\n', markdown) # Remove excessive newlines
return markdown
except Exception as e:
raise Exception(f"Error fetching {url}: {str(e)}")
async def crawl_parallel_with_requests(urls: List[str], tracker: Optional[CrawlProgressTracker] = None, max_concurrent: int = 5):
"""Crawl multiple URLs in parallel with a concurrency limit using direct HTTP requests."""
# Create a semaphore to limit concurrency
semaphore = asyncio.Semaphore(max_concurrent)
async def process_url(url: str):
async with semaphore:
if tracker:
tracker.log(f"Crawling: {url}")
# Ensure UI gets updated
if tracker.progress_callback:
tracker.progress_callback(tracker.get_status())
else:
print(f"Crawling: {url}")
try:
# Use a thread pool to run the blocking HTTP request
loop = asyncio.get_running_loop()
if tracker:
tracker.log(f"Fetching content from: {url}")
else:
print(f"Fetching content from: {url}")
markdown = await loop.run_in_executor(None, fetch_url_content, url)
if markdown:
if tracker:
tracker.urls_succeeded += 1
tracker.log(f"Successfully crawled: {url}")
# Ensure UI gets updated
if tracker.progress_callback:
tracker.progress_callback(tracker.get_status())
else:
print(f"Successfully crawled: {url}")
await process_and_store_document(url, markdown, tracker)
else:
if tracker:
tracker.urls_failed += 1
tracker.log(f"Failed: {url} - No content retrieved")
# Ensure UI gets updated
if tracker.progress_callback:
tracker.progress_callback(tracker.get_status())
else:
print(f"Failed: {url} - No content retrieved")
except Exception as e:
if tracker:
tracker.urls_failed += 1
tracker.log(f"Error processing {url}: {str(e)}")
# Ensure UI gets updated
if tracker.progress_callback:
tracker.progress_callback(tracker.get_status())
else:
print(f"Error processing {url}: {str(e)}")
finally:
if tracker:
tracker.urls_processed += 1
# Ensure UI gets updated
if tracker.progress_callback:
tracker.progress_callback(tracker.get_status())
time.sleep(2)
# Process all URLs in parallel with limited concurrency
if tracker:
tracker.log(f"Processing {len(urls)} URLs with concurrency {max_concurrent}")
# Ensure UI gets updated
if tracker.progress_callback:
tracker.progress_callback(tracker.get_status())
else:
print(f"Processing {len(urls)} URLs with concurrency {max_concurrent}")
await asyncio.gather(*[process_url(url) for url in urls])
def get_pydantic_ai_docs_urls() -> List[str]:
"""Get URLs from Pydantic AI docs sitemap."""
sitemap_url = "https://ai.pydantic.dev/sitemap.xml"
try:
response = requests.get(sitemap_url)
response.raise_for_status()
# Parse the XML
root = ElementTree.fromstring(response.content)
# Extract all URLs from the sitemap
namespace = {'ns': 'http://www.sitemaps.org/schemas/sitemap/0.9'}
urls = [loc.text for loc in root.findall('.//ns:loc', namespace)]
return urls
except Exception as e:
print(f"Error fetching sitemap: {e}")
return []
def clear_existing_records():
"""Clear all existing records with source='pydantic_ai_docs' from the site_pages table."""
try:
result = supabase.table("site_pages").delete().eq("metadata->>source", "pydantic_ai_docs").execute()
print("Cleared existing pydantic_ai_docs records from site_pages")
return result
except Exception as e:
print(f"Error clearing existing records: {e}")
return None
async def main_with_requests(tracker: Optional[CrawlProgressTracker] = None):
"""Main function using direct HTTP requests instead of browser automation."""
try:
# Start tracking if tracker is provided
if tracker:
tracker.start()
else:
print("Starting crawling process...")
# Clear existing records first
if tracker:
tracker.log("Clearing existing Pydantic AI docs records...")
else:
print("Clearing existing Pydantic AI docs records...")
clear_existing_records()
if tracker:
tracker.log("Existing records cleared")
else:
print("Existing records cleared")
# Get URLs from Pydantic AI docs
if tracker:
tracker.log("Fetching URLs from Pydantic AI sitemap...")
else:
print("Fetching URLs from Pydantic AI sitemap...")
urls = get_pydantic_ai_docs_urls()
if not urls:
if tracker:
tracker.log("No URLs found to crawl")
tracker.complete()
else:
print("No URLs found to crawl")
return
if tracker:
tracker.urls_found = len(urls)
tracker.log(f"Found {len(urls)} URLs to crawl")
else:
print(f"Found {len(urls)} URLs to crawl")
# Crawl the URLs using direct HTTP requests
await crawl_parallel_with_requests(urls, tracker)
# Mark as complete if tracker is provided
if tracker:
tracker.complete()
else:
print("Crawling process completed")
except Exception as e:
if tracker:
tracker.log(f"Error in crawling process: {str(e)}")
tracker.complete()
else:
print(f"Error in crawling process: {str(e)}")
def start_crawl_with_requests(progress_callback: Optional[Callable[[Dict[str, Any]], None]] = None) -> CrawlProgressTracker:
"""Start the crawling process using direct HTTP requests in a separate thread and return the tracker."""
tracker = CrawlProgressTracker(progress_callback)
def run_crawl():
try:
asyncio.run(main_with_requests(tracker))
except Exception as e:
print(f"Error in crawl thread: {e}")
tracker.log(f"Thread error: {str(e)}")
tracker.complete()
# Start the crawling process in a separate thread
thread = threading.Thread(target=run_crawl)
thread.daemon = True
thread.start()
return tracker
if __name__ == "__main__":
# Run the main function directly
print("Starting crawler...")
asyncio.run(main_with_requests())
print("Crawler finished.")

View File

@@ -0,0 +1,7 @@
{
"dependencies": ["."],
"graphs": {
"agent": "./archon_graph.py:agentic_flow"
},
"env": "../.env"
}

View File

@@ -0,0 +1,101 @@
from __future__ import annotations as _annotations
from dataclasses import dataclass
from dotenv import load_dotenv
import logfire
import asyncio
import httpx
import os
import sys
import json
from typing import List
from pydantic import BaseModel
from pydantic_ai import Agent, ModelRetry, RunContext
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.models.openai import OpenAIModel
from openai import AsyncOpenAI
from supabase import Client
# Add the parent directory to sys.path to allow importing from the parent directory
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from utils.utils import get_env_var
from archon.agent_prompts import primary_coder_prompt
from archon.agent_tools import (
retrieve_relevant_documentation_tool,
list_documentation_pages_tool,
get_page_content_tool
)
load_dotenv()
provider = get_env_var('LLM_PROVIDER') or 'OpenAI'
llm = get_env_var('PRIMARY_MODEL') or 'gpt-4o-mini'
base_url = get_env_var('BASE_URL') or 'https://api.openai.com/v1'
api_key = get_env_var('LLM_API_KEY') or 'no-llm-api-key-provided'
model = AnthropicModel(llm, api_key=api_key) if provider == "Anthropic" else OpenAIModel(llm, base_url=base_url, api_key=api_key)
logfire.configure(send_to_logfire='if-token-present')
@dataclass
class PydanticAIDeps:
supabase: Client
embedding_client: AsyncOpenAI
reasoner_output: str
advisor_output: str
pydantic_ai_coder = Agent(
model,
system_prompt=primary_coder_prompt,
deps_type=PydanticAIDeps,
retries=2
)
@pydantic_ai_coder.system_prompt
def add_reasoner_output(ctx: RunContext[str]) -> str:
return f"""
\n\nAdditional thoughts/instructions from the reasoner LLM.
This scope includes documentation pages for you to search as well:
{ctx.deps.reasoner_output}
Recommended starting point from the advisor agent:
{ctx.deps.advisor_output}
"""
@pydantic_ai_coder.tool
async def retrieve_relevant_documentation(ctx: RunContext[PydanticAIDeps], user_query: str) -> str:
"""
Retrieve relevant documentation chunks based on the query with RAG.
Args:
ctx: The context including the Supabase client and OpenAI client
user_query: The user's question or query
Returns:
A formatted string containing the top 4 most relevant documentation chunks
"""
return await retrieve_relevant_documentation_tool(ctx.deps.supabase, ctx.deps.embedding_client, user_query)
@pydantic_ai_coder.tool
async def list_documentation_pages(ctx: RunContext[PydanticAIDeps]) -> List[str]:
"""
Retrieve a list of all available Pydantic AI documentation pages.
Returns:
List[str]: List of unique URLs for all documentation pages
"""
return await list_documentation_pages_tool(ctx.deps.supabase)
@pydantic_ai_coder.tool
async def get_page_content(ctx: RunContext[PydanticAIDeps], url: str) -> str:
"""
Retrieve the full content of a specific documentation page by combining all its chunks.
Args:
ctx: The context including the Supabase client
url: The URL of the page to retrieve
Returns:
str: The complete page content with all chunks combined in order
"""
return await get_page_content_tool(ctx.deps.supabase, url)

View File

@@ -0,0 +1,92 @@
from __future__ import annotations as _annotations
from dataclasses import dataclass
from dotenv import load_dotenv
import logfire
import asyncio
import httpx
import os
import sys
import json
from typing import List
from pydantic import BaseModel
from pydantic_ai import Agent, ModelRetry, RunContext
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.models.openai import OpenAIModel
from openai import AsyncOpenAI
from supabase import Client
# Add the parent directory to sys.path to allow importing from the parent directory
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))
from utils.utils import get_env_var
from archon.agent_prompts import agent_refiner_prompt
from archon.agent_tools import (
retrieve_relevant_documentation_tool,
list_documentation_pages_tool,
get_page_content_tool
)
load_dotenv()
provider = get_env_var('LLM_PROVIDER') or 'OpenAI'
llm = get_env_var('PRIMARY_MODEL') or 'gpt-4o-mini'
base_url = get_env_var('BASE_URL') or 'https://api.openai.com/v1'
api_key = get_env_var('LLM_API_KEY') or 'no-llm-api-key-provided'
model = AnthropicModel(llm, api_key=api_key) if provider == "Anthropic" else OpenAIModel(llm, base_url=base_url, api_key=api_key)
embedding_model = get_env_var('EMBEDDING_MODEL') or 'text-embedding-3-small'
logfire.configure(send_to_logfire='if-token-present')
@dataclass
class AgentRefinerDeps:
supabase: Client
embedding_client: AsyncOpenAI
agent_refiner_agent = Agent(
model,
system_prompt=agent_refiner_prompt,
deps_type=AgentRefinerDeps,
retries=2
)
@agent_refiner_agent.tool
async def retrieve_relevant_documentation(ctx: RunContext[AgentRefinerDeps], query: str) -> str:
"""
Retrieve relevant documentation chunks based on the query with RAG.
Make sure your searches always focus on implementing the agent itself.
Args:
ctx: The context including the Supabase client and OpenAI client
query: Your query to retrieve relevant documentation for implementing agents
Returns:
A formatted string containing the top 4 most relevant documentation chunks
"""
return await retrieve_relevant_documentation_tool(ctx.deps.supabase, ctx.deps.embedding_client, query)
@agent_refiner_agent.tool
async def list_documentation_pages(ctx: RunContext[AgentRefinerDeps]) -> List[str]:
"""
Retrieve a list of all available Pydantic AI documentation pages.
This will give you all pages available, but focus on the ones related to configuring agents and their dependencies.
Returns:
List[str]: List of unique URLs for all documentation pages
"""
return await list_documentation_pages_tool(ctx.deps.supabase)
@agent_refiner_agent.tool
async def get_page_content(ctx: RunContext[AgentRefinerDeps], url: str) -> str:
"""
Retrieve the full content of a specific documentation page by combining all its chunks.
Only use this tool to get pages related to setting up agents with Pydantic AI.
Args:
ctx: The context including the Supabase client
url: The URL of the page to retrieve
Returns:
str: The complete page content with all chunks combined in order
"""
return await get_page_content_tool(ctx.deps.supabase, url)

View File

@@ -0,0 +1,31 @@
from __future__ import annotations as _annotations
import logfire
import os
import sys
from pydantic_ai import Agent
from dotenv import load_dotenv
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.models.openai import OpenAIModel
from supabase import Client
# Add the parent directory to sys.path to allow importing from the parent directory
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))
from utils.utils import get_env_var
from archon.agent_prompts import prompt_refiner_prompt
load_dotenv()
provider = get_env_var('LLM_PROVIDER') or 'OpenAI'
llm = get_env_var('PRIMARY_MODEL') or 'gpt-4o-mini'
base_url = get_env_var('BASE_URL') or 'https://api.openai.com/v1'
api_key = get_env_var('LLM_API_KEY') or 'no-llm-api-key-provided'
model = AnthropicModel(llm, api_key=api_key) if provider == "Anthropic" else OpenAIModel(llm, base_url=base_url, api_key=api_key)
logfire.configure(send_to_logfire='if-token-present')
prompt_refiner_agent = Agent(
model,
system_prompt=prompt_refiner_prompt
)

View File

@@ -0,0 +1,117 @@
from __future__ import annotations as _annotations
from dataclasses import dataclass
from dotenv import load_dotenv
import logfire
import asyncio
import httpx
import os
import sys
import json
from typing import List
from pydantic import BaseModel
from pydantic_ai import Agent, ModelRetry, RunContext
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.models.openai import OpenAIModel
from openai import AsyncOpenAI
from supabase import Client
# Add the parent directory to sys.path to allow importing from the parent directory
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))
from utils.utils import get_env_var
from archon.agent_prompts import tools_refiner_prompt
from archon.agent_tools import (
retrieve_relevant_documentation_tool,
list_documentation_pages_tool,
get_page_content_tool,
get_file_content_tool
)
load_dotenv()
provider = get_env_var('LLM_PROVIDER') or 'OpenAI'
llm = get_env_var('PRIMARY_MODEL') or 'gpt-4o-mini'
base_url = get_env_var('BASE_URL') or 'https://api.openai.com/v1'
api_key = get_env_var('LLM_API_KEY') or 'no-llm-api-key-provided'
model = AnthropicModel(llm, api_key=api_key) if provider == "Anthropic" else OpenAIModel(llm, base_url=base_url, api_key=api_key)
embedding_model = get_env_var('EMBEDDING_MODEL') or 'text-embedding-3-small'
logfire.configure(send_to_logfire='if-token-present')
@dataclass
class ToolsRefinerDeps:
supabase: Client
embedding_client: AsyncOpenAI
file_list: List[str]
tools_refiner_agent = Agent(
model,
system_prompt=tools_refiner_prompt,
deps_type=ToolsRefinerDeps,
retries=2
)
@tools_refiner_agent.system_prompt
def add_file_list(ctx: RunContext[str]) -> str:
return f"""
\n\Here is the list of all the files that you can pull the contents of with the
'get_file_content' tool if the example/tool/MCP server is relevant to the
agent the user is trying to build:\n
{"\n".join(ctx.deps.file_list)}
"""
@tools_refiner_agent.tool
async def retrieve_relevant_documentation(ctx: RunContext[ToolsRefinerDeps], query: str) -> str:
"""
Retrieve relevant documentation chunks based on the query with RAG.
Make sure your searches always focus on implementing tools.
Args:
ctx: The context including the Supabase client and OpenAI client
query: Your query to retrieve relevant documentation for implementing tools
Returns:
A formatted string containing the top 4 most relevant documentation chunks
"""
return await retrieve_relevant_documentation_tool(ctx.deps.supabase, ctx.deps.embedding_client, query)
@tools_refiner_agent.tool
async def list_documentation_pages(ctx: RunContext[ToolsRefinerDeps]) -> List[str]:
"""
Retrieve a list of all available Pydantic AI documentation pages.
This will give you all pages available, but focus on the ones related to tools.
Returns:
List[str]: List of unique URLs for all documentation pages
"""
return await list_documentation_pages_tool(ctx.deps.supabase)
@tools_refiner_agent.tool
async def get_page_content(ctx: RunContext[ToolsRefinerDeps], url: str) -> str:
"""
Retrieve the full content of a specific documentation page by combining all its chunks.
Only use this tool to get pages related to using tools with Pydantic AI.
Args:
ctx: The context including the Supabase client
url: The URL of the page to retrieve
Returns:
str: The complete page content with all chunks combined in order
"""
return await get_page_content_tool(ctx.deps.supabase, url)
@tools_refiner_agent.tool_plain
def get_file_content(file_path: str) -> str:
"""
Retrieves the content of a specific file. Use this to get the contents of an example, tool, config for an MCP server
Args:
file_path: The path to the file
Returns:
The raw contents of the file
"""
return get_file_content_tool(file_path)

View File

@@ -0,0 +1,70 @@
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import Optional, Dict, Any
from archon.archon_graph import agentic_flow
from langgraph.types import Command
from utils.utils import write_to_log
app = FastAPI()
class InvokeRequest(BaseModel):
message: str
thread_id: str
is_first_message: bool = False
config: Optional[Dict[str, Any]] = None
@app.get("/health")
async def health_check():
"""Health check endpoint"""
return {"status": "ok"}
@app.post("/invoke")
async def invoke_agent(request: InvokeRequest):
"""Process a message through the agentic flow and return the complete response.
The agent streams the response but this API endpoint waits for the full output
before returning so it's a synchronous operation for MCP.
Another endpoint will be made later to fully stream the response from the API.
Args:
request: The InvokeRequest containing message and thread info
Returns:
dict: Contains the complete response from the agent
"""
try:
config = request.config or {
"configurable": {
"thread_id": request.thread_id
}
}
response = ""
if request.is_first_message:
write_to_log(f"Processing first message for thread {request.thread_id}")
async for msg in agentic_flow.astream(
{"latest_user_message": request.message},
config,
stream_mode="custom"
):
response += str(msg)
else:
write_to_log(f"Processing continuation for thread {request.thread_id}")
async for msg in agentic_flow.astream(
Command(resume=request.message),
config,
stream_mode="custom"
):
response += str(msg)
write_to_log(f"Final response for thread {request.thread_id}: {response}")
return {"response": response}
except Exception as e:
print(f"Exception invoking Archon for thread {request.thread_id}: {str(e)}")
write_to_log(f"Error processing message for thread {request.thread_id}: {str(e)}")
raise HTTPException(status_code=500, detail=str(e))
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8100)

View File

@@ -0,0 +1,38 @@
# Ignore specified folders
iterations/
venv/
.langgraph_api/
.github/
__pycache__/
.env
# Git related
.git/
.gitignore
.gitattributes
# Python cache
*.pyc
*.pyo
*.pyd
.Python
*.so
.pytest_cache/
# Environment files
.env.local
.env.development.local
.env.test.local
.env.production.local
# Logs
*.log
# IDE specific files
.idea/
.vscode/
*.swp
*.swo
# Keep the example env file for reference
!.env.example

View File

@@ -0,0 +1,16 @@
FROM python:3.12-slim
WORKDIR /app
# Copy requirements file and install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy the MCP server files
COPY . .
# Expose port for MCP server
EXPOSE 8100
# Command to run the MCP server
CMD ["python", "mcp_server.py"]

View File

@@ -0,0 +1,126 @@
from mcp.server.fastmcp import FastMCP
from datetime import datetime
from dotenv import load_dotenv
from typing import Dict, List
import threading
import requests
import asyncio
import uuid
import sys
import os
# Load environment variables from .env file
load_dotenv()
# Initialize FastMCP server with ERROR logging level
mcp = FastMCP("archon", log_level="ERROR")
# Store active threads
active_threads: Dict[str, List[str]] = {}
# FastAPI service URL
GRAPH_SERVICE_URL = os.getenv("GRAPH_SERVICE_URL", "http://localhost:8100")
def write_to_log(message: str):
"""Write a message to the logs.txt file in the workbench directory.
Args:
message: The message to log
"""
# Get the directory one level up from the current file
current_dir = os.path.dirname(os.path.abspath(__file__))
parent_dir = os.path.dirname(current_dir)
workbench_dir = os.path.join(parent_dir, "workbench")
log_path = os.path.join(workbench_dir, "logs.txt")
os.makedirs(workbench_dir, exist_ok=True)
timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
log_entry = f"[{timestamp}] {message}\n"
with open(log_path, "a", encoding="utf-8") as f:
f.write(log_entry)
@mcp.tool()
async def create_thread() -> str:
"""Create a new conversation thread for Archon.
Always call this tool before invoking Archon for the first time in a conversation.
(if you don't already have a thread ID)
Returns:
str: A unique thread ID for the conversation
"""
thread_id = str(uuid.uuid4())
active_threads[thread_id] = []
write_to_log(f"Created new thread: {thread_id}")
return thread_id
def _make_request(thread_id: str, user_input: str, config: dict) -> str:
"""Make synchronous request to graph service"""
try:
response = requests.post(
f"{GRAPH_SERVICE_URL}/invoke",
json={
"message": user_input,
"thread_id": thread_id,
"is_first_message": not active_threads[thread_id],
"config": config
},
timeout=300 # 5 minute timeout for long-running operations
)
response.raise_for_status()
return response.json()
except requests.exceptions.Timeout:
write_to_log(f"Request timed out for thread {thread_id}")
raise TimeoutError("Request to graph service timed out. The operation took longer than expected.")
except requests.exceptions.RequestException as e:
write_to_log(f"Request failed for thread {thread_id}: {str(e)}")
raise
@mcp.tool()
async def run_agent(thread_id: str, user_input: str) -> str:
"""Run the Archon agent with user input.
Only use this tool after you have called create_thread in this conversation to get a unique thread ID.
If you already created a thread ID in this conversation, do not create another one. Reuse the same ID.
After you receive the code from Archon, always implement it into the codebase unless asked not to.
After using this tool and implementing the code it gave back, ask the user if they want you to refine the agent
autonomously (they can just say 'refine') or they can just give feedback and you'll improve the agent that way.
If they want to refine the agent, just give 'refine' for user_input.
Args:
thread_id: The conversation thread ID
user_input: The user's message to process
Returns:
str: The agent's response which generally includes the code for the agent
"""
if thread_id not in active_threads:
write_to_log(f"Error: Thread not found - {thread_id}")
raise ValueError("Thread not found")
write_to_log(f"Processing message for thread {thread_id}: {user_input}")
config = {
"configurable": {
"thread_id": thread_id
}
}
try:
result = await asyncio.to_thread(_make_request, thread_id, user_input, config)
active_threads[thread_id].append(user_input)
return result['response']
except Exception as e:
raise
if __name__ == "__main__":
write_to_log("Starting MCP server")
# Run MCP server
mcp.run(transport='stdio')

View File

@@ -0,0 +1,3 @@
mcp==1.2.1
python-dotenv==1.0.1
requests==2.32.3

Binary file not shown.

After

Width:  |  Height:  |  Size: 576 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 80 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 327 KiB

View File

@@ -0,0 +1,176 @@
aiofiles==24.1.0
aiohappyeyeballs==2.4.4
aiohttp==3.11.11
aiosignal==1.3.2
aiosqlite==0.20.0
altair==5.5.0
annotated-types==0.7.0
anthropic==0.42.0
anyio==4.8.0
attrs==24.3.0
beautifulsoup4==4.12.3
blinker==1.9.0
cachetools==5.5.0
certifi==2024.12.14
cffi==1.17.1
charset-normalizer==3.4.1
click==8.1.8
cohere==5.13.12
colorama==0.4.6
Crawl4AI==0.4.247
cryptography==43.0.3
Deprecated==1.2.15
deprecation==2.1.0
distro==1.9.0
dnspython==2.7.0
email_validator==2.2.0
eval_type_backport==0.2.2
executing==2.1.0
fake-http-header==0.3.5
fastapi==0.115.8
fastapi-cli==0.0.7
fastavro==1.10.0
filelock==3.16.1
frozenlist==1.5.0
fsspec==2024.12.0
gitdb==4.0.12
GitPython==3.1.44
google-auth==2.37.0
googleapis-common-protos==1.66.0
gotrue==2.11.1
greenlet==3.1.1
griffe==1.5.4
groq==0.15.0
h11==0.14.0
h2==4.1.0
hpack==4.0.0
html2text==2024.2.26
httpcore==1.0.7
httptools==0.6.4
httpx==0.27.2
httpx-sse==0.4.0
huggingface-hub==0.27.1
hyperframe==6.0.1
idna==3.10
importlib_metadata==8.5.0
iniconfig==2.0.0
itsdangerous==2.2.0
Jinja2==3.1.5
jiter==0.8.2
joblib==1.4.2
jsonpatch==1.33
jsonpath-python==1.0.6
jsonpointer==3.0.0
jsonschema==4.23.0
jsonschema-specifications==2024.10.1
jsonschema_rs==0.25.1
langchain-core==0.3.33
langgraph==0.2.69
langgraph-checkpoint==2.0.10
langgraph-cli==0.1.71
langgraph-sdk==0.1.51
langsmith==0.3.6
litellm==1.57.8
logfire==3.1.0
logfire-api==3.1.0
lxml==5.3.0
markdown-it-py==3.0.0
MarkupSafe==3.0.2
mcp==1.2.1
mdurl==0.1.2
mistralai==1.2.6
mockito==1.5.3
msgpack==1.1.0
multidict==6.1.0
mypy-extensions==1.0.0
narwhals==1.21.1
nltk==3.9.1
numpy==2.2.1
openai==1.59.6
opentelemetry-api==1.29.0
opentelemetry-exporter-otlp-proto-common==1.29.0
opentelemetry-exporter-otlp-proto-http==1.29.0
opentelemetry-instrumentation==0.50b0
opentelemetry-proto==1.29.0
opentelemetry-sdk==1.29.0
opentelemetry-semantic-conventions==0.50b0
orjson==3.10.15
packaging==24.2
pandas==2.2.3
pillow==10.4.0
playwright==1.49.1
pluggy==1.5.0
postgrest==0.19.1
propcache==0.2.1
protobuf==5.29.3
psutil==6.1.1
pyarrow==18.1.0
pyasn1==0.6.1
pyasn1_modules==0.4.1
pycparser==2.22
pydantic==2.10.5
pydantic-ai==0.0.22
pydantic-ai-slim==0.0.22
pydantic-extra-types==2.10.2
pydantic-graph==0.0.22
pydantic-settings==2.7.1
pydantic_core==2.27.2
pydeck==0.9.1
pyee==12.0.0
Pygments==2.19.1
PyJWT==2.10.1
pyOpenSSL==24.3.0
pytest==8.3.4
pytest-mockito==0.0.4
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-multipart==0.0.20
pytz==2024.2
PyYAML==6.0.2
rank-bm25==0.2.2
realtime==2.1.0
referencing==0.35.1
regex==2024.11.6
requests==2.32.3
requests-toolbelt==1.0.0
rich==13.9.4
rich-toolkit==0.13.2
rpds-py==0.22.3
rsa==4.9
shellingham==1.5.4
six==1.17.0
smmap==5.0.2
sniffio==1.3.1
snowballstemmer==2.2.0
soupsieve==2.6
sse-starlette==2.1.3
starlette==0.45.3
storage3==0.11.0
streamlit==1.41.1
StrEnum==0.4.15
structlog==24.4.0
supabase==2.11.0
supafunc==0.9.0
tenacity==9.0.0
tf-playwright-stealth==1.1.0
tiktoken==0.8.0
tokenizers==0.21.0
toml==0.10.2
tornado==6.4.2
tqdm==4.67.1
typer==0.15.1
types-requests==2.32.0.20241016
typing-inspect==0.9.0
typing_extensions==4.12.2
tzdata==2024.2
ujson==5.10.0
urllib3==2.3.0
uvicorn==0.34.0
watchdog==6.0.0
watchfiles==1.0.4
websockets==13.1
wrapt==1.17.1
xxhash==3.5.0
yarl==1.18.3
zipp==3.21.0
zstandard==0.23.0

View File

@@ -0,0 +1,152 @@
#!/usr/bin/env python
"""
Simple script to build and run Archon Docker containers.
"""
import os
import subprocess
import platform
import time
from pathlib import Path
def run_command(command, cwd=None):
"""Run a command and print output in real-time."""
print(f"Running: {' '.join(command)}")
process = subprocess.Popen(
command,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
text=False,
cwd=cwd
)
for line in process.stdout:
try:
decoded_line = line.decode('utf-8', errors='replace')
print(decoded_line.strip())
except Exception as e:
print(f"Error processing output: {e}")
process.wait()
return process.returncode
def check_docker():
"""Check if Docker is installed and running."""
try:
subprocess.run(
["docker", "--version"],
check=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE
)
return True
except (subprocess.SubprocessError, FileNotFoundError):
print("Error: Docker is not installed or not in PATH")
return False
def main():
"""Main function to build and run Archon containers."""
# Check if Docker is available
if not check_docker():
return 1
# Get the base directory
base_dir = Path(__file__).parent.absolute()
# Check for .env file
env_file = base_dir / ".env"
env_args = []
if env_file.exists():
print(f"Using environment file: {env_file}")
env_args = ["--env-file", str(env_file)]
else:
print("No .env file found. Continuing without environment variables.")
# Build the MCP container
print("\n=== Building Archon MCP container ===")
mcp_dir = base_dir / "mcp"
if run_command(["docker", "build", "-t", "archon-mcp:latest", "."], cwd=mcp_dir) != 0:
print("Error building MCP container")
return 1
# Build the main Archon container
print("\n=== Building main Archon container ===")
if run_command(["docker", "build", "-t", "archon:latest", "."], cwd=base_dir) != 0:
print("Error building main Archon container")
return 1
# Check if the container exists (running or stopped)
try:
result = subprocess.run(
["docker", "ps", "-a", "-q", "--filter", "name=archon-container"],
check=True,
capture_output=True,
text=True
)
if result.stdout.strip():
print("\n=== Removing existing Archon container ===")
container_id = result.stdout.strip()
print(f"Found container with ID: {container_id}")
# Check if the container is running
running_check = subprocess.run(
["docker", "ps", "-q", "--filter", "id=" + container_id],
check=True,
capture_output=True,
text=True
)
# If running, stop it first
if running_check.stdout.strip():
print("Container is running. Stopping it first...")
stop_result = run_command(["docker", "stop", container_id])
if stop_result != 0:
print("Warning: Failed to stop container gracefully, will try force removal")
# Remove the container with force flag to ensure it's removed
print("Removing container...")
rm_result = run_command(["docker", "rm", "-f", container_id])
if rm_result != 0:
print("Error: Failed to remove container. Please remove it manually with:")
print(f" docker rm -f {container_id}")
return 1
print("Container successfully removed")
except subprocess.SubprocessError as e:
print(f"Error checking for existing containers: {e}")
pass
# Run the Archon container
print("\n=== Starting Archon container ===")
cmd = [
"docker", "run", "-d",
"--name", "archon-container",
"-p", "8501:8501",
"-p", "8100:8100",
"--add-host", "host.docker.internal:host-gateway"
]
# Add environment variables if .env exists
if env_args:
cmd.extend(env_args)
# Add image name
cmd.append("archon:latest")
if run_command(cmd) != 0:
print("Error starting Archon container")
return 1
# Wait a moment for the container to start
time.sleep(2)
# Print success message
print("\n=== Archon is now running! ===")
print("-> Access the Streamlit UI at: http://localhost:8501")
print("-> MCP container is ready to use - see the MCP tab in the UI.")
print("\nTo stop Archon, run: docker stop archon-container && docker rm archon-container")
return 0
if __name__ == "__main__":
exit(main())

View File

@@ -0,0 +1 @@
# This file makes the streamlit_ui directory a Python package

View File

@@ -0,0 +1,230 @@
import streamlit as st
import subprocess
import threading
import platform
import queue
import time
import sys
import os
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from utils.utils import reload_archon_graph
def agent_service_tab():
"""Display the agent service interface for managing the graph service"""
st.header("MCP Agent Service")
st.write("Start, restart, and monitor the Archon agent service for MCP.")
# Initialize session state variables if they don't exist
if "service_process" not in st.session_state:
st.session_state.service_process = None
if "service_running" not in st.session_state:
st.session_state.service_running = False
if "service_output" not in st.session_state:
st.session_state.service_output = []
if "output_queue" not in st.session_state:
st.session_state.output_queue = queue.Queue()
# Function to check if the service is running
def is_service_running():
if st.session_state.service_process is None:
return False
# Check if process is still running
return st.session_state.service_process.poll() is None
# Function to kill any process using port 8100
def kill_process_on_port(port):
try:
if platform.system() == "Windows":
# Windows: use netstat to find the process using the port
result = subprocess.run(
f'netstat -ano | findstr :{port}',
shell=True,
capture_output=True,
text=True
)
if result.stdout:
# Extract the PID from the output
for line in result.stdout.splitlines():
if f":{port}" in line and "LISTENING" in line:
parts = line.strip().split()
pid = parts[-1]
# Kill the process
subprocess.run(f'taskkill /F /PID {pid}', shell=True)
st.session_state.output_queue.put(f"[{time.strftime('%H:%M:%S')}] Killed any existing process using port {port} (PID: {pid})\n")
return True
else:
# Unix-like systems: use lsof to find the process using the port
result = subprocess.run(
f'lsof -i :{port} -t',
shell=True,
capture_output=True,
text=True
)
if result.stdout:
# Extract the PID from the output
pid = result.stdout.strip()
# Kill the process
subprocess.run(f'kill -9 {pid}', shell=True)
st.session_state.output_queue.put(f"[{time.strftime('%H:%M:%S')}] Killed process using port {port} (PID: {pid})\n")
return True
return False
except Exception as e:
st.session_state.output_queue.put(f"[{time.strftime('%H:%M:%S')}] Error killing process on port {port}: {str(e)}\n")
return False
# Update service status
st.session_state.service_running = is_service_running()
# Process any new output in the queue
try:
while not st.session_state.output_queue.empty():
line = st.session_state.output_queue.get_nowait()
if line:
st.session_state.service_output.append(line)
except Exception:
pass
# Create button text based on service status
button_text = "Restart Agent Service" if st.session_state.service_running else "Start Agent Service"
# Create columns for buttons
col1, col2 = st.columns([1, 1])
# Start/Restart button
with col1:
if st.button(button_text, use_container_width=True):
# Stop existing process if running
if st.session_state.service_running:
try:
st.session_state.service_process.terminate()
time.sleep(1) # Give it time to terminate
if st.session_state.service_process.poll() is None:
# Force kill if still running
st.session_state.service_process.kill()
except Exception as e:
st.error(f"Error stopping service: {str(e)}")
# Clear previous output
st.session_state.service_output = []
st.session_state.output_queue = queue.Queue()
# Kill any process using port 8100
kill_process_on_port(8100)
# Start new process
try:
# Get the absolute path to the graph service script
base_path = os.path.abspath(os.path.dirname(os.path.dirname(__file__)))
graph_service_path = os.path.join(base_path, 'graph_service.py')
# Start the process with output redirection
process = subprocess.Popen(
[sys.executable, graph_service_path],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
bufsize=1,
universal_newlines=True
)
st.session_state.service_process = process
st.session_state.service_running = True
# Start threads to read output
def read_output(stream, queue_obj):
for line in iter(stream.readline, ''):
queue_obj.put(line)
stream.close()
# Start threads for stdout and stderr
threading.Thread(target=read_output, args=(process.stdout, st.session_state.output_queue), daemon=True).start()
threading.Thread(target=read_output, args=(process.stderr, st.session_state.output_queue), daemon=True).start()
# Add startup message
st.session_state.output_queue.put(f"[{time.strftime('%H:%M:%S')}] Agent service started\n")
st.success("Agent service started successfully!")
st.rerun()
except Exception as e:
st.error(f"Error starting service: {str(e)}")
st.session_state.output_queue.put(f"[{time.strftime('%H:%M:%S')}] Error: {str(e)}\n")
# Stop button
with col2:
stop_button = st.button("Stop Agent Service", disabled=not st.session_state.service_running, use_container_width=True)
if stop_button and st.session_state.service_running:
try:
st.session_state.service_process.terminate()
time.sleep(1) # Give it time to terminate
if st.session_state.service_process.poll() is None:
# Force kill if still running
st.session_state.service_process.kill()
st.session_state.service_running = False
st.session_state.output_queue.put(f"[{time.strftime('%H:%M:%S')}] Agent service stopped\n")
st.success("Agent service stopped successfully!")
st.rerun()
except Exception as e:
st.error(f"Error stopping service: {str(e)}")
st.session_state.output_queue.put(f"[{time.strftime('%H:%M:%S')}] Error stopping: {str(e)}\n")
# Service status indicator
status_color = "🟢" if st.session_state.service_running else "🔴"
status_text = "Running" if st.session_state.service_running else "Stopped"
st.write(f"**Service Status:** {status_color} {status_text}")
# Add auto-refresh option
auto_refresh = st.checkbox("Auto-refresh output (uncheck this before copying any error message)", value=True)
# Display output in a scrollable container
st.subheader("Service Output")
# Calculate height based on number of lines, but cap it
output_height = min(400, max(200, len(st.session_state.service_output) * 20))
# Create a scrollable container for the output
with st.container():
# Join all output lines and display in the container
output_text = "".join(st.session_state.service_output)
# For auto-scrolling, we'll use a different approach
if auto_refresh and st.session_state.service_running and output_text:
# We'll reverse the output text so the newest lines appear at the top
# This way they're always visible without needing to scroll
lines = output_text.splitlines()
reversed_lines = lines[::-1] # Reverse the lines
output_text = "\n".join(reversed_lines)
# Add a note at the top (which will appear at the bottom of the reversed text)
note = "--- SHOWING NEWEST LOGS FIRST (AUTO-SCROLL MODE) ---\n\n"
output_text = note + output_text
# Use a text area for scrollable output
st.text_area(
label="Realtime Logs from Archon Service",
value=output_text,
height=output_height,
disabled=True,
key="output_text_area" # Use a fixed key to maintain state between refreshes
)
# Add a toggle for reversed mode
if auto_refresh and st.session_state.service_running:
st.caption("Logs are shown newest-first for auto-scrolling. Disable auto-refresh to see logs in chronological order.")
# Add a clear output button
if st.button("Clear Output"):
st.session_state.service_output = []
st.rerun()
# Auto-refresh if enabled and service is running
if auto_refresh and st.session_state.service_running:
time.sleep(0.1) # Small delay to prevent excessive CPU usage
st.rerun()

View File

@@ -0,0 +1,86 @@
from langgraph.types import Command
import streamlit as st
import uuid
import sys
import os
# Add the current directory to Python path
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from archon.archon_graph import agentic_flow
@st.cache_resource
def get_thread_id():
return str(uuid.uuid4())
thread_id = get_thread_id()
async def run_agent_with_streaming(user_input: str):
"""
Run the agent with streaming text for the user_input prompt,
while maintaining the entire conversation in `st.session_state.messages`.
"""
config = {
"configurable": {
"thread_id": thread_id
}
}
# First message from user
if len(st.session_state.messages) == 1:
async for msg in agentic_flow.astream(
{"latest_user_message": user_input}, config, stream_mode="custom"
):
yield msg
# Continue the conversation
else:
async for msg in agentic_flow.astream(
Command(resume=user_input), config, stream_mode="custom"
):
yield msg
async def chat_tab():
"""Display the chat interface for talking to Archon"""
st.write("Describe to me an AI agent you want to build and I'll code it for you with Pydantic AI.")
st.write("Example: Build me an AI agent that can search the web with the Brave API.")
# Initialize chat history in session state if not present
if "messages" not in st.session_state:
st.session_state.messages = []
# Add a clear conversation button
if st.button("Clear Conversation"):
st.session_state.messages = []
st.rerun()
# Display chat messages from history on app rerun
for message in st.session_state.messages:
message_type = message["type"]
if message_type in ["human", "ai", "system"]:
with st.chat_message(message_type):
st.markdown(message["content"])
# Chat input for the user
user_input = st.chat_input("What do you want to build today?")
if user_input:
# We append a new request to the conversation explicitly
st.session_state.messages.append({"type": "human", "content": user_input})
# Display user prompt in the UI
with st.chat_message("user"):
st.markdown(user_input)
# Display assistant response in chat message container
response_content = ""
with st.chat_message("assistant"):
message_placeholder = st.empty() # Placeholder for updating the message
# Add a spinner while loading
with st.spinner("Archon is thinking..."):
# Run the async generator to fetch responses
async for chunk in run_agent_with_streaming(user_input):
response_content += chunk
# Update the placeholder with the current response content
message_placeholder.markdown(response_content)
st.session_state.messages.append({"type": "ai", "content": response_content})

View File

@@ -0,0 +1,180 @@
import streamlit as st
import sys
import os
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from utils.utils import get_env_var
@st.cache_data
def load_sql_template():
"""Load the SQL template file and cache it"""
with open(os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "utils", "site_pages.sql"), "r") as f:
return f.read()
def get_supabase_sql_editor_url(supabase_url):
"""Get the URL for the Supabase SQL Editor"""
try:
# Extract the project reference from the URL
# Format is typically: https://<project-ref>.supabase.co
if '//' in supabase_url and 'supabase' in supabase_url:
parts = supabase_url.split('//')
if len(parts) > 1:
domain_parts = parts[1].split('.')
if len(domain_parts) > 0:
project_ref = domain_parts[0]
return f"https://supabase.com/dashboard/project/{project_ref}/sql/new"
# Fallback to a generic URL
return "https://supabase.com/dashboard"
except Exception:
return "https://supabase.com/dashboard"
def show_manual_sql_instructions(sql, vector_dim, recreate=False):
"""Show instructions for manually executing SQL in Supabase"""
st.info("### Manual SQL Execution Instructions")
# Provide a link to the Supabase SQL Editor
supabase_url = get_env_var("SUPABASE_URL")
if supabase_url:
dashboard_url = get_supabase_sql_editor_url(supabase_url)
st.markdown(f"**Step 1:** [Open Your Supabase SQL Editor with this URL]({dashboard_url})")
else:
st.markdown("**Step 1:** Open your Supabase Dashboard and navigate to the SQL Editor")
st.markdown("**Step 2:** Create a new SQL query")
if recreate:
st.markdown("**Step 3:** Copy and execute the following SQL:")
drop_sql = f"DROP FUNCTION IF EXISTS match_site_pages(vector({vector_dim}), int, jsonb);\nDROP TABLE IF EXISTS site_pages CASCADE;"
st.code(drop_sql, language="sql")
st.markdown("**Step 4:** Then copy and execute this SQL:")
st.code(sql, language="sql")
else:
st.markdown("**Step 3:** Copy and execute the following SQL:")
st.code(sql, language="sql")
st.success("After executing the SQL, return to this page and refresh to see the updated table status.")
def database_tab(supabase):
"""Display the database configuration interface"""
st.header("Database Configuration")
st.write("Set up and manage your Supabase database tables for Archon.")
# Check if Supabase is configured
if not supabase:
st.error("Supabase is not configured. Please set your Supabase URL and Service Key in the Environment tab.")
return
# Site Pages Table Setup
st.subheader("Site Pages Table")
st.write("This table stores web page content and embeddings for semantic search.")
# Add information about the table
with st.expander("About the Site Pages Table", expanded=False):
st.markdown("""
This table is used to store:
- Web page content split into chunks
- Vector embeddings for semantic search
- Metadata for filtering results
The table includes:
- URL and chunk number (unique together)
- Title and summary of the content
- Full text content
- Vector embeddings for similarity search
- Metadata in JSON format
It also creates:
- A vector similarity search function
- Appropriate indexes for performance
- Row-level security policies for Supabase
""")
# Check if the table already exists
table_exists = False
table_has_data = False
try:
# Try to query the table to see if it exists
response = supabase.table("site_pages").select("id").limit(1).execute()
table_exists = True
# Check if the table has data
count_response = supabase.table("site_pages").select("*", count="exact").execute()
row_count = count_response.count if hasattr(count_response, 'count') else 0
table_has_data = row_count > 0
st.success("✅ The site_pages table already exists in your database.")
if table_has_data:
st.info(f"The table contains data ({row_count} rows).")
else:
st.info("The table exists but contains no data.")
except Exception as e:
error_str = str(e)
if "relation" in error_str and "does not exist" in error_str:
st.info("The site_pages table does not exist yet. You can create it below.")
else:
st.error(f"Error checking table status: {error_str}")
st.info("Proceeding with the assumption that the table needs to be created.")
table_exists = False
# Vector dimensions selection
st.write("### Vector Dimensions")
st.write("Select the embedding dimensions based on your embedding model:")
vector_dim = st.selectbox(
"Embedding Dimensions",
options=[1536, 768, 384, 1024],
index=0,
help="Use 1536 for OpenAI embeddings, 768 for nomic-embed-text with Ollama, or select another dimension based on your model."
)
# Get the SQL with the selected vector dimensions
sql_template = load_sql_template()
# Replace the vector dimensions in the SQL
sql = sql_template.replace("vector(1536)", f"vector({vector_dim})")
# Also update the match_site_pages function dimensions
sql = sql.replace("query_embedding vector(1536)", f"query_embedding vector({vector_dim})")
# Show the SQL
with st.expander("View SQL", expanded=False):
st.code(sql, language="sql")
# Create table button
if not table_exists:
if st.button("Get Instructions for Creating Site Pages Table"):
show_manual_sql_instructions(sql, vector_dim)
else:
# Option to recreate the table or clear data
col1, col2 = st.columns(2)
with col1:
st.warning("⚠️ Recreating will delete all existing data.")
if st.button("Get Instructions for Recreating Site Pages Table"):
show_manual_sql_instructions(sql, vector_dim, recreate=True)
with col2:
if table_has_data:
st.warning("⚠️ Clear all data but keep structure.")
if st.button("Clear Table Data"):
try:
with st.spinner("Clearing table data..."):
# Use the Supabase client to delete all rows
response = supabase.table("site_pages").delete().neq("id", 0).execute()
st.success("✅ Table data cleared successfully!")
st.rerun()
except Exception as e:
st.error(f"Error clearing table data: {str(e)}")
# Fall back to manual SQL
truncate_sql = "TRUNCATE TABLE site_pages;"
st.code(truncate_sql, language="sql")
st.info("Execute this SQL in your Supabase SQL Editor to clear the table data.")
# Provide a link to the Supabase SQL Editor
supabase_url = get_env_var("SUPABASE_URL")
if supabase_url:
dashboard_url = get_supabase_sql_editor_url(supabase_url)
st.markdown(f"[Open Your Supabase SQL Editor with this URL]({dashboard_url})")

View File

@@ -0,0 +1,158 @@
import streamlit as st
import time
import sys
import os
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from archon.crawl_pydantic_ai_docs import start_crawl_with_requests, clear_existing_records
from utils.utils import get_env_var, create_new_tab_button
def documentation_tab(supabase_client):
"""Display the documentation interface"""
st.header("Documentation")
# Create tabs for different documentation sources
doc_tabs = st.tabs(["Pydantic AI Docs", "Future Sources"])
with doc_tabs[0]:
st.subheader("Pydantic AI Documentation")
st.markdown("""
This section allows you to crawl and index the Pydantic AI documentation.
The crawler will:
1. Fetch URLs from the Pydantic AI sitemap
2. Crawl each page and extract content
3. Split content into chunks
4. Generate embeddings for each chunk
5. Store the chunks in the Supabase database
This process may take several minutes depending on the number of pages.
""")
# Check if the database is configured
supabase_url = get_env_var("SUPABASE_URL")
supabase_key = get_env_var("SUPABASE_SERVICE_KEY")
if not supabase_url or not supabase_key:
st.warning("⚠️ Supabase is not configured. Please set up your environment variables first.")
create_new_tab_button("Go to Environment Section", "Environment", key="goto_env_from_docs")
else:
# Initialize session state for tracking crawl progress
if "crawl_tracker" not in st.session_state:
st.session_state.crawl_tracker = None
if "crawl_status" not in st.session_state:
st.session_state.crawl_status = None
if "last_update_time" not in st.session_state:
st.session_state.last_update_time = time.time()
# Create columns for the buttons
col1, col2 = st.columns(2)
with col1:
# Button to start crawling
if st.button("Crawl Pydantic AI Docs", key="crawl_pydantic") and not (st.session_state.crawl_tracker and st.session_state.crawl_tracker.is_running):
try:
# Define a callback function to update the session state
def update_progress(status):
st.session_state.crawl_status = status
# Start the crawling process in a separate thread
st.session_state.crawl_tracker = start_crawl_with_requests(update_progress)
st.session_state.crawl_status = st.session_state.crawl_tracker.get_status()
# Force a rerun to start showing progress
st.rerun()
except Exception as e:
st.error(f"❌ Error starting crawl: {str(e)}")
with col2:
# Button to clear existing Pydantic AI docs
if st.button("Clear Pydantic AI Docs", key="clear_pydantic"):
with st.spinner("Clearing existing Pydantic AI docs..."):
try:
# Run the function to clear records
clear_existing_records()
st.success("✅ Successfully cleared existing Pydantic AI docs from the database.")
# Force a rerun to update the UI
st.rerun()
except Exception as e:
st.error(f"❌ Error clearing Pydantic AI docs: {str(e)}")
# Display crawling progress if a crawl is in progress or has completed
if st.session_state.crawl_tracker:
# Create a container for the progress information
progress_container = st.container()
with progress_container:
# Get the latest status
current_time = time.time()
# Update status every second
if current_time - st.session_state.last_update_time >= 1:
st.session_state.crawl_status = st.session_state.crawl_tracker.get_status()
st.session_state.last_update_time = current_time
status = st.session_state.crawl_status
# Display a progress bar
if status and status["urls_found"] > 0:
progress = status["urls_processed"] / status["urls_found"]
st.progress(progress)
# Display status metrics
col1, col2, col3, col4 = st.columns(4)
if status:
col1.metric("URLs Found", status["urls_found"])
col2.metric("URLs Processed", status["urls_processed"])
col3.metric("Successful", status["urls_succeeded"])
col4.metric("Failed", status["urls_failed"])
else:
col1.metric("URLs Found", 0)
col2.metric("URLs Processed", 0)
col3.metric("Successful", 0)
col4.metric("Failed", 0)
# Display logs in an expander
with st.expander("Crawling Logs", expanded=True):
if status and "logs" in status:
logs_text = "\n".join(status["logs"][-20:]) # Show last 20 logs
st.code(logs_text)
else:
st.code("No logs available yet...")
# Show completion message
if status and not status["is_running"] and status["end_time"]:
if status["urls_failed"] == 0:
st.success("✅ Crawling process completed successfully!")
else:
st.warning(f"⚠️ Crawling process completed with {status['urls_failed']} failed URLs.")
# Auto-refresh while crawling is in progress
if not status or status["is_running"]:
st.rerun()
# Display database statistics
st.subheader("Database Statistics")
try:
# Query the count of Pydantic AI docs
result = supabase_client.table("site_pages").select("count", count="exact").eq("metadata->>source", "pydantic_ai_docs").execute()
count = result.count if hasattr(result, "count") else 0
# Display the count
st.metric("Pydantic AI Docs Chunks", count)
# Add a button to view the data
if count > 0 and st.button("View Indexed Data", key="view_pydantic_data"):
# Query a sample of the data
sample_data = supabase_client.table("site_pages").select("url,title,summary,chunk_number").eq("metadata->>source", "pydantic_ai_docs").limit(10).execute()
# Display the sample data
st.dataframe(sample_data.data)
st.info("Showing up to 10 sample records. The database contains more records.")
except Exception as e:
st.error(f"Error querying database: {str(e)}")
with doc_tabs[1]:
st.info("Additional documentation sources will be available in future updates.")

View File

@@ -0,0 +1,362 @@
import streamlit as st
import sys
import os
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from utils.utils import (
get_env_var, save_env_var, reload_archon_graph,
get_current_profile, set_current_profile, get_all_profiles,
create_profile, delete_profile, get_profile_env_vars
)
def environment_tab():
# Get all available profiles and current profile
profiles = get_all_profiles()
current_profile = get_current_profile()
# Profile management section
st.subheader("Profile Management")
st.write("Profiles allow you to store different sets of environment variables for different providers or use cases.")
col1, col2 = st.columns([3, 1])
with col1:
# Profile selector
selected_profile = st.selectbox(
"Select Profile",
options=profiles,
index=profiles.index(current_profile) if current_profile in profiles else 0,
key="profile_selector"
)
if selected_profile != current_profile:
if set_current_profile(selected_profile):
# Clear provider session state variables to force them to reload from the new profile
if "llm_provider" in st.session_state:
del st.session_state.llm_provider
if "embedding_provider" in st.session_state:
del st.session_state.embedding_provider
st.success(f"Switched to profile: {selected_profile}, reloading...")
reload_archon_graph(show_reload_success=False)
st.rerun()
else:
st.error("Failed to switch profile.")
with col2:
# Add CSS for precise margin control
st.markdown("""
<style>
div[data-testid="stChatInput"] {
margin-top: 10px !important;
}
</style>
""", unsafe_allow_html=True)
# New profile creation with CSS applied directly to the chat input
new_profile_name = st.chat_input("New Profile Name", key="new_profile_name")
# Add a button to create the profile
if new_profile_name:
if new_profile_name in profiles:
st.error(f"Profile '{new_profile_name}' already exists.")
else:
if create_profile(new_profile_name):
# Clear provider session state variables for the new profile
if "llm_provider" in st.session_state:
del st.session_state.llm_provider
if "embedding_provider" in st.session_state:
del st.session_state.embedding_provider
st.success(f"Created profile: {new_profile_name}")
st.rerun()
else:
st.error("Failed to create profile.")
# Delete profile option (not for default)
if selected_profile != "default" and selected_profile == current_profile:
if st.button("Delete Current Profile", key="delete_profile"):
if delete_profile(selected_profile):
# Clear provider session state variables to force them to reload from the default profile
if "llm_provider" in st.session_state:
del st.session_state.llm_provider
if "embedding_provider" in st.session_state:
del st.session_state.embedding_provider
st.success(f"Deleted profile: {selected_profile}, reloading...")
reload_archon_graph(show_reload_success=False)
st.rerun()
else:
st.error("Failed to delete profile.")
st.markdown("---")
# Environment variables section
st.subheader(f"Environment Variables for Profile: {current_profile}")
st.write("- Configure your environment variables for Archon. These settings will be saved and used for future sessions.")
st.write("- NOTE: Press 'enter' to save after inputting a variable, otherwise click the 'save' button at the bottom.")
st.write("- HELP: Hover over the '?' icon on the right for each environment variable for help/examples.")
st.warning("⚠️ If your agent service for MCP is already running, you'll need to restart it after changing environment variables.")
# Get current profile's environment variables
profile_env_vars = get_profile_env_vars()
# Define default URLs for providers
llm_default_urls = {
"OpenAI": "https://api.openai.com/v1",
"Anthropic": "https://api.anthropic.com/v1",
"OpenRouter": "https://openrouter.ai/api/v1",
"Ollama": "http://localhost:11434/v1"
}
embedding_default_urls = {
"OpenAI": "https://api.openai.com/v1",
"Ollama": "http://localhost:11434/v1"
}
# Initialize session state for provider selections if not already set
if "llm_provider" not in st.session_state:
st.session_state.llm_provider = profile_env_vars.get("LLM_PROVIDER", "OpenAI")
if "embedding_provider" not in st.session_state:
st.session_state.embedding_provider = profile_env_vars.get("EMBEDDING_PROVIDER", "OpenAI")
# 1. Large Language Models Section - Provider Selection (outside form)
st.subheader("1. Select Your LLM Provider")
# LLM Provider dropdown
llm_providers = ["OpenAI", "Anthropic", "OpenRouter", "Ollama"]
selected_llm_provider = st.selectbox(
"LLM Provider",
options=llm_providers,
index=llm_providers.index(st.session_state.llm_provider) if st.session_state.llm_provider in llm_providers else 0,
key="llm_provider_selector"
)
# Update session state if provider changed
if selected_llm_provider != st.session_state.llm_provider:
st.session_state.llm_provider = selected_llm_provider
st.rerun() # Force a rerun to update the form
# 2. Embedding Models Section - Provider Selection (outside form)
st.subheader("2. Select Your Embedding Model Provider")
# Embedding Provider dropdown
embedding_providers = ["OpenAI", "Ollama"]
selected_embedding_provider = st.selectbox(
"Embedding Provider",
options=embedding_providers,
index=embedding_providers.index(st.session_state.embedding_provider) if st.session_state.embedding_provider in embedding_providers else 0,
key="embedding_provider_selector"
)
# Update session state if provider changed
if selected_embedding_provider != st.session_state.embedding_provider:
st.session_state.embedding_provider = selected_embedding_provider
st.rerun() # Force a rerun to update the form
# 3. Set environment variables (within the form)
st.subheader("3. Set All Environment Variables")
# Create a form for the environment variables
with st.form("env_vars_form"):
updated_values = {}
# Store the selected providers in the updated values
updated_values["LLM_PROVIDER"] = selected_llm_provider
updated_values["EMBEDDING_PROVIDER"] = selected_embedding_provider
# 1. Large Language Models Section - Settings
st.subheader("LLM Settings")
# BASE_URL
base_url_help = "Base URL for your LLM provider:\n\n" + \
"OpenAI: https://api.openai.com/v1\n\n" + \
"Anthropic: https://api.anthropic.com/v1\n\n" + \
"OpenRouter: https://openrouter.ai/api/v1\n\n" + \
"Ollama: http://localhost:11434/v1"
# Get current BASE_URL or use default for selected provider
current_base_url = profile_env_vars.get("BASE_URL", llm_default_urls.get(selected_llm_provider, ""))
# If provider changed or BASE_URL is empty, use the default
if not current_base_url or profile_env_vars.get("LLM_PROVIDER", "") != selected_llm_provider:
current_base_url = llm_default_urls.get(selected_llm_provider, "")
llm_base_url = st.text_input(
"BASE_URL:",
value=current_base_url,
help=base_url_help,
key="input_BASE_URL"
)
updated_values["BASE_URL"] = llm_base_url
# API_KEY
api_key_help = "API key for your LLM provider:\n\n" + \
"For OpenAI: https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key\n\n" + \
"For Anthropic: https://console.anthropic.com/account/keys\n\n" + \
"For OpenRouter: https://openrouter.ai/keys\n\n" + \
"For Ollama, no need to set this unless you specifically configured an API key"
# Get current API_KEY or set default for Ollama
current_api_key = profile_env_vars.get("LLM_API_KEY", "")
# If provider is Ollama and LLM_API_KEY is empty or provider changed, set to NOT_REQUIRED
if selected_llm_provider == "Ollama" and (not current_api_key or profile_env_vars.get("LLM_PROVIDER", "") != selected_llm_provider):
current_api_key = "NOT_REQUIRED"
# If there's already a value, show asterisks in the placeholder
placeholder = current_api_key if current_api_key == "NOT_REQUIRED" else "Set but hidden" if current_api_key else ""
api_key = st.text_input(
"API_KEY:",
type="password" if current_api_key != "NOT_REQUIRED" else "default",
help=api_key_help,
key="input_LLM_API_KEY",
placeholder=placeholder
)
# Only update if user entered something (to avoid overwriting with empty string)
if api_key:
updated_values["LLM_API_KEY"] = api_key
elif selected_llm_provider == "Ollama" and (not current_api_key or current_api_key == "NOT_REQUIRED"):
updated_values["LLM_API_KEY"] = "NOT_REQUIRED"
# PRIMARY_MODEL
primary_model_help = "The LLM you want to use for the primary agent/coder\n\n" + \
"Example: gpt-4o-mini\n\n" + \
"Example: qwen2.5:14b-instruct-8k"
primary_model = st.text_input(
"PRIMARY_MODEL:",
value=profile_env_vars.get("PRIMARY_MODEL", ""),
help=primary_model_help,
key="input_PRIMARY_MODEL"
)
updated_values["PRIMARY_MODEL"] = primary_model
# REASONER_MODEL
reasoner_model_help = "The LLM you want to use for the reasoner\n\n" + \
"Example: o3-mini\n\n" + \
"Example: deepseek-r1:7b-8k"
reasoner_model = st.text_input(
"REASONER_MODEL:",
value=profile_env_vars.get("REASONER_MODEL", ""),
help=reasoner_model_help,
key="input_REASONER_MODEL"
)
updated_values["REASONER_MODEL"] = reasoner_model
st.markdown("---")
# 2. Embedding Models Section - Settings
st.subheader("Embedding Settings")
# EMBEDDING_BASE_URL
embedding_base_url_help = "Base URL for your embedding provider:\n\n" + \
"OpenAI: https://api.openai.com/v1\n\n" + \
"Ollama: http://localhost:11434/v1"
# Get current EMBEDDING_BASE_URL or use default for selected provider
current_embedding_base_url = profile_env_vars.get("EMBEDDING_BASE_URL", embedding_default_urls.get(selected_embedding_provider, ""))
# If provider changed or EMBEDDING_BASE_URL is empty, use the default
if not current_embedding_base_url or profile_env_vars.get("EMBEDDING_PROVIDER", "") != selected_embedding_provider:
current_embedding_base_url = embedding_default_urls.get(selected_embedding_provider, "")
embedding_base_url = st.text_input(
"EMBEDDING_BASE_URL:",
value=current_embedding_base_url,
help=embedding_base_url_help,
key="input_EMBEDDING_BASE_URL"
)
updated_values["EMBEDDING_BASE_URL"] = embedding_base_url
# EMBEDDING_API_KEY
embedding_api_key_help = "API key for your embedding provider:\n\n" + \
"For OpenAI: https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key\n\n" + \
"For Ollama, no need to set this unless you specifically configured an API key"
# Get current EMBEDDING_API_KEY or set default for Ollama
current_embedding_api_key = profile_env_vars.get("EMBEDDING_API_KEY", "")
# If provider is Ollama and EMBEDDING_API_KEY is empty or provider changed, set to NOT_REQUIRED
if selected_embedding_provider == "Ollama" and (not current_embedding_api_key or profile_env_vars.get("EMBEDDING_PROVIDER", "") != selected_embedding_provider):
current_embedding_api_key = "NOT_REQUIRED"
# If there's already a value, show asterisks in the placeholder
placeholder = "Set but hidden" if current_embedding_api_key else ""
embedding_api_key = st.text_input(
"EMBEDDING_API_KEY:",
type="password",
help=embedding_api_key_help,
key="input_EMBEDDING_API_KEY",
placeholder=placeholder
)
# Only update if user entered something (to avoid overwriting with empty string)
if embedding_api_key:
updated_values["EMBEDDING_API_KEY"] = embedding_api_key
elif selected_embedding_provider == "Ollama" and (not current_embedding_api_key or current_embedding_api_key == "NOT_REQUIRED"):
updated_values["EMBEDDING_API_KEY"] = "NOT_REQUIRED"
# EMBEDDING_MODEL
embedding_model_help = "Embedding model you want to use\n\n" + \
"Example for Ollama: nomic-embed-text\n\n" + \
"Example for OpenAI: text-embedding-3-small"
embedding_model = st.text_input(
"EMBEDDING_MODEL:",
value=profile_env_vars.get("EMBEDDING_MODEL", ""),
help=embedding_model_help,
key="input_EMBEDDING_MODEL"
)
updated_values["EMBEDDING_MODEL"] = embedding_model
st.markdown("---")
# 3. Database Section
st.header("3. Database")
# SUPABASE_URL
supabase_url_help = "Get your SUPABASE_URL from the API section of your Supabase project settings -\nhttps://supabase.com/dashboard/project/<your project ID>/settings/api"
supabase_url = st.text_input(
"SUPABASE_URL:",
value=profile_env_vars.get("SUPABASE_URL", ""),
help=supabase_url_help,
key="input_SUPABASE_URL"
)
updated_values["SUPABASE_URL"] = supabase_url
# SUPABASE_SERVICE_KEY
supabase_key_help = "Get your SUPABASE_SERVICE_KEY from the API section of your Supabase project settings -\nhttps://supabase.com/dashboard/project/<your project ID>/settings/api\nOn this page it is called the service_role secret."
# If there's already a value, show asterisks in the placeholder
placeholder = "Set but hidden" if profile_env_vars.get("SUPABASE_SERVICE_KEY", "") else ""
supabase_key = st.text_input(
"SUPABASE_SERVICE_KEY:",
type="password",
help=supabase_key_help,
key="input_SUPABASE_SERVICE_KEY",
placeholder=placeholder
)
# Only update if user entered something (to avoid overwriting with empty string)
if supabase_key:
updated_values["SUPABASE_SERVICE_KEY"] = supabase_key
# Submit button
submitted = st.form_submit_button("Save Environment Variables")
if submitted:
# Save all updated values to the current profile
success = True
for var_name, value in updated_values.items():
if value or var_name in ["LLM_API_KEY", "EMBEDDING_API_KEY"]: # Allow empty strings for API keys (they might be intentionally cleared)
if not save_env_var(var_name, value):
success = False
st.error(f"Failed to save {var_name}.")
if success:
st.success(f"Environment variables saved successfully to profile: {current_profile}!")
reload_archon_graph()

View File

@@ -0,0 +1,831 @@
import streamlit as st
def future_enhancements_tab():
# Display the future enhancements and integrations interface
st.write("## Future Enhancements")
st.write("Explore what's coming next for Archon - from specialized multi-agent workflows to autonomous framework learning.")
# Future Iterations section
st.write("### Future Iterations")
# V5: Multi-Agent Coding Workflow
with st.expander("V5: Multi-Agent Coding Workflow"):
st.write("Specialized agents for different parts of the agent creation process")
# Create a visual representation of multi-agent workflow
st.write("#### Multi-Agent Coding Architecture")
# Describe the parallel architecture
st.markdown("""
The V5 architecture introduces specialized parallel agents that work simultaneously on different aspects of agent creation:
1. **Reasoner Agent**: Analyzes requirements and plans the overall agent architecture
2. **Parallel Coding Agents**:
- **Prompt Engineering Agent**: Designs optimal prompts for the agent
- **Tool Definition Agent**: Creates tool specifications and interfaces
- **Dependencies Agent**: Identifies required libraries and dependencies
- **Model Selection Agent**: Determines the best model configuration
3. **Final Coding Agent**: Integrates all components into a cohesive agent
4. **Human-in-the-Loop**: Iterative refinement with the final coding agent
""")
# Display parallel agents
st.write("#### Parallel Coding Agents")
col1, col2, col3, col4 = st.columns(4)
with col1:
st.info("**Prompt Engineering Agent**\n\nDesigns optimal prompts for different agent scenarios")
with col2:
st.success("**Tool Definition Agent**\n\nCreates tool specifications and interfaces")
with col3:
st.warning("**Dependencies Agent**\n\nIdentifies required libraries and dependencies")
with col4:
st.error("**Model Selection Agent**\n\nDetermines the best model configuration")
# Updated flow chart visualization with better colors for ovals
st.graphviz_chart('''
digraph {
rankdir=LR;
node [shape=box, style=filled, color=lightblue];
User [label="User Request", shape=ellipse, style=filled, color=purple, fontcolor=black];
Reasoner [label="Reasoner\nAgent"];
subgraph cluster_parallel {
label = "Parallel Coding Agents";
color = lightgrey;
style = filled;
Prompt [label="Prompt\nEngineering\nAgent", color=lightskyblue];
Tools [label="Tool\nDefinition\nAgent", color=green];
Dependencies [label="Dependencies\nAgent", color=yellow];
Model [label="Model\nSelection\nAgent", color=pink];
}
Final [label="Final\nCoding\nAgent"];
Human [label="Human-in-the-Loop\nIteration", shape=ellipse, style=filled, color=orange, fontcolor=black];
User -> Reasoner;
Reasoner -> Prompt;
Reasoner -> Tools;
Reasoner -> Dependencies;
Reasoner -> Model;
Prompt -> Final;
Tools -> Final;
Dependencies -> Final;
Model -> Final;
Final -> Human;
Human -> Final [label="Feedback Loop", color=red, constraint=false];
}
''')
st.write("#### Benefits of Parallel Agent Architecture")
st.markdown("""
- **Specialization**: Each agent focuses on its area of expertise
- **Efficiency**: Parallel processing reduces overall development time
- **Quality**: Specialized agents produce higher quality components
- **Flexibility**: Easy to add new specialized agents as needed
- **Scalability**: Architecture can handle complex agent requirements
""")
# V6: Tool Library and Example Integration
with st.expander("V6: Tool Library and Example Integration"):
st.write("Pre-built external tool and agent examples incorporation")
st.write("""
With pre-built tools, the agent can pull full functions from the tool library so it doesn't have to
create them from scratch. On top of that, pre-built agents will give Archon a starting point
so it doesn't have to build the agent structure from scratch either.
""")
st.write("#### Example Integration Configuration")
# Add tabs for different aspects of V6
tool_tab, example_tab = st.tabs(["Tool Library", "Example Agents"])
with tool_tab:
st.write("##### Example Tool Library Config (could be a RAG implementation too, still deciding)")
sample_config = """
{
"tool_library": {
"web_tools": {
"web_search": {
"type": "search_engine",
"api_key_env": "SEARCH_API_KEY",
"description": "Search the web for information"
},
"web_browser": {
"type": "browser",
"description": "Navigate web pages and extract content"
}
},
"data_tools": {
"database_query": {
"type": "sql_executor",
"description": "Execute SQL queries against databases"
},
"data_analysis": {
"type": "pandas_processor",
"description": "Analyze data using pandas"
}
},
"ai_service_tools": {
"image_generation": {
"type": "text_to_image",
"api_key_env": "IMAGE_GEN_API_KEY",
"description": "Generate images from text descriptions"
},
"text_to_speech": {
"type": "tts_converter",
"api_key_env": "TTS_API_KEY",
"description": "Convert text to spoken audio"
}
}
}
}
"""
st.code(sample_config, language="json")
st.write("##### Pydantic AI Tool Definition Example")
pydantic_tool_example = """
from pydantic_ai import Agent, RunContext, Tool
from typing import Union, List, Dict, Any
import requests
@agent.tool
async def weather_tool(ctx: RunContext[Dict[str, Any]], location: str) -> str:
\"\"\"Get current weather information for a location.
Args:
location: The city and state/country (e.g., 'San Francisco, CA')
Returns:
A string with current weather conditions and temperature
\"\"\"
api_key = ctx.deps.get("WEATHER_API_KEY")
if not api_key:
return "Error: Weather API key not configured"
try:
url = f"https://api.weatherapi.com/v1/current.json?key={api_key}&q={location}"
response = requests.get(url)
data = response.json()
if "error" in data:
return f"Error: {data['error']['message']}"
current = data["current"]
location_name = f"{data['location']['name']}, {data['location']['country']}"
condition = current["condition"]["text"]
temp_c = current["temp_c"]
temp_f = current["temp_f"]
humidity = current["humidity"]
return f"Weather in {location_name}: {condition}, {temp_c}°C ({temp_f}°F), {humidity}% humidity"
except Exception as e:
return f"Error retrieving weather data: {str(e)}"
"""
st.code(pydantic_tool_example, language="python")
st.write("##### Tool Usage in Agent")
tool_usage_example = """
async def use_weather_tool(location: str) -> str:
\"\"\"Search for weather information\"\"\"
tool = agent.get_tool("get_weather")
result = await tool.execute({"location": location})
return result.content
"""
st.code(tool_usage_example, language="python")
with example_tab:
st.write("##### Example Agents")
st.markdown("""
V6 will include pre-built example agents that serve as templates and learning resources. These examples will be baked directly into agent prompts to improve results and consistency.
**Benefits of Example Agents:**
- Provide concrete implementation patterns for common agent types
- Demonstrate best practices for tool usage and error handling
- Serve as starting points that can be customized for specific needs
- Improve consistency in agent behavior and output format
- Reduce the learning curve for new users
""")
st.write("##### Example Agent Types")
example_agents = {
"Research Assistant": {
"description": "Performs comprehensive research on topics using web search and content analysis",
"tools": ["web_search", "web_browser", "summarization"],
"example_prompt": "Research the latest advancements in quantum computing and provide a summary"
},
"Data Analyst": {
"description": "Analyzes datasets, generates visualizations, and provides insights",
"tools": ["database_query", "data_analysis", "chart_generation"],
"example_prompt": "Analyze this sales dataset and identify key trends over the past quarter"
},
"Content Creator": {
"description": "Generates various types of content including text, images, and code",
"tools": ["text_generation", "image_generation", "code_generation"],
"example_prompt": "Create a blog post about sustainable living with accompanying images"
},
"Conversational Assistant": {
"description": "Engages in helpful, informative conversations with natural dialogue",
"tools": ["knowledge_base", "memory_management", "personalization"],
"example_prompt": "I'd like to learn more about machine learning. Where should I start?"
}
}
# Create a table of example agents
example_data = {
"Agent Type": list(example_agents.keys()),
"Description": [example_agents[a]["description"] for a in example_agents],
"Core Tools": [", ".join(example_agents[a]["tools"]) for a in example_agents]
}
st.dataframe(example_data, use_container_width=True)
st.write("##### Example Agent Implementation")
st.code("""
# Example Weather Agent based on Pydantic AI documentation
from pydantic_ai import Agent, RunContext
from typing import Dict, Any
from dataclasses import dataclass
from httpx import AsyncClient
@dataclass
class WeatherDeps:
client: AsyncClient
weather_api_key: str | None
geo_api_key: str | None
# Create the agent with appropriate system prompt
weather_agent = Agent(
'openai:gpt-4o',
system_prompt=(
'Be concise, reply with one sentence. '
'Use the `get_lat_lng` tool to get the latitude and longitude of locations, '
'then use the `get_weather` tool to get the weather.'
),
deps_type=WeatherDeps,
)
@weather_agent.tool
async def get_lat_lng(ctx: RunContext[WeatherDeps], location_description: str) -> Dict[str, float]:
\"\"\"Get the latitude and longitude of a location.
Args:
location_description: A description of a location (e.g., 'London, UK')
Returns:
Dictionary with lat and lng keys
\"\"\"
if ctx.deps.geo_api_key is None:
# Return dummy data if no API key
return {'lat': 51.1, 'lng': -0.1}
# Call geocoding API
params = {'q': location_description, 'api_key': ctx.deps.geo_api_key}
r = await ctx.deps.client.get('https://geocode.maps.co/search', params=params)
r.raise_for_status()
data = r.json()
if data:
return {'lat': float(data[0]['lat']), 'lng': float(data[0]['lon'])}
else:
return {'error': 'Location not found'}
@weather_agent.tool
async def get_weather(ctx: RunContext[WeatherDeps], lat: float, lng: float) -> Dict[str, Any]:
\"\"\"Get the weather at a location.
Args:
lat: Latitude of the location
lng: Longitude of the location
Returns:
Dictionary with temperature and description
\"\"\"
if ctx.deps.weather_api_key is None:
# Return dummy data if no API key
return {'temperature': '21°C', 'description': 'Sunny'}
# Call weather API
params = {
'apikey': ctx.deps.weather_api_key,
'location': f'{lat},{lng}',
'units': 'metric',
}
r = await ctx.deps.client.get(
'https://api.tomorrow.io/v4/weather/realtime',
params=params
)
r.raise_for_status()
data = r.json()
values = data['data']['values']
weather_codes = {
1000: 'Clear, Sunny',
1100: 'Mostly Clear',
1101: 'Partly Cloudy',
4001: 'Rain',
5000: 'Snow',
8000: 'Thunderstorm',
}
return {
'temperature': f'{values["temperatureApparent"]:0.0f}°C',
'description': weather_codes.get(values['weatherCode'], 'Unknown'),
}
# Example usage
async def get_weather_report(location: str) -> str:
\"\"\"Get weather report for a location.\"\"\"
async with AsyncClient() as client:
deps = WeatherDeps(
client=client,
weather_api_key="YOUR_API_KEY", # Replace with actual key
geo_api_key="YOUR_API_KEY", # Replace with actual key
)
result = await weather_agent.run(
f"What is the weather like in {location}?",
deps=deps
)
return result.data
""", language="python")
st.info("""
**In-Context Learning with Examples**
These example agents will be used in the system prompt for Archon, providing concrete examples that help the LLM understand the expected structure and quality of agent code. This approach leverages in-context learning to significantly improve code generation quality and consistency.
""")
# V7: LangGraph Documentation
with st.expander("V7: LangGraph Documentation"):
st.write("Integrating LangGraph for complex agent workflows")
st.markdown("""
### Pydantic AI vs LangGraph with Pydantic AI
V7 will integrate LangGraph to enable complex agent workflows while maintaining compatibility with Pydantic AI agents.
This allows for creating sophisticated multi-agent systems with well-defined state management and workflow control.
""")
col1, col2 = st.columns(2)
with col1:
st.markdown("#### Pydantic AI Agent")
st.markdown("Simple, standalone agent with tools")
pydantic_agent_code = """
# Simple Pydantic AI Weather Agent
from pydantic_ai import Agent, RunContext
from typing import Dict, Any
from dataclasses import dataclass
from httpx import AsyncClient
@dataclass
class WeatherDeps:
client: AsyncClient
weather_api_key: str | None
# Create the agent
weather_agent = Agent(
'openai:gpt-4o',
system_prompt="You provide weather information.",
deps_type=WeatherDeps,
)
@weather_agent.tool
async def get_weather(
ctx: RunContext[WeatherDeps],
location: str
) -> Dict[str, Any]:
\"\"\"Get weather for a location.\"\"\"
# Implementation details...
return {"temperature": "21°C", "description": "Sunny"}
# Usage
async def main():
async with AsyncClient() as client:
deps = WeatherDeps(
client=client,
weather_api_key="API_KEY"
)
result = await weather_agent.run(
"What's the weather in London?",
deps=deps
)
print(result.data)
"""
st.code(pydantic_agent_code, language="python")
with col2:
st.markdown("#### LangGraph with Pydantic AI Agent")
st.markdown("Complex workflow using Pydantic AI agents in a graph")
langgraph_code = """
# LangGraph with Pydantic AI Agents
from pydantic_ai import Agent, RunContext
from typing import TypedDict, Literal
from dataclasses import dataclass
from httpx import AsyncClient
from langgraph.graph import StateGraph, START, END
# Define state for LangGraph
class GraphState(TypedDict):
query: str
weather_result: str
verified: bool
response: str
# Create a verifier agent
verifier_agent = Agent(
'openai:gpt-4o',
system_prompt=(
"You verify weather information for accuracy and completeness. "
"Check if the weather report includes temperature, conditions, "
"and is properly formatted."
)
)
# Define nodes for the graph
async def get_weather_info(state: GraphState) -> GraphState:
\"\"\"Use the weather agent to get weather information.\"\"\"
# Simply use the weather agent directly
async with AsyncClient() as client:
deps = WeatherDeps(
client=client,
weather_api_key="API_KEY"
)
result = await weather_agent.run(
state["query"],
deps=deps
)
return {"weather_result": result.data}
async def verify_information(state: GraphState) -> GraphState:
\"\"\"Use the verifier agent to check the weather information.\"\"\"
result = await verifier_agent.run(
f"Verify this weather information: {state['weather_result']}"
)
# Simple verification logic
verified = "accurate" in result.data.lower()
return {"verified": verified}
async def route(state: GraphState) -> Literal["regenerate", "finalize"]:
"\"\"Decide whether to regenerate or finalize based on verification.\"\"\"
if state["verified"]:
return "finalize"
else:
return "regenerate"
async def regenerate_response(state: GraphState) -> GraphState:
\"\"\"Regenerate a better response if verification failed.\"\"\"
result = await verifier_agent.run(
result = await weather_agent.run(
f"Please provide more detailed weather information for: {state['query']}"
)
return {"weather_result": result.data, "verified": True}
async def finalize_response(state: GraphState) -> GraphState:
\"\"\"Format the final response.\"\"\"
return {"response": f"Verified Weather Report: {state['weather_result']}"}
# Build the graph
workflow = StateGraph(GraphState)
# Add nodes
workflow.add_node("get_weather", get_weather_info)
workflow.add_node("verify", verify_information)
workflow.add_node("regenerate", regenerate_response)
workflow.add_node("finalize", finalize_response)
# Add edges
workflow.add_edge(START, "get_weather")
workflow.add_edge("get_weather", "verify")
# Add conditional edges based on verification
workflow.add_conditional_edges(
"verify",
route,
{
"regenerate": "regenerate",
"finalize": "finalize"
}
)
workflow.add_edge("regenerate", "finalize")
workflow.add_edge("finalize", END)
# Compile the graph
app = workflow.compile()
# Usage
async def main():
result = await app.ainvoke({
"query": "What's the weather in London?",
"verified": False
})
print(result["response"])
"""
st.code(langgraph_code, language="python")
st.markdown("""
### Key Benefits of Integration
1. **Workflow Management**: LangGraph provides a structured way to define complex agent workflows with clear state transitions.
2. **Reusability**: Pydantic AI agents can be reused within LangGraph nodes, maintaining their tool capabilities.
3. **Visualization**: LangGraph offers built-in visualization of agent workflows, making it easier to understand and debug complex systems.
4. **State Management**: The typed state in LangGraph ensures type safety and clear data flow between nodes.
5. **Parallel Execution**: LangGraph supports parallel execution of nodes, enabling more efficient processing.
6. **Human-in-the-Loop**: Both frameworks support human intervention points, which can be combined for powerful interactive systems.
""")
st.image("https://blog.langchain.dev/content/images/2024/01/simple_multi_agent_diagram--1-.png",
caption="Example LangGraph Multi-Agent Workflow", width=600)
# V8: Self-Feedback Loop
with st.expander("V8: Self-Feedback Loop"):
st.write("Automated validation and error correction")
# Create a visual feedback loop
st.graphviz_chart('''
digraph {
rankdir=TB;
node [shape=box, style=filled, color=lightblue];
Agent [label="Agent Generation"];
Test [label="Automated Testing"];
Validate [label="Validation"];
Error [label="Error Detection"];
Fix [label="Self-Correction"];
Agent -> Test;
Test -> Validate;
Validate -> Error [label="Issues Found"];
Error -> Fix;
Fix -> Agent [label="Regenerate"];
Validate -> Agent [label="Success", color=green];
}
''')
st.write("#### Validation Process")
st.info("""
1. Generate agent code
2. Run automated tests
3. Analyze test results
4. Identify errors or improvement areas
5. Apply self-correction algorithms
6. Regenerate improved code
7. Repeat until validation passes
""")
# V9: Self Agent Execution
with st.expander("V9: Self Agent Execution"):
st.write("Testing and iterating on agents in an isolated environment")
st.write("#### Agent Execution Process")
execution_process = [
{"phase": "Sandbox Creation", "description": "Set up isolated environment using Local AI package"},
{"phase": "Agent Deployment", "description": "Load the generated agent into the testing environment"},
{"phase": "Test Execution", "description": "Run the agent against predefined scenarios and user queries"},
{"phase": "Performance Monitoring", "description": "Track response quality, latency, and resource usage"},
{"phase": "Error Detection", "description": "Identify runtime errors and logical inconsistencies"},
{"phase": "Iterative Improvement", "description": "Refine agent based on execution results"}
]
for i, phase in enumerate(execution_process):
st.write(f"**{i+1}. {phase['phase']}:** {phase['description']}")
st.write("#### Local AI Package Integration")
st.markdown("""
The [Local AI package](https://github.com/coleam00/local-ai-packaged) provides a containerized environment for:
- Running LLMs locally for agent testing
- Simulating API calls and external dependencies
- Monitoring agent behavior in a controlled setting
- Collecting performance metrics for optimization
""")
st.info("This enables Archon to test and refine agents in a controlled environment before deployment, significantly improving reliability and performance through empirical iteration.")
# V10: Multi-Framework Support
with st.expander("V10: Multi-Framework Support"):
st.write("Framework-agnostic agent generation")
frameworks = {
"Pydantic AI": {"status": "Supported", "description": "Native support for function-based agents"},
"LangGraph": {"status": "Coming in V7", "description": "Declarative multi-agent orchestration"},
"LangChain": {"status": "Planned", "description": "Popular agent framework with extensive tools"},
"Agno (Phidata)": {"status": "Planned", "description": "Multi-agent workflow framework"},
"CrewAI": {"status": "Planned", "description": "Role-based collaborative agents"},
"LlamaIndex": {"status": "Planned", "description": "RAG-focused agent framework"}
}
# Create a frameworks comparison table
df_data = {
"Framework": list(frameworks.keys()),
"Status": [frameworks[f]["status"] for f in frameworks],
"Description": [frameworks[f]["description"] for f in frameworks]
}
st.dataframe(df_data, use_container_width=True)
# V11: Autonomous Framework Learning
with st.expander("V11: Autonomous Framework Learning"):
st.write("Self-learning from mistakes and continuous improvement")
st.write("#### Self-Improvement Process")
improvement_process = [
{"phase": "Error Detection", "description": "Identifies patterns in failed agent generations and runtime errors"},
{"phase": "Root Cause Analysis", "description": "Analyzes error patterns to determine underlying issues in prompts or examples"},
{"phase": "Prompt Refinement", "description": "Automatically updates system prompts to address identified weaknesses"},
{"phase": "Example Augmentation", "description": "Adds new examples to the prompt library based on successful generations"},
{"phase": "Tool Enhancement", "description": "Creates or modifies tools to handle edge cases and common failure modes"},
{"phase": "Validation", "description": "Tests improvements against historical failure cases to ensure progress"}
]
for i, phase in enumerate(improvement_process):
st.write(f"**{i+1}. {phase['phase']}:** {phase['description']}")
st.info("This enables Archon to stay updated with the latest AI frameworks without manual intervention.")
# V12: Advanced RAG Techniques
with st.expander("V12: Advanced RAG Techniques"):
st.write("Enhanced retrieval and incorporation of framework documentation")
st.write("#### Advanced RAG Components")
col1, col2 = st.columns(2)
with col1:
st.markdown("#### Document Processing")
st.markdown("""
- **Hierarchical Chunking**: Multi-level chunking strategy that preserves document structure
- **Semantic Headers**: Extraction of meaningful section headers for better context
- **Code-Text Separation**: Specialized embedding models for code vs. natural language
- **Metadata Enrichment**: Automatic tagging with framework version, function types, etc.
""")
st.markdown("#### Query Processing")
st.markdown("""
- **Query Decomposition**: Breaking complex queries into sub-queries
- **Framework Detection**: Identifying which framework the query relates to
- **Intent Classification**: Determining if query is about usage, concepts, or troubleshooting
- **Query Expansion**: Adding relevant framework-specific terminology
""")
with col2:
st.markdown("#### Retrieval Enhancements")
st.markdown("""
- **Hybrid Search**: Combining dense and sparse retrievers for optimal results
- **Re-ranking**: Post-retrieval scoring based on relevance to the specific task
- **Cross-Framework Retrieval**: Finding analogous patterns across different frameworks
- **Code Example Prioritization**: Boosting practical examples in search results
""")
st.markdown("#### Knowledge Integration")
st.markdown("""
- **Context Stitching**: Intelligently combining information from multiple chunks
- **Framework Translation**: Converting patterns between frameworks (e.g., LangChain to LangGraph)
- **Version Awareness**: Handling differences between framework versions
- **Adaptive Retrieval**: Learning from successful and unsuccessful retrievals
""")
st.info("This enables Archon to more effectively retrieve and incorporate framework documentation, leading to more accurate and contextually appropriate agent generation.")
# V13: MCP Agent Marketplace
with st.expander("V13: MCP Agent Marketplace"):
st.write("Integrating Archon agents as MCP servers and publishing to marketplaces")
st.write("#### MCP Integration Process")
mcp_integration_process = [
{"phase": "Protocol Implementation", "description": "Implement the Model Context Protocol to enable IDE integration"},
{"phase": "Agent Conversion", "description": "Transform Archon-generated agents into MCP-compatible servers"},
{"phase": "Specialized Agent Creation", "description": "Build purpose-specific agents for code review, refactoring, and testing"},
{"phase": "Marketplace Publishing", "description": "Package and publish agents to MCP marketplaces for distribution"},
{"phase": "IDE Integration", "description": "Enable seamless operation within Windsurf, Cursor, and other MCP-enabled IDEs"}
]
for i, phase in enumerate(mcp_integration_process):
st.write(f"**{i+1}. {phase['phase']}:** {phase['description']}")
st.info("This enables Archon to create specialized agents that operate directly within IDEs through the MCP protocol, while also making them available through marketplace distribution channels.")
# Future Integrations section
st.write("### Future Integrations")
# LangSmith
with st.expander("LangSmith"):
st.write("Integration with LangChain's tracing and monitoring platform")
st.image("https://docs.smith.langchain.com/assets/images/trace-9510284b5b15ba55fc1cca6af2404657.png", width=600)
st.write("#### LangSmith Benefits")
st.markdown("""
- **Tracing**: Monitor agent execution steps and decisions
- **Debugging**: Identify issues in complex agent workflows
- **Analytics**: Track performance and cost metrics
- **Evaluation**: Assess agent quality with automated testing
- **Feedback Collection**: Gather human feedback to improve agents
""")
# MCP Marketplace
with st.expander("MCP Marketplace"):
st.write("Integration with AI IDE marketplaces")
st.write("#### MCP Marketplace Integration")
st.markdown("""
- Publish Archon itself as a premium agent in MCP marketplaces
- Create specialized Archon variants for different development needs
- Enable one-click installation directly from within IDEs
- Integrate seamlessly with existing development workflows
""")
st.warning("The Model Context Protocol (MCP) is an emerging standard for AI assistant integration with IDEs like Windsurf, Cursor, Cline, and Roo Code.")
# Other Frameworks
with st.expander("Other Frameworks besides Pydantic AI"):
st.write("Support for additional agent frameworks")
st.write("#### Framework Adapter Architecture")
st.graphviz_chart('''
digraph {
rankdir=TB;
node [shape=box, style=filled, color=lightblue];
Archon [label="Archon Core"];
Adapter [label="Framework Adapter Layer"];
Pydantic [label="Pydantic AI", color=lightskyblue];
LangGraph [label="LangGraph", color=lightskyblue];
LangChain [label="LangChain", color=lightskyblue];
Agno [label="Agno", color=lightskyblue];
CrewAI [label="CrewAI", color=lightskyblue];
LlamaIndex [label="LlamaIndex", color=lightskyblue];
Archon -> Adapter;
Adapter -> Pydantic;
Adapter -> LangGraph;
Adapter -> LangChain;
Adapter -> Agno;
Adapter -> CrewAI;
Adapter -> LlamaIndex;
}
''')
# Vector Databases
with st.expander("Other Vector Databases besides Supabase"):
st.write("Support for additional vector databases")
vector_dbs = {
"Supabase": {"status": "Supported", "features": ["pgvector integration", "SQL API", "Real-time subscriptions"]},
"Pinecone": {"status": "Planned", "features": ["High scalability", "Low latency", "Serverless"]},
"Qdrant": {"status": "Planned", "features": ["Filtering", "Self-hosted option", "REST API"]},
"Milvus": {"status": "Planned", "features": ["Horizontal scaling", "Cloud-native", "Hybrid search"]},
"Chroma": {"status": "Planned", "features": ["Local-first", "Lightweight", "Simple API"]},
"Weaviate": {"status": "Planned", "features": ["GraphQL", "Multi-modal", "RESTful API"]}
}
# Create vector DB comparison table
df_data = {
"Vector Database": list(vector_dbs.keys()),
"Status": [vector_dbs[db]["status"] for db in vector_dbs],
"Key Features": [", ".join(vector_dbs[db]["features"]) for db in vector_dbs]
}
st.dataframe(df_data, use_container_width=True)
# Local AI Package
with st.expander("Local AI Package Integration"):
st.write("Integration with [Local AI Package](https://github.com/coleam00/local-ai-packaged)")
st.markdown("""
The Local AI Package enables running models entirely locally, providing:
- **Complete Privacy**: No data leaves your machine
- **Cost Savings**: Eliminate API usage fees
- **Offline Operation**: Work without internet connectivity
- **Custom Fine-tuning**: Adapt models to specific domains
- **Lower Latency**: Reduce response times for better UX
""")
st.info("This integration will allow Archon to operate fully offline with local models for both agent creation and execution.")

View File

@@ -0,0 +1,140 @@
import streamlit as st
import sys
import os
# Add the parent directory to sys.path to allow importing from the parent directory
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from utils.utils import create_new_tab_button
def intro_tab():
"""Display the introduction and setup guide for Archon"""
# Welcome message
st.markdown("""
Archon is an AI meta-agent designed to autonomously build, refine, and optimize other AI agents.
It serves both as a practical tool for developers and as an educational framework demonstrating the evolution of agentic systems.
Archon is developed in iterations, starting with a simple Pydantic AI agent that can build other Pydantic AI agents,
all the way to a full agentic workflow using LangGraph that can build other AI agents with any framework.
Through its iterative development, Archon showcases the power of planning, feedback loops, and domain-specific knowledge in creating robust AI agents.
""")
# Environment variables update notice
st.warning("""
**🔄 IMPORTANT UPDATE (March 20th):** Archon now uses a multi-agent workflow with specialized refiner agents for autonomous prompt, tools, and agent definition improvements. The primary coding agent still creates the initial agent by itself, but then you can say 'refine' or something along those lines as a follow up prompt to kick off the specialized agents in parallel.
""")
# Setup guide with expandable sections
st.markdown("## Setup Guide")
st.markdown("Follow these concise steps to get Archon up and running (IMPORTANT: come back here after each step):")
# Step 1: Environment Configuration
with st.expander("Step 1: Environment Configuration", expanded=True):
st.markdown("""
### Environment Configuration
First, you need to set up your environment variables:
1. Go to the **Environment** tab
2. Configure the following essential variables:
- `BASE_URL`: API endpoint (OpenAI, OpenRouter, or Ollama)
- `LLM_API_KEY`: Your API key for the LLM service
- `OPENAI_API_KEY`: Required for embeddings
- `SUPABASE_URL`: Your Supabase project URL
- `SUPABASE_SERVICE_KEY`: Your Supabase service key
- `PRIMARY_MODEL`: Main agent model (e.g., gpt-4o-mini)
- `REASONER_MODEL`: Planning model (e.g., o3-mini)
These settings determine how Archon connects to external services and which models it uses.
""")
# Add a button to navigate to the Environment tab
create_new_tab_button("Go to Environment Section (New Tab)", "Environment", key="goto_env", use_container_width=True)
# Step 2: Database Setup
with st.expander("Step 2: Database Setup", expanded=False):
st.markdown("""
### Database Setup
Archon uses Supabase for vector storage and retrieval:
1. Go to the **Database** tab
2. Select your embedding dimensions (1536 for OpenAI, 768 for nomic-embed-text)
3. Follow the instructions to create the `site_pages` table
This creates the necessary tables, indexes, and functions for vector similarity search.
""")
# Add a button to navigate to the Database tab
create_new_tab_button("Go to Database Section (New Tab)", "Database", key="goto_db", use_container_width=True)
# Step 3: Documentation Crawling
with st.expander("Step 3: Documentation Crawling", expanded=False):
st.markdown("""
### Documentation Crawling
Populate the database with framework documentation:
1. Go to the **Documentation** tab
2. Click on "Crawl Pydantic AI Docs"
3. Wait for the crawling process to complete
This step downloads and processes documentation, creating embeddings for semantic search.
""")
# Add a button to navigate to the Documentation tab
create_new_tab_button("Go to the Documentation Section (New Tab)", "Documentation", key="goto_docs", use_container_width=True)
# Step 4: Agent Service
with st.expander("Step 4: Agent Service Setup (for MCP)", expanded=False):
st.markdown("""
### MCP Agent Service Setup
Start the graph service for agent generation:
1. Go to the **Agent Service** tab
2. Click on "Start Agent Service"
3. Verify the service is running
The agent service powers the LangGraph workflow for agent creation.
""")
# Add a button to navigate to the Agent Service tab
create_new_tab_button("Go to Agent Service Section (New Tab)", "Agent Service", key="goto_service", use_container_width=True)
# Step 5: MCP Configuration (Optional)
with st.expander("Step 5: MCP Configuration (Optional)", expanded=False):
st.markdown("""
### MCP Configuration
For integration with AI IDEs:
1. Go to the **MCP** tab
2. Select your IDE (Windsurf, Cursor, or Cline/Roo Code)
3. Follow the instructions to configure your IDE
This enables you to use Archon directly from your AI-powered IDE.
""")
# Add a button to navigate to the MCP tab
create_new_tab_button("Go to MCP Section (New Tab)", "MCP", key="goto_mcp", use_container_width=True)
# Step 6: Using Archon
with st.expander("Step 6: Using Archon", expanded=False):
st.markdown("""
### Using Archon
Once everything is set up:
1. Go to the **Chat** tab
2. Describe the agent you want to build
3. Archon will plan and generate the necessary code
You can also use Archon directly from your AI IDE if you've configured MCP.
""")
# Add a button to navigate to the Chat tab
create_new_tab_button("Go to Chat Section (New Tab)", "Chat", key="goto_chat", use_container_width=True)
# Resources
st.markdown("""
## Additional Resources
- [GitHub Repository](https://github.com/coleam00/archon)
- [Archon Community Forum](https://thinktank.ottomator.ai/c/archon/30)
- [GitHub Kanban Board](https://github.com/users/coleam00/projects/1)
""")

View File

@@ -0,0 +1,171 @@
import streamlit as st
import platform
import json
import os
def get_paths():
# Get the absolute path to the current directory
base_path = os.path.abspath(os.path.dirname(os.path.dirname(__file__)))
# Determine the correct python path based on the OS
if platform.system() == "Windows":
python_path = os.path.join(base_path, 'venv', 'Scripts', 'python.exe')
else: # macOS or Linux
python_path = os.path.join(base_path, 'venv', 'bin', 'python')
server_script_path = os.path.join(base_path, 'mcp', 'mcp_server.py')
return python_path, server_script_path
def generate_mcp_config(ide_type, python_path, server_script_path):
"""
Generate MCP configuration for the selected IDE type.
"""
# Create the config dictionary for Python
python_config = {
"mcpServers": {
"archon": {
"command": python_path,
"args": [server_script_path]
}
}
}
# Create the config dictionary for Docker
docker_config = {
"mcpServers": {
"archon": {
"command": "docker",
"args": [
"run",
"-i",
"--rm",
"-e",
"GRAPH_SERVICE_URL",
"archon-mcp:latest"
],
"env": {
"GRAPH_SERVICE_URL": "http://host.docker.internal:8100"
}
}
}
}
# Return appropriate configuration based on IDE type
if ide_type == "Windsurf":
return json.dumps(python_config, indent=2), json.dumps(docker_config, indent=2)
elif ide_type == "Cursor":
return f"{python_path} {server_script_path}", f"docker run -i --rm -e GRAPH_SERVICE_URL=http://host.docker.internal:8100 archon-mcp:latest"
elif ide_type == "Cline/Roo Code":
return json.dumps(python_config, indent=2), json.dumps(docker_config, indent=2)
elif ide_type == "Claude Code":
return f"Not Required", "Not Required"
else:
return "Unknown IDE type selected", "Unknown IDE type selected"
def mcp_tab():
"""Display the MCP configuration interface"""
st.header("MCP Configuration")
st.write("Select your AI IDE to get the appropriate MCP configuration:")
# IDE selection with side-by-side buttons
col1, col2, col3, col4 = st.columns(4)
with col1:
windsurf_button = st.button("Windsurf", use_container_width=True, key="windsurf_button")
with col2:
cursor_button = st.button("Cursor", use_container_width=True, key="cursor_button")
with col3:
cline_button = st.button("Cline/Roo Code", use_container_width=True, key="cline_button")
with col4:
claude_button = st.button("Claude Code", use_container_width=True, key="claude_button")
# Initialize session state for selected IDE if not present
if "selected_ide" not in st.session_state:
st.session_state.selected_ide = None
# Update selected IDE based on button clicks
if windsurf_button:
st.session_state.selected_ide = "Windsurf"
elif cursor_button:
st.session_state.selected_ide = "Cursor"
elif cline_button:
st.session_state.selected_ide = "Cline/Roo Code"
elif claude_button:
st.session_state.selected_ide = "Claude Code"
# Display configuration if an IDE is selected
if st.session_state.selected_ide:
selected_ide = st.session_state.selected_ide
st.subheader(f"MCP Configuration for {selected_ide}")
python_path, server_script_path = get_paths()
python_config, docker_config = generate_mcp_config(selected_ide, python_path, server_script_path)
# Configuration type tabs
config_tab1, config_tab2 = st.tabs(["Docker Configuration", "Python Configuration"])
with config_tab1:
st.markdown("### Docker Configuration")
st.code(docker_config, language="json" if selected_ide != "Cursor" else None)
st.markdown("#### Requirements:")
st.markdown("- Docker installed")
st.markdown("- Run the setup script to build and start both containers:")
st.code("python run_docker.py", language="bash")
with config_tab2:
st.markdown("### Python Configuration")
st.code(python_config, language="json" if selected_ide != "Cursor" else None)
st.markdown("#### Requirements:")
st.markdown("- Python 3.11+ installed")
st.markdown("- Virtual environment created and activated")
st.markdown("- All dependencies installed via `pip install -r requirements.txt`")
st.markdown("- Must be running Archon not within a container")
# Instructions based on IDE type
st.markdown("---")
st.markdown("### Setup Instructions")
if selected_ide == "Windsurf":
st.markdown("""
#### How to use in Windsurf:
1. Click on the hammer icon above the chat input
2. Click on "Configure"
3. Paste the JSON from your preferred configuration tab above
4. Click "Refresh" next to "Configure"
""")
elif selected_ide == "Cursor":
st.markdown("""
#### How to use in Cursor:
1. Go to Cursor Settings > Features > MCP
2. Click on "+ Add New MCP Server"
3. Name: Archon
4. Type: command (equivalent to stdio)
5. Command: Paste the command from your preferred configuration tab above
""")
elif selected_ide == "Cline/Roo Code":
st.markdown("""
#### How to use in Cline or Roo Code:
1. From the Cline/Roo Code extension, click the "MCP Server" tab
2. Click the "Edit MCP Settings" button
3. The MCP settings file should be displayed in a tab in VS Code
4. Paste the JSON from your preferred configuration tab above
5. Cline/Roo Code will automatically detect and start the MCP server
""")
elif selected_ide == "Claude Code":
st.markdown(f"""
#### How to use in Claude Code:
1. Deploy and run Archon in Docker
2. In the Archon UI, start the MCP service.
3. Open a terminal and navigate to your work folder.
4. Execute the command:
\tFor Docker: `claude mcp add Archon docker run -i --rm -e GRAPH_SERVICE_URL=http://host.docker.internal:8100 archon-mcp:latest `
\tFor Python: `claude mcp add Archon {python_path} {server_script_path}`
5. Start Claude Code with the command `claude`. When Claude Code starts, at the bottom of the welcome section will be a listing of connected MCP Services, Archon should be listed with a status of _connected_.
6. You can now use the Archon MCP service in your Claude Code projects
(NOTE: If you close the terminal, or start a session in a new terminal, you will need to re-add the MCP service.)
""")

View File

@@ -0,0 +1,94 @@
"""
This module contains the CSS styles for the Streamlit UI.
"""
import streamlit as st
def load_css():
"""
Load the custom CSS styles for the Archon UI.
"""
st.markdown("""
<style>
:root {
--primary-color: #00CC99; /* Green */
--secondary-color: #EB2D8C; /* Pink */
--text-color: #262730;
}
/* Style the buttons */
.stButton > button {
color: white;
border: 2px solid var(--primary-color);
padding: 0.5rem 1rem;
font-weight: bold;
transition: all 0.3s ease;
}
.stButton > button:hover {
color: white;
border: 2px solid var(--secondary-color);
}
/* Override Streamlit's default focus styles that make buttons red */
.stButton > button:focus,
.stButton > button:focus:hover,
.stButton > button:active,
.stButton > button:active:hover {
color: white !important;
border: 2px solid var(--secondary-color) !important;
box-shadow: none !important;
outline: none !important;
}
/* Style headers */
h1, h2, h3 {
color: var(--primary-color);
}
/* Hide spans within h3 elements */
h1 span, h2 span, h3 span {
display: none !important;
visibility: hidden;
width: 0;
height: 0;
opacity: 0;
position: absolute;
overflow: hidden;
}
/* Style code blocks */
pre {
border-left: 4px solid var(--primary-color);
}
/* Style links */
a {
color: var(--secondary-color);
}
/* Style the chat messages */
.stChatMessage {
border-left: 4px solid var(--secondary-color);
}
/* Style the chat input */
.stChatInput > div {
border: 2px solid var(--primary-color) !important;
}
/* Remove red outline on focus */
.stChatInput > div:focus-within {
box-shadow: none !important;
border: 2px solid var(--secondary-color) !important;
outline: none !important;
}
/* Remove red outline on all inputs when focused */
input:focus, textarea:focus, [contenteditable]:focus {
box-shadow: none !important;
border-color: var(--secondary-color) !important;
outline: none !important;
}
</style>
""", unsafe_allow_html=True)

View File

@@ -0,0 +1,114 @@
from __future__ import annotations
from dotenv import load_dotenv
import streamlit as st
import logfire
import asyncio
# Set page config - must be the first Streamlit command
st.set_page_config(
page_title="Archon - Agent Builder",
page_icon="🤖",
layout="wide",
)
# Utilities and styles
from utils.utils import get_clients
from streamlit_pages.styles import load_css
# Streamlit pages
from streamlit_pages.intro import intro_tab
from streamlit_pages.chat import chat_tab
from streamlit_pages.environment import environment_tab
from streamlit_pages.database import database_tab
from streamlit_pages.documentation import documentation_tab
from streamlit_pages.agent_service import agent_service_tab
from streamlit_pages.mcp import mcp_tab
from streamlit_pages.future_enhancements import future_enhancements_tab
# Load environment variables from .env file
load_dotenv()
# Initialize clients
openai_client, supabase = get_clients()
# Load custom CSS styles
load_css()
# Configure logfire to suppress warnings (optional)
logfire.configure(send_to_logfire='never')
async def main():
# Check for tab query parameter
query_params = st.query_params
if "tab" in query_params:
tab_name = query_params["tab"]
if tab_name in ["Intro", "Chat", "Environment", "Database", "Documentation", "Agent Service", "MCP", "Future Enhancements"]:
st.session_state.selected_tab = tab_name
# Add sidebar navigation
with st.sidebar:
st.image("public/ArchonLightGrey.png", width=1000)
# Navigation options with vertical buttons
st.write("### Navigation")
# Initialize session state for selected tab if not present
if "selected_tab" not in st.session_state:
st.session_state.selected_tab = "Intro"
# Vertical navigation buttons
intro_button = st.button("Intro", use_container_width=True, key="intro_button")
chat_button = st.button("Chat", use_container_width=True, key="chat_button")
env_button = st.button("Environment", use_container_width=True, key="env_button")
db_button = st.button("Database", use_container_width=True, key="db_button")
docs_button = st.button("Documentation", use_container_width=True, key="docs_button")
service_button = st.button("Agent Service", use_container_width=True, key="service_button")
mcp_button = st.button("MCP", use_container_width=True, key="mcp_button")
future_enhancements_button = st.button("Future Enhancements", use_container_width=True, key="future_enhancements_button")
# Update selected tab based on button clicks
if intro_button:
st.session_state.selected_tab = "Intro"
elif chat_button:
st.session_state.selected_tab = "Chat"
elif mcp_button:
st.session_state.selected_tab = "MCP"
elif env_button:
st.session_state.selected_tab = "Environment"
elif service_button:
st.session_state.selected_tab = "Agent Service"
elif db_button:
st.session_state.selected_tab = "Database"
elif docs_button:
st.session_state.selected_tab = "Documentation"
elif future_enhancements_button:
st.session_state.selected_tab = "Future Enhancements"
# Display the selected tab
if st.session_state.selected_tab == "Intro":
st.title("Archon - Introduction")
intro_tab()
elif st.session_state.selected_tab == "Chat":
st.title("Archon - Agent Builder")
await chat_tab()
elif st.session_state.selected_tab == "MCP":
st.title("Archon - MCP Configuration")
mcp_tab()
elif st.session_state.selected_tab == "Environment":
st.title("Archon - Environment Configuration")
environment_tab()
elif st.session_state.selected_tab == "Agent Service":
st.title("Archon - Agent Service")
agent_service_tab()
elif st.session_state.selected_tab == "Database":
st.title("Archon - Database Configuration")
database_tab(supabase)
elif st.session_state.selected_tab == "Documentation":
st.title("Archon - Documentation")
documentation_tab(supabase)
elif st.session_state.selected_tab == "Future Enhancements":
st.title("Archon - Future Enhancements")
future_enhancements_tab()
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -0,0 +1,72 @@
-- Enable the pgvector extension
create extension if not exists vector;
-- Create the documentation chunks table
create table site_pages (
id bigserial primary key,
url varchar not null,
chunk_number integer not null,
title varchar not null,
summary varchar not null,
content text not null, -- Added content column
metadata jsonb not null default '{}'::jsonb, -- Added metadata column
embedding vector(1536), -- OpenAI embeddings are 1536 dimensions
created_at timestamp with time zone default timezone('utc'::text, now()) not null,
-- Add a unique constraint to prevent duplicate chunks for the same URL
unique(url, chunk_number)
);
-- Create an index for better vector similarity search performance
create index on site_pages using ivfflat (embedding vector_cosine_ops);
-- Create an index on metadata for faster filtering
create index idx_site_pages_metadata on site_pages using gin (metadata);
-- Create a function to search for documentation chunks
create function match_site_pages (
query_embedding vector(1536),
match_count int default 10,
filter jsonb DEFAULT '{}'::jsonb
) returns table (
id bigint,
url varchar,
chunk_number integer,
title varchar,
summary varchar,
content text,
metadata jsonb,
similarity float
)
language plpgsql
as $$
#variable_conflict use_column
begin
return query
select
id,
url,
chunk_number,
title,
summary,
content,
metadata,
1 - (site_pages.embedding <=> query_embedding) as similarity
from site_pages
where metadata @> filter
order by site_pages.embedding <=> query_embedding
limit match_count;
end;
$$;
-- Everything above will work for any PostgreSQL database. The below commands are for Supabase security
-- Enable RLS on the table
alter table site_pages enable row level security;
-- Create a policy that allows anyone to read
create policy "Allow public read access"
on site_pages
for select
to public
using (true);

View File

@@ -0,0 +1,409 @@
from supabase import Client, create_client
from openai import AsyncOpenAI
from dotenv import load_dotenv
from datetime import datetime
from functools import wraps
from typing import Optional
import streamlit as st
import webbrowser
import importlib
import inspect
import json
import sys
import os
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
# Load environment variables from .env file
load_dotenv()
current_dir = os.path.dirname(os.path.abspath(__file__))
parent_dir = os.path.dirname(current_dir)
workbench_dir = os.path.join(parent_dir, "workbench")
def write_to_log(message: str):
"""Write a message to the logs.txt file in the workbench directory.
Args:
message: The message to log
"""
# Get the directory one level up from the current file
log_path = os.path.join(workbench_dir, "logs.txt")
os.makedirs(workbench_dir, exist_ok=True)
timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
log_entry = f"[{timestamp}] {message}\n"
with open(log_path, "a", encoding="utf-8") as f:
f.write(log_entry)
def get_env_var(var_name: str, profile: Optional[str] = None) -> Optional[str]:
"""Get an environment variable from the saved JSON file or from environment variables.
Args:
var_name: The name of the environment variable to retrieve
profile: The profile to use (if None, uses the current profile)
Returns:
The value of the environment variable or None if not found
"""
# Path to the JSON file storing environment variables
env_file_path = os.path.join(workbench_dir, "env_vars.json")
# First try to get from JSON file
if os.path.exists(env_file_path):
try:
with open(env_file_path, "r") as f:
env_vars = json.load(f)
# If profile is specified, use it; otherwise use current profile
current_profile = profile or env_vars.get("current_profile", "default")
# Get variables for the profile
if "profiles" in env_vars and current_profile in env_vars["profiles"]:
profile_vars = env_vars["profiles"][current_profile]
if var_name in profile_vars and profile_vars[var_name]:
return profile_vars[var_name]
# For backward compatibility, check the root level
if var_name in env_vars and env_vars[var_name]:
return env_vars[var_name]
except (json.JSONDecodeError, IOError) as e:
write_to_log(f"Error reading env_vars.json: {str(e)}")
# If not found in JSON, try to get from environment variables
return os.environ.get(var_name)
def save_env_var(var_name: str, value: str, profile: Optional[str] = None) -> bool:
"""Save an environment variable to the JSON file.
Args:
var_name: The name of the environment variable
value: The value to save
profile: The profile to save to (if None, uses the current profile)
Returns:
True if successful, False otherwise
"""
# Path to the JSON file storing environment variables
env_file_path = os.path.join(workbench_dir, "env_vars.json")
os.makedirs(workbench_dir, exist_ok=True)
# Load existing env vars or create empty dict
env_vars = {}
if os.path.exists(env_file_path):
try:
with open(env_file_path, "r") as f:
env_vars = json.load(f)
except (json.JSONDecodeError, IOError) as e:
write_to_log(f"Error reading env_vars.json: {str(e)}")
# Continue with empty dict if file is corrupted
# Initialize profiles structure if it doesn't exist
if "profiles" not in env_vars:
env_vars["profiles"] = {}
# If no current profile is set, set it to default
if "current_profile" not in env_vars:
env_vars["current_profile"] = "default"
# Determine which profile to use
current_profile = profile or env_vars.get("current_profile", "default")
# Initialize the profile if it doesn't exist
if current_profile not in env_vars["profiles"]:
env_vars["profiles"][current_profile] = {}
# Update the variable in the profile
env_vars["profiles"][current_profile][var_name] = value
# Save back to file
try:
with open(env_file_path, "w") as f:
json.dump(env_vars, f, indent=2)
return True
except IOError as e:
write_to_log(f"Error writing to env_vars.json: {str(e)}")
return False
def get_current_profile() -> str:
"""Get the current environment profile name.
Returns:
The name of the current profile, defaults to "default" if not set
"""
env_file_path = os.path.join(workbench_dir, "env_vars.json")
if os.path.exists(env_file_path):
try:
with open(env_file_path, "r") as f:
env_vars = json.load(f)
return env_vars.get("current_profile", "default")
except (json.JSONDecodeError, IOError) as e:
write_to_log(f"Error reading env_vars.json: {str(e)}")
return "default"
def set_current_profile(profile_name: str) -> bool:
"""Set the current environment profile.
Args:
profile_name: The name of the profile to set as current
Returns:
True if successful, False otherwise
"""
env_file_path = os.path.join(workbench_dir, "env_vars.json")
os.makedirs(workbench_dir, exist_ok=True)
# Load existing env vars or create empty dict
env_vars = {}
if os.path.exists(env_file_path):
try:
with open(env_file_path, "r") as f:
env_vars = json.load(f)
except (json.JSONDecodeError, IOError) as e:
write_to_log(f"Error reading env_vars.json: {str(e)}")
# Continue with empty dict if file is corrupted
# Initialize profiles structure if it doesn't exist
if "profiles" not in env_vars:
env_vars["profiles"] = {}
# Initialize the profile if it doesn't exist
if profile_name not in env_vars["profiles"]:
env_vars["profiles"][profile_name] = {}
# Set the current profile
env_vars["current_profile"] = profile_name
# Save back to file
try:
with open(env_file_path, "w") as f:
json.dump(env_vars, f, indent=2)
return True
except IOError as e:
write_to_log(f"Error writing to env_vars.json: {str(e)}")
return False
def get_all_profiles() -> list:
"""Get a list of all available environment profiles.
Returns:
List of profile names
"""
env_file_path = os.path.join(workbench_dir, "env_vars.json")
if os.path.exists(env_file_path):
try:
with open(env_file_path, "r") as f:
env_vars = json.load(f)
if "profiles" in env_vars:
return list(env_vars["profiles"].keys())
except (json.JSONDecodeError, IOError) as e:
write_to_log(f"Error reading env_vars.json: {str(e)}")
# Return default if no profiles exist
return ["default"]
def create_profile(profile_name: str) -> bool:
"""Create a new environment profile.
Args:
profile_name: The name of the profile to create
Returns:
True if successful, False otherwise
"""
env_file_path = os.path.join(workbench_dir, "env_vars.json")
os.makedirs(workbench_dir, exist_ok=True)
# Load existing env vars or create empty dict
env_vars = {}
if os.path.exists(env_file_path):
try:
with open(env_file_path, "r") as f:
env_vars = json.load(f)
except (json.JSONDecodeError, IOError) as e:
write_to_log(f"Error reading env_vars.json: {str(e)}")
# Continue with empty dict if file is corrupted
# Initialize profiles structure if it doesn't exist
if "profiles" not in env_vars:
env_vars["profiles"] = {}
# Create the profile if it doesn't exist
if profile_name not in env_vars["profiles"]:
env_vars["profiles"][profile_name] = {}
# Save back to file
try:
with open(env_file_path, "w") as f:
json.dump(env_vars, f, indent=2)
return True
except IOError as e:
write_to_log(f"Error writing to env_vars.json: {str(e)}")
return False
# Profile already exists
return True
def delete_profile(profile_name: str) -> bool:
"""Delete an environment profile.
Args:
profile_name: The name of the profile to delete
Returns:
True if successful, False otherwise
"""
# Don't allow deleting the default profile
if profile_name == "default":
return False
env_file_path = os.path.join(workbench_dir, "env_vars.json")
if os.path.exists(env_file_path):
try:
with open(env_file_path, "r") as f:
env_vars = json.load(f)
if "profiles" in env_vars and profile_name in env_vars["profiles"]:
# Delete the profile
del env_vars["profiles"][profile_name]
# If the current profile was deleted, set to default
if env_vars.get("current_profile") == profile_name:
env_vars["current_profile"] = "default"
# Save back to file
with open(env_file_path, "w") as f:
json.dump(env_vars, f, indent=2)
return True
except (json.JSONDecodeError, IOError) as e:
write_to_log(f"Error reading/writing env_vars.json: {str(e)}")
return False
def get_profile_env_vars(profile_name: Optional[str] = None) -> dict:
"""Get all environment variables for a specific profile.
Args:
profile_name: The name of the profile (if None, uses the current profile)
Returns:
Dictionary of environment variables for the profile
"""
env_file_path = os.path.join(workbench_dir, "env_vars.json")
if os.path.exists(env_file_path):
try:
with open(env_file_path, "r") as f:
env_vars = json.load(f)
# If profile is specified, use it; otherwise use current profile
current_profile = profile_name or env_vars.get("current_profile", "default")
# Get variables for the profile
if "profiles" in env_vars and current_profile in env_vars["profiles"]:
return env_vars["profiles"][current_profile]
# For backward compatibility, if no profiles structure but we're looking for default
if current_profile == "default" and "profiles" not in env_vars:
# Return all variables except profiles and current_profile
return {k: v for k, v in env_vars.items()
if k not in ["profiles", "current_profile"]}
except (json.JSONDecodeError, IOError) as e:
write_to_log(f"Error reading env_vars.json: {str(e)}")
return {}
def log_node_execution(func):
"""Decorator to log the start and end of graph node execution.
Args:
func: The async function to wrap
"""
@wraps(func)
async def wrapper(*args, **kwargs):
func_name = func.__name__
write_to_log(f"Starting node: {func_name}")
try:
result = await func(*args, **kwargs)
write_to_log(f"Completed node: {func_name}")
return result
except Exception as e:
write_to_log(f"Error in node {func_name}: {str(e)}")
raise
return wrapper
# Helper function to create a button that opens a tab in a new window
def create_new_tab_button(label, tab_name, key=None, use_container_width=False):
"""Create a button that opens a specified tab in a new browser window"""
# Create a unique key if none provided
if key is None:
key = f"new_tab_{tab_name.lower().replace(' ', '_')}"
# Get the base URL
base_url = st.query_params.get("base_url", "")
if not base_url:
# If base_url is not in query params, use the default localhost URL
base_url = "http://localhost:8501"
# Create the URL for the new tab
new_tab_url = f"{base_url}/?tab={tab_name}"
# Create a button that will open the URL in a new tab when clicked
if st.button(label, key=key, use_container_width=use_container_width):
webbrowser.open_new_tab(new_tab_url)
# Function to reload the archon_graph module
def reload_archon_graph(show_reload_success=True):
"""Reload the archon_graph module to apply new environment variables"""
try:
# First reload pydantic_ai_coder
import archon.pydantic_ai_coder
importlib.reload(archon.pydantic_ai_coder)
# Then reload archon_graph which imports pydantic_ai_coder
import archon.archon_graph
importlib.reload(archon.archon_graph)
# Then reload the crawler
import archon.crawl_pydantic_ai_docs
importlib.reload(archon.crawl_pydantic_ai_docs)
if show_reload_success:
st.success("Successfully reloaded Archon modules with new environment variables!")
return True
except Exception as e:
st.error(f"Error reloading Archon modules: {str(e)}")
return False
def get_clients():
# LLM client setup
embedding_client = None
base_url = get_env_var('EMBEDDING_BASE_URL') or 'https://api.openai.com/v1'
api_key = get_env_var('EMBEDDING_API_KEY') or 'no-api-key-provided'
provider = get_env_var('EMBEDDING_PROVIDER') or 'OpenAI'
# Setup OpenAI client for LLM
if provider == "Ollama":
if api_key == "NOT_REQUIRED":
api_key = "ollama" # Use a dummy key for Ollama
embedding_client = AsyncOpenAI(base_url=base_url, api_key=api_key)
else:
embedding_client = AsyncOpenAI(base_url=base_url, api_key=api_key)
# Supabase client setup
supabase = None
supabase_url = get_env_var("SUPABASE_URL")
supabase_key = get_env_var("SUPABASE_SERVICE_KEY")
if supabase_url and supabase_key:
try:
supabase: Client = Client(supabase_url, supabase_key)
except Exception as e:
print(f"Failed to initialize Supabase: {e}")
write_to_log(f"Failed to initialize Supabase: {e}")
return embedding_client, supabase

Binary file not shown.

Before

Width:  |  Height:  |  Size: 60 KiB

After

Width:  |  Height:  |  Size: 80 KiB