mirror of
https://github.com/coleam00/Archon.git
synced 2025-12-30 21:49:30 -05:00
209 lines
6.7 KiB
Plaintext
209 lines
6.7 KiB
Plaintext
---
|
|
title: Archon Knowledge Overview
|
|
sidebar_position: 1
|
|
---
|
|
|
|
import Tabs from '@theme/Tabs';
|
|
import TabItem from '@theme/TabItem';
|
|
import Admonition from '@theme/Admonition';
|
|
|
|
# 🧠 Archon Knowledge: Your AI's Memory Bank
|
|
|
|
<div className="hero hero--primary">
|
|
<div className="container">
|
|
<h2 className="hero__subtitle">
|
|
**Build a powerful knowledge base** for your AI assistants. Crawl websites, upload documents, and give your AI instant access to all your technical and business information.
|
|
</h2>
|
|
</div>
|
|
</div>
|
|
|
|
Archon Knowledge transforms your documentation, websites, and files into a searchable knowledge base that your AI coding assistants can instantly access. Never explain the same concept twice - your AI remembers everything.
|
|
|
|
<Admonition type="tip" icon="🎉" title="Fully Operational RAG System">
|
|
The RAG system is **now fully functional** with 14 MCP tools enabled, comprehensive error handling, and threading optimizations for high performance.
|
|
</Admonition>
|
|
|
|
## 🏗️ How RAG Works
|
|
|
|
```mermaid
|
|
%%{init:{
|
|
'theme':'base',
|
|
'themeVariables': {
|
|
'primaryColor':'#1f2937',
|
|
'primaryTextColor':'#ffffff',
|
|
'primaryBorderColor':'#8b5cf6',
|
|
'lineColor':'#a855f7',
|
|
'textColor':'#ffffff',
|
|
'fontFamily':'Inter',
|
|
'fontSize':'14px',
|
|
'background':'#000000',
|
|
'mainBkg':'#1f2937',
|
|
'secondBkg':'#111827',
|
|
'borderColor':'#8b5cf6'
|
|
}
|
|
}}%%
|
|
flowchart TD
|
|
A[🤖 AI Agent Query] --> B[🧠 Generate Embeddings]
|
|
B --> C[🔍 Vector Search]
|
|
C --> D[📄 Matching Documents]
|
|
D --> E[⚡ Filter & Rank]
|
|
E --> F[📋 Return Results]
|
|
```
|
|
|
|
## ⚡ Performance Features
|
|
|
|
Archon Knowledge is optimized for speed and efficiency:
|
|
|
|
- **Smart Concurrency**: Adaptive processing based on system resources
|
|
- **Batch Processing**: Processes multiple documents efficiently
|
|
- **Rate Limiting**: Respects API limits while maximizing throughput
|
|
- **Memory Management**: Automatically adjusts to available system memory
|
|
|
|
## 🔍 Using the Knowledge Base
|
|
|
|
### Basic Search
|
|
|
|
The `perform_rag_query` tool is the primary interface for semantic search across your knowledge base:
|
|
|
|
```javascript title="Basic RAG Query"
|
|
// Simple search across all sources
|
|
await mcp.callTool('perform_rag_query', {
|
|
query: "authentication best practices",
|
|
match_count: 5 // Optional, defaults to 5
|
|
});
|
|
```
|
|
|
|
### Filtered Search by Source
|
|
|
|
Filter results to specific domains or sources:
|
|
|
|
```javascript title="Source-Filtered Search"
|
|
// Search only within a specific domain
|
|
await mcp.callTool('perform_rag_query', {
|
|
query: "MCP session management",
|
|
source: "modelcontextprotocol.io", // Filter by domain
|
|
match_count: 10
|
|
});
|
|
|
|
// Get available sources first
|
|
const sources = await mcp.callTool('get_available_sources', {});
|
|
// Returns: ["ai.pydantic.dev", "modelcontextprotocol.io", ...]
|
|
```
|
|
|
|
### Advanced Usage Examples
|
|
|
|
<Tabs>
|
|
<TabItem value="technical" label="Technical Documentation" default>
|
|
|
|
```javascript
|
|
// Search for technical implementation details
|
|
await mcp.callTool('perform_rag_query', {
|
|
query: "SSE transport implementation MCP protocol",
|
|
source: "modelcontextprotocol.io",
|
|
match_count: 5
|
|
});
|
|
|
|
// Response includes:
|
|
// - Matched content chunks
|
|
// - Source URLs
|
|
// - Similarity scores
|
|
// - Metadata (headers, context)
|
|
```
|
|
|
|
</TabItem>
|
|
<TabItem value="code" label="Code Examples">
|
|
|
|
```javascript
|
|
// Search for code examples
|
|
await mcp.callTool('search_code_examples', {
|
|
query: "React hooks useState useEffect",
|
|
source_id: "react.dev", // Optional source filter
|
|
match_count: 10
|
|
});
|
|
|
|
// Returns:
|
|
// - Code snippets with syntax highlighting
|
|
// - AI-generated summaries
|
|
// - Full context (before/after code)
|
|
// - Source file information
|
|
```
|
|
|
|
</TabItem>
|
|
<TabItem value="multi-source" label="Multi-Source Search">
|
|
|
|
```javascript
|
|
// Search across all indexed sources
|
|
const results = await mcp.callTool('perform_rag_query', {
|
|
query: "best practices for API design REST GraphQL",
|
|
// No source filter - searches everything
|
|
match_count: 15
|
|
});
|
|
|
|
// Group results by source
|
|
const groupedResults = results.reduce((acc, result) => {
|
|
const source = result.metadata.source;
|
|
if (!acc[source]) acc[source] = [];
|
|
acc[source].push(result);
|
|
return acc;
|
|
}, {});
|
|
```
|
|
|
|
</TabItem>
|
|
</Tabs>
|
|
|
|
## 🔧 Advanced Features
|
|
|
|
- **Contextual Embeddings**: Enhanced understanding through document context
|
|
- **Source Filtering**: Search within specific domains or documentation sources
|
|
- **Code Search**: Specialized search for code examples and implementations
|
|
- **Multi-Source**: Search across all your indexed knowledge sources simultaneously
|
|
|
|
## ⚡ Performance
|
|
|
|
<Admonition type="success" icon="📊" title="Fast & Efficient">
|
|
- **Average Query Time**: 200-300ms
|
|
- **Optimized Processing**: Smart batching and concurrency
|
|
- **Memory Adaptive**: Automatically adjusts to system resources
|
|
- **Rate Limited**: Respects API limits for reliable operation
|
|
</Admonition>
|
|
|
|
## 📊 Real-Time Progress
|
|
|
|
When processing large amounts of content, Archon provides real-time progress updates via Socket.IO:
|
|
|
|
- **Smooth Progress**: Linear progression from 0-100%
|
|
- **Batch Details**: Clear information about processing status
|
|
- **Real-Time Updates**: Live updates as documents are processed
|
|
- **Memory Awareness**: Automatically adjusts based on system resources
|
|
|
|
## 🗄️ Data Storage
|
|
|
|
Archon uses a vector database to store and search your knowledge:
|
|
|
|
- **Vector Embeddings**: Content is converted to high-dimensional vectors for semantic search
|
|
- **Source Tracking**: Each document is linked to its original source
|
|
- **Code Examples**: Special handling for code snippets with language detection
|
|
- **Metadata Storage**: Additional context and headers are preserved
|
|
|
|
## 🔧 Common Issues
|
|
|
|
### Performance
|
|
- **Slow searches**: Usually due to large document sets - the system automatically optimizes batch sizes
|
|
- **Memory usage**: Adaptive processing automatically adjusts based on available system memory
|
|
- **Rate limiting**: Built-in rate limiting prevents API quota issues
|
|
|
|
### Search Quality
|
|
- **Poor results**: Try different search terms or use source filtering to narrow results
|
|
- **Missing content**: Ensure documents are properly crawled and indexed
|
|
- **Code examples**: Use the specialized `search_code_examples` tool for better code results
|
|
|
|
## 🚀 Getting Started
|
|
|
|
1. **Add Knowledge Sources**: Use MCP tools to crawl websites and upload documents
|
|
2. **Search Your Knowledge**: Use `perform_rag_query` to find relevant information
|
|
3. **Filter by Source**: Search within specific domains when you need focused results
|
|
4. **Find Code Examples**: Use `search_code_examples` for code-specific searches
|
|
|
|
## 🔮 What's Next
|
|
|
|
Future enhancements include multi-model processing, hybrid search combining vector and keyword search, and advanced neural reranking for even better results. |