* Initial commit for RAG by document
* Phase 2
* Adding migrations
* Fixing page IDs for chunk metadata
* Fixing unit tests, adding tool to list pages for source
* Fixing page storage upsert issues
* Max file length for retrieval
* Fixing title issue
* Fixing tests
* fix: implement CASCADE DELETE for source deletion timeout issue
- Add migration 009 to add CASCADE DELETE constraints to foreign keys
- Simplify delete_source() to only delete parent record
- Database now handles cascading deletes efficiently
- Fixes timeout issues when deleting sources with thousands of pages
* chore: update complete_setup.sql to include CASCADE DELETE constraints
- Add ON DELETE CASCADE to foreign keys in initial setup
- Include migration 009 in the migrations tracking
- Ensures new installations have CASCADE DELETE from the start
Updates crawl4ai dependency to latest stable version with performance
and stability improvements.
Key improvements in 0.7.4:
- LLM-powered table extraction with intelligent chunking
- Fixed dispatcher bug for better concurrent processing
- Resolved browser manager race conditions
- Enhanced URL processing and proxy support
All existing tests pass (18/18). No breaking changes identified.
API remains backward compatible.
⚠️ IMPORTANT: URL Resolution Bug Status
A critical bug in v0.6.2 where ../../ paths only go up ONE directory
instead of TWO has been documented (see crawler-test branch). Status
in v0.7.4 is UNKNOWN - testing required before production deployment.
Test script provided: python/test_url_resolution_fix.py
Related issues fixed in v0.7.x:
- #570: General relative URL handling
- #1268: URLs after redirects
- #1323: Trailing slash base URL handling
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Add Anthropic and Grok provider support
* feat: Add crucial GPT-5 and reasoning model support for OpenRouter
- Add requires_max_completion_tokens() function for GPT-5, o1, o3, Grok-3 series
- Add prepare_chat_completion_params() for reasoning model compatibility
- Implement max_tokens → max_completion_tokens conversion for reasoning models
- Add temperature handling for reasoning models (must be 1.0 default)
- Enhanced provider validation and API key security in provider endpoints
- Streamlined retry logic (3→2 attempts) for faster issue detection
- Add failure tracking and circuit breaker analysis for debugging
- Support OpenRouter format detection (openai/gpt-5-nano, openai/o1-mini)
- Improved Grok provider empty response handling with structured fallbacks
- Enhanced contextual embedding with provider-aware model selection
Core provider functionality:
- OpenRouter, Grok, Anthropic provider support with full embedding integration
- Provider-specific model defaults and validation
- Secure API connectivity testing endpoints
- Provider context passing for code generation workflows
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fully working model providers, addressing securtiy and code related concerns, throughly hardening our code
* added multiprovider support, embeddings model support, cleaned the pr, need to fix health check, asnyico tasks errors, and contextual embeddings error
* fixed contextual embeddings issue
* - Added inspect-aware shutdown handling so get_llm_client always closes the underlying AsyncOpenAI / httpx.AsyncClient while the loop is still alive, with defensive logging if shutdown happens late (python/src/server/services/llm_provider_service.py:14, python/src/server/ services/llm_provider_service.py:520).
* - Restructured get_llm_client so client creation and usage live in separate try/finally blocks; fallback clients now close without logging spurious Error creating LLM client when downstream code raises (python/src/server/services/llm_provider_service.py:335-556). - Close logic now sanitizes provider names consistently and awaits whichever aclose/close coroutine the SDK exposes, keeping the loop shut down cleanly (python/src/server/services/llm_provider_service.py:530-559). Robust JSON Parsing - Added _extract_json_payload to strip code fences / extra text returned by Ollama before json.loads runs, averting the markdown-induced decode errors you saw in logs (python/src/server/services/storage/code_storage_service.py:40-63). - Swapped the direct parse call for the sanitized payload and emit a debug preview when cleanup alters the content (python/src/server/ services/storage/code_storage_service.py:858-864).
* added provider connection support
* added provider api key not being configured warning
* Updated get_llm_client so missing OpenAI keys automatically fall back to Ollama (matching existing tests) and so unsupported providers still raise the legacy ValueError the suite expects. The fallback now reuses _get_optimal_ollama_instance and rethrows ValueError(OpenAI API key not found and Ollama fallback failed) when it cant connect. Adjusted test_code_extraction_source_id.py to accept the new optional argument on the mocked extractor (and confirm its None when present).
* Resolved a few needed code rabbit suggestion - Updated the knowledge API key validation to call create_embedding with the provider argument and removed the hard-coded OpenAI fallback (python/src/server/api_routes/knowledge_api.py). - Broadened embedding provider detection so prefixed OpenRouter/OpenAI model names route through the correct client (python/src/server/ services/embeddings/embedding_service.py, python/src/server/services/llm_provider_service.py). - Removed the duplicate helper definitions from llm_provider_service.py, eliminating the stray docstring that was causing the import-time syntax error.
* updated via code rabbit PR review, code rabbit in my IDE found no issues and no nitpicks with the updates! what was done: Credential service now persists the provider under the uppercase key LLM_PROVIDER, matching the read path (no new EMBEDDING_PROVIDER usage introduced). Embedding batch creation stops inserting blank strings, logging failures and skipping invalid items before they ever hit the provider (python/src/server/services/embeddings/embedding_service.py). Contextual embedding prompts use real newline characters everywhereboth when constructing the batch prompt and when parsing the models response (python/src/server/services/embeddings/contextual_embedding_service.py). Embedding provider routing already recognizes OpenRouter-prefixed OpenAI models via is_openai_embedding_model; no further change needed there. Embedding insertion now skips unsupported vector dimensions instead of forcing them into the 1536-column, and the backoff loop uses await asyncio.sleep so we no longer block the event loop (python/src/server/services/storage/code_storage_service.py). RAG settings props were extended to include LLM_INSTANCE_NAME and OLLAMA_EMBEDDING_INSTANCE_NAME, and the debug log no longer prints API-key prefixes (the rest of the TanStack refactor/EMBEDDING_PROVIDER support remains deferred).
* test fix
* enhanced Openrouters parsing logic to automatically detect reasoning models and parse regardless of json output or not. this commit creates a robust way for archons parsing to work throughly with openrouter automatically, regardless of the model youre using, to ensure proper functionality with out breaking any generation capabilities!
* updated ui llm interface, added seprate embeddings provider, made the system fully capabale of mix and matching llm providers (local and non local) for chat & embeddings. updated the ragsettings.tsx ui mainly, along with core functionality
* added warning labels and updated ollama health checks
* ready for review, fixed som error warnings and consildated ollama status health checks
* fixed FAILED test_async_embedding_service.py
* code rabbit fixes
* Separated the code-summary LLM provider from the embedding provider, so code example storage now forwards a dedicated embedding provider override end-to-end without hijacking the embedding pipeline. this fixes code rabbits (Preserve provider override in create_embeddings_batch) suggesting
* - Swapped API credential storage to booleans so decrypted keys never sit in React state (archon-ui-main/src/components/
settings/RAGSettings.tsx).
- Normalized Ollama instance URLs and gated the metrics effect on real state changes to avoid mis-counts and duplicate
fetches (RAGSettings.tsx).
- Tightened crawl progress scaling and indented-block parsing to handle min_length=None safely (python/src/server/
services/crawling/code_extraction_service.py:160, python/src/server/services/crawling/code_extraction_service.py:911).
- Added provider-agnostic embedding rate-limit retries so Google and friends back off gracefully (python/src/server/
services/embeddings/embedding_service.py:427).
- Made the orchestration registry async + thread-safe and updated every caller to await it (python/src/server/services/
crawling/crawling_service.py:34, python/src/server/api_routes/knowledge_api.py:1291).
* Update RAGSettings.tsx - header for 'LLM Settings' is now 'LLM Provider Settings'
* (RAG Settings)
- Ollama Health Checks & Metrics
- Added a 10-second timeout to the health fetch so it doesn't hang.
- Adjusted logic so metric refreshes run for embedding-only Ollama setups too.
- Initial page load now checks Ollama if either chat or embedding provider uses it.
- Metrics and alerts now respect which provider (chat/embedding) is currently selected.
- Provider Sync & Alerts
- Fixed a sync bug so the very first provider change updates settings as expected.
- Alerts now track the active provider (chat vs embedding) rather than only the LLM provider.
- Warnings about missing credentials now skip whichever provider is currently selected.
- Modals & Types
- Normalize URLs before handing them to selection modals to keep consistent data.
- Strengthened helper function types (getDisplayedChatModel, getModelPlaceholder, etc.).
(Crawling Service)
- Made the orchestration registry lock lazy-initialized to avoid issues in Python 3.12 and wrapped registry commands
(register, unregister) in async calls. This keeps things thread-safe even during concurrent crawling and cancellation.
* - migration/complete_setup.sql:101 seeds Google/OpenRouter/Anthropic/Grok API key rows so fresh databases expose every
provider by default.
- migration/0.1.0/009_add_provider_placeholders.sql:1 backfills the same rows for existing Supabase instances and
records the migration.
- archon-ui-main/src/components/settings/RAGSettings.tsx:121 introduces a shared credentialprovider map,
reloadApiCredentials runs through all five providers, and the status poller includes the new keys.
- archon-ui-main/src/components/settings/RAGSettings.tsx:353 subscribes to the archon:credentials-updated browser event
so adding/removing a key immediately refetches credential status and pings the corresponding connectivity test.
- archon-ui-main/src/components/settings/RAGSettings.tsx:926 now treats missing Anthropic/OpenRouter/Grok keys as
missing, preventing stale connected badges when a key is removed.
* - archon-ui-main/src/components/settings/RAGSettings.tsx:90 adds a simple display-name map and reuses one red alert
style.
- archon-ui-main/src/components/settings/RAGSettings.tsx:1016 now shows exactly one red banner when the active provider
- Removed the old duplicate Missing API Key Configuration block, so the panel no longer stacks two warnings.
* Update credentialsService.ts default model
* updated the google embedding adapter for multi dimensional rag querying
* thought this micro fix in the google embedding pushed with the embedding update the other day, it didnt. pushing now
---------
Co-authored-by: Chillbruhhh <joshchesser97@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
* refactor: reorganize features/shared directory structure
- Created organized subdirectories for better code organization:
- api/ - API clients and HTTP utilities (renamed apiWithEtag.ts to apiClient.ts)
- config/ - Configuration files (queryClient, queryPatterns)
- types/ - Shared type definitions (errors)
- utils/ - Pure utility functions (optimistic, clipboard)
- hooks/ - Shared React hooks (already existed)
- Updated all import paths across the codebase (~40+ files)
- Updated all AI documentation in PRPs/ai_docs/ to reflect new structure
- All tests passing, build successful, no functional changes
This improves maintainability and follows vertical slice architecture patterns.
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: address PR review comments and code improvements
- Update imports to use @/features alias path for optimistic utils
- Fix optimistic upload item replacement by matching on source_id instead of id
- Clean up test suite naming and remove meta-terms from comments
- Only set Content-Type header on requests with body
- Add explicit TypeScript typing to useProjectFeatures hook
- Complete Phase 4 improvements with proper query typing
* fix: address additional PR review feedback
- Clear feature queries when deleting project to prevent cache memory leaks
- Update KnowledgeCard comments to follow documentation guidelines
- Add explanatory comment for accessibility pattern in KnowledgeCard
---------
Co-authored-by: Claude <noreply@anthropic.com>
Reorganize hook structure to follow vertical slice architecture:
- Move useSmartPolling, useThemeAware, useToast to features/shared/hooks
- Update 38+ import statements across codebase
- Update test file mocks to reference new locations
- Remove old ui/hooks directory
This change aligns shared utilities with the architectural pattern
where truly shared code resides in the shared directory.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-authored-by: Claude <noreply@anthropic.com>
* Preparing migration folder for the migration alert implementation
* Migrations and version APIs initial
* Touching up update instructions in README and UI
* Unit tests for migrations and version APIs
* Splitting up the Ollama migration scripts
* Removing temporary PRPs
---------
Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com>
* Add Anthropic and Grok provider support
* feat: Add crucial GPT-5 and reasoning model support for OpenRouter
- Add requires_max_completion_tokens() function for GPT-5, o1, o3, Grok-3 series
- Add prepare_chat_completion_params() for reasoning model compatibility
- Implement max_tokens → max_completion_tokens conversion for reasoning models
- Add temperature handling for reasoning models (must be 1.0 default)
- Enhanced provider validation and API key security in provider endpoints
- Streamlined retry logic (3→2 attempts) for faster issue detection
- Add failure tracking and circuit breaker analysis for debugging
- Support OpenRouter format detection (openai/gpt-5-nano, openai/o1-mini)
- Improved Grok provider empty response handling with structured fallbacks
- Enhanced contextual embedding with provider-aware model selection
Core provider functionality:
- OpenRouter, Grok, Anthropic provider support with full embedding integration
- Provider-specific model defaults and validation
- Secure API connectivity testing endpoints
- Provider context passing for code generation workflows
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fully working model providers, addressing securtiy and code related concerns, throughly hardening our code
* added multiprovider support, embeddings model support, cleaned the pr, need to fix health check, asnyico tasks errors, and contextual embeddings error
* fixed contextual embeddings issue
* - Added inspect-aware shutdown handling so get_llm_client always closes the underlying AsyncOpenAI / httpx.AsyncClient while the loop is still alive, with defensive logging if shutdown happens late (python/src/server/services/llm_provider_service.py:14, python/src/server/ services/llm_provider_service.py:520).
* - Restructured get_llm_client so client creation and usage live in separate try/finally blocks; fallback clients now close without logging spurious Error creating LLM client when downstream code raises (python/src/server/services/llm_provider_service.py:335-556). - Close logic now sanitizes provider names consistently and awaits whichever aclose/close coroutine the SDK exposes, keeping the loop shut down cleanly (python/src/server/services/llm_provider_service.py:530-559). Robust JSON Parsing - Added _extract_json_payload to strip code fences / extra text returned by Ollama before json.loads runs, averting the markdown-induced decode errors you saw in logs (python/src/server/services/storage/code_storage_service.py:40-63). - Swapped the direct parse call for the sanitized payload and emit a debug preview when cleanup alters the content (python/src/server/ services/storage/code_storage_service.py:858-864).
* added provider connection support
* added provider api key not being configured warning
* Updated get_llm_client so missing OpenAI keys automatically fall back to Ollama (matching existing tests) and so unsupported providers still raise the legacy ValueError the suite expects. The fallback now reuses _get_optimal_ollama_instance and rethrows ValueError(OpenAI API key not found and Ollama fallback failed) when it cant connect. Adjusted test_code_extraction_source_id.py to accept the new optional argument on the mocked extractor (and confirm its None when present).
* Resolved a few needed code rabbit suggestion - Updated the knowledge API key validation to call create_embedding with the provider argument and removed the hard-coded OpenAI fallback (python/src/server/api_routes/knowledge_api.py). - Broadened embedding provider detection so prefixed OpenRouter/OpenAI model names route through the correct client (python/src/server/ services/embeddings/embedding_service.py, python/src/server/services/llm_provider_service.py). - Removed the duplicate helper definitions from llm_provider_service.py, eliminating the stray docstring that was causing the import-time syntax error.
* updated via code rabbit PR review, code rabbit in my IDE found no issues and no nitpicks with the updates! what was done: Credential service now persists the provider under the uppercase key LLM_PROVIDER, matching the read path (no new EMBEDDING_PROVIDER usage introduced). Embedding batch creation stops inserting blank strings, logging failures and skipping invalid items before they ever hit the provider (python/src/server/services/embeddings/embedding_service.py). Contextual embedding prompts use real newline characters everywhereboth when constructing the batch prompt and when parsing the models response (python/src/server/services/embeddings/contextual_embedding_service.py). Embedding provider routing already recognizes OpenRouter-prefixed OpenAI models via is_openai_embedding_model; no further change needed there. Embedding insertion now skips unsupported vector dimensions instead of forcing them into the 1536-column, and the backoff loop uses await asyncio.sleep so we no longer block the event loop (python/src/server/services/storage/code_storage_service.py). RAG settings props were extended to include LLM_INSTANCE_NAME and OLLAMA_EMBEDDING_INSTANCE_NAME, and the debug log no longer prints API-key prefixes (the rest of the TanStack refactor/EMBEDDING_PROVIDER support remains deferred).
* test fix
* enhanced Openrouters parsing logic to automatically detect reasoning models and parse regardless of json output or not. this commit creates a robust way for archons parsing to work throughly with openrouter automatically, regardless of the model youre using, to ensure proper functionality with out breaking any generation capabilities!
---------
Co-authored-by: Chillbruhhh <joshchesser97@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
- Changed default Ollama URL from localhost:11434 to host.docker.internal:11434
- This allows Docker containers to connect to Ollama running on the host machine
- Updated in backend services, frontend components, migration scripts, and documentation
- Most users run Archon in Docker but Ollama as a local binary, making this a better default
* Add Codex MCP configuration instructions
- Added Codex as a supported IDE in the MCP configuration UI
- Removed Augment (duplicate of Cursor configuration)
- Positioned Codex between Gemini and Cursor in the tab order
- Added platform-specific configuration support for Windows vs Linux/macOS
- Includes step-by-step instructions for installing mcp-remote and configuring Codex
- Shows appropriate TOML configuration based on detected platform
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Finalizing Codex instructions
---------
Co-authored-by: Claude <noreply@anthropic.com>
When a file is selected through discovery, it should be crawled as a single file without
following any links contained within it. This preserves the efficiency gains of the
discovery feature.
Changes:
- Skip link extraction when is_discovery_target is true for link collection files
- Return sitemap metadata without crawling URLs when is_discovery_target is true
- Add clear logging to indicate single-file mode is active
This ensures discovered files (llms.txt, sitemap.xml, etc.) are processed as single
authoritative sources rather than starting recursive crawls, which aligns with the
PR's objective of efficient single-file discovery and crawling.
When a user directly provides a URL to a discovery file (sitemap.xml, llms.txt, robots.txt, etc.),
the system now skips the discovery phase and uses the provided file directly.
This prevents unnecessary discovery attempts and respects the user's explicit choice.
Changes:
- Check if the URL is already a discovery target before running discovery
- Skip discovery for: sitemap files, llms variants, robots.txt, well-known files, and any .txt files
- Add logging to indicate when discovery is skipped
Example: When crawling 'xyz.com/sitemap.xml' directly, the system will now use that sitemap
instead of trying to discover a different file like llms.txt
Two critical fixes for the automatic discovery feature:
1. Discovery Service path handling:
- Changed from always using root domain (/) to respecting given URL path
- e.g., for 'supabase.com/docs', now checks 'supabase.com/docs/robots.txt'
- Previously incorrectly checked 'supabase.com/robots.txt'
- Fixed all urljoin calls to use relative paths instead of absolute paths
2. Method signature mismatches:
- Removed start_progress and end_progress parameters from crawl_batch_with_progress
- Removed same parameters from crawl_recursive_with_progress
- Fixed all calls to these methods to match the strategy implementations
These fixes ensure discovery works correctly for subdirectory URLs and prevents TypeError crashes during crawling.
The progress mapper uses Python's round() function which rounds to nearest even number (banker's rounding). Updated test assertions to match actual rounding behavior:
- 3.5 rounds to 4 (not 3)
- 7.63 rounds to 8 (not 7)
- 9.5 rounds to 10 (not 9)
All tests now pass successfully.
- Resolved conflicts in progress_mapper.py to include discovery stage (3-4%)
- Resolved conflicts in crawling_service.py to maintain both discovery feature and main improvements
- Resolved conflicts in test_progress_mapper.py to include tests for discovery stage
- Kept all optimizations and improvements from main
- Maintained discovery feature functionality with proper integration
* chore, cleanup leftovers of tanstack refactoring
* refactor: Complete Phase 5 - Remove manual cache invalidations
- Removed all manual cache invalidations from knowledge queries
- Updated task queries to rely on backend consistency
- Fixed optimistic update utilities to handle edge cases
- Cleaned up unused imports and test utilities
- Fixed minor TypeScript issues in UI components
Backend now ensures data consistency through proper transaction handling,
eliminating the need for frontend cache coordination.
* docs: Enhance TODO comment for knowledge optimistic update issue
- Added comprehensive explanation of the query key mismatch issue
- Documented current behavior and impact on user experience
- Listed potential solutions with tradeoffs
- Created detailed PRP story in PRPs/local/ for future implementation
- References specific line numbers and implementation details
This documents a known limitation where optimistic updates to knowledge
items are invisible because mutations update the wrong query cache.