* Initial commit for RAG by document
* Phase 2
* Adding migrations
* Fixing page IDs for chunk metadata
* Fixing unit tests, adding tool to list pages for source
* Fixing page storage upsert issues
* Max file length for retrieval
* Fixing title issue
* Fixing tests
* fix: implement CASCADE DELETE for source deletion timeout issue
- Add migration 009 to add CASCADE DELETE constraints to foreign keys
- Simplify delete_source() to only delete parent record
- Database now handles cascading deletes efficiently
- Fixes timeout issues when deleting sources with thousands of pages
* chore: update complete_setup.sql to include CASCADE DELETE constraints
- Add ON DELETE CASCADE to foreign keys in initial setup
- Include migration 009 in the migrations tracking
- Ensures new installations have CASCADE DELETE from the start
Updates crawl4ai dependency to latest stable version with performance
and stability improvements.
Key improvements in 0.7.4:
- LLM-powered table extraction with intelligent chunking
- Fixed dispatcher bug for better concurrent processing
- Resolved browser manager race conditions
- Enhanced URL processing and proxy support
All existing tests pass (18/18). No breaking changes identified.
API remains backward compatible.
⚠️ IMPORTANT: URL Resolution Bug Status
A critical bug in v0.6.2 where ../../ paths only go up ONE directory
instead of TWO has been documented (see crawler-test branch). Status
in v0.7.4 is UNKNOWN - testing required before production deployment.
Test script provided: python/test_url_resolution_fix.py
Related issues fixed in v0.7.x:
- #570: General relative URL handling
- #1268: URLs after redirects
- #1323: Trailing slash base URL handling
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Add Anthropic and Grok provider support
* feat: Add crucial GPT-5 and reasoning model support for OpenRouter
- Add requires_max_completion_tokens() function for GPT-5, o1, o3, Grok-3 series
- Add prepare_chat_completion_params() for reasoning model compatibility
- Implement max_tokens → max_completion_tokens conversion for reasoning models
- Add temperature handling for reasoning models (must be 1.0 default)
- Enhanced provider validation and API key security in provider endpoints
- Streamlined retry logic (3→2 attempts) for faster issue detection
- Add failure tracking and circuit breaker analysis for debugging
- Support OpenRouter format detection (openai/gpt-5-nano, openai/o1-mini)
- Improved Grok provider empty response handling with structured fallbacks
- Enhanced contextual embedding with provider-aware model selection
Core provider functionality:
- OpenRouter, Grok, Anthropic provider support with full embedding integration
- Provider-specific model defaults and validation
- Secure API connectivity testing endpoints
- Provider context passing for code generation workflows
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fully working model providers, addressing securtiy and code related concerns, throughly hardening our code
* added multiprovider support, embeddings model support, cleaned the pr, need to fix health check, asnyico tasks errors, and contextual embeddings error
* fixed contextual embeddings issue
* - Added inspect-aware shutdown handling so get_llm_client always closes the underlying AsyncOpenAI / httpx.AsyncClient while the loop is still alive, with defensive logging if shutdown happens late (python/src/server/services/llm_provider_service.py:14, python/src/server/ services/llm_provider_service.py:520).
* - Restructured get_llm_client so client creation and usage live in separate try/finally blocks; fallback clients now close without logging spurious Error creating LLM client when downstream code raises (python/src/server/services/llm_provider_service.py:335-556). - Close logic now sanitizes provider names consistently and awaits whichever aclose/close coroutine the SDK exposes, keeping the loop shut down cleanly (python/src/server/services/llm_provider_service.py:530-559). Robust JSON Parsing - Added _extract_json_payload to strip code fences / extra text returned by Ollama before json.loads runs, averting the markdown-induced decode errors you saw in logs (python/src/server/services/storage/code_storage_service.py:40-63). - Swapped the direct parse call for the sanitized payload and emit a debug preview when cleanup alters the content (python/src/server/ services/storage/code_storage_service.py:858-864).
* added provider connection support
* added provider api key not being configured warning
* Updated get_llm_client so missing OpenAI keys automatically fall back to Ollama (matching existing tests) and so unsupported providers still raise the legacy ValueError the suite expects. The fallback now reuses _get_optimal_ollama_instance and rethrows ValueError(OpenAI API key not found and Ollama fallback failed) when it cant connect. Adjusted test_code_extraction_source_id.py to accept the new optional argument on the mocked extractor (and confirm its None when present).
* Resolved a few needed code rabbit suggestion - Updated the knowledge API key validation to call create_embedding with the provider argument and removed the hard-coded OpenAI fallback (python/src/server/api_routes/knowledge_api.py). - Broadened embedding provider detection so prefixed OpenRouter/OpenAI model names route through the correct client (python/src/server/ services/embeddings/embedding_service.py, python/src/server/services/llm_provider_service.py). - Removed the duplicate helper definitions from llm_provider_service.py, eliminating the stray docstring that was causing the import-time syntax error.
* updated via code rabbit PR review, code rabbit in my IDE found no issues and no nitpicks with the updates! what was done: Credential service now persists the provider under the uppercase key LLM_PROVIDER, matching the read path (no new EMBEDDING_PROVIDER usage introduced). Embedding batch creation stops inserting blank strings, logging failures and skipping invalid items before they ever hit the provider (python/src/server/services/embeddings/embedding_service.py). Contextual embedding prompts use real newline characters everywhereboth when constructing the batch prompt and when parsing the models response (python/src/server/services/embeddings/contextual_embedding_service.py). Embedding provider routing already recognizes OpenRouter-prefixed OpenAI models via is_openai_embedding_model; no further change needed there. Embedding insertion now skips unsupported vector dimensions instead of forcing them into the 1536-column, and the backoff loop uses await asyncio.sleep so we no longer block the event loop (python/src/server/services/storage/code_storage_service.py). RAG settings props were extended to include LLM_INSTANCE_NAME and OLLAMA_EMBEDDING_INSTANCE_NAME, and the debug log no longer prints API-key prefixes (the rest of the TanStack refactor/EMBEDDING_PROVIDER support remains deferred).
* test fix
* enhanced Openrouters parsing logic to automatically detect reasoning models and parse regardless of json output or not. this commit creates a robust way for archons parsing to work throughly with openrouter automatically, regardless of the model youre using, to ensure proper functionality with out breaking any generation capabilities!
* updated ui llm interface, added seprate embeddings provider, made the system fully capabale of mix and matching llm providers (local and non local) for chat & embeddings. updated the ragsettings.tsx ui mainly, along with core functionality
* added warning labels and updated ollama health checks
* ready for review, fixed som error warnings and consildated ollama status health checks
* fixed FAILED test_async_embedding_service.py
* code rabbit fixes
* Separated the code-summary LLM provider from the embedding provider, so code example storage now forwards a dedicated embedding provider override end-to-end without hijacking the embedding pipeline. this fixes code rabbits (Preserve provider override in create_embeddings_batch) suggesting
* - Swapped API credential storage to booleans so decrypted keys never sit in React state (archon-ui-main/src/components/
settings/RAGSettings.tsx).
- Normalized Ollama instance URLs and gated the metrics effect on real state changes to avoid mis-counts and duplicate
fetches (RAGSettings.tsx).
- Tightened crawl progress scaling and indented-block parsing to handle min_length=None safely (python/src/server/
services/crawling/code_extraction_service.py:160, python/src/server/services/crawling/code_extraction_service.py:911).
- Added provider-agnostic embedding rate-limit retries so Google and friends back off gracefully (python/src/server/
services/embeddings/embedding_service.py:427).
- Made the orchestration registry async + thread-safe and updated every caller to await it (python/src/server/services/
crawling/crawling_service.py:34, python/src/server/api_routes/knowledge_api.py:1291).
* Update RAGSettings.tsx - header for 'LLM Settings' is now 'LLM Provider Settings'
* (RAG Settings)
- Ollama Health Checks & Metrics
- Added a 10-second timeout to the health fetch so it doesn't hang.
- Adjusted logic so metric refreshes run for embedding-only Ollama setups too.
- Initial page load now checks Ollama if either chat or embedding provider uses it.
- Metrics and alerts now respect which provider (chat/embedding) is currently selected.
- Provider Sync & Alerts
- Fixed a sync bug so the very first provider change updates settings as expected.
- Alerts now track the active provider (chat vs embedding) rather than only the LLM provider.
- Warnings about missing credentials now skip whichever provider is currently selected.
- Modals & Types
- Normalize URLs before handing them to selection modals to keep consistent data.
- Strengthened helper function types (getDisplayedChatModel, getModelPlaceholder, etc.).
(Crawling Service)
- Made the orchestration registry lock lazy-initialized to avoid issues in Python 3.12 and wrapped registry commands
(register, unregister) in async calls. This keeps things thread-safe even during concurrent crawling and cancellation.
* - migration/complete_setup.sql:101 seeds Google/OpenRouter/Anthropic/Grok API key rows so fresh databases expose every
provider by default.
- migration/0.1.0/009_add_provider_placeholders.sql:1 backfills the same rows for existing Supabase instances and
records the migration.
- archon-ui-main/src/components/settings/RAGSettings.tsx:121 introduces a shared credentialprovider map,
reloadApiCredentials runs through all five providers, and the status poller includes the new keys.
- archon-ui-main/src/components/settings/RAGSettings.tsx:353 subscribes to the archon:credentials-updated browser event
so adding/removing a key immediately refetches credential status and pings the corresponding connectivity test.
- archon-ui-main/src/components/settings/RAGSettings.tsx:926 now treats missing Anthropic/OpenRouter/Grok keys as
missing, preventing stale connected badges when a key is removed.
* - archon-ui-main/src/components/settings/RAGSettings.tsx:90 adds a simple display-name map and reuses one red alert
style.
- archon-ui-main/src/components/settings/RAGSettings.tsx:1016 now shows exactly one red banner when the active provider
- Removed the old duplicate Missing API Key Configuration block, so the panel no longer stacks two warnings.
* Update credentialsService.ts default model
* updated the google embedding adapter for multi dimensional rag querying
* thought this micro fix in the google embedding pushed with the embedding update the other day, it didnt. pushing now
---------
Co-authored-by: Chillbruhhh <joshchesser97@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
* refactor: reorganize features/shared directory structure
- Created organized subdirectories for better code organization:
- api/ - API clients and HTTP utilities (renamed apiWithEtag.ts to apiClient.ts)
- config/ - Configuration files (queryClient, queryPatterns)
- types/ - Shared type definitions (errors)
- utils/ - Pure utility functions (optimistic, clipboard)
- hooks/ - Shared React hooks (already existed)
- Updated all import paths across the codebase (~40+ files)
- Updated all AI documentation in PRPs/ai_docs/ to reflect new structure
- All tests passing, build successful, no functional changes
This improves maintainability and follows vertical slice architecture patterns.
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: address PR review comments and code improvements
- Update imports to use @/features alias path for optimistic utils
- Fix optimistic upload item replacement by matching on source_id instead of id
- Clean up test suite naming and remove meta-terms from comments
- Only set Content-Type header on requests with body
- Add explicit TypeScript typing to useProjectFeatures hook
- Complete Phase 4 improvements with proper query typing
* fix: address additional PR review feedback
- Clear feature queries when deleting project to prevent cache memory leaks
- Update KnowledgeCard comments to follow documentation guidelines
- Add explanatory comment for accessibility pattern in KnowledgeCard
---------
Co-authored-by: Claude <noreply@anthropic.com>
Reorganize hook structure to follow vertical slice architecture:
- Move useSmartPolling, useThemeAware, useToast to features/shared/hooks
- Update 38+ import statements across codebase
- Update test file mocks to reference new locations
- Remove old ui/hooks directory
This change aligns shared utilities with the architectural pattern
where truly shared code resides in the shared directory.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-authored-by: Claude <noreply@anthropic.com>
* Preparing migration folder for the migration alert implementation
* Migrations and version APIs initial
* Touching up update instructions in README and UI
* Unit tests for migrations and version APIs
* Splitting up the Ollama migration scripts
* Removing temporary PRPs
---------
Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com>
* Add Anthropic and Grok provider support
* feat: Add crucial GPT-5 and reasoning model support for OpenRouter
- Add requires_max_completion_tokens() function for GPT-5, o1, o3, Grok-3 series
- Add prepare_chat_completion_params() for reasoning model compatibility
- Implement max_tokens → max_completion_tokens conversion for reasoning models
- Add temperature handling for reasoning models (must be 1.0 default)
- Enhanced provider validation and API key security in provider endpoints
- Streamlined retry logic (3→2 attempts) for faster issue detection
- Add failure tracking and circuit breaker analysis for debugging
- Support OpenRouter format detection (openai/gpt-5-nano, openai/o1-mini)
- Improved Grok provider empty response handling with structured fallbacks
- Enhanced contextual embedding with provider-aware model selection
Core provider functionality:
- OpenRouter, Grok, Anthropic provider support with full embedding integration
- Provider-specific model defaults and validation
- Secure API connectivity testing endpoints
- Provider context passing for code generation workflows
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fully working model providers, addressing securtiy and code related concerns, throughly hardening our code
* added multiprovider support, embeddings model support, cleaned the pr, need to fix health check, asnyico tasks errors, and contextual embeddings error
* fixed contextual embeddings issue
* - Added inspect-aware shutdown handling so get_llm_client always closes the underlying AsyncOpenAI / httpx.AsyncClient while the loop is still alive, with defensive logging if shutdown happens late (python/src/server/services/llm_provider_service.py:14, python/src/server/ services/llm_provider_service.py:520).
* - Restructured get_llm_client so client creation and usage live in separate try/finally blocks; fallback clients now close without logging spurious Error creating LLM client when downstream code raises (python/src/server/services/llm_provider_service.py:335-556). - Close logic now sanitizes provider names consistently and awaits whichever aclose/close coroutine the SDK exposes, keeping the loop shut down cleanly (python/src/server/services/llm_provider_service.py:530-559). Robust JSON Parsing - Added _extract_json_payload to strip code fences / extra text returned by Ollama before json.loads runs, averting the markdown-induced decode errors you saw in logs (python/src/server/services/storage/code_storage_service.py:40-63). - Swapped the direct parse call for the sanitized payload and emit a debug preview when cleanup alters the content (python/src/server/ services/storage/code_storage_service.py:858-864).
* added provider connection support
* added provider api key not being configured warning
* Updated get_llm_client so missing OpenAI keys automatically fall back to Ollama (matching existing tests) and so unsupported providers still raise the legacy ValueError the suite expects. The fallback now reuses _get_optimal_ollama_instance and rethrows ValueError(OpenAI API key not found and Ollama fallback failed) when it cant connect. Adjusted test_code_extraction_source_id.py to accept the new optional argument on the mocked extractor (and confirm its None when present).
* Resolved a few needed code rabbit suggestion - Updated the knowledge API key validation to call create_embedding with the provider argument and removed the hard-coded OpenAI fallback (python/src/server/api_routes/knowledge_api.py). - Broadened embedding provider detection so prefixed OpenRouter/OpenAI model names route through the correct client (python/src/server/ services/embeddings/embedding_service.py, python/src/server/services/llm_provider_service.py). - Removed the duplicate helper definitions from llm_provider_service.py, eliminating the stray docstring that was causing the import-time syntax error.
* updated via code rabbit PR review, code rabbit in my IDE found no issues and no nitpicks with the updates! what was done: Credential service now persists the provider under the uppercase key LLM_PROVIDER, matching the read path (no new EMBEDDING_PROVIDER usage introduced). Embedding batch creation stops inserting blank strings, logging failures and skipping invalid items before they ever hit the provider (python/src/server/services/embeddings/embedding_service.py). Contextual embedding prompts use real newline characters everywhereboth when constructing the batch prompt and when parsing the models response (python/src/server/services/embeddings/contextual_embedding_service.py). Embedding provider routing already recognizes OpenRouter-prefixed OpenAI models via is_openai_embedding_model; no further change needed there. Embedding insertion now skips unsupported vector dimensions instead of forcing them into the 1536-column, and the backoff loop uses await asyncio.sleep so we no longer block the event loop (python/src/server/services/storage/code_storage_service.py). RAG settings props were extended to include LLM_INSTANCE_NAME and OLLAMA_EMBEDDING_INSTANCE_NAME, and the debug log no longer prints API-key prefixes (the rest of the TanStack refactor/EMBEDDING_PROVIDER support remains deferred).
* test fix
* enhanced Openrouters parsing logic to automatically detect reasoning models and parse regardless of json output or not. this commit creates a robust way for archons parsing to work throughly with openrouter automatically, regardless of the model youre using, to ensure proper functionality with out breaking any generation capabilities!
---------
Co-authored-by: Chillbruhhh <joshchesser97@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
- Changed default Ollama URL from localhost:11434 to host.docker.internal:11434
- This allows Docker containers to connect to Ollama running on the host machine
- Updated in backend services, frontend components, migration scripts, and documentation
- Most users run Archon in Docker but Ollama as a local binary, making this a better default
* Add Codex MCP configuration instructions
- Added Codex as a supported IDE in the MCP configuration UI
- Removed Augment (duplicate of Cursor configuration)
- Positioned Codex between Gemini and Cursor in the tab order
- Added platform-specific configuration support for Windows vs Linux/macOS
- Includes step-by-step instructions for installing mcp-remote and configuring Codex
- Shows appropriate TOML configuration based on detected platform
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Finalizing Codex instructions
---------
Co-authored-by: Claude <noreply@anthropic.com>