- Add WorkOrderLogsPanel with SSE streaming support
- Add RealTimeStats component for live metrics
- Add useWorkOrderLogs hook for SSE log streaming
- Add useLogStats hook for real-time statistics
- Update WorkOrderDetailView to display logs panel
- Add comprehensive tests for new components
- Configure Vite test environment
- Fix test mocks to use requests.Session for _check_url_exists
- Add url parameter to create_mock_response to prevent MagicMock issues
- Update all test scenarios to mock both requests.get and session.get
- Remove redundant UNSAFE_PROTOCOLS check in URL validation
- Fix test assertions to match new priority order (llms.txt > llms-full.txt)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
## Backend Improvements
### Discovery Service
- Fix SSRF protection: Use requests.Session() for max_redirects parameter
- Add comprehensive IP validation (_is_safe_ip, _resolve_and_validate_hostname)
- Add hostname DNS resolution validation before requests
- Fix llms.txt link following to crawl ALL same-domain pages (not just llms.txt files)
- Remove unused file variants: llms.md, llms.markdown, sitemap_index.xml, sitemap-index.xml
- Optimize DISCOVERY_PRIORITY based on real-world usage research
- Update priority: llms.txt > llms-full.txt > sitemap.xml > robots.txt
### URL Handler
- Fix .well-known path to be case-sensitive per RFC 8615
- Remove llms.md, llms.markdown, llms.mdx from variant detection
- Simplify link collection patterns to only .txt files (most common)
- Update llms_variants list to only include spec-compliant files
### Crawling Service
- Add tldextract for proper root domain extraction (handles .co.uk, .com.au, etc.)
- Replace naive domain extraction with robust get_root_domain() function
- Add tldextract>=5.0.0 to dependencies
## Frontend Improvements
### Type Safety
- Extend ActiveOperation type with discovery fields (discovered_file, discovered_file_type, linked_files)
- Remove all type casting (operation as any) from CrawlingProgress component
- Add proper TypeScript types for discovery information
### Security
- Create URL validation utility (urlValidation.ts)
- Only render clickable links for validated HTTP/HTTPS URLs
- Reject unsafe protocols (javascript:, data:, vbscript:, file:)
- Display invalid URLs as plain text instead of links
## Testing
- Update test mocks to include history and url attributes for redirect checking
- Fix .well-known case sensitivity tests (must be lowercase per RFC 8615)
- Update discovery priority tests to match new order
- Remove tests for deprecated file variants
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Implements complete llms.txt link following functionality that crawls
linked llms.txt files on the same domain/subdomain, along with critical
bug fixes for discovery priority and variant detection.
Backend Core Functionality:
- Add _is_same_domain_or_subdomain method for subdomain matching
- Fix is_llms_variant to detect .txt files in /llms/ directories
- Implement llms.txt link extraction and following logic
- Add two-phase discovery: prioritize ALL llms.txt before sitemaps
- Enhanced progress reporting with discovery metadata
Critical Bug Fixes:
- Discovery priority: Fixed sitemap.xml being found before llms.txt
- is_llms_variant: Now matches /llms/guides.txt, /llms/swift.txt, etc.
- These were blocking bugs preventing link following from working
Frontend UI:
- Add discovery and linked files display to CrawlingProgress component
- Update progress types to include discoveredFile, linkedFiles fields
- Add new crawl types: llms_txt_with_linked_files, discovery_*
- Add "discovery" to ProgressStatus enum and active statuses
Testing:
- 8 subdomain matching unit tests (test_crawling_service_subdomain.py)
- 7 integration tests for link following (test_llms_txt_link_following.py)
- All 15 tests passing
- Validated against real Supabase llms.txt structure (1 main + 8 linked)
Files Modified:
Backend:
- crawling_service.py: Core link following logic (lines 744-788, 862-920)
- url_handler.py: Fixed variant detection (lines 633-665)
- discovery_service.py: Two-phase discovery (lines 137-214)
- 2 new comprehensive test files
Frontend:
- progress/types/progress.ts: Updated types with new fields
- progress/components/CrawlingProgress.tsx: Added UI sections
Real-world testing: Crawling supabase.com/docs now discovers
/docs/llms.txt and automatically follows 8 linked llms.txt files,
indexing complete documentation from all files.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- add trailing slashes to agent-work-orders endpoints to prevent FastAPI mount() redirects
- add defensive null check for repository_url in detail view
- fix backend routes to use relative paths with app.mount()
- resolves ERR_NAME_NOT_RESOLVED when accessing agent work orders
* Add Anthropic and Grok provider support
* feat: Add crucial GPT-5 and reasoning model support for OpenRouter
- Add requires_max_completion_tokens() function for GPT-5, o1, o3, Grok-3 series
- Add prepare_chat_completion_params() for reasoning model compatibility
- Implement max_tokens → max_completion_tokens conversion for reasoning models
- Add temperature handling for reasoning models (must be 1.0 default)
- Enhanced provider validation and API key security in provider endpoints
- Streamlined retry logic (3→2 attempts) for faster issue detection
- Add failure tracking and circuit breaker analysis for debugging
- Support OpenRouter format detection (openai/gpt-5-nano, openai/o1-mini)
- Improved Grok provider empty response handling with structured fallbacks
- Enhanced contextual embedding with provider-aware model selection
Core provider functionality:
- OpenRouter, Grok, Anthropic provider support with full embedding integration
- Provider-specific model defaults and validation
- Secure API connectivity testing endpoints
- Provider context passing for code generation workflows
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fully working model providers, addressing securtiy and code related concerns, throughly hardening our code
* added multiprovider support, embeddings model support, cleaned the pr, need to fix health check, asnyico tasks errors, and contextual embeddings error
* fixed contextual embeddings issue
* - Added inspect-aware shutdown handling so get_llm_client always closes the underlying AsyncOpenAI / httpx.AsyncClient while the loop is still alive, with defensive logging if shutdown happens late (python/src/server/services/llm_provider_service.py:14, python/src/server/ services/llm_provider_service.py:520).
* - Restructured get_llm_client so client creation and usage live in separate try/finally blocks; fallback clients now close without logging spurious Error creating LLM client when downstream code raises (python/src/server/services/llm_provider_service.py:335-556). - Close logic now sanitizes provider names consistently and awaits whichever aclose/close coroutine the SDK exposes, keeping the loop shut down cleanly (python/src/server/services/llm_provider_service.py:530-559). Robust JSON Parsing - Added _extract_json_payload to strip code fences / extra text returned by Ollama before json.loads runs, averting the markdown-induced decode errors you saw in logs (python/src/server/services/storage/code_storage_service.py:40-63). - Swapped the direct parse call for the sanitized payload and emit a debug preview when cleanup alters the content (python/src/server/ services/storage/code_storage_service.py:858-864).
* added provider connection support
* added provider api key not being configured warning
* Updated get_llm_client so missing OpenAI keys automatically fall back to Ollama (matching existing tests) and so unsupported providers still raise the legacy ValueError the suite expects. The fallback now reuses _get_optimal_ollama_instance and rethrows ValueError(OpenAI API key not found and Ollama fallback failed) when it cant connect. Adjusted test_code_extraction_source_id.py to accept the new optional argument on the mocked extractor (and confirm its None when present).
* Resolved a few needed code rabbit suggestion - Updated the knowledge API key validation to call create_embedding with the provider argument and removed the hard-coded OpenAI fallback (python/src/server/api_routes/knowledge_api.py). - Broadened embedding provider detection so prefixed OpenRouter/OpenAI model names route through the correct client (python/src/server/ services/embeddings/embedding_service.py, python/src/server/services/llm_provider_service.py). - Removed the duplicate helper definitions from llm_provider_service.py, eliminating the stray docstring that was causing the import-time syntax error.
* updated via code rabbit PR review, code rabbit in my IDE found no issues and no nitpicks with the updates! what was done: Credential service now persists the provider under the uppercase key LLM_PROVIDER, matching the read path (no new EMBEDDING_PROVIDER usage introduced). Embedding batch creation stops inserting blank strings, logging failures and skipping invalid items before they ever hit the provider (python/src/server/services/embeddings/embedding_service.py). Contextual embedding prompts use real newline characters everywhereboth when constructing the batch prompt and when parsing the models response (python/src/server/services/embeddings/contextual_embedding_service.py). Embedding provider routing already recognizes OpenRouter-prefixed OpenAI models via is_openai_embedding_model; no further change needed there. Embedding insertion now skips unsupported vector dimensions instead of forcing them into the 1536-column, and the backoff loop uses await asyncio.sleep so we no longer block the event loop (python/src/server/services/storage/code_storage_service.py). RAG settings props were extended to include LLM_INSTANCE_NAME and OLLAMA_EMBEDDING_INSTANCE_NAME, and the debug log no longer prints API-key prefixes (the rest of the TanStack refactor/EMBEDDING_PROVIDER support remains deferred).
* test fix
* enhanced Openrouters parsing logic to automatically detect reasoning models and parse regardless of json output or not. this commit creates a robust way for archons parsing to work throughly with openrouter automatically, regardless of the model youre using, to ensure proper functionality with out breaking any generation capabilities!
* updated ui llm interface, added seprate embeddings provider, made the system fully capabale of mix and matching llm providers (local and non local) for chat & embeddings. updated the ragsettings.tsx ui mainly, along with core functionality
* added warning labels and updated ollama health checks
* ready for review, fixed som error warnings and consildated ollama status health checks
* fixed FAILED test_async_embedding_service.py
* code rabbit fixes
* Separated the code-summary LLM provider from the embedding provider, so code example storage now forwards a dedicated embedding provider override end-to-end without hijacking the embedding pipeline. this fixes code rabbits (Preserve provider override in create_embeddings_batch) suggesting
* - Swapped API credential storage to booleans so decrypted keys never sit in React state (archon-ui-main/src/components/
settings/RAGSettings.tsx).
- Normalized Ollama instance URLs and gated the metrics effect on real state changes to avoid mis-counts and duplicate
fetches (RAGSettings.tsx).
- Tightened crawl progress scaling and indented-block parsing to handle min_length=None safely (python/src/server/
services/crawling/code_extraction_service.py:160, python/src/server/services/crawling/code_extraction_service.py:911).
- Added provider-agnostic embedding rate-limit retries so Google and friends back off gracefully (python/src/server/
services/embeddings/embedding_service.py:427).
- Made the orchestration registry async + thread-safe and updated every caller to await it (python/src/server/services/
crawling/crawling_service.py:34, python/src/server/api_routes/knowledge_api.py:1291).
* Update RAGSettings.tsx - header for 'LLM Settings' is now 'LLM Provider Settings'
* (RAG Settings)
- Ollama Health Checks & Metrics
- Added a 10-second timeout to the health fetch so it doesn't hang.
- Adjusted logic so metric refreshes run for embedding-only Ollama setups too.
- Initial page load now checks Ollama if either chat or embedding provider uses it.
- Metrics and alerts now respect which provider (chat/embedding) is currently selected.
- Provider Sync & Alerts
- Fixed a sync bug so the very first provider change updates settings as expected.
- Alerts now track the active provider (chat vs embedding) rather than only the LLM provider.
- Warnings about missing credentials now skip whichever provider is currently selected.
- Modals & Types
- Normalize URLs before handing them to selection modals to keep consistent data.
- Strengthened helper function types (getDisplayedChatModel, getModelPlaceholder, etc.).
(Crawling Service)
- Made the orchestration registry lock lazy-initialized to avoid issues in Python 3.12 and wrapped registry commands
(register, unregister) in async calls. This keeps things thread-safe even during concurrent crawling and cancellation.
* - migration/complete_setup.sql:101 seeds Google/OpenRouter/Anthropic/Grok API key rows so fresh databases expose every
provider by default.
- migration/0.1.0/009_add_provider_placeholders.sql:1 backfills the same rows for existing Supabase instances and
records the migration.
- archon-ui-main/src/components/settings/RAGSettings.tsx:121 introduces a shared credentialprovider map,
reloadApiCredentials runs through all five providers, and the status poller includes the new keys.
- archon-ui-main/src/components/settings/RAGSettings.tsx:353 subscribes to the archon:credentials-updated browser event
so adding/removing a key immediately refetches credential status and pings the corresponding connectivity test.
- archon-ui-main/src/components/settings/RAGSettings.tsx:926 now treats missing Anthropic/OpenRouter/Grok keys as
missing, preventing stale connected badges when a key is removed.
* - archon-ui-main/src/components/settings/RAGSettings.tsx:90 adds a simple display-name map and reuses one red alert
style.
- archon-ui-main/src/components/settings/RAGSettings.tsx:1016 now shows exactly one red banner when the active provider
- Removed the old duplicate Missing API Key Configuration block, so the panel no longer stacks two warnings.
* Update credentialsService.ts default model
* updated the google embedding adapter for multi dimensional rag querying
* thought this micro fix in the google embedding pushed with the embedding update the other day, it didnt. pushing now
---------
Co-authored-by: Chillbruhhh <joshchesser97@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
* refactor: reorganize features/shared directory structure
- Created organized subdirectories for better code organization:
- api/ - API clients and HTTP utilities (renamed apiWithEtag.ts to apiClient.ts)
- config/ - Configuration files (queryClient, queryPatterns)
- types/ - Shared type definitions (errors)
- utils/ - Pure utility functions (optimistic, clipboard)
- hooks/ - Shared React hooks (already existed)
- Updated all import paths across the codebase (~40+ files)
- Updated all AI documentation in PRPs/ai_docs/ to reflect new structure
- All tests passing, build successful, no functional changes
This improves maintainability and follows vertical slice architecture patterns.
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: address PR review comments and code improvements
- Update imports to use @/features alias path for optimistic utils
- Fix optimistic upload item replacement by matching on source_id instead of id
- Clean up test suite naming and remove meta-terms from comments
- Only set Content-Type header on requests with body
- Add explicit TypeScript typing to useProjectFeatures hook
- Complete Phase 4 improvements with proper query typing
* fix: address additional PR review feedback
- Clear feature queries when deleting project to prevent cache memory leaks
- Update KnowledgeCard comments to follow documentation guidelines
- Add explanatory comment for accessibility pattern in KnowledgeCard
---------
Co-authored-by: Claude <noreply@anthropic.com>
Reorganize hook structure to follow vertical slice architecture:
- Move useSmartPolling, useThemeAware, useToast to features/shared/hooks
- Update 38+ import statements across codebase
- Update test file mocks to reference new locations
- Remove old ui/hooks directory
This change aligns shared utilities with the architectural pattern
where truly shared code resides in the shared directory.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-authored-by: Claude <noreply@anthropic.com>
* Preparing migration folder for the migration alert implementation
* Migrations and version APIs initial
* Touching up update instructions in README and UI
* Unit tests for migrations and version APIs
* Splitting up the Ollama migration scripts
* Removing temporary PRPs
---------
Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com>