Delete PRPs/openrouter-embeddings-support.md

This commit is contained in:
DIY Smart Code
2025-11-08 23:57:24 +01:00
committed by GitHub
parent 4d2851cb2f
commit 038b00d275

View File

@@ -1,812 +0,0 @@
name: "OpenRouter Embeddings Support PRP"
description: |
Add OpenRouter as an embedding provider option, enabling users to access multiple
embedding models (OpenAI, Google Gemini, Qwen3, Mistral) through a single API key.
---
## Goal
**Feature Goal**: Enable Archon users to use OpenRouter as an embedding provider, accessing multiple embedding models through OpenRouter's unified API while maintaining compatibility with existing embedding infrastructure.
**Deliverable**:
- OpenRouter provider option in Settings UI
- OpenRouter model discovery and selection
- Configuration support for OpenRouter API credentials
- Comprehensive tests for OpenRouter embedding functionality
**Success Definition**:
- Users can select "OpenRouter" as embedding provider in Settings
- Users can authenticate with OpenRouter API key
- Users can discover and select OpenRouter embedding models (OpenAI, Gemini, Qwen3, Mistral)
- Embeddings are generated successfully using OpenRouter API
- All existing embedding tests pass with OpenRouter provider
## User Persona
**Target User**: Archon users who want cost flexibility and access to multiple embedding providers through a single API
**Use Case**:
- Primary: User wants to use Qwen3 embeddings for cost savings ($0.01/1M tokens vs $0.13/1M for OpenAI)
- Secondary: User needs longer context windows (Gemini 20K, Qwen3 32K vs OpenAI 8K)
- Tertiary: User wants a unified API key for multiple embedding providers
**User Journey**:
1. User navigates to Settings → RAG Configuration
2. User selects "OpenRouter" as embedding provider
3. User enters OpenRouter API key (starts with `sk-or-v1-`)
4. User clicks "Discover Models" to fetch available embedding models
5. User selects desired model (e.g., `qwen/qwen3-embedding-8b` for cost savings)
6. User saves configuration
7. User tests embedding generation with sample text
8. System generates embeddings using OpenRouter API
**Pain Points Addressed**:
- Eliminates need for multiple API keys (OpenAI, Google, etc.)
- Provides access to cost-effective embedding models (Qwen3)
- Enables longer context processing (Gemini, Qwen3)
- Simplifies multi-provider strategy with single unified API
## Why
- **Cost Optimization**: Qwen3 models cost $0.01/1M tokens (87% cheaper than OpenAI text-embedding-3-large)
- **Context Length**: Gemini (20K) and Qwen3 (32K) support longer contexts than OpenAI (8K)
- **Unified API**: Single API key for accessing multiple embedding providers
- **Provider Diversity**: Reduces dependency on single provider
- **Existing Infrastructure**: Leverages OpenAI-compatible adapter pattern already in codebase
- **User Flexibility**: Gives users choice based on cost, performance, and context requirements
## What
Add OpenRouter as a first-class embedding provider option with full model discovery and configuration support.
### Success Criteria
- [x] Backend recognizes "openrouter" as valid embedding provider
- [x] Frontend displays "OpenRouter" in provider selection dropdown
- [x] OpenRouter API key validation (format: `sk-or-v1-*`)
- [x] Model discovery fetches OpenRouter embedding models with metadata
- [x] Model names include provider prefixes (e.g., `openai/text-embedding-3-large`)
- [x] Settings UI displays OpenRouter models with dimensions and pricing
- [x] Embedding generation works via OpenRouter API
- [x] Error handling includes OpenRouter-specific errors (rate limits, quota)
- [x] All existing embedding tests pass with OpenRouter provider
- [x] Documentation updated with OpenRouter setup instructions
## All Needed Context
### Context Completeness Check
**Validated**: This PRP provides complete context for implementation without prior codebase knowledge:
- Existing embedding provider patterns documented
- OpenRouter API compatibility assessed
- UI integration patterns identified
- Test patterns extracted
- Configuration patterns mapped
- All file paths and patterns specified
### Documentation & References
```yaml
# MUST READ - OpenRouter API Documentation
- url: https://openrouter.ai/docs/api-reference/embeddings/create-embeddings
why: Primary API reference for embeddings endpoint structure and parameters
critical: Model names MUST include provider prefix (openai/, google/, qwen/, mistralai/)
- url: https://openrouter.ai/docs/api-reference/authentication
why: API key format validation (sk-or-v1-*) and Authorization header pattern
critical: Uses Bearer token auth, not API key header
- url: https://openrouter.ai/docs/api-reference/errors
why: OpenRouter-specific error codes and rate limit structure
critical: Rate limits based on account balance (1 RPS per $1), 429 errors require different handling
- url: https://openrouter.ai/docs/api-reference/limits
why: Rate limiting and account tier information
critical: Free tier very restrictive (20 req/min, 50/day); recommend $10 minimum
# MUST READ - Existing Codebase Patterns
- file: python/src/server/services/embeddings/embedding_service.py
why: Core embedding service with adapter pattern and provider abstraction
pattern: _get_embedding_adapter() factory returns OpenAICompatibleEmbeddingAdapter for OpenRouter
gotcha: OpenRouter uses OpenAI-compatible API, no custom adapter needed
- file: python/src/server/services/llm_provider_service.py
why: Provider validation, client creation, and model selection patterns
pattern: get_llm_client() creates OpenAI-compatible client with custom base_url
gotcha: Must add "openrouter" to _is_valid_provider() list and provider_defaults dict
- file: python/src/server/services/embeddings/provider_error_adapters.py
why: Error sanitization to remove API keys from error messages
pattern: Create OpenRouterErrorAdapter class with sk-or-v1- pattern redaction
gotcha: OpenRouter keys format: sk-or-v1-[long string]
- file: archon-ui-main/src/components/settings/RAGSettings.tsx
why: Provider selection UI and model persistence pattern
pattern: Add "openrouter" to ProviderKey type and EMBEDDING_CAPABLE_PROVIDERS array
gotcha: Per-provider model persistence in localStorage with getDefaultModels()
- file: python/tests/test_async_embedding_service.py
why: Comprehensive embedding service test patterns
pattern: AsyncContextManager for mocking, fixture-based mocking, multi-provider tests
gotcha: Must test with provider prefix in model names (openai/text-embedding-3-large)
# MUST READ - Configuration Patterns
- file: python/src/server/services/credential_service.py
why: Database-backed credential storage and retrieval
pattern: get_active_provider(service_type="embedding") for provider selection
gotcha: API keys stored encrypted in archon_settings table, not .env
- file: python/.env.example
why: Environment variable documentation reference
pattern: Add OPENROUTER_API_KEY with example and description
gotcha: Only for documentation; actual keys stored in database via Settings UI
```
### Current Codebase Tree (Relevant Sections)
```bash
python/
├── src/
│ └── server/
│ ├── services/
│ │ ├── embeddings/
│ │ │ ├── embedding_service.py # Main embedding logic
│ │ │ ├── embedding_exceptions.py # Custom exception hierarchy
│ │ │ └── provider_error_adapters.py # Error sanitization
│ │ ├── llm_provider_service.py # Provider abstraction
│ │ └── credential_service.py # Credential management
│ ├── config/
│ │ └── config.py # API key validation
│ └── api_routes/
│ └── settings_api.py # Settings CRUD API
└── tests/
├── test_async_embedding_service.py # Embedding service tests
└── test_async_llm_provider_service.py # Provider service tests
archon-ui-main/
└── src/
├── components/
│ └── settings/
│ ├── RAGSettings.tsx # Main RAG config UI
│ └── OllamaModelSelectionModal.tsx # Model selection pattern
└── services/
└── credentialsService.ts # Frontend credential API
```
### Desired Codebase Tree (Files to Add/Modify)
```bash
python/
├── src/
│ └── server/
│ ├── services/
│ │ ├── embeddings/
│ │ │ └── provider_error_adapters.py # MODIFY: Add OpenRouterErrorAdapter
│ │ ├── llm_provider_service.py # MODIFY: Add openrouter to valid providers
│ │ └── openrouter_discovery_service.py # CREATE: OpenRouter model discovery
│ ├── config/
│ │ └── config.py # MODIFY: Add validate_openrouter_api_key()
│ └── api_routes/
│ └── openrouter_api.py # CREATE: OpenRouter-specific endpoints
├── tests/
│ ├── test_openrouter_embeddings.py # CREATE: OpenRouter embedding tests
│ └── test_openrouter_discovery.py # CREATE: Model discovery tests
└── .env.example # MODIFY: Add OPENROUTER_API_KEY docs
archon-ui-main/
└── src/
├── components/
│ └── settings/
│ ├── RAGSettings.tsx # MODIFY: Add OpenRouter provider option
│ └── OpenRouterModelModal.tsx # CREATE: Model selection UI
└── services/
└── openrouterService.ts # CREATE: API client for model discovery
```
### Known Gotchas & Library Quirks
```python
# CRITICAL: OpenRouter API Compatibility
# OpenRouter uses OpenAI-compatible API BUT with provider prefixes in model names
# ✅ CORRECT: "openai/text-embedding-3-large"
# ❌ WRONG: "text-embedding-3-large"
# CRITICAL: Rate Limiting Structure
# OpenRouter rate limits scale with account balance (1 RPS per $1)
# Free tier: 20 req/min, 50-1000 requests/day based on balance
# Must handle 429 errors differently than OpenAI (no tier-based retries)
# CRITICAL: API Key Format
# OpenRouter keys: sk-or-v1-[alphanumeric string]
# OpenAI keys: sk-[alphanumeric string]
# Must validate format to prevent user confusion
# CRITICAL: Base URL
# OpenRouter: https://openrouter.ai/api/v1
# Must NOT use /embeddings suffix (full path: /api/v1/embeddings)
# GOTCHA: Archon's Adapter Pattern
# OpenRouter requires NO custom adapter - uses OpenAICompatibleEmbeddingAdapter
# Only needs base_url change and model name prefix handling
# GOTCHA: Credential Storage
# API keys stored in database via credential_service, not .env
# .env.example only for documentation purposes
# GOTCHA: Model Discovery
# Unlike Ollama, OpenRouter doesn't have /models endpoint
# Must maintain hardcoded list or fetch from OpenRouter website
# Alternative: Use OpenAI SDK model discovery with OpenRouter base URL
# GOTCHA: Dimension Support
# OpenRouter models support dimension reduction via API parameter
# OpenAI models: 1536 (reducible), 3072 (reducible)
# Gemini: 768, 1536, 3072 (reducible)
# Qwen3: Dimensions not documented, test required
# GOTCHA: Error Message Sanitization
# Must add sk-or-v1- pattern to provider_error_adapters.py
# Prevents API keys from appearing in logs and error messages
```
## Implementation Blueprint
### Data Models and Structure
```python
# python/src/server/models/openrouter_models.py
from pydantic import BaseModel, Field, field_validator
from typing import Literal
class OpenRouterEmbeddingModel(BaseModel):
"""OpenRouter embedding model metadata."""
id: str = Field(..., description="Full model ID with provider prefix (e.g., openai/text-embedding-3-large)")
provider: Literal["openai", "google", "qwen", "mistralai"] = Field(..., description="Provider name")
name: str = Field(..., description="Display name without prefix")
dimensions: int = Field(..., description="Embedding dimensions")
context_length: int = Field(..., description="Maximum context window in tokens")
pricing_per_1m_tokens: float = Field(..., description="Cost per 1M tokens")
supports_dimension_reduction: bool = Field(default=False, description="Whether model supports dimension parameter")
@field_validator("id")
@classmethod
def validate_model_id_has_prefix(cls, v: str) -> str:
"""Ensure model ID includes provider prefix."""
if "/" not in v:
raise ValueError("OpenRouter model IDs must include provider prefix (e.g., openai/model-name)")
return v
class OpenRouterModelListResponse(BaseModel):
"""Response from OpenRouter model discovery."""
embedding_models: list[OpenRouterEmbeddingModel] = Field(default_factory=list)
total_count: int = Field(..., description="Total number of embedding models")
# No database schema changes needed - uses existing archon_settings table
```
### Implementation Tasks (Ordered by Dependencies)
```yaml
Task 1: MODIFY python/src/server/config/config.py
- IMPLEMENT: validate_openrouter_api_key(api_key: str) -> bool
- PATTERN: Follow validate_openai_api_key() structure
- VALIDATION: Must start with "sk-or-v1-", non-empty, string type
- RAISES: ConfigurationError with descriptive message
- PLACEMENT: After other API key validation functions
- GOTCHA: Different prefix than OpenAI (sk-or-v1- vs sk-)
Task 2: MODIFY python/src/server/services/embeddings/provider_error_adapters.py
- IMPLEMENT: OpenRouterErrorAdapter class inheriting from ProviderErrorAdapter
- PATTERN: Copy OpenAIErrorAdapter structure, change regex patterns
- REGEX: r'sk-or-v1-[a-zA-Z0-9_-]{10,}' → '[REDACTED_KEY]'
- ADD: OpenRouterErrorAdapter to ProviderErrorFactory.get_adapter()
- PLACEMENT: After AnthropicErrorAdapter class
- GOTCHA: Must redact sk-or-v1- prefix format
Task 3: MODIFY python/src/server/services/llm_provider_service.py
- ADD: "openrouter" to _is_valid_provider() validation list
- ADD: "openrouter" to provider_defaults dict in get_embedding_model()
- DEFAULT_MODEL: "openai/text-embedding-3-small" (most cost-effective OpenAI model)
- BASE_URL: "https://openrouter.ai/api/v1"
- PATTERN: Follow existing openai, google, ollama provider pattern
- DEPENDENCIES: None (pure configuration)
- GOTCHA: Model name MUST include provider prefix
Task 4: CREATE python/src/server/services/openrouter_discovery_service.py
- IMPLEMENT: OpenRouterDiscoveryService class with discover_embedding_models() method
- RETURN: List of OpenRouterEmbeddingModel with hardcoded model metadata
- MODELS: Include OpenAI (3-small, 3-large), Gemini (001), Qwen3 (0.6b, 4b, 8b), Mistral
- PATTERN: Follow python/src/server/services/provider_discovery_service.py structure
- FUTURE: Could fetch from OpenRouter API models endpoint if available
- PLACEMENT: New file in src/server/services/
- DEPENDENCIES: Import OpenRouterEmbeddingModel from Task 1
Task 5: CREATE python/src/server/api_routes/openrouter_api.py
- IMPLEMENT: GET /api/openrouter/models endpoint returning OpenRouterModelListResponse
- PATTERN: Follow python/src/server/api_routes/ollama_api.py structure
- AUTHENTICATION: No auth required (public model list)
- DEPENDENCIES: Import OpenRouterDiscoveryService from Task 4
- PLACEMENT: New file in src/server/api_routes/
- REGISTER: Add router to main.py with prefix "/api/openrouter"
Task 6: MODIFY python/src/server/main.py
- IMPORT: from src.server.api_routes.openrouter_api import router as openrouter_router
- REGISTER: app.include_router(openrouter_router)
- PLACEMENT: After other API router registrations
- PRESERVE: Existing router registrations and startup logic
Task 7: CREATE archon-ui-main/src/services/openrouterService.ts
- IMPLEMENT: discoverModels() -> Promise<OpenRouterModelListResponse>
- PATTERN: Follow archon-ui-main/src/services/ollamaService.ts structure
- ENDPOINT: GET /api/openrouter/models
- CACHING: sessionStorage with 5-minute TTL
- ERROR_HANDLING: Try/catch with user-friendly error messages
- PLACEMENT: New file in src/services/
Task 8: MODIFY archon-ui-main/src/components/settings/RAGSettings.tsx
- ADD: "openrouter" to ProviderKey type
- ADD: "openrouter" to EMBEDDING_CAPABLE_PROVIDERS array
- ADD: "OpenRouter" to providerDisplayNames mapping
- ADD: OpenRouter color style (suggested: border-purple-500 bg-purple-500/10)
- ADD: Default models in getDefaultModels(): chatModel "gpt-4.1-nano", embeddingModel "openai/text-embedding-3-small"
- PATTERN: Follow existing provider additions (anthropic, grok)
- GOTCHA: Embedding model includes provider prefix
Task 9: CREATE archon-ui-main/src/components/settings/OpenRouterModelModal.tsx
- IMPLEMENT: Full-screen modal for OpenRouter model selection
- PATTERN: Copy OllamaModelSelectionModal.tsx structure, simplify (no multi-instance)
- FEATURES: Search, filter by provider, sort by cost/dimensions/context
- DISPLAY: Model cards with provider badge, dimensions, context length, pricing
- STATE: selectedModel, searchTerm, providerFilter, sortBy
- INTEGRATION: Call openrouterService.discoverModels() on mount
- PLACEMENT: New file in src/components/settings/
Task 10: CREATE python/tests/test_openrouter_embeddings.py
- IMPLEMENT: Unit tests for OpenRouter embedding functionality
- PATTERN: Copy python/tests/test_async_embedding_service.py structure
- TESTS:
- test_create_embedding_with_openrouter_provider()
- test_openrouter_model_name_includes_prefix()
- test_openrouter_api_key_validation()
- test_openrouter_error_sanitization()
- test_openrouter_rate_limit_handling()
- MOCKS: Mock OpenAI client with base_url="https://openrouter.ai/api/v1"
- FIXTURES: openrouter_provider_config fixture
- ASSERTIONS: Verify model names have provider prefix, API key format, error handling
- PLACEMENT: New file in tests/
Task 11: CREATE python/tests/test_openrouter_discovery.py
- IMPLEMENT: Unit tests for OpenRouter model discovery
- PATTERN: Follow python/tests/test_async_llm_provider_service.py pattern
- TESTS:
- test_discover_embedding_models_returns_valid_list()
- test_all_models_have_provider_prefix()
- test_dimensions_are_positive_integers()
- test_pricing_is_non_negative()
- PLACEMENT: New file in tests/
Task 12: MODIFY python/.env.example
- ADD: OPENROUTER_API_KEY documentation
- FORMAT: "# OpenRouter API key (format: sk-or-v1-...)\nOPENROUTER_API_KEY=your_api_key_here"
- NOTE: Add comment that keys are stored in database via Settings UI
- PLACEMENT: After other API key examples (OPENAI_API_KEY, GOOGLE_API_KEY)
```
### Implementation Patterns & Key Details
```python
# PATTERN: OpenRouter API Key Validation
def validate_openrouter_api_key(api_key: str) -> bool:
"""Validate OpenRouter API key format."""
if not api_key:
raise ConfigurationError("OpenRouter API key cannot be empty")
if not api_key.startswith("sk-or-v1-"):
raise ConfigurationError(
"OpenRouter API key must start with 'sk-or-v1-'. "
"Get your key at https://openrouter.ai/keys"
)
return True
# PATTERN: Error Sanitization for OpenRouter
class OpenRouterErrorAdapter(ProviderErrorAdapter):
"""Sanitize OpenRouter error messages to remove API keys."""
def sanitize_error_message(self, message: str) -> str:
"""Remove OpenRouter API keys from error messages."""
# Pattern: sk-or-v1-[alphanumeric and special chars]
pattern = r'sk-or-v1-[a-zA-Z0-9_-]{10,}'
return re.sub(pattern, '[REDACTED_KEY]', message)
# PATTERN: Provider Configuration in llm_provider_service.py
async def get_embedding_model(provider: str | None = None) -> str:
"""Get embedding model for provider."""
provider_defaults = {
"openai": "text-embedding-3-small",
"ollama": "nomic-embed-text",
"google": "text-embedding-004",
"openrouter": "openai/text-embedding-3-small", # ADD THIS
}
# ... rest of function
def _is_valid_provider(provider: str) -> bool:
"""Validate provider name."""
valid_providers = {
"openai", "ollama", "google", "openrouter", # ADD openrouter
"anthropic", "grok"
}
return provider.lower() in valid_providers
# PATTERN: Model Discovery Service
class OpenRouterDiscoveryService:
"""Discover and manage OpenRouter embedding models."""
async def discover_embedding_models(self) -> list[OpenRouterEmbeddingModel]:
"""
Get available OpenRouter embedding models.
NOTE: Hardcoded for now. Future: fetch from API if available.
"""
return [
OpenRouterEmbeddingModel(
id="openai/text-embedding-3-small",
provider="openai",
name="text-embedding-3-small",
dimensions=1536,
context_length=8191,
pricing_per_1m_tokens=0.02,
supports_dimension_reduction=True,
),
OpenRouterEmbeddingModel(
id="openai/text-embedding-3-large",
provider="openai",
name="text-embedding-3-large",
dimensions=3072,
context_length=8191,
pricing_per_1m_tokens=0.13,
supports_dimension_reduction=True,
),
OpenRouterEmbeddingModel(
id="google/gemini-embedding-001",
provider="google",
name="gemini-embedding-001",
dimensions=3072,
context_length=20000,
pricing_per_1m_tokens=0.15,
supports_dimension_reduction=True,
),
OpenRouterEmbeddingModel(
id="qwen/qwen3-embedding-8b",
provider="qwen",
name="qwen3-embedding-8b",
dimensions=1024, # TODO: Verify actual dimensions
context_length=32768,
pricing_per_1m_tokens=0.01,
supports_dimension_reduction=False,
),
# Add more models as needed
]
# PATTERN: API Endpoint
@router.get("/models", response_model=OpenRouterModelListResponse)
async def get_openrouter_models() -> OpenRouterModelListResponse:
"""Get available OpenRouter embedding models."""
service = OpenRouterDiscoveryService()
models = await service.discover_embedding_models()
return OpenRouterModelListResponse(
embedding_models=models,
total_count=len(models),
)
# PATTERN: Frontend Model Selection (TypeScript)
interface OpenRouterModelCardProps {
model: OpenRouterEmbeddingModel
isSelected: boolean
onSelect: () => void
}
const OpenRouterModelCard: React.FC<OpenRouterModelCardProps> = ({
model,
isSelected,
onSelect,
}) => {
return (
<div
onClick={onSelect}
className={/* selection styles */}
>
{/* Provider badge */}
<Badge>{model.provider.toUpperCase()}</Badge>
{/* Dimensions badge with color coding */}
<Badge variant={model.dimensions >= 3000 ? "success" : "default"}>
{model.dimensions}D
</Badge>
{/* Model name and context */}
<h3>{model.name}</h3>
<p>Context: {model.context_length.toLocaleString()} tokens</p>
{/* Pricing */}
<p className="text-green-400">
${model.pricing_per_1m_tokens.toFixed(3)}/1M tokens
</p>
</div>
)
}
# GOTCHA: Model Name Must Include Prefix
# When sending to OpenRouter API:
embedding_model = "openai/text-embedding-3-large" # ✅ CORRECT
# NOT:
embedding_model = "text-embedding-3-large" # ❌ WRONG - will fail with 404
# GOTCHA: Base URL Must Be Complete
client = openai.AsyncOpenAI(
base_url="https://openrouter.ai/api/v1", # ✅ CORRECT
api_key=openrouter_api_key,
)
# NOT:
base_url = "https://openrouter.ai" # ❌ WRONG - missing /api/v1
```
### Integration Points
```yaml
DATABASE:
- table: archon_settings
- action: No schema changes needed
- credentials: OPENROUTER_API_KEY stored via credential_service
- settings: EMBEDDING_PROVIDER can be set to "openrouter"
CONFIG:
- file: python/.env.example
- add: OPENROUTER_API_KEY with documentation comment
- note: Actual keys stored in database, .env for docs only
ROUTES:
- file: python/src/server/main.py
- add: app.include_router(openrouter_router, prefix="/api/openrouter")
- placement: After other API router registrations
UI_COMPONENTS:
- file: archon-ui-main/src/components/settings/RAGSettings.tsx
- modify: Add "openrouter" to provider types and defaults
- add: OpenRouter color styling and display name
API_SERVICES:
- file: archon-ui-main/src/services/openrouterService.ts
- create: New service for model discovery API calls
- pattern: Follow ollamaService.ts structure
VALIDATION:
- file: python/src/server/config/config.py
- add: validate_openrouter_api_key() function
- pattern: Follow validate_openai_api_key() structure
ERROR_HANDLING:
- file: python/src/server/services/embeddings/provider_error_adapters.py
- add: OpenRouterErrorAdapter class
- register: In ProviderErrorFactory.get_adapter()
```
## Validation Loop
### Level 1: Syntax & Style (Immediate Feedback)
```bash
# Backend validation after each Python file
cd python
uv run ruff check src/server/services/openrouter_discovery_service.py --fix
uv run mypy src/server/services/openrouter_discovery_service.py
uv run ruff format src/server/services/openrouter_discovery_service.py
# Full backend validation
uv run ruff check src/ --fix
uv run mypy src/
uv run ruff format src/
# Frontend validation after each TypeScript file
cd archon-ui-main
npm run biome:fix src/services/openrouterService.ts
npx tsc --noEmit src/services/openrouterService.ts
# Full frontend validation
npm run biome:fix
npm run lint
npx tsc --noEmit
# Expected: Zero errors before proceeding
```
### Level 2: Unit Tests (Component Validation)
```bash
# Test OpenRouter embedding functionality
cd python
uv run pytest tests/test_openrouter_embeddings.py -v
uv run pytest tests/test_openrouter_discovery.py -v
# Test existing embedding service still works
uv run pytest tests/test_async_embedding_service.py -v
# Test provider service recognizes OpenRouter
uv run pytest tests/test_async_llm_provider_service.py::test_get_llm_client_openrouter -v
# Full test suite
uv run pytest tests/ -v
# Coverage check
uv run pytest tests/ --cov=src --cov-report=term-missing
# Expected: All tests pass, >80% coverage for new code
```
### Level 3: Integration Testing (System Validation)
```bash
# Start services
cd python
docker compose up -d # or: uv run python -m src.server.main
# Wait for startup
sleep 5
# Test health endpoint
curl -f http://localhost:8181/health || echo "Service health check failed"
# Test OpenRouter model discovery endpoint
curl http://localhost:8181/api/openrouter/models | jq .
# Expected: JSON with embedding_models array containing OpenRouter models
# Test frontend (separate terminal)
cd archon-ui-main
npm run dev
# Manual UI Testing Checklist:
# 1. Navigate to Settings → RAG Configuration
# 2. Verify "OpenRouter" appears in provider dropdown
# 3. Select OpenRouter as embedding provider
# 4. Enter test API key (or real key if available)
# 5. Click "Discover Models" or "Select Model"
# 6. Verify models appear with provider prefixes (openai/, google/, qwen/)
# 7. Select a model and save configuration
# 8. Test embedding generation (upload document or search)
# 9. Check browser console for errors
# 10. Verify embeddings generated successfully
# Test with real API (if key available)
curl -X POST http://localhost:8181/api/knowledge/upload \
-H "Content-Type: multipart/form-data" \
-F "files=@test.txt" \
-F "provider=openrouter"
# Expected: Successful upload with embeddings generated
# Check Docker logs for errors
docker compose logs archon-server -f
# Expected: No errors, successful embedding generation logs
```
### Level 4: Creative & Domain-Specific Validation
```bash
# Cost Comparison Test
# Compare embedding costs between providers using same test document
# Expected: Qwen3 models show lowest cost per embedding
# Context Length Test
# Test with long documents (>8K tokens)
# Expected: Gemini and Qwen3 handle longer contexts than OpenAI
# Model Prefix Validation
# Ensure all model names in UI and API include provider prefixes
grep -r "text-embedding-3" archon-ui-main/src --include="*.tsx" --include="*.ts"
# Expected: All instances use "openai/text-embedding-3-*" format
# API Key Sanitization Test
# Trigger an error with invalid API key
# Check logs to ensure key is not visible
# Expected: All sk-or-v1-* patterns replaced with [REDACTED_KEY]
# Rate Limit Handling Test (if real API key available)
# Send rapid requests to trigger rate limiting
# Expected: Graceful handling of 429 errors, no crashes
# Multi-Provider Switching Test
# Switch between OpenRouter, OpenAI, Google in Settings
# Test embeddings work with each provider
# Expected: Seamless provider switching, embeddings generated correctly
# Browser DevTools Network Tab
# Verify API calls to /api/openrouter/models
# Verify embedding API calls include correct base URL
# Expected: Correct endpoints, proper Authorization headers
# Session Storage Validation
# Check sessionStorage for model cache
# Verify 5-minute TTL behavior
# Expected: Models cached correctly, refresh after TTL
```
## Final Validation Checklist
### Technical Validation
- [ ] All 4 validation levels completed successfully
- [ ] All tests pass: `uv run pytest tests/ -v`
- [ ] No linting errors: `uv run ruff check src/`
- [ ] No type errors: `uv run mypy src/`
- [ ] No formatting issues: `uv run ruff format src/ --check`
- [ ] Frontend builds without errors: `npm run build`
- [ ] No TypeScript errors: `npx tsc --noEmit`
### Feature Validation
- [ ] OpenRouter appears in provider selection dropdown
- [ ] OpenRouter API key can be saved in Settings UI
- [ ] Model discovery returns OpenRouter embedding models
- [ ] All model names include provider prefixes (openai/, google/, qwen/, mistralai/)
- [ ] Model selection modal displays OpenRouter models with metadata
- [ ] Embeddings successfully generated using OpenRouter API
- [ ] Error messages do not expose API keys
- [ ] Rate limiting handled gracefully (if testable)
- [ ] Configuration persists across browser sessions
- [ ] Switching between providers works correctly
### Code Quality Validation
- [ ] Follows existing embedding provider patterns
- [ ] No custom adapter created (uses OpenAICompatibleEmbeddingAdapter)
- [ ] File placement matches desired codebase tree structure
- [ ] Naming conventions consistent (openrouter lowercase, OpenRouter display name)
- [ ] Error handling includes OpenRouter-specific errors
- [ ] API key validation prevents invalid formats
- [ ] Test coverage >80% for new code
### Documentation & User Experience
- [ ] .env.example includes OPENROUTER_API_KEY with helpful comment
- [ ] Model selection UI clearly shows provider, dimensions, pricing
- [ ] Error messages are user-friendly and actionable
- [ ] Settings UI tooltips explain OpenRouter benefits (cost, context)
- [ ] No breaking changes to existing embedding functionality
---
## Anti-Patterns to Avoid
- ❌ Don't create custom EmbeddingAdapter for OpenRouter (use OpenAI-compatible)
- ❌ Don't omit provider prefix from model names (breaks OpenRouter API)
- ❌ Don't validate OpenAI keys for OpenRouter (different format: sk-or-v1-)
- ❌ Don't hardcode OpenRouter API URL without /api/v1 path
- ❌ Don't skip error message sanitization (prevents key leakage)
- ❌ Don't assume OpenRouter models have same dimensions as OpenAI
- ❌ Don't ignore rate limit differences (balance-based, not tier-based)
- ❌ Don't forget to test with model name prefixes in all tests
- ❌ Don't break existing provider functionality when adding OpenRouter
- ❌ Don't skip session storage caching for model discovery (reduces API calls)
---
## PRP Confidence Score
**Score**: 9/10
**Rationale**:
- ✅ OpenRouter API is fully OpenAI-compatible (minimal integration work)
- ✅ Existing adapter pattern supports OpenRouter without custom code
- ✅ Comprehensive research completed on API, UI patterns, testing
- ✅ Clear implementation path with specific file modifications
- ✅ All patterns extracted from existing codebase
- ⚠️ Minor uncertainty: Qwen3 embedding dimensions not documented (needs testing)
- ⚠️ Rate limit handling may need iteration based on real API behavior
**Implementation Time Estimate**: 4-6 hours for experienced developer
**Risk Level**: Low (leverages existing infrastructure, no breaking changes)