feat: Implement comprehensive OpenAI error handling for Issue #362

Replace silent failures with clear, actionable error messages to eliminate 90-minute debugging sessions when OpenAI API quota is exhausted. ## Backend Enhancements - Add error sanitization preventing sensitive data exposure (API keys, URLs, tokens) - Add upfront API key validation before expensive operations (crawl, upload, refresh) - Implement fail-fast pattern in RAG service (no more empty results for API failures) - Add specific error handling for quota, rate limit, auth, and API errors - Add EmbeddingAuthenticationError exception with masked key prefix support ## Frontend Enhancements - Create enhanced error utilities with OpenAI-specific parsing - Build TanStack Query compatible API wrapper preserving ETag caching - Update knowledge service to use enhanced error handling - Enhance TanStack Query hooks with user-friendly error messages ## Security Features - Comprehensive regex sanitization (8 patterns) with ReDoS protection - Input validation and circular reference detection - Generic fallback messages for sensitive keywords - Bounded quantifiers to prevent regex DoS attacks ## User Experience - Clear error messages: "OpenAI API quota exhausted" - Actionable guidance: "Check your OpenAI billing dashboard and add credits" - Immediate error visibility (no more silent failures) - Appropriate error severity styling ## Architecture Compatibility - Full TanStack Query integration maintained - ETag caching and optimistic updates preserved - No performance regression (all existing tests pass) - Compatible with existing knowledge base architecture Resolves #362: Users no longer experience mysterious empty RAG results that require extensive debugging to identify OpenAI quota issues. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-01 04:09:08 -05:00 · 2025-09-12 19:22:36 +02:00
parent 94aed6b9fa
commit 98b798173e
26 changed files with 1375 additions and 143 deletions
--- a/TASK_PRP/PRPs/rebase-bugfix-issue-362.md
+++ b/TASK_PRP/PRPs/rebase-bugfix-issue-362.md
@@ -0,0 +1,275 @@
 # TASK PRP: Rebase bugfix-issue-362 onto main
 ## Task Overview
 Rebase the bugfix-issue-362 branch (33 commits of OpenAI error handling improvements) onto main branch (includes TanStack Query Migration Phase 3) while preserving all error handling functionality and adapting to the new architecture.
 ## Analysis Process
 ### Scope Definition
 - **Affected Files**: 
  - `archon-ui-main/src/pages/KnowledgeBasePage.tsx` (conflict)
  - `archon-ui-main/src/services/knowledgeBaseService.ts` (deleted in main)
  - `python/src/server/services/search/rag_service.py` (conflict)
  - Multiple test files with error handling updates
 - **Dependencies**: TanStack Query hooks, frontend error handling, backend RAG service
 - **Side Effects**: Error sanitization patterns, API authentication flows
 - **Test Coverage**: Frontend error display, backend error handling, integration tests
 ### Pattern Research
 ```yaml
 context:
  docs:
    - url: "PR #605 - TanStack Query Migration Phase 3"
      focus: "Knowledge base architecture changes"
    - url: "archon-ui-main/src/features/knowledge/hooks/"
      focus: "New TanStack Query patterns"
  patterns:
    - file: "archon-ui-main/src/features/knowledge/components/KnowledgeBase.tsx"
      copy: "TanStack Query error handling patterns"
    - file: "archon-ui-main/src/features/knowledge/hooks/useKnowledgeQueries.ts"  
      copy: "Service replacement patterns"
  gotchas:
    - issue: "knowledgeBaseService.ts deleted"
      fix: "Functionality moved to TanStack Query hooks"
    - issue: "Error handling architecture changed"
      fix: "Use TanStack Query error states"
    - issue: "Import paths changed"
      fix: "Update to features/ directory structure"
 ```
 ## Task Structure
 ### Phase 1: Preparation and Analysis
 **ACTION backup-branches:**
 - CREATE: backup branch `git checkout -b bugfix-issue-362-backup`
 - CREATE: analysis branch `git checkout -b rebase-analysis`
 - VALIDATE: `git branch -a | grep backup`
 - IF_FAIL: "Branches not created properly"
 - ROLLBACK: "Continue without backup (higher risk)"
 **ACTION analyze-main-changes:**
 - READ: `git log origin/main --oneline -10`
 - READ: `archon-ui-main/src/features/knowledge/` structure
 - READ: TanStack Query migration patterns in main
 - VALIDATE: Understanding of new architecture
 - IF_FAIL: "Research TanStack Query documentation"
 - ROLLBACK: "Proceed with basic understanding"
 **ACTION identify-functionality-mapping:**
 - COMPARE: deleted `knowledgeBaseService.ts` vs new hooks
 - MAP: Error handling functions to TanStack Query equivalents
 - DOCUMENT: Functions that need to be preserved
 - VALIDATE: Complete functionality mapping
 - IF_FAIL: "Manual code review needed"
 - ROLLBACK: "Proceed with partial mapping"
 ### Phase 2: Interactive Rebase Setup
 **ACTION start-rebase:**
 - COMMAND: `git rebase -i origin/main`
 - SELECT: Keep all commits (edit conflicting ones)
 - VALIDATE: Rebase editor opens successfully
 - IF_FAIL: "Check git configuration, try git rebase --abort"
 - ROLLBACK: "Use merge strategy instead"
 ### Phase 3: Systematic Conflict Resolution
 **ACTION resolve-rag-service-conflict:**
 - FILE: `python/src/server/services/search/rag_service.py`
 - OPERATION: Merge error handling improvements with main changes
 - PRESERVE: OpenAI authentication error handling
 - PRESERVE: Error message sanitization
 - VALIDATE: `cd python && uv run pytest tests/test_rag_*.py -v`
 - IF_FAIL: "Review specific test failures, adjust error handling"
 - ROLLBACK: "Accept main version, re-implement error handling"
 **ACTION resolve-knowledge-page-conflict:**
 - FILE: `archon-ui-main/src/pages/KnowledgeBasePage.tsx`
 - OPERATION: Adapt error handling to TanStack Query patterns
 - PORT: Error display logic to new component structure
 - UPDATE: Import statements to use features/ directory
 - VALIDATE: `cd archon-ui-main && npm run test KnowledgeBasePage`
 - IF_FAIL: "Check TanStack Query error handling patterns"
 - ROLLBACK: "Use main version, re-add error handling manually"
 **ACTION handle-deleted-service:**
 - ANALYZE: Functions from `knowledgeBaseService.ts` 
 - PORT: Error handling utilities to appropriate TanStack Query hooks
 - UPDATE: All imports using the deleted service
 - REMOVE: References to deleted service
 - VALIDATE: `cd archon-ui-main && npm run build`
 - IF_FAIL: "Find missing function mappings, update imports"
 - ROLLBACK: "Recreate minimal service file"
 ### Phase 4: Architecture Adaptation
 **ACTION adapt-error-handling:**
 - UPDATE: Frontend error handling to use TanStack Query error states
 - PRESERVE: Error message sanitization functions
 - PRESERVE: OpenAI API key masking
 - UPDATE: Error boundary components if needed
 - VALIDATE: Manual test of error scenarios
 - IF_FAIL: "Check error state propagation"
 - ROLLBACK: "Keep old error handling patterns where possible"
 **ACTION update-imports:**
 - SCAN: All files for old import paths
 - UPDATE: Import paths to new features/ structure
 - REMOVE: Unused imports from deleted services
 - ADD: New TanStack Query hook imports
 - VALIDATE: `cd archon-ui-main && npm run lint`
 - IF_FAIL: "Fix remaining import issues"
 - ROLLBACK: "Manual import fixes"
 ### Phase 5: Comprehensive Validation
 **ACTION run-frontend-tests:**
 - COMMAND: `cd archon-ui-main && npm test`
 - FOCUS: Error handling test cases
 - FOCUS: Knowledge base functionality
 - VALIDATE: All tests pass
 - IF_FAIL: "Fix failing tests one by one"
 - ROLLBACK: "Identify and skip non-critical failures"
 **ACTION run-backend-tests:**
 - COMMAND: `cd python && uv run pytest tests/ -v`
 - FOCUS: RAG service tests
 - FOCUS: Error handling tests
 - VALIDATE: All tests pass
 - IF_FAIL: "Fix backend test failures"
 - ROLLBACK: "Document test failures for later"
 **ACTION manual-error-testing:**
 - TEST: Invalid OpenAI API key scenario
 - TEST: Network timeout scenarios  
 - TEST: Invalid query scenarios
 - VERIFY: Error messages are sanitized
 - VERIFY: Error messages are user-friendly
 - VALIDATE: All error scenarios handled properly
 - IF_FAIL: "Fix specific error handling issues"
 - ROLLBACK: "Document error handling gaps"
 **ACTION performance-validation:**
 - TEST: Knowledge base loading performance
 - TEST: Error handling overhead
 - COMPARE: Before and after performance
 - VALIDATE: No significant performance regression
 - IF_FAIL: "Identify and fix performance issues"
 - ROLLBACK: "Accept minor performance trade-offs"
 ### Phase 6: Finalization
 **ACTION complete-rebase:**
 - COMMAND: `git rebase --continue` (for each remaining commit)
 - RESOLVE: Any remaining conflicts systematically
 - VALIDATE: `git log --oneline` shows clean history
 - IF_FAIL: "Continue resolving conflicts"
 - ROLLBACK: "git rebase --abort and retry"
 **ACTION final-validation:**
 - RUN: Full test suite
 - RUN: Manual smoke test of key functionality
 - CHECK: All error handling improvements preserved
 - VALIDATE: Branch ready for PR
 - IF_FAIL: "Address remaining issues"
 - ROLLBACK: "Reset to backup branch"
 ## User Interaction Points
 ### 1. Initial Confirmation
 **Questions:**
 - Confirm understanding of TanStack Query migration impact
 - Verify that all 33 commits of error handling should be preserved
 - Approve the systematic conflict resolution approach
 ### 2. Architecture Review
 **Questions:**
 - Review the mapping of knowledgeBaseService.ts to TanStack Query hooks
 - Confirm error handling patterns for the new architecture
 - Approve the adaptation strategy
 ### 3. Conflict Resolution Review
 **Questions:**
 - Review specific conflict resolutions as they occur
 - Confirm that error handling functionality is preserved
 - Approve any architectural adaptations
 ## Critical Elements
 ### Debug Patterns
 - **Frontend Issues**: Check browser console for TanStack Query errors
 - **Backend Issues**: Check logs for RAG service error handling
 - **Integration Issues**: Test full error flow from backend to frontend
 ### Performance Considerations
 - TanStack Query caching may affect error handling timing
 - Error boundary performance with new architecture
 - Backend error processing overhead
 ### Security Concerns
 - Ensure error message sanitization still works
 - Verify OpenAI API key masking is preserved
 - Check that no sensitive data leaks in new error flows
 ### Assumptions
 - TanStack Query migration is complete and stable in main
 - All required TanStack Query hooks are implemented
 - Test suite covers error handling scenarios adequately
 ## Risk Assessment
 ### High Risk Areas
 - **Frontend Architecture Mismatch**: TanStack Query vs old service patterns
 - **Lost Functionality**: Functions from deleted knowledgeBaseService.ts
 - **Error Handling Gaps**: New architecture may miss some error cases
 ### Medium Risk Areas  
 - **Import Path Updates**: Many files may have outdated imports
 - **Test Compatibility**: Tests may need updates for new architecture
 - **Performance Impact**: New error handling patterns may affect performance
 ### Low Risk Areas
 - **Backend RAG Service**: Likely straightforward merge conflict
 - **Basic Functionality**: Core features should work with proper adaptation
 ### Rollback Strategy
 1. **Immediate Abort**: `git rebase --abort` returns to original state
 2. **Partial Rollback**: Reset to backup branch and cherry-pick successful commits
 3. **Alternative Approach**: Use merge strategy instead of rebase if conflicts too complex
 ## Success Criteria
 - [ ] All 33 commits successfully rebased onto main
 - [ ] All error handling functionality preserved
 - [ ] Frontend works with TanStack Query architecture  
 - [ ] Backend RAG service conflicts resolved
 - [ ] All tests passing
 - [ ] Error scenarios work as expected
 - [ ] No performance regression
 - [ ] Security features (sanitization) intact
 ## Output Files
 - `bugfix-issue-362` branch rebased onto main
 - Updated test results
 - Documentation of any architectural changes made
 - List of any functionality that couldn't be preserved (if any)
 ## Estimated Timeline
 - **Phase 1-2**: 30 minutes (preparation and setup)
 - **Phase 3**: 1-2 hours (conflict resolution)
 - **Phase 4**: 1 hour (architecture adaptation)
 - **Phase 5**: 1 hour (validation)
 - **Phase 6**: 30 minutes (finalization)
 - **Total**: 4-5 hours
 ## Quality Checklist
 - [x] All changes identified (3 main conflicts + architecture adaptation)
 - [x] Dependencies mapped (TanStack Query, error handling, RAG service)
 - [x] Each task has validation (tests and manual checks)
 - [x] Rollback steps included (backup branches, abort strategies)
 - [x] Debug strategies provided (frontend, backend, integration)
 - [x] Performance impact noted (TanStack Query caching, error processing)
 - [x] Security checked (error sanitization, API key masking)
 - [x] No missing edge cases (comprehensive error scenario testing)
--- a/archon-ui-main/src/features/knowledge/hooks/useKnowledgeQueries.ts
+++ b/archon-ui-main/src/features/knowledge/hooks/useKnowledgeQueries.ts
@@ -11,6 +11,7 @@ import { useActiveOperations } from "../progress/hooks";
 import { progressKeys } from "../progress/hooks/useProgressQueries";
 import type { ActiveOperation, ActiveOperationsResponse } from "../progress/types";
 import { knowledgeService } from "../services";
 import { getDisplayErrorMessage, type EnhancedError } from "../utils/errorHandler";
 import type {
  CrawlRequest,
  CrawlStartResponse,
@@ -273,7 +274,10 @@ export function useCrawlUrl() {
        queryClient.setQueryData(progressKeys.list(), context.previousOperations);
      }
-      const errorMessage = error instanceof Error ? error.message : "Failed to start crawl";
+      // Use enhanced error handling for better user experience
      const errorMessage = (error as EnhancedError)?.isOpenAIError 
        ? getDisplayErrorMessage(error as EnhancedError)
        : (error instanceof Error ? error.message : "Failed to start crawl");
      showToast(errorMessage, "error");
    },
  });
@@ -449,8 +453,10 @@ export function useUploadDocument() {
        queryClient.setQueryData(progressKeys.list(), context.previousOperations);
      }
-      // Display the actual error message from backend
+      // Use enhanced error handling for better user experience
-      const message = error instanceof Error ? error.message : "Failed to upload document";
+      const message = (error as EnhancedError)?.isOpenAIError 
        ? getDisplayErrorMessage(error as EnhancedError)
        : (error instanceof Error ? error.message : "Failed to upload document");
      showToast(message, "error");
    },
  });
@@ -521,7 +527,10 @@ export function useDeleteKnowledgeItem() {
        queryClient.setQueryData(queryKey, data);
      }
-      const errorMessage = error instanceof Error ? error.message : "Failed to delete item";
+      // Use enhanced error handling for better user experience
      const errorMessage = (error as EnhancedError)?.isOpenAIError 
        ? getDisplayErrorMessage(error as EnhancedError)
        : (error instanceof Error ? error.message : "Failed to delete item");
      showToast(errorMessage, "error");
    },
    onSuccess: (data) => {
@@ -568,7 +577,10 @@ export function useUpdateKnowledgeItem() {
        queryClient.setQueryData(knowledgeKeys.detail(variables.sourceId), context.previousItem);
      }
-      const errorMessage = error instanceof Error ? error.message : "Failed to update item";
+      // Use enhanced error handling for better user experience
      const errorMessage = (error as EnhancedError)?.isOpenAIError 
        ? getDisplayErrorMessage(error as EnhancedError)
        : (error instanceof Error ? error.message : "Failed to update item");
      showToast(errorMessage, "error");
    },
    onSuccess: (_data, { sourceId }) => {
@@ -604,7 +616,10 @@ export function useRefreshKnowledgeItem() {
      return data;
    },
    onError: (error) => {
-      const errorMessage = error instanceof Error ? error.message : "Failed to refresh item";
+      // Use enhanced error handling for better user experience
      const errorMessage = (error as EnhancedError)?.isOpenAIError 
        ? getDisplayErrorMessage(error as EnhancedError)
        : (error instanceof Error ? error.message : "Failed to refresh item");
      showToast(errorMessage, "error");
    },
  });
--- a/archon-ui-main/src/features/knowledge/services/apiWithEnhancedErrors.ts
+++ b/archon-ui-main/src/features/knowledge/services/apiWithEnhancedErrors.ts
@@ -0,0 +1,99 @@
 /**
 * Enhanced API client for knowledge base operations with OpenAI error handling
 * Built on top of the ETag-aware API client with additional error parsing
 */
 import { callAPIWithETag } from "../../projects/shared/apiWithEtag";
 import { parseKnowledgeBaseError, type EnhancedError } from "../utils/errorHandler";
 /**
 * API call wrapper with enhanced OpenAI error handling
 * Uses ETag caching for efficiency while adding specialized error parsing
 */
 export async function callKnowledgeAPI<T>(
  endpoint: string,
  options: RequestInit = {}
 ): Promise<T> {
  try {
    // Use the ETag-aware API client for caching benefits
    return await callAPIWithETag<T>(endpoint, options);
  } catch (error: any) {
    // Apply enhanced error parsing for OpenAI errors
    const enhancedError = parseKnowledgeBaseError({
      status: error.statusCode || error.status,
      error: error.message || error.detail || error,
      detail: error.detail
    });
    // Preserve the original error structure but enhance with our parsing
    const finalError = error as EnhancedError;
    finalError.isOpenAIError = enhancedError.isOpenAIError;
    finalError.errorDetails = enhancedError.errorDetails;
    finalError.message = enhancedError.message;
    throw finalError;
  }
 }
 /**
 * Enhanced upload wrapper that handles FormData and file uploads with better error handling
 */
 export async function uploadWithEnhancedErrors(
  endpoint: string,
  formData: FormData,
  timeoutMs: number = 30000
 ): Promise<any> {
  const API_BASE_URL = "/api"; // Use same base as other services
  let fullUrl = `${API_BASE_URL}${endpoint}`;
  // Handle test environment URLs
  if (typeof process !== "undefined" && process.env?.NODE_ENV === "test") {
    const testHost = process.env?.VITE_HOST || "localhost";
    const testPort = process.env?.ARCHON_SERVER_PORT || "8181";
    fullUrl = `http://${testHost}:${testPort}${fullUrl}`;
  }
  try {
    const response = await fetch(fullUrl, {
      method: "POST",
      body: formData,
      signal: AbortSignal.timeout(timeoutMs),
    });
    if (!response.ok) {
      let errorData;
      try {
        errorData = await response.json();
      } catch {
        const text = await response.text();
        errorData = { status: response.status, error: text };
      }
      // Apply enhanced error parsing
      const enhancedError = parseKnowledgeBaseError({
        status: response.status,
        error: errorData.detail || errorData.error || errorData,
        detail: errorData.detail
      });
      throw enhancedError;
    }
    return response.json();
  } catch (error: any) {
    // Check if it's a timeout error
    if (error instanceof Error && error.name === 'AbortError') {
      const timeoutError = parseKnowledgeBaseError(new Error('Request timed out'));
      throw timeoutError;
    }
    // If it's already an enhanced error, re-throw it
    if (error && typeof error === 'object' && 'isOpenAIError' in error) {
      throw error;
    }
    // Parse other errors through the error handler for consistency
    throw parseKnowledgeBaseError(error);
  }
 }
--- a/archon-ui-main/src/features/knowledge/services/knowledgeService.ts
+++ b/archon-ui-main/src/features/knowledge/services/knowledgeService.ts
@@ -3,7 +3,8 @@
 * Handles all knowledge-related API operations using TanStack Query patterns
 */
-import { callAPIWithETag, invalidateETagCache } from "../../projects/shared/apiWithEtag";
+import { invalidateETagCache } from "../../projects/shared/apiWithEtag";
 import { callKnowledgeAPI, uploadWithEnhancedErrors } from "./apiWithEnhancedErrors";
 import type {
  ChunksResponse,
  CodeExamplesResponse,
@@ -40,21 +41,21 @@ export const knowledgeService = {
    const queryString = params.toString();
    const endpoint = `/api/knowledge-items/summary${queryString ? `?${queryString}` : ""}`;
-    return callAPIWithETag<KnowledgeItemsResponse>(endpoint);
+    return callKnowledgeAPI<KnowledgeItemsResponse>(endpoint);
  },
  /**
   * Get a specific knowledge item
   */
  async getKnowledgeItem(sourceId: string): Promise<KnowledgeItem> {
-    return callAPIWithETag<KnowledgeItem>(`/api/knowledge-items/${sourceId}`);
+    return callKnowledgeAPI<KnowledgeItem>(`/api/knowledge-items/${sourceId}`);
  },
  /**
   * Delete a knowledge item
   */
  async deleteKnowledgeItem(sourceId: string): Promise<{ success: boolean; message: string }> {
-    const response = await callAPIWithETag<{ success: boolean; message: string }>(`/api/knowledge-items/${sourceId}`, {
+    const response = await callKnowledgeAPI<{ success: boolean; message: string }>(`/api/knowledge-items/${sourceId}`, {
      method: "DELETE",
    });
@@ -70,7 +71,7 @@ export const knowledgeService = {
   * Update a knowledge item
   */
  async updateKnowledgeItem(sourceId: string, updates: Partial<KnowledgeItem>): Promise<KnowledgeItem> {
-    const response = await callAPIWithETag<KnowledgeItem>(`/api/knowledge-items/${sourceId}`, {
+    const response = await callKnowledgeAPI<KnowledgeItem>(`/api/knowledge-items/${sourceId}`, {
      method: "PUT",
      body: JSON.stringify(updates),
    });
@@ -87,7 +88,7 @@ export const knowledgeService = {
   * Start crawling a URL
   */
  async crawlUrl(request: CrawlRequest): Promise<CrawlStartResponse> {
-    const response = await callAPIWithETag<CrawlStartResponse>("/api/knowledge-items/crawl", {
+    const response = await callKnowledgeAPI<CrawlStartResponse>("/api/knowledge-items/crawl", {
      method: "POST",
      body: JSON.stringify(request),
    });
@@ -103,7 +104,7 @@ export const knowledgeService = {
   * Refresh an existing knowledge item
   */
  async refreshKnowledgeItem(sourceId: string): Promise<RefreshResponse> {
-    const response = await callAPIWithETag<RefreshResponse>(`/api/knowledge-items/${sourceId}/refresh`, {
+    const response = await callKnowledgeAPI<RefreshResponse>(`/api/knowledge-items/${sourceId}/refresh`, {
      method: "POST",
    });
@@ -132,38 +133,21 @@ export const knowledgeService = {
      formData.append("tags", JSON.stringify(metadata.tags));
    }
-    // Use fetch directly for file upload (FormData doesn't work well with our ETag wrapper)
+    // Use enhanced upload wrapper with OpenAI error handling
-    // In test environment, we need absolute URLs
+    const result = await uploadWithEnhancedErrors("/documents/upload", formData, 30000);
    let uploadUrl = "/api/documents/upload";
    if (typeof process !== "undefined" && process.env?.NODE_ENV === "test") {
      const testHost = process.env?.VITE_HOST || "localhost";
      const testPort = process.env?.ARCHON_SERVER_PORT || "8181";
      uploadUrl = `http://${testHost}:${testPort}${uploadUrl}`;
    }
    const response = await fetch(uploadUrl, {
      method: "POST",
      body: formData,
      signal: AbortSignal.timeout(30000), // 30 second timeout for file uploads
    });
    if (!response.ok) {
      const error = await response.json();
      throw new Error(error.error || `HTTP ${response.status}`);
    }
    // Invalidate list cache
    invalidateETagCache("/api/knowledge-items");
    invalidateETagCache("/api/knowledge-items/summary");
-    return response.json();
+    return result;
  },
  /**
   * Stop a running crawl
   */
  async stopCrawl(progressId: string): Promise<{ success: boolean; message: string }> {
-    return callAPIWithETag<{ success: boolean; message: string }>(`/api/knowledge-items/stop/${progressId}`, {
+    return callKnowledgeAPI<{ success: boolean; message: string }>(`/api/knowledge-items/stop/${progressId}`, {
      method: "POST",
    });
  },
@@ -193,7 +177,7 @@ export const knowledgeService = {
    const queryString = params.toString();
    const endpoint = `/api/knowledge-items/${sourceId}/chunks${queryString ? `?${queryString}` : ""}`;
-    return callAPIWithETag<ChunksResponse>(endpoint);
+    return callKnowledgeAPI<ChunksResponse>(endpoint);
  },
  /**
@@ -217,14 +201,14 @@ export const knowledgeService = {
    const queryString = params.toString();
    const endpoint = `/api/knowledge-items/${sourceId}/code-examples${queryString ? `?${queryString}` : ""}`;
-    return callAPIWithETag<CodeExamplesResponse>(endpoint);
+    return callKnowledgeAPI<CodeExamplesResponse>(endpoint);
  },
  /**
   * Search the knowledge base
   */
  async searchKnowledgeBase(options: SearchOptions): Promise<SearchResultsResponse> {
-    return callAPIWithETag("/api/knowledge-items/search", {
+    return callKnowledgeAPI("/api/knowledge-items/search", {
      method: "POST",
      body: JSON.stringify(options),
    });
@@ -234,6 +218,6 @@ export const knowledgeService = {
   * Get available knowledge sources
   */
  async getKnowledgeSources(): Promise<KnowledgeSource[]> {
-    return callAPIWithETag<KnowledgeSource[]>("/api/knowledge-items/sources");
+    return callKnowledgeAPI<KnowledgeSource[]>("/api/knowledge-items/sources");
  },
 };
--- a/archon-ui-main/src/features/knowledge/utils/errorHandler.ts
+++ b/archon-ui-main/src/features/knowledge/utils/errorHandler.ts
@@ -0,0 +1,208 @@
 /**
 * Error handling utilities for knowledge base operations
 * 
 * Provides specialized error handling for OpenAI API errors,
 * rate limiting, and quota exhaustion scenarios.
 * 
 * Related to GitHub issue #362 - improves user experience
 * by displaying clear error messages when OpenAI API fails.
 */
 export interface OpenAIErrorDetails {
  error: string;
  message: string;
  error_type: 'quota_exhausted' | 'rate_limit' | 'api_error' | 'authentication_failed';
  tokens_used?: number;
  retry_after?: number;
  api_key_prefix?: string;
 }
 export interface EnhancedError extends Error {
  statusCode?: number;
  errorDetails?: OpenAIErrorDetails;
  isOpenAIError?: boolean;
 }
 /**
 * Create a fallback error for cases where input is invalid or unparseable
 */
 function createFallbackError(reason: string): EnhancedError {
  return Object.assign(new Error('Unknown error occurred'), {
    errorDetails: {
      error: 'unknown',
      message: `${reason}. Please try again or contact support if the problem persists.`,
      error_type: 'api_error' as const
    }
  }) as EnhancedError;
 }
 /**
 * Check if an object can be safely serialized (no circular references)
 */
 function isSafeObject(obj: any): boolean {
  try {
    JSON.stringify(obj);
    return true;
  } catch {
    return false;
  }
 }
 /**
 * Parse and enhance API errors from knowledge base operations
 */
 export function parseKnowledgeBaseError(error: any): EnhancedError {
  // Enhanced input validation
  if (!error) {
    return createFallbackError('No error information provided');
  }
  if (typeof error === 'string') {
    return Object.assign(new Error(error), {
      errorDetails: {
        error: 'api_error',
        message: error,
        error_type: 'api_error' as const
      }
    }) as EnhancedError;
  }
  if (typeof error !== 'object' || error === null) {
    return createFallbackError('Invalid error format');
  }
  // Check for empty objects or objects with no useful properties
  if (error.constructor === Object && Object.keys(error).length === 0) {
    return createFallbackError('Empty error object received');
  }
  // Check for circular references and object safety
  if (!isSafeObject(error)) {
    return createFallbackError('Error object contains circular references');
  }
  // Handle Error instances that might have been serialized/deserialized
  if (error instanceof Error || (error.name && error.message && error.stack)) {
    // This is likely an Error object, proceed with parsing
  } else if (!error.message && !error.error && !error.detail && !error.status) {
    // Object doesn't have any recognizable error properties
    return createFallbackError('Unrecognized error object structure');
  }
  const enhancedError: EnhancedError = new Error(error.message || 'Unknown error');
  // Check if this is an HTTP response error with JSON details
  if (error && typeof error === 'object') {
    // Handle fetch Response errors
    if (error.status || error.statusCode) {
      enhancedError.statusCode = error.status || error.statusCode;
    }
    // Parse error details from API response
    if (error.error || error.detail) {
      const errorData = error.error || error.detail;
      // Check if it's an OpenAI-specific error
      if (typeof errorData === 'object' && errorData.error_type) {
        enhancedError.isOpenAIError = true;
        enhancedError.errorDetails = errorData as OpenAIErrorDetails;
        // Override the message with the detailed error message
        enhancedError.message = errorData.message || errorData.error || enhancedError.message;
      }
    }
  }
  return enhancedError;
 }
 /**
 * Get user-friendly error message for display in UI
 */
 export function getDisplayErrorMessage(error: EnhancedError): string {
  if (error.isOpenAIError && error.errorDetails) {
    switch (error.errorDetails.error_type) {
      case 'quota_exhausted':
        return `OpenAI API quota exhausted. Please add credits to your OpenAI account or check your billing settings.`;
      case 'rate_limit':
        return `OpenAI API rate limit exceeded. Please wait a moment and try again.`;
      case 'authentication_failed':
        return `Invalid or expired OpenAI API key. Please check your API key in settings.`;
      case 'api_error':
        return `OpenAI API error: ${error.errorDetails.message}. Please check your API key configuration.`;
      default:
        return error.errorDetails.message || error.message;
    }
  }
  // Handle HTTP status codes
  if (error.statusCode) {
    switch (error.statusCode) {
      case 401:
        return 'Authentication failed. Please check your API key.';
      case 429:
        return 'API rate limit exceeded. Please wait a moment and try again.';
      case 502:
        return 'API service unavailable. Please try again in a few minutes.';
      case 503:
        return 'Service temporarily unavailable. Please try again later.';
      default:
        return error.message;
    }
  }
  return error.message || 'An unexpected error occurred.';
 }
 /**
 * Get error severity level for UI styling
 */
 export function getErrorSeverity(error: EnhancedError): 'error' | 'warning' | 'info' {
  if (error.isOpenAIError && error.errorDetails) {
    switch (error.errorDetails.error_type) {
      case 'quota_exhausted':
        return 'error'; // Critical - user action required
      case 'authentication_failed':
        return 'error'; // Critical - configuration issue
      case 'rate_limit':
        return 'warning'; // Temporary - retry may work
      case 'api_error':
        return 'error'; // Likely configuration issue
      default:
        return 'error';
    }
  }
  if (error.statusCode && error.statusCode >= 500) {
    return 'error'; // Server errors
  }
  return 'warning'; // Default to warning for other errors
 }
 /**
 * Get suggested action for the user based on error type
 */
 export function getErrorAction(error: EnhancedError): string | null {
  if (error.isOpenAIError && error.errorDetails) {
    switch (error.errorDetails.error_type) {
      case 'quota_exhausted':
        return 'Check your OpenAI billing dashboard and add credits';
      case 'authentication_failed':
        return 'Verify your OpenAI API key in Settings';
      case 'rate_limit':
        const retryAfter = error.errorDetails.retry_after || 30;
        return `Wait ${retryAfter} seconds and try again`;
      case 'api_error':
        return 'Verify your OpenAI API key in Settings';
      default:
        return null;
    }
  }
  return null;
 }
--- a/python/src/agents/base_agent.py
+++ b/python/src/agents/base_agent.py
@@ -216,7 +216,7 @@ class BaseAgent(ABC, Generic[DepsT, OutputT]):
            self.logger.info(f"Agent {self.name} completed successfully")
            # PydanticAI returns a RunResult with data attribute
            return result.data
-        except asyncio.TimeoutError:
+        except TimeoutError:
            self.logger.error(f"Agent {self.name} timed out after 120 seconds")
            raise Exception(f"Agent {self.name} operation timed out - taking too long to respond")
        except Exception as e:
--- a/python/src/mcp_server/features/documents/document_tools.py
+++ b/python/src/mcp_server/features/documents/document_tools.py
@@ -11,8 +11,8 @@ from typing import Any
 from urllib.parse import urljoin
 import httpx
 from mcp.server.fastmcp import Context, FastMCP
 from src.mcp_server.utils.error_handling import MCPErrorFormatter
 from src.mcp_server.utils.timeout_config import get_default_timeout
 from src.server.config.service_discovery import get_api_url
--- a/python/src/mcp_server/features/documents/version_tools.py
+++ b/python/src/mcp_server/features/documents/version_tools.py
@@ -11,8 +11,8 @@ from typing import Any
 from urllib.parse import urljoin
 import httpx
 from mcp.server.fastmcp import Context, FastMCP
 from src.mcp_server.utils.error_handling import MCPErrorFormatter
 from src.mcp_server.utils.timeout_config import get_default_timeout
 from src.server.config.service_discovery import get_api_url
--- a/python/src/mcp_server/features/feature_tools.py
+++ b/python/src/mcp_server/features/feature_tools.py
@@ -9,8 +9,8 @@ import logging
 from urllib.parse import urljoin
 import httpx
 from mcp.server.fastmcp import Context, FastMCP
 from src.mcp_server.utils.error_handling import MCPErrorFormatter
 from src.mcp_server.utils.timeout_config import get_default_timeout
 from src.server.config.service_discovery import get_api_url
--- a/python/src/mcp_server/features/projects/project_tools.py
+++ b/python/src/mcp_server/features/projects/project_tools.py
@@ -11,8 +11,8 @@ import logging
 from urllib.parse import urljoin
 import httpx
 from mcp.server.fastmcp import Context, FastMCP
 from src.mcp_server.utils.error_handling import MCPErrorFormatter
 from src.mcp_server.utils.timeout_config import (
    get_default_timeout,
--- a/python/src/mcp_server/features/tasks/task_tools.py
+++ b/python/src/mcp_server/features/tasks/task_tools.py
@@ -11,8 +11,8 @@ from typing import Any
 from urllib.parse import urljoin
 import httpx
 from mcp.server.fastmcp import Context, FastMCP
 from src.mcp_server.utils.error_handling import MCPErrorFormatter
 from src.mcp_server.utils.timeout_config import get_default_timeout
 from src.server.config.service_discovery import get_api_url
--- a/python/src/mcp_server/mcp_server.py
+++ b/python/src/mcp_server/mcp_server.py
@@ -29,7 +29,6 @@ from pathlib import Path
 from typing import Any
 from dotenv import load_dotenv
 from mcp.server.fastmcp import Context, FastMCP
 # Add the project root to Python path for imports
--- a/python/src/mcp_server/modules/rag_module.py
+++ b/python/src/mcp_server/modules/rag_module.py
@@ -16,7 +16,6 @@ import os
 from urllib.parse import urljoin
 import httpx
 from mcp.server.fastmcp import Context, FastMCP
 # Import service discovery for HTTP communication
--- a/python/src/server/api_routes/knowledge_api.py
+++ b/python/src/server/api_routes/knowledge_api.py
@@ -53,6 +53,92 @@ crawl_semaphore = asyncio.Semaphore(CONCURRENT_CRAWL_LIMIT)
 active_crawl_tasks: dict[str, asyncio.Task] = {}
 def _sanitize_openai_error(error_message: str) -> str:
    """Sanitize OpenAI API error messages to prevent information disclosure."""
    import re
    # Input validation
    if not isinstance(error_message, str):
        return "OpenAI API encountered an error. Please verify your API key and quota."
    if not error_message.strip():
        return "OpenAI API encountered an error. Please verify your API key and quota."
    # Common patterns to sanitize with bounded quantifiers to prevent ReDoS
    sanitized_patterns = {
        r'https?://[^\s]{1,200}': '[REDACTED_URL]',  # Remove URLs (bounded)
        r'sk-[a-zA-Z0-9]{48}': '[REDACTED_KEY]',     # Remove OpenAI API keys
        r'"[^"]{1,100}auth[^"]{1,100}"': '[REDACTED_AUTH]',     # Remove auth details (bounded)
        r'org-[a-zA-Z0-9]{24}': '[REDACTED_ORG]',   # Remove OpenAI organization IDs
        r'proj_[a-zA-Z0-9]{10,20}': '[REDACTED_PROJ]', # Remove OpenAI project IDs (bounded)
        r'req_[a-zA-Z0-9]{6,20}': '[REDACTED_REQ]',   # Remove OpenAI request IDs (bounded)
        r'user-[a-zA-Z0-9]{10,20}': '[REDACTED_USER]', # Remove OpenAI user IDs (bounded)
        r'sess_[a-zA-Z0-9]{10,20}': '[REDACTED_SESS]', # Remove session IDs (bounded)
        r'Bearer\s+[^\s]{1,200}': 'Bearer [REDACTED_AUTH_TOKEN]', # Remove bearer tokens (bounded)
    }
    sanitized = error_message
    for pattern, replacement in sanitized_patterns.items():
        sanitized = re.sub(pattern, replacement, sanitized, flags=re.IGNORECASE)
    # Check for sensitive words after pattern replacement
    sensitive_words = ['internal', 'server', 'token']
    # Only check for 'endpoint' if it's not part of our redacted URL pattern
    if 'endpoint' in sanitized.lower() and '[REDACTED_URL]' not in sanitized:
        sensitive_words.append('endpoint')
    # Return generic message if still contains sensitive info
    if any(word in sanitized.lower() for word in sensitive_words):
        return "OpenAI API encountered an error. Please verify your API key and quota."
    return sanitized
 async def _validate_openai_api_key() -> None:
    """
    Validate OpenAI API key is present and working before starting operations.
    Raises:
        HTTPException: 401 if API key is invalid/missing, 429 if quota exhausted
    """
    try:
        # Test the API key with a minimal embedding request
        from ..services.embeddings.embedding_service import create_embedding
        # Try to create a test embedding with minimal content
        await create_embedding(text="test")
    except Exception as e:
        # Import embedding exceptions for specific error handling
        from ..services.embeddings.embedding_exceptions import (
            EmbeddingAuthenticationError,
            EmbeddingQuotaExhaustedError,
        )
        if isinstance(e, EmbeddingAuthenticationError):
            raise HTTPException(
                status_code=401,
                detail={
                    "error": "Invalid OpenAI API key",
                    "message": "Please verify your OpenAI API key in Settings before starting a crawl.",
                    "error_type": "authentication_failed"
                }
            )
        elif isinstance(e, EmbeddingQuotaExhaustedError):
            raise HTTPException(
                status_code=429,
                detail={
                    "error": "OpenAI quota exhausted",
                    "message": "Your OpenAI API key has no remaining credits. Please add credits to your account.",
                    "error_type": "quota_exhausted"
                }
            )
        else:
            # For any other errors, allow the operation to continue
            # The error will be caught later during actual processing
            logger.warning(f"API key validation failed with unexpected error: {e}")
            pass
 # Request Models
 class KnowledgeItemRequest(BaseModel):
    url: str
@@ -479,6 +565,9 @@ async def get_knowledge_item_code_examples(
@router.post("/knowledge-items/{source_id}/refresh")
 async def refresh_knowledge_item(source_id: str):
    """Refresh a knowledge item by re-crawling its URL with the same metadata."""
    # CRITICAL: Validate OpenAI API key before starting refresh
    await _validate_openai_api_key()
    try:
        safe_logfire_info(f"Starting knowledge item refresh | source_id={source_id}")
@@ -597,6 +686,9 @@ async def crawl_knowledge_item(request: KnowledgeItemRequest):
    if not request.url.startswith(("http://", "https://")):
        raise HTTPException(status_code=422, detail="URL must start with http:// or https://")
    # CRITICAL: Validate OpenAI API key before starting crawl
    await _validate_openai_api_key()
    try:
        safe_logfire_info(
            f"Starting knowledge item crawl | url={str(request.url)} | knowledge_type={request.knowledge_type} | tags={request.tags}"
@@ -750,6 +842,9 @@ async def upload_document(
    knowledge_type: str = Form("technical"),
 ):
    """Upload and process a document with progress tracking."""
    # CRITICAL: Validate OpenAI API key before starting upload
    await _validate_openai_api_key()
    try:
        # DETAILED LOGGING: Track knowledge_type parameter flow
        safe_logfire_info(
@@ -974,10 +1069,73 @@ async def perform_rag_query(request: RagQueryRequest):
    except HTTPException:
        raise
    except Exception as e:
-        safe_logfire_error(
+        # Import embedding exceptions for specific error handling
-            f"RAG query failed | error={str(e)} | query={request.query[:50]} | source={request.source}"
+        from ..services.embeddings.embedding_exceptions import (
            EmbeddingAPIError,
            EmbeddingAuthenticationError,
            EmbeddingQuotaExhaustedError,
            EmbeddingRateLimitError,
        )
-        raise HTTPException(status_code=500, detail={"error": f"RAG query failed: {str(e)}"})
+
        # Handle specific OpenAI/embedding errors with detailed messages
        if isinstance(e, EmbeddingAuthenticationError):
            safe_logfire_error(
                f"OpenAI authentication failed during RAG query | query={request.query[:50]} | source={request.source}"
            )
            raise HTTPException(
                status_code=401,
                detail={
                    "error": "OpenAI API authentication failed",
                    "message": "Invalid or expired OpenAI API key. Please check your API key in settings.",
                    "error_type": "authentication_failed",
                    "api_key_prefix": getattr(e, "api_key_prefix", None),
                }
            )
        elif isinstance(e, EmbeddingQuotaExhaustedError):
            safe_logfire_error(
                f"OpenAI quota exhausted during RAG query | query={request.query[:50]} | source={request.source}"
            )
            raise HTTPException(
                status_code=429,
                detail={
                    "error": "OpenAI API quota exhausted",
                    "message": "Your OpenAI API key has no remaining credits. Please add credits to your OpenAI account or check your billing settings.",
                    "error_type": "quota_exhausted",
                    "tokens_used": getattr(e, "tokens_used", None),
                }
            )
        elif isinstance(e, EmbeddingRateLimitError):
            safe_logfire_error(
                f"OpenAI rate limit hit during RAG query | query={request.query[:50]} | source={request.source}"
            )
            raise HTTPException(
                status_code=429,
                detail={
                    "error": "OpenAI API rate limit exceeded",
                    "message": "Too many requests to OpenAI API. Please wait a moment and try again.",
                    "error_type": "rate_limit",
                    "retry_after": 30,  # Suggest 30 second wait
                }
            )
        elif isinstance(e, EmbeddingAPIError):
            safe_logfire_error(
                f"OpenAI API error during RAG query | error={str(e)} | query={request.query[:50]} | source={request.source}"
            )
            sanitized_message = _sanitize_openai_error(str(e))
            raise HTTPException(
                status_code=502,
                detail={
                    "error": "OpenAI API error",
                    "message": f"OpenAI API error: {sanitized_message}",
                    "error_type": "api_error",
                }
            )
        else:
            # Generic error handling for other exceptions
            safe_logfire_error(
                f"RAG query failed | error={str(e)} | query={request.query[:50]} | source={request.source}"
            )
            raise HTTPException(status_code=500, detail={"error": f"RAG query failed: {str(e)}"})
@router.post("/rag/code-examples")
--- a/python/src/server/main.py
+++ b/python/src/server/main.py
@@ -113,7 +113,7 @@ async def lifespan(app: FastAPI):
        _initialization_complete = True
        api_logger.info("🎉 Archon backend started successfully!")
-    except Exception as e:
+    except Exception:
        api_logger.error("❌ Failed to start backend", exc_info=True)
        raise
@@ -135,7 +135,7 @@ async def lifespan(app: FastAPI):
        api_logger.info("✅ Cleanup completed")
-    except Exception as e:
+    except Exception:
        api_logger.error("❌ Error during shutdown", exc_info=True)
--- a/python/src/server/services/crawling/crawling_service.py
+++ b/python/src/server/services/crawling/crawling_service.py
@@ -486,7 +486,7 @@ class CrawlingService:
                    logger.error("Code extraction failed, continuing crawl without code examples", exc_info=True)
                    safe_logfire_error(f"Code extraction failed | error={e}")
                    code_examples_count = 0
-                    
+
                    # Report code extraction failure to progress tracker
                    if self.progress_tracker:
                        await self.progress_tracker.update(
--- a/python/src/server/services/crawling/helpers/url_handler.py
+++ b/python/src/server/services/crawling/helpers/url_handler.py
@@ -6,8 +6,7 @@ Handles URL transformations and validations.
 import hashlib
 import re
-from urllib.parse import urlparse, urljoin
+from urllib.parse import urljoin, urlparse
 from typing import List, Optional
 from ....config.logfire_config import get_logger
@@ -33,8 +32,8 @@ class URLHandler:
        except Exception as e:
            logger.warning(f"Error checking if URL is sitemap: {e}")
            return False
-    
+
-    @staticmethod  
+    @staticmethod
    def is_markdown(url: str) -> bool:
        """
        Check if a URL points to a markdown file (.md, .mdx, .markdown).
@@ -274,9 +273,9 @@ class URLHandler:
            # Fallback: use a hash of the error message + url to still get something unique
            fallback = f"error_{redacted}_{str(e)}"
            return hashlib.sha256(fallback.encode("utf-8")).hexdigest()[:16]
-    
+
    @staticmethod
-    def extract_markdown_links(content: str, base_url: Optional[str] = None) -> List[str]:
+    def extract_markdown_links(content: str, base_url: str | None = None) -> list[str]:
        """
        Extract markdown-style links from text content.
@@ -290,10 +289,10 @@ class URLHandler:
        try:
            if not content:
                return []
-            
+
            # Ultimate URL pattern with comprehensive format support:
            #  1) [text](url) - markdown links
-            #  2) <https://...> - autolinks  
+            #  2) <https://...> - autolinks
            #  3) https://... - bare URLs with protocol
            #  4) //example.com - protocol-relative URLs
            #  5) www.example.com - scheme-less www URLs
@@ -348,7 +347,7 @@ class URLHandler:
                # Only include HTTP/HTTPS URLs
                if url.startswith(('http://', 'https://')):
                    urls.append(url)
-            
+
            # Remove duplicates while preserving order
            seen = set()
            unique_urls = []
@@ -356,16 +355,16 @@ class URLHandler:
                if url not in seen:
                    seen.add(url)
                    unique_urls.append(url)
-            
+
            logger.info(f"Extracted {len(unique_urls)} unique links from content")
            return unique_urls
-            
+
        except Exception as e:
            logger.error(f"Error extracting markdown links: {e}", exc_info=True)
            return []
-    
+
    @staticmethod
-    def is_link_collection_file(url: str, content: Optional[str] = None) -> bool:
+    def is_link_collection_file(url: str, content: str | None = None) -> bool:
        """
        Check if a URL/file appears to be a link collection file like llms.txt.
@@ -380,7 +379,7 @@ class URLHandler:
            # Extract filename from URL
            parsed = urlparse(url)
            filename = parsed.path.split('/')[-1].lower()
-            
+
            # Check for specific link collection filenames
            # Note: "full-*" or "*-full" patterns are NOT link collections - they contain complete content, not just links
            link_collection_patterns = [
@@ -391,12 +390,12 @@ class URLHandler:
                'llms.mdx', 'links.mdx', 'resources.mdx', 'references.mdx',
                'llms.markdown', 'links.markdown', 'resources.markdown', 'references.markdown',
            ]
-            
+
            # Direct filename match
            if filename in link_collection_patterns:
                logger.info(f"Detected link collection file by filename: {filename}")
                return True
-            
+
            # Pattern-based detection for variations, but exclude "full" variants
            # Only match files that are likely link collections, not complete content files
            if filename.endswith(('.txt', '.md', '.mdx', '.markdown')):
@@ -407,7 +406,7 @@ class URLHandler:
                    if any(filename.startswith(pattern + '.') or filename.startswith(pattern + '-') for pattern in base_patterns):
                        logger.info(f"Detected potential link collection file: {filename}")
                        return True
-            
+
            # Content-based detection if content is provided
            if content:
                # Never treat "full" variants as link collections to preserve single-page behavior
@@ -417,19 +416,19 @@ class URLHandler:
                # Reuse extractor to avoid regex divergence and maintain consistency
                extracted_links = URLHandler.extract_markdown_links(content, url)
                total_links = len(extracted_links)
-                
+
                # Calculate link density (links per 100 characters)
                content_length = len(content.strip())
                if content_length > 0:
                    link_density = (total_links * 100) / content_length
-                    
+
                    # If more than 2% of content is links, likely a link collection
                    if link_density > 2.0 and total_links > 3:
                        logger.info(f"Detected link collection by content analysis: {total_links} links, density {link_density:.2f}%")
                        return True
-            
+
            return False
-            
+
        except Exception as e:
            logger.warning(f"Error checking if file is link collection: {e}", exc_info=True)
            return False
--- a/python/src/server/services/embeddings/contextual_embedding_service.py
+++ b/python/src/server/services/embeddings/contextual_embedding_service.py
@@ -219,4 +219,4 @@ async def generate_contextual_embeddings_batch(
    except Exception as e:
        search_logger.error(f"Error in contextual embedding batch: {e}")
        # Return non-contextual for all chunks
-        return [(chunk, False) for chunk in chunks]
+        return [(chunk, False) for chunk in chunks]
--- a/python/src/server/services/embeddings/embedding_exceptions.py
+++ b/python/src/server/services/embeddings/embedding_exceptions.py
@@ -99,6 +99,22 @@ class EmbeddingAPIError(EmbeddingError):
            self.metadata["original_error_message"] = str(original_error)
 class EmbeddingAuthenticationError(EmbeddingError):
    """
    Raised when API authentication fails (invalid or expired API key).
    This is a CRITICAL error that should stop the entire process
    as continuing would be pointless without valid API access.
    """
    def __init__(self, message: str, api_key_prefix: str | None = None, **kwargs):
        super().__init__(message, **kwargs)
        # Store masked API key prefix for debugging (first 3 chars + ellipsis)
        self.api_key_prefix = api_key_prefix[:3] + "…" if api_key_prefix and len(api_key_prefix) >= 3 else None
        if self.api_key_prefix:
            self.metadata["api_key_prefix"] = self.api_key_prefix
 class EmbeddingValidationError(EmbeddingError):
    """
    Raised when embedding validation fails (e.g., zero vector detected).
--- a/python/src/server/services/knowledge/knowledge_item_service.py
+++ b/python/src/server/services/knowledge/knowledge_item_service.py
@@ -143,7 +143,7 @@ class KnowledgeItemService:
                    display_url = source_url
                else:
                    display_url = first_urls.get(source_id, f"source://{source_id}")
-                
+
                code_examples_count = code_example_counts.get(source_id, 0)
                chunks_count = chunk_counts.get(source_id, 0)
--- a/python/src/server/services/knowledge/knowledge_summary_service.py
+++ b/python/src/server/services/knowledge/knowledge_summary_service.py
@@ -5,9 +5,9 @@ Provides lightweight summary data for knowledge items to minimize data transfer.
 Optimized for frequent polling and card displays.
 """
-from typing import Any, Optional
+from typing import Any
-from ...config.logfire_config import safe_logfire_info, safe_logfire_error
+from ...config.logfire_config import safe_logfire_error, safe_logfire_info
 class KnowledgeSummaryService:
@@ -29,8 +29,8 @@ class KnowledgeSummaryService:
        self,
        page: int = 1,
        per_page: int = 20,
-        knowledge_type: Optional[str] = None,
+        knowledge_type: str | None = None,
-        search: Optional[str] = None,
+        search: str | None = None,
    ) -> dict[str, Any]:
        """
        Get lightweight summaries of knowledge items.
@@ -51,69 +51,69 @@ class KnowledgeSummaryService:
        """
        try:
            safe_logfire_info(f"Fetching knowledge summaries | page={page} | per_page={per_page}")
-            
+
            # Build base query - select only needed fields, including source_url
            query = self.supabase.from_("archon_sources").select(
                "source_id, title, summary, metadata, source_url, created_at, updated_at"
            )
-            
+
            # Apply filters
            if knowledge_type:
                query = query.contains("metadata", {"knowledge_type": knowledge_type})
-            
+
            if search:
                search_pattern = f"%{search}%"
                query = query.or_(
                    f"title.ilike.{search_pattern},summary.ilike.{search_pattern}"
                )
-            
+
            # Get total count
            count_query = self.supabase.from_("archon_sources").select(
                "*", count="exact", head=True
            )
-            
+
            if knowledge_type:
                count_query = count_query.contains("metadata", {"knowledge_type": knowledge_type})
-            
+
            if search:
                search_pattern = f"%{search}%"
                count_query = count_query.or_(
                    f"title.ilike.{search_pattern},summary.ilike.{search_pattern}"
                )
-            
+
            count_result = count_query.execute()
            total = count_result.count if hasattr(count_result, "count") else 0
-            
+
            # Apply pagination
            start_idx = (page - 1) * per_page
            query = query.range(start_idx, start_idx + per_page - 1)
            query = query.order("updated_at", desc=True)
-            
+
            # Execute main query
            result = query.execute()
            sources = result.data if result.data else []
-            
+
            # Get source IDs for batch operations
            source_ids = [s["source_id"] for s in sources]
-            
+
            # Batch fetch counts only (no content!)
            summaries = []
-            
+
            if source_ids:
                # Get document counts in a single query
                doc_counts = await self._get_document_counts_batch(source_ids)
-                
+
                # Get code example counts in a single query
                code_counts = await self._get_code_example_counts_batch(source_ids)
-                
+
                # Get first URLs in a single query
                first_urls = await self._get_first_urls_batch(source_ids)
-                
+
                # Build summaries
                for source in sources:
                    source_id = source["source_id"]
                    metadata = source.get("metadata", {})
-                    
+
                    # Use the original source_url from the source record (the URL the user entered)
                    # Fall back to first crawled page URL, then to source:// format as last resort
                    source_url = source.get("source_url")
@@ -121,9 +121,9 @@ class KnowledgeSummaryService:
                        first_url = source_url
                    else:
                        first_url = first_urls.get(source_id, f"source://{source_id}")
-                    
+
                    source_type = metadata.get("source_type", "file" if first_url.startswith("file://") else "url")
-                    
+
                    # Extract knowledge_type - check metadata first, otherwise default based on source content
                    # The metadata should always have it if it was crawled properly
                    knowledge_type = metadata.get("knowledge_type")
@@ -132,7 +132,7 @@ class KnowledgeSummaryService:
                        # This handles legacy data that might not have knowledge_type set
                        safe_logfire_info(f"Knowledge type not found in metadata for {source_id}, defaulting to technical")
                        knowledge_type = "technical"
-                    
+
                    summary = {
                        "source_id": source_id,
                        "title": source.get("title", source.get("summary", "Untitled")),
@@ -148,11 +148,11 @@ class KnowledgeSummaryService:
                        "metadata": metadata,  # Include full metadata for debugging
                    }
                    summaries.append(summary)
-            
+
            safe_logfire_info(
                f"Knowledge summaries fetched | count={len(summaries)} | total={total}"
            )
-            
+
            return {
                "items": summaries,
                "total": total,
@@ -160,11 +160,11 @@ class KnowledgeSummaryService:
                "per_page": per_page,
                "pages": (total + per_page - 1) // per_page if per_page > 0 else 0,
            }
-            
+
        except Exception as e:
            safe_logfire_error(f"Failed to get knowledge summaries | error={str(e)}")
            raise
-    
+
    async def _get_document_counts_batch(self, source_ids: list[str]) -> dict[str, int]:
        """
        Get document counts for multiple sources in a single query.
@@ -179,7 +179,7 @@ class KnowledgeSummaryService:
            # Use a raw SQL query for efficient counting
            # Group by source_id and count
            counts = {}
-            
+
            # For now, use individual queries but optimize later with raw SQL
            for source_id in source_ids:
                result = (
@@ -189,13 +189,13 @@ class KnowledgeSummaryService:
                    .execute()
                )
                counts[source_id] = result.count if hasattr(result, "count") else 0
-            
+
            return counts
-            
+
        except Exception as e:
            safe_logfire_error(f"Failed to get document counts | error={str(e)}")
-            return {sid: 0 for sid in source_ids}
+            return dict.fromkeys(source_ids, 0)
-    
+
    async def _get_code_example_counts_batch(self, source_ids: list[str]) -> dict[str, int]:
        """
        Get code example counts for multiple sources efficiently.
@@ -208,7 +208,7 @@ class KnowledgeSummaryService:
        """
        try:
            counts = {}
-            
+
            # For now, use individual queries but can optimize with raw SQL later
            for source_id in source_ids:
                result = (
@@ -218,13 +218,13 @@ class KnowledgeSummaryService:
                    .execute()
                )
                counts[source_id] = result.count if hasattr(result, "count") else 0
-            
+
            return counts
-            
+
        except Exception as e:
            safe_logfire_error(f"Failed to get code example counts | error={str(e)}")
-            return {sid: 0 for sid in source_ids}
+            return dict.fromkeys(source_ids, 0)
-    
+
    async def _get_first_urls_batch(self, source_ids: list[str]) -> dict[str, str]:
        """
        Get first URL for each source in a batch.
@@ -244,21 +244,21 @@ class KnowledgeSummaryService:
                .order("created_at", desc=False)
                .execute()
            )
-            
+
            # Group by source_id, keeping first URL for each
            urls = {}
            for item in result.data or []:
                source_id = item["source_id"]
                if source_id not in urls:
                    urls[source_id] = item["url"]
-            
+
            # Provide defaults for any missing
            for source_id in source_ids:
                if source_id not in urls:
                    urls[source_id] = f"source://{source_id}"
-            
+
            return urls
-            
+
        except Exception as e:
            safe_logfire_error(f"Failed to get first URLs | error={str(e)}")
-            return {sid: f"source://{sid}" for sid in source_ids}
+            return {sid: f"source://{sid}" for sid in source_ids}
--- a/python/src/server/services/search/hybrid_search_strategy.py
+++ b/python/src/server/services/search/hybrid_search_strategy.py
@@ -191,4 +191,4 @@ class HybridSearchStrategy:
            except Exception as e:
                logger.error(f"Hybrid code example search failed: {e}")
                span.set_attribute("error", str(e))
-                return []
+                return []
--- a/python/src/server/services/search/rag_service.py
+++ b/python/src/server/services/search/rag_service.py
@@ -117,7 +117,8 @@ class RAGService:
                if not query_embedding:
                    logger.error("Failed to create embedding for query")
-                    return []
+                    # Follow fail-fast principle - embedding failure should not return empty results
                    raise RuntimeError("Failed to create embedding for query - this indicates a configuration or API issue")
                if use_hybrid_search:
                    # Use hybrid strategy
@@ -141,9 +142,22 @@ class RAGService:
                return results
            except Exception as e:
                # Import embedding exceptions for specific error handling
                from ..embeddings.embedding_exceptions import (
                    EmbeddingAPIError,
                    EmbeddingAuthenticationError,
                    EmbeddingQuotaExhaustedError,
                    EmbeddingRateLimitError,
                )
                # Re-raise OpenAI embedding errors so they propagate to the API layer with specific error info
                if isinstance(e, (EmbeddingAuthenticationError, EmbeddingQuotaExhaustedError, EmbeddingRateLimitError, EmbeddingAPIError)):
                    raise
                logger.error(f"Document search failed: {e}")
                span.set_attribute("error", str(e))
-                return []
+                # Follow fail-fast principle - don't return empty results for legitimate failures
                raise RuntimeError(f"Document search failed: {str(e)}") from e
    async def search_code_examples(
        self,
--- a/python/src/server/services/threading_service.py
+++ b/python/src/server/services/threading_service.py
@@ -91,7 +91,7 @@ class RateLimiter:
        """
        while True:  # Loop instead of recursion to avoid stack overflow
            wait_time_to_sleep = None
-            
+
            async with self._lock:
                now = time.time()
@@ -104,7 +104,7 @@ class RateLimiter:
                    self.request_times.append(now)
                    self.token_usage.append((now, estimated_tokens))
                    return True
-                
+
                # Calculate wait time if we can't make the request
                wait_time = self._calculate_wait_time(estimated_tokens)
                if wait_time > 0:
@@ -118,7 +118,7 @@ class RateLimiter:
                    wait_time_to_sleep = wait_time
                else:
                    return False
-            
+
            # Sleep outside the lock to avoid deadlock
            if wait_time_to_sleep is not None:
                # For long waits, break into smaller chunks with progress updates
--- a/python/src/server/utils/progress/progress_tracker.py
+++ b/python/src/server/utils/progress/progress_tracker.py
@@ -106,7 +106,7 @@ class ProgressTracker:
                f"DEBUG: ProgressTracker.update called | status={status} | progress={progress} | "
                f"current_state_progress={self.state.get('progress', 0)} | kwargs_keys={list(kwargs.keys())}"
            )
-        
+
        # CRITICAL: Never allow progress to go backwards
        current_progress = self.state.get("progress", 0)
        new_progress = min(100, max(0, progress))  # Ensure 0-100
@@ -129,7 +129,7 @@ class ProgressTracker:
            "log": log,
            "timestamp": datetime.now().isoformat(),
        })
-        
+
        # DEBUG: Log final state for document_storage
        if status == "document_storage" and actual_progress >= 35:
            safe_logfire_info(
@@ -155,10 +155,10 @@ class ProgressTracker:
        for key, value in kwargs.items():
            if key not in protected_fields:
                self.state[key] = value
-        
+
        self._update_state()
-        
+
        # Schedule cleanup for terminal states
        if status in ["cancelled", "failed"]:
            asyncio.create_task(self._delayed_cleanup(self.progress_id))
@@ -189,7 +189,7 @@ class ProgressTracker:
        safe_logfire_info(
            f"Progress completed | progress_id={self.progress_id} | type={self.operation_type} | duration={self.state.get('duration_formatted', 'unknown')}"
        )
-        
+
        # Schedule cleanup after delay to allow clients to see final state
        asyncio.create_task(self._delayed_cleanup(self.progress_id))
@@ -214,7 +214,7 @@ class ProgressTracker:
        safe_logfire_error(
            f"Progress error | progress_id={self.progress_id} | type={self.operation_type} | error={error_message}"
        )
-        
+
        # Schedule cleanup after delay to allow clients to see final state
        asyncio.create_task(self._delayed_cleanup(self.progress_id))
@@ -241,9 +241,9 @@ class ProgressTracker:
        )
    async def update_crawl_stats(
-        self, 
+        self,
-        processed_pages: int, 
+        processed_pages: int,
-        total_pages: int, 
+        total_pages: int,
        current_url: str | None = None,
        pages_found: int | None = None
    ):
@@ -269,16 +269,16 @@ class ProgressTracker:
            "total_pages": total_pages,
            "current_url": current_url,
        }
-        
+
        if pages_found is not None:
            update_data["pages_found"] = pages_found
-            
+
        await self.update(**update_data)
    async def update_storage_progress(
-        self, 
+        self,
-        chunks_stored: int, 
+        chunks_stored: int,
-        total_chunks: int, 
+        total_chunks: int,
        operation: str = "storing",
        word_count: int | None = None,
        embeddings_created: int | None = None
@@ -294,7 +294,7 @@ class ProgressTracker:
            embeddings_created: Number of embeddings created
        """
        progress_val = int((chunks_stored / max(total_chunks, 1)) * 100)
-        
+
        update_data = {
            "status": "document_storage",
            "progress": progress_val,
@@ -302,14 +302,14 @@ class ProgressTracker:
            "chunks_stored": chunks_stored,
            "total_chunks": total_chunks,
        }
-        
+
        if word_count is not None:
            update_data["word_count"] = word_count
        if embeddings_created is not None:
            update_data["embeddings_created"] = embeddings_created
-            
+
        await self.update(**update_data)
-    
+
    async def update_code_extraction_progress(
        self,
        completed_summaries: int,
@@ -327,11 +327,11 @@ class ProgressTracker:
            current_file: Current file being processed
        """
        progress_val = int((completed_summaries / max(total_summaries, 1)) * 100)
-        
+
        log = f"Extracting code: {completed_summaries}/{total_summaries} summaries"
        if current_file:
            log += f" - {current_file}"
-        
+
        await self.update(
            status="code_extraction",
            progress=progress_val,
--- a/python/tests/test_openai_error_handling.py
+++ b/python/tests/test_openai_error_handling.py
@@ -0,0 +1,466 @@
 """
 Comprehensive test suite for OpenAI error handling enhancements.
 Tests the complete error flow from embedding service through RAG service 
 to API endpoints, ensuring proper error propagation and message sanitization.
 Related to GitHub Issue #362.
 """
 import pytest
 from unittest.mock import Mock, AsyncMock, patch
 from fastapi import HTTPException
 from src.server.services.embeddings.embedding_exceptions import (
    EmbeddingAuthenticationError,
    EmbeddingQuotaExhaustedError,
    EmbeddingRateLimitError,
    EmbeddingAPIError,
 )
 from src.server.services.search.rag_service import RAGService
 from src.server.api_routes.knowledge_api import perform_rag_query, RagQueryRequest, _sanitize_openai_error, _validate_openai_api_key
 class TestOpenAIErrorHandling:
    """Test suite for OpenAI error handling in RAG queries."""
    @pytest.mark.asyncio
    async def test_quota_exhausted_error_propagation(self):
        """Test that quota exhausted errors propagate correctly through service layer."""
        # Mock the create_embedding function to raise quota exhausted error
        with patch("src.server.services.search.rag_service.create_embedding") as mock_create_embedding:
            mock_create_embedding.side_effect = EmbeddingQuotaExhaustedError(
                "OpenAI quota exhausted", tokens_used=1000
            )
            # Create RAG service and test search_documents method
            with patch("src.server.services.search.rag_service.get_supabase_client"):
                rag_service = RAGService()
                # Should propagate the quota error, not return empty list
                with pytest.raises(EmbeddingQuotaExhaustedError) as exc_info:
                    await rag_service.search_documents("test query")
                assert "quota exhausted" in str(exc_info.value).lower()
                assert exc_info.value.tokens_used == 1000
    @pytest.mark.asyncio
    async def test_authentication_error_propagation(self):
        """Test that authentication errors propagate correctly through service layer."""
        # Mock the create_embedding function to raise authentication error
        with patch("src.server.services.search.rag_service.create_embedding") as mock_create_embedding:
            mock_create_embedding.side_effect = EmbeddingAuthenticationError(
                "Invalid API key", api_key_prefix="sk-1234"
            )
            # Create RAG service and test search_documents method
            with patch("src.server.services.search.rag_service.get_supabase_client"):
                rag_service = RAGService()
                # Should propagate the authentication error
                with pytest.raises(EmbeddingAuthenticationError) as exc_info:
                    await rag_service.search_documents("test query")
                assert "invalid api key" in str(exc_info.value).lower()
                assert exc_info.value.api_key_prefix == "sk-1…"
    @pytest.mark.asyncio
    async def test_rate_limit_error_propagation(self):
        """Test that rate limit errors propagate correctly through service layer."""
        # Mock the create_embedding function to raise rate limit error
        with patch("src.server.services.search.rag_service.create_embedding") as mock_create_embedding:
            mock_create_embedding.side_effect = EmbeddingRateLimitError(
                "Rate limit exceeded"
            )
            # Create RAG service and test search_documents method
            with patch("src.server.services.search.rag_service.get_supabase_client"):
                rag_service = RAGService()
                # Should propagate the rate limit error
                with pytest.raises(EmbeddingRateLimitError) as exc_info:
                    await rag_service.search_documents("test query")
                assert "rate limit" in str(exc_info.value).lower()
    @pytest.mark.asyncio
    async def test_api_quota_error_in_rag_endpoint(self):
        """Test that quota errors are properly handled in RAG API endpoint."""
        request = RagQueryRequest(query="test query", match_count=5)
        # Mock RAGService to raise quota exhausted error
        with patch("src.server.api_routes.knowledge_api.RAGService") as mock_rag_service_class, \
             patch("src.server.api_routes.knowledge_api.get_supabase_client"):
            mock_service = Mock()
            mock_service.perform_rag_query = AsyncMock(
                side_effect=EmbeddingQuotaExhaustedError(
                    "OpenAI quota exhausted", tokens_used=500
                )
            )
            mock_rag_service_class.return_value = mock_service
            # Should raise HTTPException with status 429 and detailed error info
            with pytest.raises(HTTPException) as exc_info:
                await perform_rag_query(request)
            # Verify error details
            assert exc_info.value.status_code == 429
            assert "quota exhausted" in exc_info.value.detail["error"].lower()
            assert "add credits" in exc_info.value.detail["message"].lower()
            assert exc_info.value.detail["error_type"] == "quota_exhausted"
            assert exc_info.value.detail["tokens_used"] == 500
    @pytest.mark.asyncio
    async def test_api_authentication_error_in_rag_endpoint(self):
        """Test that authentication errors are properly handled in RAG API endpoint."""
        request = RagQueryRequest(query="test query", match_count=5)
        # Mock RAGService to raise authentication error
        with patch("src.server.api_routes.knowledge_api.RAGService") as mock_rag_service_class, \
             patch("src.server.api_routes.knowledge_api.get_supabase_client"):
            mock_service = Mock()
            mock_service.perform_rag_query = AsyncMock(
                side_effect=EmbeddingAuthenticationError(
                    "Invalid API key", api_key_prefix="sk-1234"
                )
            )
            mock_rag_service_class.return_value = mock_service
            # Should raise HTTPException with status 401 and detailed error info
            with pytest.raises(HTTPException) as exc_info:
                await perform_rag_query(request)
            # Verify error details
            assert exc_info.value.status_code == 401
            assert "authentication failed" in exc_info.value.detail["error"].lower()
            assert "invalid or expired" in exc_info.value.detail["message"].lower()
            assert exc_info.value.detail["error_type"] == "authentication_failed"
            assert exc_info.value.detail["api_key_prefix"] == "sk-1…"
    @pytest.mark.asyncio
    async def test_api_rate_limit_error_in_rag_endpoint(self):
        """Test that rate limit errors are properly handled in RAG API endpoint."""
        request = RagQueryRequest(query="test query", match_count=5)
        # Mock RAGService to raise rate limit error
        with patch("src.server.api_routes.knowledge_api.RAGService") as mock_rag_service_class, \
             patch("src.server.api_routes.knowledge_api.get_supabase_client"):
            mock_service = Mock()
            mock_service.perform_rag_query = AsyncMock(
                side_effect=EmbeddingRateLimitError("Rate limit exceeded")
            )
            mock_rag_service_class.return_value = mock_service
            # Should raise HTTPException with status 429 and detailed error info
            with pytest.raises(HTTPException) as exc_info:
                await perform_rag_query(request)
            # Verify error details
            assert exc_info.value.status_code == 429
            assert "rate limit" in exc_info.value.detail["error"].lower()
            assert "too many requests" in exc_info.value.detail["message"].lower()
            assert exc_info.value.detail["error_type"] == "rate_limit"
            assert exc_info.value.detail["retry_after"] == 30
    @pytest.mark.asyncio
    async def test_api_generic_error_in_rag_endpoint(self):
        """Test that generic API errors are properly handled in RAG API endpoint."""
        request = RagQueryRequest(query="test query", match_count=5)
        # Mock RAGService to raise generic API error
        with patch("src.server.api_routes.knowledge_api.RAGService") as mock_rag_service_class, \
             patch("src.server.api_routes.knowledge_api.get_supabase_client"):
            mock_service = Mock()
            mock_service.perform_rag_query = AsyncMock(
                side_effect=EmbeddingAPIError("Invalid model specified")
            )
            mock_rag_service_class.return_value = mock_service
            # Should raise HTTPException with status 502 and detailed error info
            with pytest.raises(HTTPException) as exc_info:
                await perform_rag_query(request)
            # Verify error details
            assert exc_info.value.status_code == 502
            assert "openai api error" in exc_info.value.detail["error"].lower()
            assert exc_info.value.detail["error_type"] == "api_error"
    @pytest.mark.asyncio
    async def test_api_key_validation_success(self):
        """Test that API key validation passes with valid key."""
        # Mock create_embedding to succeed
        with patch("src.server.api_routes.knowledge_api.create_embedding") as mock_create_embedding:
            mock_create_embedding.return_value = [0.1] * 1536  # Mock successful embedding
            # Should not raise any exception
            await _validate_openai_api_key()
    @pytest.mark.asyncio
    async def test_api_key_validation_auth_failure(self):
        """Test that API key validation raises 401 for authentication failures."""
        # Mock create_embedding to raise authentication error
        with patch("src.server.api_routes.knowledge_api.create_embedding") as mock_create_embedding:
            mock_create_embedding.side_effect = EmbeddingAuthenticationError(
                "Invalid API key", api_key_prefix="sk-1234"
            )
            # Should raise HTTPException with status 401
            with pytest.raises(HTTPException) as exc_info:
                await _validate_openai_api_key()
            assert exc_info.value.status_code == 401
            assert exc_info.value.detail["error_type"] == "authentication_failed"
    @pytest.mark.asyncio
    async def test_api_key_validation_quota_failure(self):
        """Test that API key validation raises 429 for quota exhaustion."""
        # Mock create_embedding to raise quota exhausted error
        with patch("src.server.api_routes.knowledge_api.create_embedding") as mock_create_embedding:
            mock_create_embedding.side_effect = EmbeddingQuotaExhaustedError(
                "Quota exhausted", tokens_used=1000
            )
            # Should raise HTTPException with status 429
            with pytest.raises(HTTPException) as exc_info:
                await _validate_openai_api_key()
            assert exc_info.value.status_code == 429
            assert exc_info.value.detail["error_type"] == "quota_exhausted"
    def test_sanitize_openai_error_removes_urls(self):
        """Test that sanitization function removes URLs from error messages."""
        error_message = "Connection failed to https://api.openai.com/v1/embeddings with status 400"
        sanitized = _sanitize_openai_error(error_message)
        assert "https://api.openai.com" not in sanitized
        assert "[REDACTED_URL]" in sanitized
        assert "Connection failed" in sanitized
    def test_sanitize_openai_error_removes_api_keys(self):
        """Test that sanitization function removes API keys from error messages."""
        error_message = "Authentication failed with key sk-1234567890abcdef1234567890abcdef1234567890abcdef"
        sanitized = _sanitize_openai_error(error_message)
        assert "sk-1234567890abcdef1234567890abcdef1234567890abcdef" not in sanitized
        assert "[REDACTED_KEY]" in sanitized
        assert "Authentication failed" in sanitized
    def test_sanitize_openai_error_removes_auth_info(self):
        """Test that sanitization function removes auth details from error messages."""
        error_message = 'Failed to authenticate: "auth_bearer_xyz123"'
        sanitized = _sanitize_openai_error(error_message)
        assert "auth_bearer_xyz123" not in sanitized
        assert "[REDACTED_AUTH]" in sanitized
    def test_sanitize_openai_error_returns_generic_for_sensitive_words(self):
        """Test that sanitization returns generic message for sensitive internal details."""
        error_message = "Internal server error on endpoint /v1/embeddings"
        sanitized = _sanitize_openai_error(error_message)
        # Should return generic message due to 'internal' and 'endpoint' keywords
        assert sanitized == "OpenAI API encountered an error. Please verify your API key and quota."
    def test_sanitize_openai_error_preserves_safe_messages(self):
        """Test that sanitization preserves safe error messages."""
        error_message = "Model not found: text-embedding-ada-002"
        sanitized = _sanitize_openai_error(error_message)
        # Should preserve the message since it contains no sensitive info
        assert sanitized == error_message
    def test_sanitize_openai_error_removes_organization_ids(self):
        """Test that sanitization function removes OpenAI organization IDs."""
        error_message = "Permission denied for org-1234567890abcdef12345678 with model access"
        sanitized = _sanitize_openai_error(error_message)
        assert "org-1234567890abcdef12345678" not in sanitized
        assert "[REDACTED_ORG]" in sanitized
        assert "Permission denied" in sanitized
    def test_sanitize_openai_error_removes_project_ids(self):
        """Test that sanitization function removes OpenAI project IDs."""
        error_message = "Project proj_abcdef1234567890xyz not found"
        sanitized = _sanitize_openai_error(error_message)
        assert "proj_abcdef1234567890xyz" not in sanitized
        assert "[REDACTED_PROJ]" in sanitized
        assert "Project" in sanitized and "not found" in sanitized
    def test_sanitize_openai_error_removes_request_ids(self):
        """Test that sanitization function removes OpenAI request IDs."""
        error_message = "Request req_1234567890abcdefghij failed with timeout"
        sanitized = _sanitize_openai_error(error_message)
        assert "req_1234567890abcdefghij" not in sanitized
        assert "[REDACTED_REQ]" in sanitized
        assert "Request" in sanitized and "failed with timeout" in sanitized
    def test_sanitize_openai_error_handles_multiple_patterns(self):
        """Test that sanitization handles multiple sensitive patterns in one message."""
        error_message = "Request req_abc123 to https://api.openai.com failed for org-1234567890abcdef12345678 with key sk-1234567890abcdef1234567890abcdef1234567890abcdef"
        sanitized = _sanitize_openai_error(error_message)
        # Verify all patterns are redacted
        assert "req_abc123" not in sanitized
        assert "https://api.openai.com" not in sanitized
        assert "org-1234567890abcdef12345678" not in sanitized
        assert "sk-1234567890abcdef1234567890abcdef1234567890abcdef" not in sanitized
        # Verify redacted placeholders are present
        assert "[REDACTED_REQ]" in sanitized
        assert "[REDACTED_URL]" in sanitized
        assert "[REDACTED_ORG]" in sanitized
        assert "[REDACTED_KEY]" in sanitized
    def test_sanitize_openai_error_input_validation(self):
        """Test that sanitization function handles invalid input gracefully."""
        # Test None input
        result = _sanitize_openai_error(None)
        assert result == "OpenAI API encountered an error. Please verify your API key and quota."
        # Test non-string input
        result = _sanitize_openai_error(123)
        assert result == "OpenAI API encountered an error. Please verify your API key and quota."
        # Test empty string
        result = _sanitize_openai_error("")
        assert result == "OpenAI API encountered an error. Please verify your API key and quota."
        # Test whitespace-only string
        result = _sanitize_openai_error("   ")
        assert result == "OpenAI API encountered an error. Please verify your API key and quota."
    @pytest.mark.asyncio
    async def test_fail_fast_pattern_embedding_failure(self):
        """Test that embedding failures now fail fast instead of returning empty results."""
        # Mock the create_embedding function to return None (failure)
        with patch("src.server.services.search.rag_service.create_embedding") as mock_create_embedding:
            mock_create_embedding.return_value = None
            # Create RAG service and test search_documents method
            with patch("src.server.services.search.rag_service.get_supabase_client"):
                rag_service = RAGService()
                # Should raise RuntimeError instead of returning empty list
                with pytest.raises(RuntimeError) as exc_info:
                    await rag_service.search_documents("test query")
                assert "Failed to create embedding" in str(exc_info.value)
                assert "configuration or API issue" in str(exc_info.value)
    @pytest.mark.asyncio
    async def test_fail_fast_pattern_search_failure(self):
        """Test that search failures now fail fast instead of returning empty results."""
        # Mock the create_embedding to succeed but vector search to fail
        with patch("src.server.services.search.rag_service.create_embedding") as mock_create_embedding:
            mock_create_embedding.return_value = [0.1] * 1536  # Mock embedding vector
            with patch("src.server.services.search.rag_service.get_supabase_client"):
                rag_service = RAGService()
                # Mock the base strategy to raise an exception
                with patch.object(rag_service.base_strategy, 'vector_search', side_effect=Exception("Database connection failed")):
                    # Should raise RuntimeError instead of returning empty list
                    with pytest.raises(RuntimeError) as exc_info:
                        await rag_service.search_documents("test query")
                    assert "Document search failed" in str(exc_info.value)
                    assert "Database connection failed" in str(exc_info.value)
    @pytest.mark.asyncio
    async def test_integration_error_flow_rag_to_api(self):
        """Test complete error flow from RAG service through API endpoint."""
        request = RagQueryRequest(query="test query", match_count=5)
        # Mock both RAGService and get_supabase_client to avoid real connections
        with patch("src.server.api_routes.knowledge_api.RAGService") as mock_rag_service_class, \
             patch("src.server.api_routes.knowledge_api.get_supabase_client"):
            mock_service = Mock()
            mock_service.perform_rag_query = AsyncMock(
                side_effect=RuntimeError("Document search failed: Database connection failed")
            )
            mock_rag_service_class.return_value = mock_service
            # Should raise HTTPException with generic error (not OpenAI specific)
            with pytest.raises(HTTPException) as exc_info:
                await perform_rag_query(request)
            # Verify error details
            assert exc_info.value.status_code == 500
            assert "RAG query failed" in exc_info.value.detail["error"]
            assert "Database connection failed" in exc_info.value.detail["error"]
    @pytest.mark.asyncio
    async def test_api_error_sanitization_in_endpoint(self):
        """Test that API errors are sanitized in the RAG endpoint response."""
        request = RagQueryRequest(query="test query", match_count=5)
        # Mock RAGService to raise API error with sensitive information
        with patch("src.server.api_routes.knowledge_api.RAGService") as mock_rag_service_class, \
             patch("src.server.api_routes.knowledge_api.get_supabase_client"):
            mock_service = Mock()
            mock_service.perform_rag_query = AsyncMock(
                side_effect=EmbeddingAPIError("Request failed to https://api.openai.com/v1/embeddings with key sk-1234567890abcdef1234567890abcdef1234567890abcdef")
            )
            mock_rag_service_class.return_value = mock_service
            # Should raise HTTPException with sanitized error message
            with pytest.raises(HTTPException) as exc_info:
                await perform_rag_query(request)
            # Verify error is sanitized
            error_message = exc_info.value.detail["message"]
            assert "sk-1234567890abcdef1234567890abcdef1234567890abcdef" not in error_message
            assert "https://api.openai.com" not in error_message
            assert "[REDACTED_KEY]" in error_message
            assert "[REDACTED_URL]" in error_message
    @pytest.mark.asyncio
    async def test_successful_rag_query_flow(self):
        """Test that successful RAG queries work correctly."""
        request = RagQueryRequest(query="test query", match_count=5)
        # Mock RAGService to return successful result
        mock_result = {
            "results": [{"content": "test result"}],
            "query": "test query",
            "total_found": 1,
        }
        with patch("src.server.api_routes.knowledge_api.RAGService") as mock_rag_service_class, \
             patch("src.server.api_routes.knowledge_api.get_supabase_client"):
            mock_service = Mock()
            mock_service.perform_rag_query = AsyncMock(return_value=(True, mock_result))
            mock_rag_service_class.return_value = mock_service
            # Should return successful result
            result = await perform_rag_query(request)
            # Verify successful response
            assert result["success"] is True
            assert result["results"] == [{"content": "test result"}]
            assert result["total_found"] == 1