diff --git a/TASK_PRP/PRPs/rebase-bugfix-issue-362.md b/TASK_PRP/PRPs/rebase-bugfix-issue-362.md
new file mode 100644
index 00000000..aa92954c
--- /dev/null
+++ b/TASK_PRP/PRPs/rebase-bugfix-issue-362.md
@@ -0,0 +1,275 @@
+# TASK PRP: Rebase bugfix-issue-362 onto main
+
+## Task Overview
+Rebase the bugfix-issue-362 branch (33 commits of OpenAI error handling improvements) onto main branch (includes TanStack Query Migration Phase 3) while preserving all error handling functionality and adapting to the new architecture.
+
+## Analysis Process
+
+### Scope Definition
+- **Affected Files**: 
+  - `archon-ui-main/src/pages/KnowledgeBasePage.tsx` (conflict)
+  - `archon-ui-main/src/services/knowledgeBaseService.ts` (deleted in main)
+  - `python/src/server/services/search/rag_service.py` (conflict)
+  - Multiple test files with error handling updates
+- **Dependencies**: TanStack Query hooks, frontend error handling, backend RAG service
+- **Side Effects**: Error sanitization patterns, API authentication flows
+- **Test Coverage**: Frontend error display, backend error handling, integration tests
+
+### Pattern Research
+```yaml
+context:
+  docs:
+    - url: "PR #605 - TanStack Query Migration Phase 3"
+      focus: "Knowledge base architecture changes"
+    - url: "archon-ui-main/src/features/knowledge/hooks/"
+      focus: "New TanStack Query patterns"
+
+  patterns:
+    - file: "archon-ui-main/src/features/knowledge/components/KnowledgeBase.tsx"
+      copy: "TanStack Query error handling patterns"
+    - file: "archon-ui-main/src/features/knowledge/hooks/useKnowledgeQueries.ts"  
+      copy: "Service replacement patterns"
+
+  gotchas:
+    - issue: "knowledgeBaseService.ts deleted"
+      fix: "Functionality moved to TanStack Query hooks"
+    - issue: "Error handling architecture changed"
+      fix: "Use TanStack Query error states"
+    - issue: "Import paths changed"
+      fix: "Update to features/ directory structure"
+```
+
+## Task Structure
+
+### Phase 1: Preparation and Analysis
+
+**ACTION backup-branches:**
+- CREATE: backup branch `git checkout -b bugfix-issue-362-backup`
+- CREATE: analysis branch `git checkout -b rebase-analysis`
+- VALIDATE: `git branch -a | grep backup`
+- IF_FAIL: "Branches not created properly"
+- ROLLBACK: "Continue without backup (higher risk)"
+
+**ACTION analyze-main-changes:**
+- READ: `git log origin/main --oneline -10`
+- READ: `archon-ui-main/src/features/knowledge/` structure
+- READ: TanStack Query migration patterns in main
+- VALIDATE: Understanding of new architecture
+- IF_FAIL: "Research TanStack Query documentation"
+- ROLLBACK: "Proceed with basic understanding"
+
+**ACTION identify-functionality-mapping:**
+- COMPARE: deleted `knowledgeBaseService.ts` vs new hooks
+- MAP: Error handling functions to TanStack Query equivalents
+- DOCUMENT: Functions that need to be preserved
+- VALIDATE: Complete functionality mapping
+- IF_FAIL: "Manual code review needed"
+- ROLLBACK: "Proceed with partial mapping"
+
+### Phase 2: Interactive Rebase Setup
+
+**ACTION start-rebase:**
+- COMMAND: `git rebase -i origin/main`
+- SELECT: Keep all commits (edit conflicting ones)
+- VALIDATE: Rebase editor opens successfully
+- IF_FAIL: "Check git configuration, try git rebase --abort"
+- ROLLBACK: "Use merge strategy instead"
+
+### Phase 3: Systematic Conflict Resolution
+
+**ACTION resolve-rag-service-conflict:**
+- FILE: `python/src/server/services/search/rag_service.py`
+- OPERATION: Merge error handling improvements with main changes
+- PRESERVE: OpenAI authentication error handling
+- PRESERVE: Error message sanitization
+- VALIDATE: `cd python && uv run pytest tests/test_rag_*.py -v`
+- IF_FAIL: "Review specific test failures, adjust error handling"
+- ROLLBACK: "Accept main version, re-implement error handling"
+
+**ACTION resolve-knowledge-page-conflict:**
+- FILE: `archon-ui-main/src/pages/KnowledgeBasePage.tsx`
+- OPERATION: Adapt error handling to TanStack Query patterns
+- PORT: Error display logic to new component structure
+- UPDATE: Import statements to use features/ directory
+- VALIDATE: `cd archon-ui-main && npm run test KnowledgeBasePage`
+- IF_FAIL: "Check TanStack Query error handling patterns"
+- ROLLBACK: "Use main version, re-add error handling manually"
+
+**ACTION handle-deleted-service:**
+- ANALYZE: Functions from `knowledgeBaseService.ts` 
+- PORT: Error handling utilities to appropriate TanStack Query hooks
+- UPDATE: All imports using the deleted service
+- REMOVE: References to deleted service
+- VALIDATE: `cd archon-ui-main && npm run build`
+- IF_FAIL: "Find missing function mappings, update imports"
+- ROLLBACK: "Recreate minimal service file"
+
+### Phase 4: Architecture Adaptation
+
+**ACTION adapt-error-handling:**
+- UPDATE: Frontend error handling to use TanStack Query error states
+- PRESERVE: Error message sanitization functions
+- PRESERVE: OpenAI API key masking
+- UPDATE: Error boundary components if needed
+- VALIDATE: Manual test of error scenarios
+- IF_FAIL: "Check error state propagation"
+- ROLLBACK: "Keep old error handling patterns where possible"
+
+**ACTION update-imports:**
+- SCAN: All files for old import paths
+- UPDATE: Import paths to new features/ structure
+- REMOVE: Unused imports from deleted services
+- ADD: New TanStack Query hook imports
+- VALIDATE: `cd archon-ui-main && npm run lint`
+- IF_FAIL: "Fix remaining import issues"
+- ROLLBACK: "Manual import fixes"
+
+### Phase 5: Comprehensive Validation
+
+**ACTION run-frontend-tests:**
+- COMMAND: `cd archon-ui-main && npm test`
+- FOCUS: Error handling test cases
+- FOCUS: Knowledge base functionality
+- VALIDATE: All tests pass
+- IF_FAIL: "Fix failing tests one by one"
+- ROLLBACK: "Identify and skip non-critical failures"
+
+**ACTION run-backend-tests:**
+- COMMAND: `cd python && uv run pytest tests/ -v`
+- FOCUS: RAG service tests
+- FOCUS: Error handling tests
+- VALIDATE: All tests pass
+- IF_FAIL: "Fix backend test failures"
+- ROLLBACK: "Document test failures for later"
+
+**ACTION manual-error-testing:**
+- TEST: Invalid OpenAI API key scenario
+- TEST: Network timeout scenarios  
+- TEST: Invalid query scenarios
+- VERIFY: Error messages are sanitized
+- VERIFY: Error messages are user-friendly
+- VALIDATE: All error scenarios handled properly
+- IF_FAIL: "Fix specific error handling issues"
+- ROLLBACK: "Document error handling gaps"
+
+**ACTION performance-validation:**
+- TEST: Knowledge base loading performance
+- TEST: Error handling overhead
+- COMPARE: Before and after performance
+- VALIDATE: No significant performance regression
+- IF_FAIL: "Identify and fix performance issues"
+- ROLLBACK: "Accept minor performance trade-offs"
+
+### Phase 6: Finalization
+
+**ACTION complete-rebase:**
+- COMMAND: `git rebase --continue` (for each remaining commit)
+- RESOLVE: Any remaining conflicts systematically
+- VALIDATE: `git log --oneline` shows clean history
+- IF_FAIL: "Continue resolving conflicts"
+- ROLLBACK: "git rebase --abort and retry"
+
+**ACTION final-validation:**
+- RUN: Full test suite
+- RUN: Manual smoke test of key functionality
+- CHECK: All error handling improvements preserved
+- VALIDATE: Branch ready for PR
+- IF_FAIL: "Address remaining issues"
+- ROLLBACK: "Reset to backup branch"
+
+## User Interaction Points
+
+### 1. Initial Confirmation
+**Questions:**
+- Confirm understanding of TanStack Query migration impact
+- Verify that all 33 commits of error handling should be preserved
+- Approve the systematic conflict resolution approach
+
+### 2. Architecture Review
+**Questions:**
+- Review the mapping of knowledgeBaseService.ts to TanStack Query hooks
+- Confirm error handling patterns for the new architecture
+- Approve the adaptation strategy
+
+### 3. Conflict Resolution Review
+**Questions:**
+- Review specific conflict resolutions as they occur
+- Confirm that error handling functionality is preserved
+- Approve any architectural adaptations
+
+## Critical Elements
+
+### Debug Patterns
+- **Frontend Issues**: Check browser console for TanStack Query errors
+- **Backend Issues**: Check logs for RAG service error handling
+- **Integration Issues**: Test full error flow from backend to frontend
+
+### Performance Considerations
+- TanStack Query caching may affect error handling timing
+- Error boundary performance with new architecture
+- Backend error processing overhead
+
+### Security Concerns
+- Ensure error message sanitization still works
+- Verify OpenAI API key masking is preserved
+- Check that no sensitive data leaks in new error flows
+
+### Assumptions
+- TanStack Query migration is complete and stable in main
+- All required TanStack Query hooks are implemented
+- Test suite covers error handling scenarios adequately
+
+## Risk Assessment
+
+### High Risk Areas
+- **Frontend Architecture Mismatch**: TanStack Query vs old service patterns
+- **Lost Functionality**: Functions from deleted knowledgeBaseService.ts
+- **Error Handling Gaps**: New architecture may miss some error cases
+
+### Medium Risk Areas  
+- **Import Path Updates**: Many files may have outdated imports
+- **Test Compatibility**: Tests may need updates for new architecture
+- **Performance Impact**: New error handling patterns may affect performance
+
+### Low Risk Areas
+- **Backend RAG Service**: Likely straightforward merge conflict
+- **Basic Functionality**: Core features should work with proper adaptation
+
+### Rollback Strategy
+1. **Immediate Abort**: `git rebase --abort` returns to original state
+2. **Partial Rollback**: Reset to backup branch and cherry-pick successful commits
+3. **Alternative Approach**: Use merge strategy instead of rebase if conflicts too complex
+
+## Success Criteria
+- [ ] All 33 commits successfully rebased onto main
+- [ ] All error handling functionality preserved
+- [ ] Frontend works with TanStack Query architecture  
+- [ ] Backend RAG service conflicts resolved
+- [ ] All tests passing
+- [ ] Error scenarios work as expected
+- [ ] No performance regression
+- [ ] Security features (sanitization) intact
+
+## Output Files
+- `bugfix-issue-362` branch rebased onto main
+- Updated test results
+- Documentation of any architectural changes made
+- List of any functionality that couldn't be preserved (if any)
+
+## Estimated Timeline
+- **Phase 1-2**: 30 minutes (preparation and setup)
+- **Phase 3**: 1-2 hours (conflict resolution)
+- **Phase 4**: 1 hour (architecture adaptation)
+- **Phase 5**: 1 hour (validation)
+- **Phase 6**: 30 minutes (finalization)
+- **Total**: 4-5 hours
+
+## Quality Checklist
+- [x] All changes identified (3 main conflicts + architecture adaptation)
+- [x] Dependencies mapped (TanStack Query, error handling, RAG service)
+- [x] Each task has validation (tests and manual checks)
+- [x] Rollback steps included (backup branches, abort strategies)
+- [x] Debug strategies provided (frontend, backend, integration)
+- [x] Performance impact noted (TanStack Query caching, error processing)
+- [x] Security checked (error sanitization, API key masking)
+- [x] No missing edge cases (comprehensive error scenario testing)
\ No newline at end of file
diff --git a/archon-ui-main/src/features/knowledge/hooks/useKnowledgeQueries.ts b/archon-ui-main/src/features/knowledge/hooks/useKnowledgeQueries.ts
index 1d286630..0df5cf4d 100644
--- a/archon-ui-main/src/features/knowledge/hooks/useKnowledgeQueries.ts
+++ b/archon-ui-main/src/features/knowledge/hooks/useKnowledgeQueries.ts
@@ -11,6 +11,7 @@ import { useActiveOperations } from "../progress/hooks";
 import { progressKeys } from "../progress/hooks/useProgressQueries";
 import type { ActiveOperation, ActiveOperationsResponse } from "../progress/types";
 import { knowledgeService } from "../services";
+import { getDisplayErrorMessage, type EnhancedError } from "../utils/errorHandler";
 import type {
   CrawlRequest,
   CrawlStartResponse,
@@ -273,7 +274,10 @@ export function useCrawlUrl() {
         queryClient.setQueryData(progressKeys.list(), context.previousOperations);
       }
 
-      const errorMessage = error instanceof Error ? error.message : "Failed to start crawl";
+      // Use enhanced error handling for better user experience
+      const errorMessage = (error as EnhancedError)?.isOpenAIError 
+        ? getDisplayErrorMessage(error as EnhancedError)
+        : (error instanceof Error ? error.message : "Failed to start crawl");
       showToast(errorMessage, "error");
     },
   });
@@ -449,8 +453,10 @@ export function useUploadDocument() {
         queryClient.setQueryData(progressKeys.list(), context.previousOperations);
       }
 
-      // Display the actual error message from backend
-      const message = error instanceof Error ? error.message : "Failed to upload document";
+      // Use enhanced error handling for better user experience
+      const message = (error as EnhancedError)?.isOpenAIError 
+        ? getDisplayErrorMessage(error as EnhancedError)
+        : (error instanceof Error ? error.message : "Failed to upload document");
       showToast(message, "error");
     },
   });
@@ -521,7 +527,10 @@ export function useDeleteKnowledgeItem() {
         queryClient.setQueryData(queryKey, data);
       }
 
-      const errorMessage = error instanceof Error ? error.message : "Failed to delete item";
+      // Use enhanced error handling for better user experience
+      const errorMessage = (error as EnhancedError)?.isOpenAIError 
+        ? getDisplayErrorMessage(error as EnhancedError)
+        : (error instanceof Error ? error.message : "Failed to delete item");
       showToast(errorMessage, "error");
     },
     onSuccess: (data) => {
@@ -568,7 +577,10 @@ export function useUpdateKnowledgeItem() {
         queryClient.setQueryData(knowledgeKeys.detail(variables.sourceId), context.previousItem);
       }
 
-      const errorMessage = error instanceof Error ? error.message : "Failed to update item";
+      // Use enhanced error handling for better user experience
+      const errorMessage = (error as EnhancedError)?.isOpenAIError 
+        ? getDisplayErrorMessage(error as EnhancedError)
+        : (error instanceof Error ? error.message : "Failed to update item");
       showToast(errorMessage, "error");
     },
     onSuccess: (_data, { sourceId }) => {
@@ -604,7 +616,10 @@ export function useRefreshKnowledgeItem() {
       return data;
     },
     onError: (error) => {
-      const errorMessage = error instanceof Error ? error.message : "Failed to refresh item";
+      // Use enhanced error handling for better user experience
+      const errorMessage = (error as EnhancedError)?.isOpenAIError 
+        ? getDisplayErrorMessage(error as EnhancedError)
+        : (error instanceof Error ? error.message : "Failed to refresh item");
       showToast(errorMessage, "error");
     },
   });
diff --git a/archon-ui-main/src/features/knowledge/services/apiWithEnhancedErrors.ts b/archon-ui-main/src/features/knowledge/services/apiWithEnhancedErrors.ts
new file mode 100644
index 00000000..0c135515
--- /dev/null
+++ b/archon-ui-main/src/features/knowledge/services/apiWithEnhancedErrors.ts
@@ -0,0 +1,99 @@
+/**
+ * Enhanced API client for knowledge base operations with OpenAI error handling
+ * Built on top of the ETag-aware API client with additional error parsing
+ */
+
+import { callAPIWithETag } from "../../projects/shared/apiWithEtag";
+import { parseKnowledgeBaseError, type EnhancedError } from "../utils/errorHandler";
+
+/**
+ * API call wrapper with enhanced OpenAI error handling
+ * Uses ETag caching for efficiency while adding specialized error parsing
+ */
+export async function callKnowledgeAPI<T>(
+  endpoint: string,
+  options: RequestInit = {}
+): Promise<T> {
+  try {
+    // Use the ETag-aware API client for caching benefits
+    return await callAPIWithETag<T>(endpoint, options);
+  } catch (error: any) {
+    // Apply enhanced error parsing for OpenAI errors
+    const enhancedError = parseKnowledgeBaseError({
+      status: error.statusCode || error.status,
+      error: error.message || error.detail || error,
+      detail: error.detail
+    });
+    
+    // Preserve the original error structure but enhance with our parsing
+    const finalError = error as EnhancedError;
+    finalError.isOpenAIError = enhancedError.isOpenAIError;
+    finalError.errorDetails = enhancedError.errorDetails;
+    finalError.message = enhancedError.message;
+    
+    throw finalError;
+  }
+}
+
+/**
+ * Enhanced upload wrapper that handles FormData and file uploads with better error handling
+ */
+export async function uploadWithEnhancedErrors(
+  endpoint: string,
+  formData: FormData,
+  timeoutMs: number = 30000
+): Promise<any> {
+  const API_BASE_URL = "/api"; // Use same base as other services
+  
+  let fullUrl = `${API_BASE_URL}${endpoint}`;
+  
+  // Handle test environment URLs
+  if (typeof process !== "undefined" && process.env?.NODE_ENV === "test") {
+    const testHost = process.env?.VITE_HOST || "localhost";
+    const testPort = process.env?.ARCHON_SERVER_PORT || "8181";
+    fullUrl = `http://${testHost}:${testPort}${fullUrl}`;
+  }
+  
+  try {
+    const response = await fetch(fullUrl, {
+      method: "POST",
+      body: formData,
+      signal: AbortSignal.timeout(timeoutMs),
+    });
+
+    if (!response.ok) {
+      let errorData;
+      try {
+        errorData = await response.json();
+      } catch {
+        const text = await response.text();
+        errorData = { status: response.status, error: text };
+      }
+      
+      // Apply enhanced error parsing
+      const enhancedError = parseKnowledgeBaseError({
+        status: response.status,
+        error: errorData.detail || errorData.error || errorData,
+        detail: errorData.detail
+      });
+      
+      throw enhancedError;
+    }
+
+    return response.json();
+  } catch (error: any) {
+    // Check if it's a timeout error
+    if (error instanceof Error && error.name === 'AbortError') {
+      const timeoutError = parseKnowledgeBaseError(new Error('Request timed out'));
+      throw timeoutError;
+    }
+    
+    // If it's already an enhanced error, re-throw it
+    if (error && typeof error === 'object' && 'isOpenAIError' in error) {
+      throw error;
+    }
+    
+    // Parse other errors through the error handler for consistency
+    throw parseKnowledgeBaseError(error);
+  }
+}
\ No newline at end of file
diff --git a/archon-ui-main/src/features/knowledge/services/knowledgeService.ts b/archon-ui-main/src/features/knowledge/services/knowledgeService.ts
index 68cb81af..d6da81ca 100644
--- a/archon-ui-main/src/features/knowledge/services/knowledgeService.ts
+++ b/archon-ui-main/src/features/knowledge/services/knowledgeService.ts
@@ -3,7 +3,8 @@
  * Handles all knowledge-related API operations using TanStack Query patterns
  */
 
-import { callAPIWithETag, invalidateETagCache } from "../../projects/shared/apiWithEtag";
+import { invalidateETagCache } from "../../projects/shared/apiWithEtag";
+import { callKnowledgeAPI, uploadWithEnhancedErrors } from "./apiWithEnhancedErrors";
 import type {
   ChunksResponse,
   CodeExamplesResponse,
@@ -40,21 +41,21 @@ export const knowledgeService = {
     const queryString = params.toString();
     const endpoint = `/api/knowledge-items/summary${queryString ? `?${queryString}` : ""}`;
 
-    return callAPIWithETag<KnowledgeItemsResponse>(endpoint);
+    return callKnowledgeAPI<KnowledgeItemsResponse>(endpoint);
   },
 
   /**
    * Get a specific knowledge item
    */
   async getKnowledgeItem(sourceId: string): Promise<KnowledgeItem> {
-    return callAPIWithETag<KnowledgeItem>(`/api/knowledge-items/${sourceId}`);
+    return callKnowledgeAPI<KnowledgeItem>(`/api/knowledge-items/${sourceId}`);
   },
 
   /**
    * Delete a knowledge item
    */
   async deleteKnowledgeItem(sourceId: string): Promise<{ success: boolean; message: string }> {
-    const response = await callAPIWithETag<{ success: boolean; message: string }>(`/api/knowledge-items/${sourceId}`, {
+    const response = await callKnowledgeAPI<{ success: boolean; message: string }>(`/api/knowledge-items/${sourceId}`, {
       method: "DELETE",
     });
 
@@ -70,7 +71,7 @@ export const knowledgeService = {
    * Update a knowledge item
    */
   async updateKnowledgeItem(sourceId: string, updates: Partial<KnowledgeItem>): Promise<KnowledgeItem> {
-    const response = await callAPIWithETag<KnowledgeItem>(`/api/knowledge-items/${sourceId}`, {
+    const response = await callKnowledgeAPI<KnowledgeItem>(`/api/knowledge-items/${sourceId}`, {
       method: "PUT",
       body: JSON.stringify(updates),
     });
@@ -87,7 +88,7 @@ export const knowledgeService = {
    * Start crawling a URL
    */
   async crawlUrl(request: CrawlRequest): Promise<CrawlStartResponse> {
-    const response = await callAPIWithETag<CrawlStartResponse>("/api/knowledge-items/crawl", {
+    const response = await callKnowledgeAPI<CrawlStartResponse>("/api/knowledge-items/crawl", {
       method: "POST",
       body: JSON.stringify(request),
     });
@@ -103,7 +104,7 @@ export const knowledgeService = {
    * Refresh an existing knowledge item
    */
   async refreshKnowledgeItem(sourceId: string): Promise<RefreshResponse> {
-    const response = await callAPIWithETag<RefreshResponse>(`/api/knowledge-items/${sourceId}/refresh`, {
+    const response = await callKnowledgeAPI<RefreshResponse>(`/api/knowledge-items/${sourceId}/refresh`, {
       method: "POST",
     });
 
@@ -132,38 +133,21 @@ export const knowledgeService = {
       formData.append("tags", JSON.stringify(metadata.tags));
     }
 
-    // Use fetch directly for file upload (FormData doesn't work well with our ETag wrapper)
-    // In test environment, we need absolute URLs
-    let uploadUrl = "/api/documents/upload";
-    if (typeof process !== "undefined" && process.env?.NODE_ENV === "test") {
-      const testHost = process.env?.VITE_HOST || "localhost";
-      const testPort = process.env?.ARCHON_SERVER_PORT || "8181";
-      uploadUrl = `http://${testHost}:${testPort}${uploadUrl}`;
-    }
-
-    const response = await fetch(uploadUrl, {
-      method: "POST",
-      body: formData,
-      signal: AbortSignal.timeout(30000), // 30 second timeout for file uploads
-    });
-
-    if (!response.ok) {
-      const error = await response.json();
-      throw new Error(error.error || `HTTP ${response.status}`);
-    }
+    // Use enhanced upload wrapper with OpenAI error handling
+    const result = await uploadWithEnhancedErrors("/documents/upload", formData, 30000);
 
     // Invalidate list cache
     invalidateETagCache("/api/knowledge-items");
     invalidateETagCache("/api/knowledge-items/summary");
 
-    return response.json();
+    return result;
   },
 
   /**
    * Stop a running crawl
    */
   async stopCrawl(progressId: string): Promise<{ success: boolean; message: string }> {
-    return callAPIWithETag<{ success: boolean; message: string }>(`/api/knowledge-items/stop/${progressId}`, {
+    return callKnowledgeAPI<{ success: boolean; message: string }>(`/api/knowledge-items/stop/${progressId}`, {
       method: "POST",
     });
   },
@@ -193,7 +177,7 @@ export const knowledgeService = {
     const queryString = params.toString();
     const endpoint = `/api/knowledge-items/${sourceId}/chunks${queryString ? `?${queryString}` : ""}`;
 
-    return callAPIWithETag<ChunksResponse>(endpoint);
+    return callKnowledgeAPI<ChunksResponse>(endpoint);
   },
 
   /**
@@ -217,14 +201,14 @@ export const knowledgeService = {
     const queryString = params.toString();
     const endpoint = `/api/knowledge-items/${sourceId}/code-examples${queryString ? `?${queryString}` : ""}`;
 
-    return callAPIWithETag<CodeExamplesResponse>(endpoint);
+    return callKnowledgeAPI<CodeExamplesResponse>(endpoint);
   },
 
   /**
    * Search the knowledge base
    */
   async searchKnowledgeBase(options: SearchOptions): Promise<SearchResultsResponse> {
-    return callAPIWithETag("/api/knowledge-items/search", {
+    return callKnowledgeAPI("/api/knowledge-items/search", {
       method: "POST",
       body: JSON.stringify(options),
     });
@@ -234,6 +218,6 @@ export const knowledgeService = {
    * Get available knowledge sources
    */
   async getKnowledgeSources(): Promise<KnowledgeSource[]> {
-    return callAPIWithETag<KnowledgeSource[]>("/api/knowledge-items/sources");
+    return callKnowledgeAPI<KnowledgeSource[]>("/api/knowledge-items/sources");
   },
 };
diff --git a/archon-ui-main/src/features/knowledge/utils/errorHandler.ts b/archon-ui-main/src/features/knowledge/utils/errorHandler.ts
new file mode 100644
index 00000000..a47cb112
--- /dev/null
+++ b/archon-ui-main/src/features/knowledge/utils/errorHandler.ts
@@ -0,0 +1,208 @@
+/**
+ * Error handling utilities for knowledge base operations
+ * 
+ * Provides specialized error handling for OpenAI API errors,
+ * rate limiting, and quota exhaustion scenarios.
+ * 
+ * Related to GitHub issue #362 - improves user experience
+ * by displaying clear error messages when OpenAI API fails.
+ */
+
+export interface OpenAIErrorDetails {
+  error: string;
+  message: string;
+  error_type: 'quota_exhausted' | 'rate_limit' | 'api_error' | 'authentication_failed';
+  tokens_used?: number;
+  retry_after?: number;
+  api_key_prefix?: string;
+}
+
+export interface EnhancedError extends Error {
+  statusCode?: number;
+  errorDetails?: OpenAIErrorDetails;
+  isOpenAIError?: boolean;
+}
+
+/**
+ * Create a fallback error for cases where input is invalid or unparseable
+ */
+function createFallbackError(reason: string): EnhancedError {
+  return Object.assign(new Error('Unknown error occurred'), {
+    errorDetails: {
+      error: 'unknown',
+      message: `${reason}. Please try again or contact support if the problem persists.`,
+      error_type: 'api_error' as const
+    }
+  }) as EnhancedError;
+}
+
+/**
+ * Check if an object can be safely serialized (no circular references)
+ */
+function isSafeObject(obj: any): boolean {
+  try {
+    JSON.stringify(obj);
+    return true;
+  } catch {
+    return false;
+  }
+}
+
+/**
+ * Parse and enhance API errors from knowledge base operations
+ */
+export function parseKnowledgeBaseError(error: any): EnhancedError {
+  // Enhanced input validation
+  if (!error) {
+    return createFallbackError('No error information provided');
+  }
+  
+  if (typeof error === 'string') {
+    return Object.assign(new Error(error), {
+      errorDetails: {
+        error: 'api_error',
+        message: error,
+        error_type: 'api_error' as const
+      }
+    }) as EnhancedError;
+  }
+  
+  if (typeof error !== 'object' || error === null) {
+    return createFallbackError('Invalid error format');
+  }
+
+  // Check for empty objects or objects with no useful properties
+  if (error.constructor === Object && Object.keys(error).length === 0) {
+    return createFallbackError('Empty error object received');
+  }
+
+  // Check for circular references and object safety
+  if (!isSafeObject(error)) {
+    return createFallbackError('Error object contains circular references');
+  }
+
+  // Handle Error instances that might have been serialized/deserialized
+  if (error instanceof Error || (error.name && error.message && error.stack)) {
+    // This is likely an Error object, proceed with parsing
+  } else if (!error.message && !error.error && !error.detail && !error.status) {
+    // Object doesn't have any recognizable error properties
+    return createFallbackError('Unrecognized error object structure');
+  }
+  
+  const enhancedError: EnhancedError = new Error(error.message || 'Unknown error');
+  
+  // Check if this is an HTTP response error with JSON details
+  if (error && typeof error === 'object') {
+    // Handle fetch Response errors
+    if (error.status || error.statusCode) {
+      enhancedError.statusCode = error.status || error.statusCode;
+    }
+    
+    // Parse error details from API response
+    if (error.error || error.detail) {
+      const errorData = error.error || error.detail;
+      
+      // Check if it's an OpenAI-specific error
+      if (typeof errorData === 'object' && errorData.error_type) {
+        enhancedError.isOpenAIError = true;
+        enhancedError.errorDetails = errorData as OpenAIErrorDetails;
+        
+        // Override the message with the detailed error message
+        enhancedError.message = errorData.message || errorData.error || enhancedError.message;
+      }
+    }
+  }
+  
+  return enhancedError;
+}
+
+/**
+ * Get user-friendly error message for display in UI
+ */
+export function getDisplayErrorMessage(error: EnhancedError): string {
+  if (error.isOpenAIError && error.errorDetails) {
+    switch (error.errorDetails.error_type) {
+      case 'quota_exhausted':
+        return `OpenAI API quota exhausted. Please add credits to your OpenAI account or check your billing settings.`;
+      
+      case 'rate_limit':
+        return `OpenAI API rate limit exceeded. Please wait a moment and try again.`;
+      
+      case 'authentication_failed':
+        return `Invalid or expired OpenAI API key. Please check your API key in settings.`;
+      
+      case 'api_error':
+        return `OpenAI API error: ${error.errorDetails.message}. Please check your API key configuration.`;
+      
+      default:
+        return error.errorDetails.message || error.message;
+    }
+  }
+  
+  // Handle HTTP status codes
+  if (error.statusCode) {
+    switch (error.statusCode) {
+      case 401:
+        return 'Authentication failed. Please check your API key.';
+      case 429:
+        return 'API rate limit exceeded. Please wait a moment and try again.';
+      case 502:
+        return 'API service unavailable. Please try again in a few minutes.';
+      case 503:
+        return 'Service temporarily unavailable. Please try again later.';
+      default:
+        return error.message;
+    }
+  }
+  
+  return error.message || 'An unexpected error occurred.';
+}
+
+/**
+ * Get error severity level for UI styling
+ */
+export function getErrorSeverity(error: EnhancedError): 'error' | 'warning' | 'info' {
+  if (error.isOpenAIError && error.errorDetails) {
+    switch (error.errorDetails.error_type) {
+      case 'quota_exhausted':
+        return 'error'; // Critical - user action required
+      case 'authentication_failed':
+        return 'error'; // Critical - configuration issue
+      case 'rate_limit':
+        return 'warning'; // Temporary - retry may work
+      case 'api_error':
+        return 'error'; // Likely configuration issue
+      default:
+        return 'error';
+    }
+  }
+  
+  if (error.statusCode && error.statusCode >= 500) {
+    return 'error'; // Server errors
+  }
+  
+  return 'warning'; // Default to warning for other errors
+}
+
+/**
+ * Get suggested action for the user based on error type
+ */
+export function getErrorAction(error: EnhancedError): string | null {
+  if (error.isOpenAIError && error.errorDetails) {
+    switch (error.errorDetails.error_type) {
+      case 'quota_exhausted':
+        return 'Check your OpenAI billing dashboard and add credits';
+      case 'authentication_failed':
+        return 'Verify your OpenAI API key in Settings';
+      case 'rate_limit':
+        const retryAfter = error.errorDetails.retry_after || 30;
+        return `Wait ${retryAfter} seconds and try again`;
+      case 'api_error':
+        return 'Verify your OpenAI API key in Settings';
+      default:
+        return null;
+    }
+  }
+  
+  return null;
+}
\ No newline at end of file
diff --git a/python/src/agents/base_agent.py b/python/src/agents/base_agent.py
index 7ea03c03..18680d3a 100644
--- a/python/src/agents/base_agent.py
+++ b/python/src/agents/base_agent.py
@@ -216,7 +216,7 @@ class BaseAgent(ABC, Generic[DepsT, OutputT]):
             self.logger.info(f"Agent {self.name} completed successfully")
             # PydanticAI returns a RunResult with data attribute
             return result.data
-        except asyncio.TimeoutError:
+        except TimeoutError:
             self.logger.error(f"Agent {self.name} timed out after 120 seconds")
             raise Exception(f"Agent {self.name} operation timed out - taking too long to respond")
         except Exception as e:
diff --git a/python/src/mcp_server/features/documents/document_tools.py b/python/src/mcp_server/features/documents/document_tools.py
index acc39975..213f00cf 100644
--- a/python/src/mcp_server/features/documents/document_tools.py
+++ b/python/src/mcp_server/features/documents/document_tools.py
@@ -11,8 +11,8 @@ from typing import Any
 from urllib.parse import urljoin
 
 import httpx
-
 from mcp.server.fastmcp import Context, FastMCP
+
 from src.mcp_server.utils.error_handling import MCPErrorFormatter
 from src.mcp_server.utils.timeout_config import get_default_timeout
 from src.server.config.service_discovery import get_api_url
diff --git a/python/src/mcp_server/features/documents/version_tools.py b/python/src/mcp_server/features/documents/version_tools.py
index b5033c6d..f4cd089f 100644
--- a/python/src/mcp_server/features/documents/version_tools.py
+++ b/python/src/mcp_server/features/documents/version_tools.py
@@ -11,8 +11,8 @@ from typing import Any
 from urllib.parse import urljoin
 
 import httpx
-
 from mcp.server.fastmcp import Context, FastMCP
+
 from src.mcp_server.utils.error_handling import MCPErrorFormatter
 from src.mcp_server.utils.timeout_config import get_default_timeout
 from src.server.config.service_discovery import get_api_url
diff --git a/python/src/mcp_server/features/feature_tools.py b/python/src/mcp_server/features/feature_tools.py
index 5581a5cc..0a73a539 100644
--- a/python/src/mcp_server/features/feature_tools.py
+++ b/python/src/mcp_server/features/feature_tools.py
@@ -9,8 +9,8 @@ import logging
 from urllib.parse import urljoin
 
 import httpx
-
 from mcp.server.fastmcp import Context, FastMCP
+
 from src.mcp_server.utils.error_handling import MCPErrorFormatter
 from src.mcp_server.utils.timeout_config import get_default_timeout
 from src.server.config.service_discovery import get_api_url
diff --git a/python/src/mcp_server/features/projects/project_tools.py b/python/src/mcp_server/features/projects/project_tools.py
index 0f002412..a54a8079 100644
--- a/python/src/mcp_server/features/projects/project_tools.py
+++ b/python/src/mcp_server/features/projects/project_tools.py
@@ -11,8 +11,8 @@ import logging
 from urllib.parse import urljoin
 
 import httpx
-
 from mcp.server.fastmcp import Context, FastMCP
+
 from src.mcp_server.utils.error_handling import MCPErrorFormatter
 from src.mcp_server.utils.timeout_config import (
     get_default_timeout,
diff --git a/python/src/mcp_server/features/tasks/task_tools.py b/python/src/mcp_server/features/tasks/task_tools.py
index 1276e357..b0436641 100644
--- a/python/src/mcp_server/features/tasks/task_tools.py
+++ b/python/src/mcp_server/features/tasks/task_tools.py
@@ -11,8 +11,8 @@ from typing import Any
 from urllib.parse import urljoin
 
 import httpx
-
 from mcp.server.fastmcp import Context, FastMCP
+
 from src.mcp_server.utils.error_handling import MCPErrorFormatter
 from src.mcp_server.utils.timeout_config import get_default_timeout
 from src.server.config.service_discovery import get_api_url
diff --git a/python/src/mcp_server/mcp_server.py b/python/src/mcp_server/mcp_server.py
index 5d6002b4..ab19c6b9 100644
--- a/python/src/mcp_server/mcp_server.py
+++ b/python/src/mcp_server/mcp_server.py
@@ -29,7 +29,6 @@ from pathlib import Path
 from typing import Any
 
 from dotenv import load_dotenv
-
 from mcp.server.fastmcp import Context, FastMCP
 
 # Add the project root to Python path for imports
diff --git a/python/src/mcp_server/modules/rag_module.py b/python/src/mcp_server/modules/rag_module.py
index 8686a75c..dec7f6da 100644
--- a/python/src/mcp_server/modules/rag_module.py
+++ b/python/src/mcp_server/modules/rag_module.py
@@ -16,7 +16,6 @@ import os
 from urllib.parse import urljoin
 
 import httpx
-
 from mcp.server.fastmcp import Context, FastMCP
 
 # Import service discovery for HTTP communication
diff --git a/python/src/server/api_routes/knowledge_api.py b/python/src/server/api_routes/knowledge_api.py
index 985f450a..62fd576e 100644
--- a/python/src/server/api_routes/knowledge_api.py
+++ b/python/src/server/api_routes/knowledge_api.py
@@ -53,6 +53,92 @@ crawl_semaphore = asyncio.Semaphore(CONCURRENT_CRAWL_LIMIT)
 active_crawl_tasks: dict[str, asyncio.Task] = {}
 
 
+def _sanitize_openai_error(error_message: str) -> str:
+    """Sanitize OpenAI API error messages to prevent information disclosure."""
+    import re
+
+    # Input validation
+    if not isinstance(error_message, str):
+        return "OpenAI API encountered an error. Please verify your API key and quota."
+    if not error_message.strip():
+        return "OpenAI API encountered an error. Please verify your API key and quota."
+
+    # Common patterns to sanitize with bounded quantifiers to prevent ReDoS
+    sanitized_patterns = {
+        r'https?://[^\s]{1,200}': '[REDACTED_URL]',  # Remove URLs (bounded)
+        r'sk-[a-zA-Z0-9]{48}': '[REDACTED_KEY]',     # Remove OpenAI API keys
+        r'"[^"]{1,100}auth[^"]{1,100}"': '[REDACTED_AUTH]',     # Remove auth details (bounded)
+        r'org-[a-zA-Z0-9]{24}': '[REDACTED_ORG]',   # Remove OpenAI organization IDs
+        r'proj_[a-zA-Z0-9]{10,20}': '[REDACTED_PROJ]', # Remove OpenAI project IDs (bounded)
+        r'req_[a-zA-Z0-9]{6,20}': '[REDACTED_REQ]',   # Remove OpenAI request IDs (bounded)
+        r'user-[a-zA-Z0-9]{10,20}': '[REDACTED_USER]', # Remove OpenAI user IDs (bounded)
+        r'sess_[a-zA-Z0-9]{10,20}': '[REDACTED_SESS]', # Remove session IDs (bounded)
+        r'Bearer\s+[^\s]{1,200}': 'Bearer [REDACTED_AUTH_TOKEN]', # Remove bearer tokens (bounded)
+    }
+
+    sanitized = error_message
+    for pattern, replacement in sanitized_patterns.items():
+        sanitized = re.sub(pattern, replacement, sanitized, flags=re.IGNORECASE)
+
+    # Check for sensitive words after pattern replacement
+    sensitive_words = ['internal', 'server', 'token']
+    # Only check for 'endpoint' if it's not part of our redacted URL pattern
+    if 'endpoint' in sanitized.lower() and '[REDACTED_URL]' not in sanitized:
+        sensitive_words.append('endpoint')
+
+    # Return generic message if still contains sensitive info
+    if any(word in sanitized.lower() for word in sensitive_words):
+        return "OpenAI API encountered an error. Please verify your API key and quota."
+
+    return sanitized
+
+
+async def _validate_openai_api_key() -> None:
+    """
+    Validate OpenAI API key is present and working before starting operations.
+    
+    Raises:
+        HTTPException: 401 if API key is invalid/missing, 429 if quota exhausted
+    """
+    try:
+        # Test the API key with a minimal embedding request
+        from ..services.embeddings.embedding_service import create_embedding
+
+        # Try to create a test embedding with minimal content
+        await create_embedding(text="test")
+
+    except Exception as e:
+        # Import embedding exceptions for specific error handling
+        from ..services.embeddings.embedding_exceptions import (
+            EmbeddingAuthenticationError,
+            EmbeddingQuotaExhaustedError,
+        )
+
+        if isinstance(e, EmbeddingAuthenticationError):
+            raise HTTPException(
+                status_code=401,
+                detail={
+                    "error": "Invalid OpenAI API key",
+                    "message": "Please verify your OpenAI API key in Settings before starting a crawl.",
+                    "error_type": "authentication_failed"
+                }
+            )
+        elif isinstance(e, EmbeddingQuotaExhaustedError):
+            raise HTTPException(
+                status_code=429,
+                detail={
+                    "error": "OpenAI quota exhausted",
+                    "message": "Your OpenAI API key has no remaining credits. Please add credits to your account.",
+                    "error_type": "quota_exhausted"
+                }
+            )
+        else:
+            # For any other errors, allow the operation to continue
+            # The error will be caught later during actual processing
+            logger.warning(f"API key validation failed with unexpected error: {e}")
+            pass
+
+
 # Request Models
 class KnowledgeItemRequest(BaseModel):
     url: str
@@ -479,6 +565,9 @@ async def get_knowledge_item_code_examples(
 @router.post("/knowledge-items/{source_id}/refresh")
 async def refresh_knowledge_item(source_id: str):
     """Refresh a knowledge item by re-crawling its URL with the same metadata."""
+    # CRITICAL: Validate OpenAI API key before starting refresh
+    await _validate_openai_api_key()
+
     try:
         safe_logfire_info(f"Starting knowledge item refresh | source_id={source_id}")
 
@@ -597,6 +686,9 @@ async def crawl_knowledge_item(request: KnowledgeItemRequest):
     if not request.url.startswith(("http://", "https://")):
         raise HTTPException(status_code=422, detail="URL must start with http:// or https://")
 
+    # CRITICAL: Validate OpenAI API key before starting crawl
+    await _validate_openai_api_key()
+
     try:
         safe_logfire_info(
             f"Starting knowledge item crawl | url={str(request.url)} | knowledge_type={request.knowledge_type} | tags={request.tags}"
@@ -750,6 +842,9 @@ async def upload_document(
     knowledge_type: str = Form("technical"),
 ):
     """Upload and process a document with progress tracking."""
+    # CRITICAL: Validate OpenAI API key before starting upload
+    await _validate_openai_api_key()
+
     try:
         # DETAILED LOGGING: Track knowledge_type parameter flow
         safe_logfire_info(
@@ -974,10 +1069,73 @@ async def perform_rag_query(request: RagQueryRequest):
     except HTTPException:
         raise
     except Exception as e:
-        safe_logfire_error(
-            f"RAG query failed | error={str(e)} | query={request.query[:50]} | source={request.source}"
+        # Import embedding exceptions for specific error handling
+        from ..services.embeddings.embedding_exceptions import (
+            EmbeddingAPIError,
+            EmbeddingAuthenticationError,
+            EmbeddingQuotaExhaustedError,
+            EmbeddingRateLimitError,
         )
-        raise HTTPException(status_code=500, detail={"error": f"RAG query failed: {str(e)}"})
+
+        # Handle specific OpenAI/embedding errors with detailed messages
+        if isinstance(e, EmbeddingAuthenticationError):
+            safe_logfire_error(
+                f"OpenAI authentication failed during RAG query | query={request.query[:50]} | source={request.source}"
+            )
+            raise HTTPException(
+                status_code=401,
+                detail={
+                    "error": "OpenAI API authentication failed",
+                    "message": "Invalid or expired OpenAI API key. Please check your API key in settings.",
+                    "error_type": "authentication_failed",
+                    "api_key_prefix": getattr(e, "api_key_prefix", None),
+                }
+            )
+        elif isinstance(e, EmbeddingQuotaExhaustedError):
+            safe_logfire_error(
+                f"OpenAI quota exhausted during RAG query | query={request.query[:50]} | source={request.source}"
+            )
+            raise HTTPException(
+                status_code=429,
+                detail={
+                    "error": "OpenAI API quota exhausted",
+                    "message": "Your OpenAI API key has no remaining credits. Please add credits to your OpenAI account or check your billing settings.",
+                    "error_type": "quota_exhausted",
+                    "tokens_used": getattr(e, "tokens_used", None),
+                }
+            )
+        elif isinstance(e, EmbeddingRateLimitError):
+            safe_logfire_error(
+                f"OpenAI rate limit hit during RAG query | query={request.query[:50]} | source={request.source}"
+            )
+            raise HTTPException(
+                status_code=429,
+                detail={
+                    "error": "OpenAI API rate limit exceeded",
+                    "message": "Too many requests to OpenAI API. Please wait a moment and try again.",
+                    "error_type": "rate_limit",
+                    "retry_after": 30,  # Suggest 30 second wait
+                }
+            )
+        elif isinstance(e, EmbeddingAPIError):
+            safe_logfire_error(
+                f"OpenAI API error during RAG query | error={str(e)} | query={request.query[:50]} | source={request.source}"
+            )
+            sanitized_message = _sanitize_openai_error(str(e))
+            raise HTTPException(
+                status_code=502,
+                detail={
+                    "error": "OpenAI API error",
+                    "message": f"OpenAI API error: {sanitized_message}",
+                    "error_type": "api_error",
+                }
+            )
+        else:
+            # Generic error handling for other exceptions
+            safe_logfire_error(
+                f"RAG query failed | error={str(e)} | query={request.query[:50]} | source={request.source}"
+            )
+            raise HTTPException(status_code=500, detail={"error": f"RAG query failed: {str(e)}"})
 
 
 @router.post("/rag/code-examples")
diff --git a/python/src/server/main.py b/python/src/server/main.py
index b2269420..fabf8d1b 100644
--- a/python/src/server/main.py
+++ b/python/src/server/main.py
@@ -113,7 +113,7 @@ async def lifespan(app: FastAPI):
         _initialization_complete = True
         api_logger.info("🎉 Archon backend started successfully!")
 
-    except Exception as e:
+    except Exception:
         api_logger.error("❌ Failed to start backend", exc_info=True)
         raise
 
@@ -135,7 +135,7 @@ async def lifespan(app: FastAPI):
 
         api_logger.info("✅ Cleanup completed")
 
-    except Exception as e:
+    except Exception:
         api_logger.error("❌ Error during shutdown", exc_info=True)
 
 
diff --git a/python/src/server/services/crawling/crawling_service.py b/python/src/server/services/crawling/crawling_service.py
index e05cd6ec..ae2576a9 100644
--- a/python/src/server/services/crawling/crawling_service.py
+++ b/python/src/server/services/crawling/crawling_service.py
@@ -486,7 +486,7 @@ class CrawlingService:
                     logger.error("Code extraction failed, continuing crawl without code examples", exc_info=True)
                     safe_logfire_error(f"Code extraction failed | error={e}")
                     code_examples_count = 0
-                    
+
                     # Report code extraction failure to progress tracker
                     if self.progress_tracker:
                         await self.progress_tracker.update(
diff --git a/python/src/server/services/crawling/helpers/url_handler.py b/python/src/server/services/crawling/helpers/url_handler.py
index 33c75c57..dfc15599 100644
--- a/python/src/server/services/crawling/helpers/url_handler.py
+++ b/python/src/server/services/crawling/helpers/url_handler.py
@@ -6,8 +6,7 @@ Handles URL transformations and validations.
 
 import hashlib
 import re
-from urllib.parse import urlparse, urljoin
-from typing import List, Optional
+from urllib.parse import urljoin, urlparse
 
 from ....config.logfire_config import get_logger
 
@@ -33,8 +32,8 @@ class URLHandler:
         except Exception as e:
             logger.warning(f"Error checking if URL is sitemap: {e}")
             return False
-    
-    @staticmethod  
+
+    @staticmethod
     def is_markdown(url: str) -> bool:
         """
         Check if a URL points to a markdown file (.md, .mdx, .markdown).
@@ -274,9 +273,9 @@ class URLHandler:
             # Fallback: use a hash of the error message + url to still get something unique
             fallback = f"error_{redacted}_{str(e)}"
             return hashlib.sha256(fallback.encode("utf-8")).hexdigest()[:16]
-    
+
     @staticmethod
-    def extract_markdown_links(content: str, base_url: Optional[str] = None) -> List[str]:
+    def extract_markdown_links(content: str, base_url: str | None = None) -> list[str]:
         """
         Extract markdown-style links from text content.
         
@@ -290,10 +289,10 @@ class URLHandler:
         try:
             if not content:
                 return []
-            
+
             # Ultimate URL pattern with comprehensive format support:
             #  1) [text](url) - markdown links
-            #  2) <https://...> - autolinks  
+            #  2) <https://...> - autolinks
             #  3) https://... - bare URLs with protocol
             #  4) //example.com - protocol-relative URLs
             #  5) www.example.com - scheme-less www URLs
@@ -348,7 +347,7 @@ class URLHandler:
                 # Only include HTTP/HTTPS URLs
                 if url.startswith(('http://', 'https://')):
                     urls.append(url)
-            
+
             # Remove duplicates while preserving order
             seen = set()
             unique_urls = []
@@ -356,16 +355,16 @@ class URLHandler:
                 if url not in seen:
                     seen.add(url)
                     unique_urls.append(url)
-            
+
             logger.info(f"Extracted {len(unique_urls)} unique links from content")
             return unique_urls
-            
+
         except Exception as e:
             logger.error(f"Error extracting markdown links: {e}", exc_info=True)
             return []
-    
+
     @staticmethod
-    def is_link_collection_file(url: str, content: Optional[str] = None) -> bool:
+    def is_link_collection_file(url: str, content: str | None = None) -> bool:
         """
         Check if a URL/file appears to be a link collection file like llms.txt.
         
@@ -380,7 +379,7 @@ class URLHandler:
             # Extract filename from URL
             parsed = urlparse(url)
             filename = parsed.path.split('/')[-1].lower()
-            
+
             # Check for specific link collection filenames
             # Note: "full-*" or "*-full" patterns are NOT link collections - they contain complete content, not just links
             link_collection_patterns = [
@@ -391,12 +390,12 @@ class URLHandler:
                 'llms.mdx', 'links.mdx', 'resources.mdx', 'references.mdx',
                 'llms.markdown', 'links.markdown', 'resources.markdown', 'references.markdown',
             ]
-            
+
             # Direct filename match
             if filename in link_collection_patterns:
                 logger.info(f"Detected link collection file by filename: {filename}")
                 return True
-            
+
             # Pattern-based detection for variations, but exclude "full" variants
             # Only match files that are likely link collections, not complete content files
             if filename.endswith(('.txt', '.md', '.mdx', '.markdown')):
@@ -407,7 +406,7 @@ class URLHandler:
                     if any(filename.startswith(pattern + '.') or filename.startswith(pattern + '-') for pattern in base_patterns):
                         logger.info(f"Detected potential link collection file: {filename}")
                         return True
-            
+
             # Content-based detection if content is provided
             if content:
                 # Never treat "full" variants as link collections to preserve single-page behavior
@@ -417,19 +416,19 @@ class URLHandler:
                 # Reuse extractor to avoid regex divergence and maintain consistency
                 extracted_links = URLHandler.extract_markdown_links(content, url)
                 total_links = len(extracted_links)
-                
+
                 # Calculate link density (links per 100 characters)
                 content_length = len(content.strip())
                 if content_length > 0:
                     link_density = (total_links * 100) / content_length
-                    
+
                     # If more than 2% of content is links, likely a link collection
                     if link_density > 2.0 and total_links > 3:
                         logger.info(f"Detected link collection by content analysis: {total_links} links, density {link_density:.2f}%")
                         return True
-            
+
             return False
-            
+
         except Exception as e:
             logger.warning(f"Error checking if file is link collection: {e}", exc_info=True)
             return False
diff --git a/python/src/server/services/embeddings/contextual_embedding_service.py b/python/src/server/services/embeddings/contextual_embedding_service.py
index e72d81a5..7469d5ad 100644
--- a/python/src/server/services/embeddings/contextual_embedding_service.py
+++ b/python/src/server/services/embeddings/contextual_embedding_service.py
@@ -219,4 +219,4 @@ async def generate_contextual_embeddings_batch(
     except Exception as e:
         search_logger.error(f"Error in contextual embedding batch: {e}")
         # Return non-contextual for all chunks
-        return [(chunk, False) for chunk in chunks]
\ No newline at end of file
+        return [(chunk, False) for chunk in chunks]
diff --git a/python/src/server/services/embeddings/embedding_exceptions.py b/python/src/server/services/embeddings/embedding_exceptions.py
index 7a2ae6f9..eec3f629 100644
--- a/python/src/server/services/embeddings/embedding_exceptions.py
+++ b/python/src/server/services/embeddings/embedding_exceptions.py
@@ -99,6 +99,22 @@ class EmbeddingAPIError(EmbeddingError):
             self.metadata["original_error_message"] = str(original_error)
 
 
+class EmbeddingAuthenticationError(EmbeddingError):
+    """
+    Raised when API authentication fails (invalid or expired API key).
+    
+    This is a CRITICAL error that should stop the entire process
+    as continuing would be pointless without valid API access.
+    """
+
+    def __init__(self, message: str, api_key_prefix: str | None = None, **kwargs):
+        super().__init__(message, **kwargs)
+        # Store masked API key prefix for debugging (first 3 chars + ellipsis)
+        self.api_key_prefix = api_key_prefix[:3] + "…" if api_key_prefix and len(api_key_prefix) >= 3 else None
+        if self.api_key_prefix:
+            self.metadata["api_key_prefix"] = self.api_key_prefix
+
+
 class EmbeddingValidationError(EmbeddingError):
     """
     Raised when embedding validation fails (e.g., zero vector detected).
diff --git a/python/src/server/services/knowledge/knowledge_item_service.py b/python/src/server/services/knowledge/knowledge_item_service.py
index de8c9e0a..03d220c7 100644
--- a/python/src/server/services/knowledge/knowledge_item_service.py
+++ b/python/src/server/services/knowledge/knowledge_item_service.py
@@ -143,7 +143,7 @@ class KnowledgeItemService:
                     display_url = source_url
                 else:
                     display_url = first_urls.get(source_id, f"source://{source_id}")
-                
+
                 code_examples_count = code_example_counts.get(source_id, 0)
                 chunks_count = chunk_counts.get(source_id, 0)
 
diff --git a/python/src/server/services/knowledge/knowledge_summary_service.py b/python/src/server/services/knowledge/knowledge_summary_service.py
index cee03305..fc2c2088 100644
--- a/python/src/server/services/knowledge/knowledge_summary_service.py
+++ b/python/src/server/services/knowledge/knowledge_summary_service.py
@@ -5,9 +5,9 @@ Provides lightweight summary data for knowledge items to minimize data transfer.
 Optimized for frequent polling and card displays.
 """
 
-from typing import Any, Optional
+from typing import Any
 
-from ...config.logfire_config import safe_logfire_info, safe_logfire_error
+from ...config.logfire_config import safe_logfire_error, safe_logfire_info
 
 
 class KnowledgeSummaryService:
@@ -29,8 +29,8 @@ class KnowledgeSummaryService:
         self,
         page: int = 1,
         per_page: int = 20,
-        knowledge_type: Optional[str] = None,
-        search: Optional[str] = None,
+        knowledge_type: str | None = None,
+        search: str | None = None,
     ) -> dict[str, Any]:
         """
         Get lightweight summaries of knowledge items.
@@ -51,69 +51,69 @@ class KnowledgeSummaryService:
         """
         try:
             safe_logfire_info(f"Fetching knowledge summaries | page={page} | per_page={per_page}")
-            
+
             # Build base query - select only needed fields, including source_url
             query = self.supabase.from_("archon_sources").select(
                 "source_id, title, summary, metadata, source_url, created_at, updated_at"
             )
-            
+
             # Apply filters
             if knowledge_type:
                 query = query.contains("metadata", {"knowledge_type": knowledge_type})
-            
+
             if search:
                 search_pattern = f"%{search}%"
                 query = query.or_(
                     f"title.ilike.{search_pattern},summary.ilike.{search_pattern}"
                 )
-            
+
             # Get total count
             count_query = self.supabase.from_("archon_sources").select(
                 "*", count="exact", head=True
             )
-            
+
             if knowledge_type:
                 count_query = count_query.contains("metadata", {"knowledge_type": knowledge_type})
-            
+
             if search:
                 search_pattern = f"%{search}%"
                 count_query = count_query.or_(
                     f"title.ilike.{search_pattern},summary.ilike.{search_pattern}"
                 )
-            
+
             count_result = count_query.execute()
             total = count_result.count if hasattr(count_result, "count") else 0
-            
+
             # Apply pagination
             start_idx = (page - 1) * per_page
             query = query.range(start_idx, start_idx + per_page - 1)
             query = query.order("updated_at", desc=True)
-            
+
             # Execute main query
             result = query.execute()
             sources = result.data if result.data else []
-            
+
             # Get source IDs for batch operations
             source_ids = [s["source_id"] for s in sources]
-            
+
             # Batch fetch counts only (no content!)
             summaries = []
-            
+
             if source_ids:
                 # Get document counts in a single query
                 doc_counts = await self._get_document_counts_batch(source_ids)
-                
+
                 # Get code example counts in a single query
                 code_counts = await self._get_code_example_counts_batch(source_ids)
-                
+
                 # Get first URLs in a single query
                 first_urls = await self._get_first_urls_batch(source_ids)
-                
+
                 # Build summaries
                 for source in sources:
                     source_id = source["source_id"]
                     metadata = source.get("metadata", {})
-                    
+
                     # Use the original source_url from the source record (the URL the user entered)
                     # Fall back to first crawled page URL, then to source:// format as last resort
                     source_url = source.get("source_url")
@@ -121,9 +121,9 @@ class KnowledgeSummaryService:
                         first_url = source_url
                     else:
                         first_url = first_urls.get(source_id, f"source://{source_id}")
-                    
+
                     source_type = metadata.get("source_type", "file" if first_url.startswith("file://") else "url")
-                    
+
                     # Extract knowledge_type - check metadata first, otherwise default based on source content
                     # The metadata should always have it if it was crawled properly
                     knowledge_type = metadata.get("knowledge_type")
@@ -132,7 +132,7 @@ class KnowledgeSummaryService:
                         # This handles legacy data that might not have knowledge_type set
                         safe_logfire_info(f"Knowledge type not found in metadata for {source_id}, defaulting to technical")
                         knowledge_type = "technical"
-                    
+
                     summary = {
                         "source_id": source_id,
                         "title": source.get("title", source.get("summary", "Untitled")),
@@ -148,11 +148,11 @@ class KnowledgeSummaryService:
                         "metadata": metadata,  # Include full metadata for debugging
                     }
                     summaries.append(summary)
-            
+
             safe_logfire_info(
                 f"Knowledge summaries fetched | count={len(summaries)} | total={total}"
             )
-            
+
             return {
                 "items": summaries,
                 "total": total,
@@ -160,11 +160,11 @@ class KnowledgeSummaryService:
                 "per_page": per_page,
                 "pages": (total + per_page - 1) // per_page if per_page > 0 else 0,
             }
-            
+
         except Exception as e:
             safe_logfire_error(f"Failed to get knowledge summaries | error={str(e)}")
             raise
-    
+
     async def _get_document_counts_batch(self, source_ids: list[str]) -> dict[str, int]:
         """
         Get document counts for multiple sources in a single query.
@@ -179,7 +179,7 @@ class KnowledgeSummaryService:
             # Use a raw SQL query for efficient counting
             # Group by source_id and count
             counts = {}
-            
+
             # For now, use individual queries but optimize later with raw SQL
             for source_id in source_ids:
                 result = (
@@ -189,13 +189,13 @@ class KnowledgeSummaryService:
                     .execute()
                 )
                 counts[source_id] = result.count if hasattr(result, "count") else 0
-            
+
             return counts
-            
+
         except Exception as e:
             safe_logfire_error(f"Failed to get document counts | error={str(e)}")
-            return {sid: 0 for sid in source_ids}
-    
+            return dict.fromkeys(source_ids, 0)
+
     async def _get_code_example_counts_batch(self, source_ids: list[str]) -> dict[str, int]:
         """
         Get code example counts for multiple sources efficiently.
@@ -208,7 +208,7 @@ class KnowledgeSummaryService:
         """
         try:
             counts = {}
-            
+
             # For now, use individual queries but can optimize with raw SQL later
             for source_id in source_ids:
                 result = (
@@ -218,13 +218,13 @@ class KnowledgeSummaryService:
                     .execute()
                 )
                 counts[source_id] = result.count if hasattr(result, "count") else 0
-            
+
             return counts
-            
+
         except Exception as e:
             safe_logfire_error(f"Failed to get code example counts | error={str(e)}")
-            return {sid: 0 for sid in source_ids}
-    
+            return dict.fromkeys(source_ids, 0)
+
     async def _get_first_urls_batch(self, source_ids: list[str]) -> dict[str, str]:
         """
         Get first URL for each source in a batch.
@@ -244,21 +244,21 @@ class KnowledgeSummaryService:
                 .order("created_at", desc=False)
                 .execute()
             )
-            
+
             # Group by source_id, keeping first URL for each
             urls = {}
             for item in result.data or []:
                 source_id = item["source_id"]
                 if source_id not in urls:
                     urls[source_id] = item["url"]
-            
+
             # Provide defaults for any missing
             for source_id in source_ids:
                 if source_id not in urls:
                     urls[source_id] = f"source://{source_id}"
-            
+
             return urls
-            
+
         except Exception as e:
             safe_logfire_error(f"Failed to get first URLs | error={str(e)}")
-            return {sid: f"source://{sid}" for sid in source_ids}
\ No newline at end of file
+            return {sid: f"source://{sid}" for sid in source_ids}
diff --git a/python/src/server/services/search/hybrid_search_strategy.py b/python/src/server/services/search/hybrid_search_strategy.py
index caad26e6..acc660d4 100644
--- a/python/src/server/services/search/hybrid_search_strategy.py
+++ b/python/src/server/services/search/hybrid_search_strategy.py
@@ -191,4 +191,4 @@ class HybridSearchStrategy:
             except Exception as e:
                 logger.error(f"Hybrid code example search failed: {e}")
                 span.set_attribute("error", str(e))
-                return []
\ No newline at end of file
+                return []
diff --git a/python/src/server/services/search/rag_service.py b/python/src/server/services/search/rag_service.py
index cf89cffe..7356c0c0 100644
--- a/python/src/server/services/search/rag_service.py
+++ b/python/src/server/services/search/rag_service.py
@@ -117,7 +117,8 @@ class RAGService:
 
                 if not query_embedding:
                     logger.error("Failed to create embedding for query")
-                    return []
+                    # Follow fail-fast principle - embedding failure should not return empty results
+                    raise RuntimeError("Failed to create embedding for query - this indicates a configuration or API issue")
 
                 if use_hybrid_search:
                     # Use hybrid strategy
@@ -141,9 +142,22 @@ class RAGService:
                 return results
 
             except Exception as e:
+                # Import embedding exceptions for specific error handling
+                from ..embeddings.embedding_exceptions import (
+                    EmbeddingAPIError,
+                    EmbeddingAuthenticationError,
+                    EmbeddingQuotaExhaustedError,
+                    EmbeddingRateLimitError,
+                )
+
+                # Re-raise OpenAI embedding errors so they propagate to the API layer with specific error info
+                if isinstance(e, (EmbeddingAuthenticationError, EmbeddingQuotaExhaustedError, EmbeddingRateLimitError, EmbeddingAPIError)):
+                    raise
+
                 logger.error(f"Document search failed: {e}")
                 span.set_attribute("error", str(e))
-                return []
+                # Follow fail-fast principle - don't return empty results for legitimate failures
+                raise RuntimeError(f"Document search failed: {str(e)}") from e
 
     async def search_code_examples(
         self,
diff --git a/python/src/server/services/threading_service.py b/python/src/server/services/threading_service.py
index cc768418..21e199f7 100644
--- a/python/src/server/services/threading_service.py
+++ b/python/src/server/services/threading_service.py
@@ -91,7 +91,7 @@ class RateLimiter:
         """
         while True:  # Loop instead of recursion to avoid stack overflow
             wait_time_to_sleep = None
-            
+
             async with self._lock:
                 now = time.time()
 
@@ -104,7 +104,7 @@ class RateLimiter:
                     self.request_times.append(now)
                     self.token_usage.append((now, estimated_tokens))
                     return True
-                
+
                 # Calculate wait time if we can't make the request
                 wait_time = self._calculate_wait_time(estimated_tokens)
                 if wait_time > 0:
@@ -118,7 +118,7 @@ class RateLimiter:
                     wait_time_to_sleep = wait_time
                 else:
                     return False
-            
+
             # Sleep outside the lock to avoid deadlock
             if wait_time_to_sleep is not None:
                 # For long waits, break into smaller chunks with progress updates
diff --git a/python/src/server/utils/progress/progress_tracker.py b/python/src/server/utils/progress/progress_tracker.py
index 60a79363..6ebb8187 100644
--- a/python/src/server/utils/progress/progress_tracker.py
+++ b/python/src/server/utils/progress/progress_tracker.py
@@ -106,7 +106,7 @@ class ProgressTracker:
                 f"DEBUG: ProgressTracker.update called | status={status} | progress={progress} | "
                 f"current_state_progress={self.state.get('progress', 0)} | kwargs_keys={list(kwargs.keys())}"
             )
-        
+
         # CRITICAL: Never allow progress to go backwards
         current_progress = self.state.get("progress", 0)
         new_progress = min(100, max(0, progress))  # Ensure 0-100
@@ -129,7 +129,7 @@ class ProgressTracker:
             "log": log,
             "timestamp": datetime.now().isoformat(),
         })
-        
+
         # DEBUG: Log final state for document_storage
         if status == "document_storage" and actual_progress >= 35:
             safe_logfire_info(
@@ -155,10 +155,10 @@ class ProgressTracker:
         for key, value in kwargs.items():
             if key not in protected_fields:
                 self.state[key] = value
-        
+
 
         self._update_state()
-        
+
         # Schedule cleanup for terminal states
         if status in ["cancelled", "failed"]:
             asyncio.create_task(self._delayed_cleanup(self.progress_id))
@@ -189,7 +189,7 @@ class ProgressTracker:
         safe_logfire_info(
             f"Progress completed | progress_id={self.progress_id} | type={self.operation_type} | duration={self.state.get('duration_formatted', 'unknown')}"
         )
-        
+
         # Schedule cleanup after delay to allow clients to see final state
         asyncio.create_task(self._delayed_cleanup(self.progress_id))
 
@@ -214,7 +214,7 @@ class ProgressTracker:
         safe_logfire_error(
             f"Progress error | progress_id={self.progress_id} | type={self.operation_type} | error={error_message}"
         )
-        
+
         # Schedule cleanup after delay to allow clients to see final state
         asyncio.create_task(self._delayed_cleanup(self.progress_id))
 
@@ -241,9 +241,9 @@ class ProgressTracker:
         )
 
     async def update_crawl_stats(
-        self, 
-        processed_pages: int, 
-        total_pages: int, 
+        self,
+        processed_pages: int,
+        total_pages: int,
         current_url: str | None = None,
         pages_found: int | None = None
     ):
@@ -269,16 +269,16 @@ class ProgressTracker:
             "total_pages": total_pages,
             "current_url": current_url,
         }
-        
+
         if pages_found is not None:
             update_data["pages_found"] = pages_found
-            
+
         await self.update(**update_data)
 
     async def update_storage_progress(
-        self, 
-        chunks_stored: int, 
-        total_chunks: int, 
+        self,
+        chunks_stored: int,
+        total_chunks: int,
         operation: str = "storing",
         word_count: int | None = None,
         embeddings_created: int | None = None
@@ -294,7 +294,7 @@ class ProgressTracker:
             embeddings_created: Number of embeddings created
         """
         progress_val = int((chunks_stored / max(total_chunks, 1)) * 100)
-        
+
         update_data = {
             "status": "document_storage",
             "progress": progress_val,
@@ -302,14 +302,14 @@ class ProgressTracker:
             "chunks_stored": chunks_stored,
             "total_chunks": total_chunks,
         }
-        
+
         if word_count is not None:
             update_data["word_count"] = word_count
         if embeddings_created is not None:
             update_data["embeddings_created"] = embeddings_created
-            
+
         await self.update(**update_data)
-    
+
     async def update_code_extraction_progress(
         self,
         completed_summaries: int,
@@ -327,11 +327,11 @@ class ProgressTracker:
             current_file: Current file being processed
         """
         progress_val = int((completed_summaries / max(total_summaries, 1)) * 100)
-        
+
         log = f"Extracting code: {completed_summaries}/{total_summaries} summaries"
         if current_file:
             log += f" - {current_file}"
-        
+
         await self.update(
             status="code_extraction",
             progress=progress_val,
diff --git a/python/tests/test_openai_error_handling.py b/python/tests/test_openai_error_handling.py
new file mode 100644
index 00000000..2ad5a16a
--- /dev/null
+++ b/python/tests/test_openai_error_handling.py
@@ -0,0 +1,466 @@
+"""
+Comprehensive test suite for OpenAI error handling enhancements.
+
+Tests the complete error flow from embedding service through RAG service 
+to API endpoints, ensuring proper error propagation and message sanitization.
+
+Related to GitHub Issue #362.
+"""
+
+import pytest
+from unittest.mock import Mock, AsyncMock, patch
+from fastapi import HTTPException
+
+from src.server.services.embeddings.embedding_exceptions import (
+    EmbeddingAuthenticationError,
+    EmbeddingQuotaExhaustedError,
+    EmbeddingRateLimitError,
+    EmbeddingAPIError,
+)
+from src.server.services.search.rag_service import RAGService
+from src.server.api_routes.knowledge_api import perform_rag_query, RagQueryRequest, _sanitize_openai_error, _validate_openai_api_key
+
+
+class TestOpenAIErrorHandling:
+    """Test suite for OpenAI error handling in RAG queries."""
+
+    @pytest.mark.asyncio
+    async def test_quota_exhausted_error_propagation(self):
+        """Test that quota exhausted errors propagate correctly through service layer."""
+        
+        # Mock the create_embedding function to raise quota exhausted error
+        with patch("src.server.services.search.rag_service.create_embedding") as mock_create_embedding:
+            mock_create_embedding.side_effect = EmbeddingQuotaExhaustedError(
+                "OpenAI quota exhausted", tokens_used=1000
+            )
+            
+            # Create RAG service and test search_documents method
+            with patch("src.server.services.search.rag_service.get_supabase_client"):
+                rag_service = RAGService()
+                
+                # Should propagate the quota error, not return empty list
+                with pytest.raises(EmbeddingQuotaExhaustedError) as exc_info:
+                    await rag_service.search_documents("test query")
+                
+                assert "quota exhausted" in str(exc_info.value).lower()
+                assert exc_info.value.tokens_used == 1000
+
+    @pytest.mark.asyncio
+    async def test_authentication_error_propagation(self):
+        """Test that authentication errors propagate correctly through service layer."""
+        
+        # Mock the create_embedding function to raise authentication error
+        with patch("src.server.services.search.rag_service.create_embedding") as mock_create_embedding:
+            mock_create_embedding.side_effect = EmbeddingAuthenticationError(
+                "Invalid API key", api_key_prefix="sk-1234"
+            )
+            
+            # Create RAG service and test search_documents method
+            with patch("src.server.services.search.rag_service.get_supabase_client"):
+                rag_service = RAGService()
+                
+                # Should propagate the authentication error
+                with pytest.raises(EmbeddingAuthenticationError) as exc_info:
+                    await rag_service.search_documents("test query")
+                
+                assert "invalid api key" in str(exc_info.value).lower()
+                assert exc_info.value.api_key_prefix == "sk-1…"
+
+    @pytest.mark.asyncio
+    async def test_rate_limit_error_propagation(self):
+        """Test that rate limit errors propagate correctly through service layer."""
+        
+        # Mock the create_embedding function to raise rate limit error
+        with patch("src.server.services.search.rag_service.create_embedding") as mock_create_embedding:
+            mock_create_embedding.side_effect = EmbeddingRateLimitError(
+                "Rate limit exceeded"
+            )
+            
+            # Create RAG service and test search_documents method
+            with patch("src.server.services.search.rag_service.get_supabase_client"):
+                rag_service = RAGService()
+                
+                # Should propagate the rate limit error
+                with pytest.raises(EmbeddingRateLimitError) as exc_info:
+                    await rag_service.search_documents("test query")
+                
+                assert "rate limit" in str(exc_info.value).lower()
+
+    @pytest.mark.asyncio
+    async def test_api_quota_error_in_rag_endpoint(self):
+        """Test that quota errors are properly handled in RAG API endpoint."""
+        
+        request = RagQueryRequest(query="test query", match_count=5)
+        
+        # Mock RAGService to raise quota exhausted error
+        with patch("src.server.api_routes.knowledge_api.RAGService") as mock_rag_service_class, \
+             patch("src.server.api_routes.knowledge_api.get_supabase_client"):
+            
+            mock_service = Mock()
+            mock_service.perform_rag_query = AsyncMock(
+                side_effect=EmbeddingQuotaExhaustedError(
+                    "OpenAI quota exhausted", tokens_used=500
+                )
+            )
+            mock_rag_service_class.return_value = mock_service
+            
+            # Should raise HTTPException with status 429 and detailed error info
+            with pytest.raises(HTTPException) as exc_info:
+                await perform_rag_query(request)
+            
+            # Verify error details
+            assert exc_info.value.status_code == 429
+            assert "quota exhausted" in exc_info.value.detail["error"].lower()
+            assert "add credits" in exc_info.value.detail["message"].lower()
+            assert exc_info.value.detail["error_type"] == "quota_exhausted"
+            assert exc_info.value.detail["tokens_used"] == 500
+
+    @pytest.mark.asyncio
+    async def test_api_authentication_error_in_rag_endpoint(self):
+        """Test that authentication errors are properly handled in RAG API endpoint."""
+        
+        request = RagQueryRequest(query="test query", match_count=5)
+        
+        # Mock RAGService to raise authentication error
+        with patch("src.server.api_routes.knowledge_api.RAGService") as mock_rag_service_class, \
+             patch("src.server.api_routes.knowledge_api.get_supabase_client"):
+            
+            mock_service = Mock()
+            mock_service.perform_rag_query = AsyncMock(
+                side_effect=EmbeddingAuthenticationError(
+                    "Invalid API key", api_key_prefix="sk-1234"
+                )
+            )
+            mock_rag_service_class.return_value = mock_service
+            
+            # Should raise HTTPException with status 401 and detailed error info
+            with pytest.raises(HTTPException) as exc_info:
+                await perform_rag_query(request)
+            
+            # Verify error details
+            assert exc_info.value.status_code == 401
+            assert "authentication failed" in exc_info.value.detail["error"].lower()
+            assert "invalid or expired" in exc_info.value.detail["message"].lower()
+            assert exc_info.value.detail["error_type"] == "authentication_failed"
+            assert exc_info.value.detail["api_key_prefix"] == "sk-1…"
+
+    @pytest.mark.asyncio
+    async def test_api_rate_limit_error_in_rag_endpoint(self):
+        """Test that rate limit errors are properly handled in RAG API endpoint."""
+        
+        request = RagQueryRequest(query="test query", match_count=5)
+        
+        # Mock RAGService to raise rate limit error
+        with patch("src.server.api_routes.knowledge_api.RAGService") as mock_rag_service_class, \
+             patch("src.server.api_routes.knowledge_api.get_supabase_client"):
+            
+            mock_service = Mock()
+            mock_service.perform_rag_query = AsyncMock(
+                side_effect=EmbeddingRateLimitError("Rate limit exceeded")
+            )
+            mock_rag_service_class.return_value = mock_service
+            
+            # Should raise HTTPException with status 429 and detailed error info
+            with pytest.raises(HTTPException) as exc_info:
+                await perform_rag_query(request)
+            
+            # Verify error details
+            assert exc_info.value.status_code == 429
+            assert "rate limit" in exc_info.value.detail["error"].lower()
+            assert "too many requests" in exc_info.value.detail["message"].lower()
+            assert exc_info.value.detail["error_type"] == "rate_limit"
+            assert exc_info.value.detail["retry_after"] == 30
+
+    @pytest.mark.asyncio
+    async def test_api_generic_error_in_rag_endpoint(self):
+        """Test that generic API errors are properly handled in RAG API endpoint."""
+        
+        request = RagQueryRequest(query="test query", match_count=5)
+        
+        # Mock RAGService to raise generic API error
+        with patch("src.server.api_routes.knowledge_api.RAGService") as mock_rag_service_class, \
+             patch("src.server.api_routes.knowledge_api.get_supabase_client"):
+            
+            mock_service = Mock()
+            mock_service.perform_rag_query = AsyncMock(
+                side_effect=EmbeddingAPIError("Invalid model specified")
+            )
+            mock_rag_service_class.return_value = mock_service
+            
+            # Should raise HTTPException with status 502 and detailed error info
+            with pytest.raises(HTTPException) as exc_info:
+                await perform_rag_query(request)
+            
+            # Verify error details
+            assert exc_info.value.status_code == 502
+            assert "openai api error" in exc_info.value.detail["error"].lower()
+            assert exc_info.value.detail["error_type"] == "api_error"
+
+    @pytest.mark.asyncio
+    async def test_api_key_validation_success(self):
+        """Test that API key validation passes with valid key."""
+        
+        # Mock create_embedding to succeed
+        with patch("src.server.api_routes.knowledge_api.create_embedding") as mock_create_embedding:
+            mock_create_embedding.return_value = [0.1] * 1536  # Mock successful embedding
+            
+            # Should not raise any exception
+            await _validate_openai_api_key()
+
+    @pytest.mark.asyncio
+    async def test_api_key_validation_auth_failure(self):
+        """Test that API key validation raises 401 for authentication failures."""
+        
+        # Mock create_embedding to raise authentication error
+        with patch("src.server.api_routes.knowledge_api.create_embedding") as mock_create_embedding:
+            mock_create_embedding.side_effect = EmbeddingAuthenticationError(
+                "Invalid API key", api_key_prefix="sk-1234"
+            )
+            
+            # Should raise HTTPException with status 401
+            with pytest.raises(HTTPException) as exc_info:
+                await _validate_openai_api_key()
+            
+            assert exc_info.value.status_code == 401
+            assert exc_info.value.detail["error_type"] == "authentication_failed"
+
+    @pytest.mark.asyncio
+    async def test_api_key_validation_quota_failure(self):
+        """Test that API key validation raises 429 for quota exhaustion."""
+        
+        # Mock create_embedding to raise quota exhausted error
+        with patch("src.server.api_routes.knowledge_api.create_embedding") as mock_create_embedding:
+            mock_create_embedding.side_effect = EmbeddingQuotaExhaustedError(
+                "Quota exhausted", tokens_used=1000
+            )
+            
+            # Should raise HTTPException with status 429
+            with pytest.raises(HTTPException) as exc_info:
+                await _validate_openai_api_key()
+            
+            assert exc_info.value.status_code == 429
+            assert exc_info.value.detail["error_type"] == "quota_exhausted"
+
+    def test_sanitize_openai_error_removes_urls(self):
+        """Test that sanitization function removes URLs from error messages."""
+        error_message = "Connection failed to https://api.openai.com/v1/embeddings with status 400"
+        sanitized = _sanitize_openai_error(error_message)
+        
+        assert "https://api.openai.com" not in sanitized
+        assert "[REDACTED_URL]" in sanitized
+        assert "Connection failed" in sanitized
+
+    def test_sanitize_openai_error_removes_api_keys(self):
+        """Test that sanitization function removes API keys from error messages."""
+        error_message = "Authentication failed with key sk-1234567890abcdef1234567890abcdef1234567890abcdef"
+        sanitized = _sanitize_openai_error(error_message)
+        
+        assert "sk-1234567890abcdef1234567890abcdef1234567890abcdef" not in sanitized
+        assert "[REDACTED_KEY]" in sanitized
+        assert "Authentication failed" in sanitized
+
+    def test_sanitize_openai_error_removes_auth_info(self):
+        """Test that sanitization function removes auth details from error messages."""
+        error_message = 'Failed to authenticate: "auth_bearer_xyz123"'
+        sanitized = _sanitize_openai_error(error_message)
+        
+        assert "auth_bearer_xyz123" not in sanitized
+        assert "[REDACTED_AUTH]" in sanitized
+
+    def test_sanitize_openai_error_returns_generic_for_sensitive_words(self):
+        """Test that sanitization returns generic message for sensitive internal details."""
+        error_message = "Internal server error on endpoint /v1/embeddings"
+        sanitized = _sanitize_openai_error(error_message)
+        
+        # Should return generic message due to 'internal' and 'endpoint' keywords
+        assert sanitized == "OpenAI API encountered an error. Please verify your API key and quota."
+
+    def test_sanitize_openai_error_preserves_safe_messages(self):
+        """Test that sanitization preserves safe error messages."""
+        error_message = "Model not found: text-embedding-ada-002"
+        sanitized = _sanitize_openai_error(error_message)
+        
+        # Should preserve the message since it contains no sensitive info
+        assert sanitized == error_message
+
+    def test_sanitize_openai_error_removes_organization_ids(self):
+        """Test that sanitization function removes OpenAI organization IDs."""
+        error_message = "Permission denied for org-1234567890abcdef12345678 with model access"
+        sanitized = _sanitize_openai_error(error_message)
+        
+        assert "org-1234567890abcdef12345678" not in sanitized
+        assert "[REDACTED_ORG]" in sanitized
+        assert "Permission denied" in sanitized
+
+    def test_sanitize_openai_error_removes_project_ids(self):
+        """Test that sanitization function removes OpenAI project IDs."""
+        error_message = "Project proj_abcdef1234567890xyz not found"
+        sanitized = _sanitize_openai_error(error_message)
+        
+        assert "proj_abcdef1234567890xyz" not in sanitized
+        assert "[REDACTED_PROJ]" in sanitized
+        assert "Project" in sanitized and "not found" in sanitized
+
+    def test_sanitize_openai_error_removes_request_ids(self):
+        """Test that sanitization function removes OpenAI request IDs."""
+        error_message = "Request req_1234567890abcdefghij failed with timeout"
+        sanitized = _sanitize_openai_error(error_message)
+        
+        assert "req_1234567890abcdefghij" not in sanitized
+        assert "[REDACTED_REQ]" in sanitized
+        assert "Request" in sanitized and "failed with timeout" in sanitized
+
+    def test_sanitize_openai_error_handles_multiple_patterns(self):
+        """Test that sanitization handles multiple sensitive patterns in one message."""
+        error_message = "Request req_abc123 to https://api.openai.com failed for org-1234567890abcdef12345678 with key sk-1234567890abcdef1234567890abcdef1234567890abcdef"
+        sanitized = _sanitize_openai_error(error_message)
+        
+        # Verify all patterns are redacted
+        assert "req_abc123" not in sanitized
+        assert "https://api.openai.com" not in sanitized
+        assert "org-1234567890abcdef12345678" not in sanitized
+        assert "sk-1234567890abcdef1234567890abcdef1234567890abcdef" not in sanitized
+        
+        # Verify redacted placeholders are present
+        assert "[REDACTED_REQ]" in sanitized
+        assert "[REDACTED_URL]" in sanitized
+        assert "[REDACTED_ORG]" in sanitized
+        assert "[REDACTED_KEY]" in sanitized
+
+    def test_sanitize_openai_error_input_validation(self):
+        """Test that sanitization function handles invalid input gracefully."""
+        # Test None input
+        result = _sanitize_openai_error(None)
+        assert result == "OpenAI API encountered an error. Please verify your API key and quota."
+        
+        # Test non-string input
+        result = _sanitize_openai_error(123)
+        assert result == "OpenAI API encountered an error. Please verify your API key and quota."
+        
+        # Test empty string
+        result = _sanitize_openai_error("")
+        assert result == "OpenAI API encountered an error. Please verify your API key and quota."
+        
+        # Test whitespace-only string
+        result = _sanitize_openai_error("   ")
+        assert result == "OpenAI API encountered an error. Please verify your API key and quota."
+
+    @pytest.mark.asyncio
+    async def test_fail_fast_pattern_embedding_failure(self):
+        """Test that embedding failures now fail fast instead of returning empty results."""
+        
+        # Mock the create_embedding function to return None (failure)
+        with patch("src.server.services.search.rag_service.create_embedding") as mock_create_embedding:
+            mock_create_embedding.return_value = None
+            
+            # Create RAG service and test search_documents method
+            with patch("src.server.services.search.rag_service.get_supabase_client"):
+                rag_service = RAGService()
+                
+                # Should raise RuntimeError instead of returning empty list
+                with pytest.raises(RuntimeError) as exc_info:
+                    await rag_service.search_documents("test query")
+                
+                assert "Failed to create embedding" in str(exc_info.value)
+                assert "configuration or API issue" in str(exc_info.value)
+
+    @pytest.mark.asyncio
+    async def test_fail_fast_pattern_search_failure(self):
+        """Test that search failures now fail fast instead of returning empty results."""
+        
+        # Mock the create_embedding to succeed but vector search to fail
+        with patch("src.server.services.search.rag_service.create_embedding") as mock_create_embedding:
+            mock_create_embedding.return_value = [0.1] * 1536  # Mock embedding vector
+            
+            with patch("src.server.services.search.rag_service.get_supabase_client"):
+                rag_service = RAGService()
+                
+                # Mock the base strategy to raise an exception
+                with patch.object(rag_service.base_strategy, 'vector_search', side_effect=Exception("Database connection failed")):
+                    
+                    # Should raise RuntimeError instead of returning empty list
+                    with pytest.raises(RuntimeError) as exc_info:
+                        await rag_service.search_documents("test query")
+                    
+                    assert "Document search failed" in str(exc_info.value)
+                    assert "Database connection failed" in str(exc_info.value)
+
+    @pytest.mark.asyncio
+    async def test_integration_error_flow_rag_to_api(self):
+        """Test complete error flow from RAG service through API endpoint."""
+        
+        request = RagQueryRequest(query="test query", match_count=5)
+        
+        # Mock both RAGService and get_supabase_client to avoid real connections
+        with patch("src.server.api_routes.knowledge_api.RAGService") as mock_rag_service_class, \
+             patch("src.server.api_routes.knowledge_api.get_supabase_client"):
+            
+            mock_service = Mock()
+            mock_service.perform_rag_query = AsyncMock(
+                side_effect=RuntimeError("Document search failed: Database connection failed")
+            )
+            mock_rag_service_class.return_value = mock_service
+            
+            # Should raise HTTPException with generic error (not OpenAI specific)
+            with pytest.raises(HTTPException) as exc_info:
+                await perform_rag_query(request)
+            
+            # Verify error details
+            assert exc_info.value.status_code == 500
+            assert "RAG query failed" in exc_info.value.detail["error"]
+            assert "Database connection failed" in exc_info.value.detail["error"]
+
+    @pytest.mark.asyncio
+    async def test_api_error_sanitization_in_endpoint(self):
+        """Test that API errors are sanitized in the RAG endpoint response."""
+        
+        request = RagQueryRequest(query="test query", match_count=5)
+        
+        # Mock RAGService to raise API error with sensitive information
+        with patch("src.server.api_routes.knowledge_api.RAGService") as mock_rag_service_class, \
+             patch("src.server.api_routes.knowledge_api.get_supabase_client"):
+            
+            mock_service = Mock()
+            mock_service.perform_rag_query = AsyncMock(
+                side_effect=EmbeddingAPIError("Request failed to https://api.openai.com/v1/embeddings with key sk-1234567890abcdef1234567890abcdef1234567890abcdef")
+            )
+            mock_rag_service_class.return_value = mock_service
+            
+            # Should raise HTTPException with sanitized error message
+            with pytest.raises(HTTPException) as exc_info:
+                await perform_rag_query(request)
+            
+            # Verify error is sanitized
+            error_message = exc_info.value.detail["message"]
+            assert "sk-1234567890abcdef1234567890abcdef1234567890abcdef" not in error_message
+            assert "https://api.openai.com" not in error_message
+            assert "[REDACTED_KEY]" in error_message
+            assert "[REDACTED_URL]" in error_message
+
+    @pytest.mark.asyncio
+    async def test_successful_rag_query_flow(self):
+        """Test that successful RAG queries work correctly."""
+        
+        request = RagQueryRequest(query="test query", match_count=5)
+        
+        # Mock RAGService to return successful result
+        mock_result = {
+            "results": [{"content": "test result"}],
+            "query": "test query",
+            "total_found": 1,
+        }
+        
+        with patch("src.server.api_routes.knowledge_api.RAGService") as mock_rag_service_class, \
+             patch("src.server.api_routes.knowledge_api.get_supabase_client"):
+            
+            mock_service = Mock()
+            mock_service.perform_rag_query = AsyncMock(return_value=(True, mock_result))
+            mock_rag_service_class.return_value = mock_service
+            
+            # Should return successful result
+            result = await perform_rag_query(request)
+            
+            # Verify successful response
+            assert result["success"] is True
+            assert result["results"] == [{"content": "test result"}]
+            assert result["total_found"] == 1
\ No newline at end of file