mirror of https://github.com/coleam00/Archon.git synced 2025-12-24 02:39:17 -05:00

Files

Rasmus Widing 1a0a850e24 feat: add step-by-step migration strategy for multi-dimensional embeddings

Adds alternative migration approach for users on Supabase free tier who encounter timeouts when running the full upgrade_database.sql script. Breaks migration into 4 manageable steps to avoid memory/timeout issues when creating vector indexes on large datasets.

2025-09-17 11:41:06 +03:00

3.9 KiB

Raw Blame History

Archon Database Migration Guide

Problem: Supabase SQL Editor Timeout

The full migration script times out in Supabase SQL editor due to memory-intensive vector index creation.

Solution: Run Migration in Steps

Method 1: Use Step-by-Step Scripts (Recommended)

Run these scripts in order in the Supabase SQL editor:

Step 1: step1_add_columns.sql - Adds new columns (fast, ~5 seconds)
Step 2: step2_migrate_data.sql - Migrates existing data (fast, ~10 seconds)
Step 3: step3_create_functions.sql - Creates search functions (fast, ~5 seconds)
Step 4: step4_create_indexes_optional.sql - Creates indexes (may timeout - OPTIONAL)

Note: If Step 4 times out, the system will still work using brute-force search. You can create indexes later.

Method 2: Direct Database Connection

Connect directly to your Supabase database using psql or a database client:

Get Connection String

Go to Supabase Dashboard → Settings → Database
Copy the connection string (use "Session pooler" for migrations)
It looks like: postgresql://postgres.[project-ref]:[password]@aws-0-[region].pooler.supabase.com:6543/postgres

Using psql

# Connect to database
psql "postgresql://postgres.[project-ref]:[password]@aws-0-[region].pooler.supabase.com:6543/postgres"

# Run the full migration
\i migration/upgrade_database_with_memory_fix.sql

# Or run individual steps
\i migration/step1_add_columns.sql
\i migration/step2_migrate_data.sql
\i migration/step3_create_functions.sql
\i migration/step4_create_indexes_optional.sql

Using TablePlus/DBeaver/pgAdmin

Create new connection with your connection string
Open and run each SQL file in order
Monitor execution time and memory usage

Method 3: Use Supabase CLI

# Install Supabase CLI
npm install -g supabase

# Login and link project
supabase login
supabase link --project-ref [your-project-ref]

# Run migration
supabase db push migration/upgrade_database_with_memory_fix.sql

Method 4: Skip Vector Indexes Entirely

If you have a small dataset (<10,000 documents), you can skip Step 4 entirely. The system will use brute-force search which is fast enough for small datasets.

Verification

After migration, run this query to verify:

SELECT
    EXISTS(SELECT 1 FROM information_schema.columns
           WHERE table_name = 'archon_crawled_pages'
           AND column_name = 'embedding_1536') as has_1536_column,
    EXISTS(SELECT 1 FROM information_schema.routines
           WHERE routine_name = 'match_archon_crawled_pages_multi') as has_multi_function,
    COUNT(*) as index_count
FROM pg_indexes
WHERE tablename IN ('archon_crawled_pages', 'archon_code_examples')
AND indexname LIKE '%embedding%';

Expected result:

has_1536_column: true
has_multi_function: true
index_count: 8+ (or 0 if you skipped Step 4)

Troubleshooting

"Memory required" error

Increase maintenance_work_mem in the script
Use direct database connection instead of SQL editor
Create indexes one at a time

"Statement timeout" error

Run scripts in smaller steps
Use direct database connection
Increase statement_timeout setting

"Permission denied" error

Ensure you're using the service role key
Check database permissions in Supabase dashboard

Post-Migration

After successful migration:

Restart services:
```
docker compose restart
```
Test the system:
- Check if RAG search works in the UI
- Try crawling a new website
- Verify embeddings are being created
Monitor performance:
- If searches are slow without indexes, create them via direct connection
- Consider using smaller embedding dimensions (384 or 768) for faster performance

Need Help?

If you encounter issues:

Check Supabase logs: Dashboard → Logs → Postgres
Verify your Supabase plan has sufficient resources
Contact Supabase support for memory limit increases (paid plans only)

3.9 KiB Raw Blame History