Reduced the size of sentence-transformers by making it CPU only, including reranking by default now

2025-12-24 02:39:17 -05:00 · 2025-08-30 11:52:40 -05:00
parent 9f22659f4c
commit 9bb35566a5
3 changed files with 11 additions and 3 deletions
--- a/README.md
+++ b/README.md
@@ -70,8 +70,6 @@ This new vision for Archon replaces the old one (the agenteer). Archon used to b
   - For cloud Supabase: they recently introduced a new type of service role key but use the legacy one (the longer one).
   - For local Supabase: set SUPABASE_URL to http://host.docker.internal:8000 (unless you have an IP address set up).

-   OPTIONAL: If you want to enable the reranking RAG strategy, add " --group server-reranking" to the end of the uv install on line 18 of `python/server/Dockerfile.server`. This will significantly increase the size of the Archon Server container which is why it's off by default.
-
 3. **Database Setup**: In your [Supabase project](https://supabase.com/dashboard) SQL Editor, copy, paste, and execute the contents of `migration/complete_setup.sql`

 4. **Start Services** (choose one):
--- a/python/Dockerfile.server
+++ b/python/Dockerfile.server
@@ -15,7 +15,7 @@ COPY pyproject.toml .
 # Install server dependencies to a virtual environment using uv
 RUN uv venv /venv && \
    . /venv/bin/activate && \
-    uv pip install --group server
+    uv pip install --group server --group server-reranking

 # Runtime stage
 FROM python:3.12-slim
--- a/python/pyproject.toml
+++ b/python/pyproject.toml
@@ -7,6 +7,16 @@ requires-python = ">=3.12"
 # Base dependencies - empty since we're using dependency groups
 dependencies = []

+# PyTorch CPU-only index configuration
+[[tool.uv.index]]
+name = "pytorch-cpu"
+url = "https://download.pytorch.org/whl/cpu"
+explicit = true
+
+# Sources configuration to use CPU-only PyTorch
+[tool.uv.sources]
+torch = [{ index = "pytorch-cpu" }]
+
 [dependency-groups]
 # Development dependencies for linting and testing
 dev = [