mirrors/archon

Fork 0

mirror of https://github.com/coleam00/Archon.git synced 2025-12-24 02:39:17 -05:00

Files

Rasmus Widing 9a60d6ae89 sauce aow

2025-10-16 19:17:18 +03:00

31 KiB

Raw Blame History

Feature: Incremental Step History Tracking for Real-Time Workflow Observability

Feature Description

Enable real-time progress visibility for Agent Work Orders by saving step history incrementally after each workflow step completes, rather than waiting until the end. This critical observability fix allows users to monitor workflow execution in real-time via the /agent-work-orders/{id}/steps API endpoint, providing immediate feedback on which steps have completed, which are in progress, and which have failed.

Currently, step history is only saved at two points: when the entire workflow completes successfully (line 260 in orchestrator) or when the workflow fails with an exception (line 269). This means users polling the steps endpoint see zero progress information until the workflow reaches one of these terminal states, creating a black-box execution experience that can last several minutes.

User Story

As a developer using the Agent Work Orders system I want to see real-time progress as each workflow step completes So that I can monitor execution, debug failures quickly, and understand what the system is doing without waiting for the entire workflow to finish

Problem Statement

The current implementation has a critical observability gap that prevents real-time progress tracking:

Root Cause:

Step history is initialized at workflow start: step_history = StepHistory(agent_work_order_id=agent_work_order_id) (line 82)
After each step executes, results are appended: step_history.steps.append(result) (lines 130, 150, 166, 186, 205, 224, 241)
BUT step history is only saved to state at:
- Line 260: await self.state_repository.save_step_history(...) - After ALL 7 steps complete successfully
- Line 269: await self.state_repository.save_step_history(...) - In exception handler when workflow fails

Impact:

Zero Real-Time Visibility: Users polling /agent-work-orders/{id}/steps see an empty array until workflow completes or fails
Poor Debugging Experience: Cannot see which step failed until the entire workflow terminates
Uncertain Progress: Long-running workflows (3-5 minutes) appear frozen with no progress indication
Wasted API Calls: Clients poll repeatedly but get no new information until terminal state
Bad User Experience: Cannot show meaningful progress bars, step indicators, or real-time status updates in UI

Example Scenario:

User creates work order → Polls /steps endpoint every 3 seconds
  0s: [] (empty)
  3s: [] (empty)
  6s: [] (empty)
  ... workflow running ...
  120s: [] (empty)
  123s: [] (empty)
  ... workflow running ...
  180s: [all 7 steps] (suddenly all appear at once)

This creates a frustrating experience where users have no insight into what's happening for minutes at a time.

Solution Statement

Implement incremental step history persistence by adding a single await self.state_repository.save_step_history() call immediately after each step result is appended to the history. This simple change enables real-time progress tracking with minimal code modification and zero performance impact.

Implementation:

After each step_history.steps.append(result) call, immediately save: await self.state_repository.save_step_history(agent_work_order_id, step_history)
Apply this pattern consistently across all 7 workflow steps
Preserve existing end-of-workflow and error-handler saves for robustness
No changes needed to API, models, or state repository (already supports incremental saves)

Result:

User creates work order → Polls /steps endpoint every 3 seconds
  0s: [] (empty - workflow starting)
  3s: [{classify step}] (classification complete!)
  10s: [{classify}, {plan}] (planning complete!)
  20s: [{classify}, {plan}, {find_plan}] (plan file found!)
  ... progress visible at each step ...
  180s: [all 7 steps] (complete with full history)

This provides immediate feedback, enables meaningful progress UIs, and dramatically improves the developer experience.

Relevant Files

Use these files to implement the feature:

Core Implementation:

python/src/agent_work_orders/workflow_engine/workflow_orchestrator.py (lines 122-269)
- Main orchestration logic where step history is managed
- Currently appends to step_history but doesn't save incrementally
- Need to add save_step_history() calls after each step completion (7 locations)
- Lines to modify: 130, 150, 166, 186, 205, 224, 241 (add save call after each append)

State Management (No Changes Needed):

python/src/agent_work_orders/state_manager/work_order_repository.py (lines 147-163)
- Already implements save_step_history() method with proper locking
- Thread-safe with asyncio.Lock for concurrent access
- Logs each save operation for observability
- Works perfectly for incremental saves - no modifications required

API Layer (No Changes Needed):

python/src/agent_work_orders/api/routes.py (lines 220-240)
- Already implements GET /agent-work-orders/{id}/steps endpoint
- Returns step history from state repository
- Will automatically return incremental results once orchestrator saves them

Models (No Changes Needed):

python/src/agent_work_orders/models.py (lines 213-246)
- StepHistory model is immutable-friendly (each save creates full snapshot)
- StepExecutionResult captures all step details
- Models already support incremental history updates

New Files

No new files needed - this is a simple enhancement to existing workflow orchestrator.

Implementation Plan

Phase 1: Foundation - Add Incremental Saves After Each Step

Add save_step_history() calls immediately after each step result is appended to enable real-time progress tracking. This is the core fix.

Phase 2: Testing - Verify Real-Time Updates

Create comprehensive tests to verify step history is saved incrementally and accessible via API throughout workflow execution.

Phase 3: Validation - End-to-End Testing

Validate with real workflow execution that step history appears incrementally when polling the steps endpoint.

Step by Step Tasks

IMPORTANT: Execute every step in order, top to bottom.

Read Current Implementation

Open python/src/agent_work_orders/workflow_engine/workflow_orchestrator.py
Review the workflow execution flow from lines 122-269
Identify all 7 locations where step_history.steps.append() is called
Understand the pattern: append result → log completion → (currently missing: save history)
Note that save_step_history() already exists in state_repository and is thread-safe

Add Incremental Save After Classify Step

Locate line 130: step_history.steps.append(classify_result)

Immediately after line 130, add:

await self.state_repository.save_step_history(agent_work_order_id, step_history)

This enables visibility of classification result in real-time
Save the file

Add Incremental Save After Plan Step

Locate line 150: step_history.steps.append(plan_result)

Immediately after line 150, add:

await self.state_repository.save_step_history(agent_work_order_id, step_history)

This enables visibility of planning result in real-time
Save the file

Add Incremental Save After Find Plan Step

Locate line 166: step_history.steps.append(plan_finder_result)

Immediately after line 166, add:

await self.state_repository.save_step_history(agent_work_order_id, step_history)

This enables visibility of plan file discovery in real-time
Save the file

Add Incremental Save After Branch Generation Step

Locate line 186: step_history.steps.append(branch_result)

Immediately after line 186, add:

await self.state_repository.save_step_history(agent_work_order_id, step_history)

This enables visibility of branch creation in real-time
Save the file

Add Incremental Save After Implementation Step

Locate line 205: step_history.steps.append(implement_result)

Immediately after line 205, add:

await self.state_repository.save_step_history(agent_work_order_id, step_history)

This enables visibility of implementation progress in real-time
This is especially important as implementation can take 1-2 minutes
Save the file

Add Incremental Save After Commit Step

Locate line 224: step_history.steps.append(commit_result)

Immediately after line 224, add:

await self.state_repository.save_step_history(agent_work_order_id, step_history)

This enables visibility of commit creation in real-time
Save the file

Add Incremental Save After PR Creation Step

Locate line 241: step_history.steps.append(pr_result)

Immediately after line 241, add:

await self.state_repository.save_step_history(agent_work_order_id, step_history)

This enables visibility of PR creation result in real-time
Save the file
Verify all 7 locations now have incremental saves

Add Comprehensive Unit Test for Incremental Saves

Open python/tests/agent_work_orders/test_workflow_engine.py

Add new test function at the end of file:

@pytest.mark.asyncio
async def test_orchestrator_saves_step_history_incrementally():
    """Test that step history is saved after each step, not just at the end"""
    from src.agent_work_orders.models import (
        CommandExecutionResult,
        StepExecutionResult,
        WorkflowStep,
    )
    from src.agent_work_orders.workflow_engine.agent_names import CLASSIFIER

    # Create mocks
    mock_executor = MagicMock()
    mock_sandbox_factory = MagicMock()
    mock_github_client = MagicMock()
    mock_phase_tracker = MagicMock()
    mock_command_loader = MagicMock()
    mock_state_repository = MagicMock()

    # Track save_step_history calls
    save_calls = []
    async def track_save(wo_id, history):
        save_calls.append(len(history.steps))

    mock_state_repository.save_step_history = AsyncMock(side_effect=track_save)
    mock_state_repository.update_status = AsyncMock()
    mock_state_repository.update_git_branch = AsyncMock()

    # Mock sandbox
    mock_sandbox = MagicMock()
    mock_sandbox.working_dir = "/tmp/test"
    mock_sandbox.setup = AsyncMock()
    mock_sandbox.cleanup = AsyncMock()
    mock_sandbox_factory.create_sandbox = MagicMock(return_value=mock_sandbox)

    # Mock GitHub client
    mock_github_client.get_issue = AsyncMock(return_value={
        "title": "Test Issue",
        "body": "Test body"
    })

    # Create orchestrator
    orchestrator = WorkflowOrchestrator(
        agent_executor=mock_executor,
        sandbox_factory=mock_sandbox_factory,
        github_client=mock_github_client,
        phase_tracker=mock_phase_tracker,
        command_loader=mock_command_loader,
        state_repository=mock_state_repository,
    )

    # Mock workflow operations to return success for all steps
    with patch("src.agent_work_orders.workflow_engine.workflow_operations.classify_issue") as mock_classify:
        with patch("src.agent_work_orders.workflow_engine.workflow_operations.build_plan") as mock_plan:
            with patch("src.agent_work_orders.workflow_engine.workflow_operations.find_plan_file") as mock_find:
                with patch("src.agent_work_orders.workflow_engine.workflow_operations.generate_branch") as mock_branch:
                    with patch("src.agent_work_orders.workflow_engine.workflow_operations.implement_plan") as mock_implement:
                        with patch("src.agent_work_orders.workflow_engine.workflow_operations.create_commit") as mock_commit:
                            with patch("src.agent_work_orders.workflow_engine.workflow_operations.create_pull_request") as mock_pr:

                                # Mock successful results for each step
                                mock_classify.return_value = StepExecutionResult(
                                    step=WorkflowStep.CLASSIFY,
                                    agent_name=CLASSIFIER,
                                    success=True,
                                    output="/feature",
                                    duration_seconds=1.0,
                                )

                                mock_plan.return_value = StepExecutionResult(
                                    step=WorkflowStep.PLAN,
                                    agent_name="planner",
                                    success=True,
                                    output="Plan created",
                                    duration_seconds=2.0,
                                )

                                mock_find.return_value = StepExecutionResult(
                                    step=WorkflowStep.FIND_PLAN,
                                    agent_name="plan_finder",
                                    success=True,
                                    output="specs/plan.md",
                                    duration_seconds=0.5,
                                )

                                mock_branch.return_value = StepExecutionResult(
                                    step=WorkflowStep.GENERATE_BRANCH,
                                    agent_name="branch_generator",
                                    success=True,
                                    output="feat-issue-1-wo-test",
                                    duration_seconds=1.0,
                                )

                                mock_implement.return_value = StepExecutionResult(
                                    step=WorkflowStep.IMPLEMENT,
                                    agent_name="implementor",
                                    success=True,
                                    output="Implementation complete",
                                    duration_seconds=5.0,
                                )

                                mock_commit.return_value = StepExecutionResult(
                                    step=WorkflowStep.COMMIT,
                                    agent_name="committer",
                                    success=True,
                                    output="Commit created",
                                    duration_seconds=1.0,
                                )

                                mock_pr.return_value = StepExecutionResult(
                                    step=WorkflowStep.CREATE_PR,
                                    agent_name="pr_creator",
                                    success=True,
                                    output="https://github.com/owner/repo/pull/1",
                                    duration_seconds=1.0,
                                )

                                # Execute workflow
                                await orchestrator.execute_workflow(
                                    agent_work_order_id="wo-test",
                                    workflow_type=AgentWorkflowType.PLAN,
                                    repository_url="https://github.com/owner/repo",
                                    sandbox_type=SandboxType.GIT_BRANCH,
                                    user_request="Test feature request",
                                )

    # Verify save_step_history was called after EACH step (7 times) + final save (8 total)
    # OR at minimum, verify it was called MORE than just once at the end
    assert len(save_calls) >= 7, f"Expected at least 7 incremental saves, got {len(save_calls)}"

    # Verify the progression: 1 step, 2 steps, 3 steps, etc.
    assert save_calls[0] == 1, "First save should have 1 step"
    assert save_calls[1] == 2, "Second save should have 2 steps"
    assert save_calls[2] == 3, "Third save should have 3 steps"
    assert save_calls[3] == 4, "Fourth save should have 4 steps"
    assert save_calls[4] == 5, "Fifth save should have 5 steps"
    assert save_calls[5] == 6, "Sixth save should have 6 steps"
    assert save_calls[6] == 7, "Seventh save should have 7 steps"

Save the file

Add Integration Test for Real-Time Step Visibility

Still in python/tests/agent_work_orders/test_workflow_engine.py

Add another test function:

@pytest.mark.asyncio
async def test_step_history_visible_during_execution():
    """Test that step history can be retrieved during workflow execution"""
    from src.agent_work_orders.models import StepHistory

    # Create real state repository (in-memory)
    from src.agent_work_orders.state_manager.work_order_repository import WorkOrderRepository
    state_repo = WorkOrderRepository()

    # Create empty step history
    step_history = StepHistory(agent_work_order_id="wo-test")

    # Simulate incremental saves during workflow
    from src.agent_work_orders.models import StepExecutionResult, WorkflowStep

    # Step 1: Classify
    step_history.steps.append(StepExecutionResult(
        step=WorkflowStep.CLASSIFY,
        agent_name="classifier",
        success=True,
        output="/feature",
        duration_seconds=1.0,
    ))
    await state_repo.save_step_history("wo-test", step_history)

    # Retrieve and verify
    retrieved = await state_repo.get_step_history("wo-test")
    assert retrieved is not None
    assert len(retrieved.steps) == 1
    assert retrieved.steps[0].step == WorkflowStep.CLASSIFY

    # Step 2: Plan
    step_history.steps.append(StepExecutionResult(
        step=WorkflowStep.PLAN,
        agent_name="planner",
        success=True,
        output="Plan created",
        duration_seconds=2.0,
    ))
    await state_repo.save_step_history("wo-test", step_history)

    # Retrieve and verify progression
    retrieved = await state_repo.get_step_history("wo-test")
    assert len(retrieved.steps) == 2
    assert retrieved.steps[1].step == WorkflowStep.PLAN

    # Verify both steps are present
    assert retrieved.steps[0].step == WorkflowStep.CLASSIFY
    assert retrieved.steps[1].step == WorkflowStep.PLAN

Save the file

Run Unit Tests for Workflow Engine

Execute: cd python && uv run pytest tests/agent_work_orders/test_workflow_engine.py::test_orchestrator_saves_step_history_incrementally -v
Verify the test passes and confirms incremental saves occur
Execute: cd python && uv run pytest tests/agent_work_orders/test_workflow_engine.py::test_step_history_visible_during_execution -v
Verify the test passes
Fix any failures before proceeding

Run All Workflow Engine Tests

Execute: cd python && uv run pytest tests/agent_work_orders/test_workflow_engine.py -v
Ensure all existing tests still pass (zero regressions)
Verify new tests are included in the run
Fix any failures

Run Complete Agent Work Orders Test Suite

Execute: cd python && uv run pytest tests/agent_work_orders/ -v
Ensure all tests across all modules pass
This validates no regressions were introduced
Pay special attention to state manager and API tests
Fix any failures

Run Type Checking

Execute: cd python && uv run mypy src/agent_work_orders/workflow_engine/workflow_orchestrator.py
Verify no type errors in the orchestrator
Execute: cd python && uv run mypy src/agent_work_orders/
Verify no type errors in the entire module
Fix any type issues

Run Linting

Execute: cd python && uv run ruff check src/agent_work_orders/workflow_engine/workflow_orchestrator.py
Verify no linting issues in orchestrator
Execute: cd python && uv run ruff check src/agent_work_orders/
Verify no linting issues in entire module
Fix any issues found

Perform Manual End-to-End Validation

Start the Agent Work Orders server:

cd python && uv run uvicorn src.agent_work_orders.main:app --port 8888 &

Wait for startup: sleep 5
Verify health: curl http://localhost:8888/health | jq

Create a test work order:

WORK_ORDER_ID=$(curl -s -X POST http://localhost:8888/agent-work-orders \
  -H "Content-Type: application/json" \
  -d '{
    "repository_url": "https://github.com/Wirasm/dylan.git",
    "sandbox_type": "git_branch",
    "workflow_type": "agent_workflow_plan",
    "user_request": "Add a test feature for real-time step tracking validation"
  }' | jq -r '.agent_work_order_id')
echo "Created work order: $WORK_ORDER_ID"

Immediately start polling for steps (in a loop or manually):

# Poll every 3 seconds to observe real-time progress
for i in {1..60}; do
  echo "=== Poll $i ($(date +%H:%M:%S)) ==="
  curl -s http://localhost:8888/agent-work-orders/$WORK_ORDER_ID/steps | jq '.steps | length'
  curl -s http://localhost:8888/agent-work-orders/$WORK_ORDER_ID/steps | jq '.steps[-1] | {step: .step, agent: .agent_name, success: .success}'
  sleep 3
done

Observe that step count increases incrementally: 0 → 1 → 2 → 3 → 4 → 5 → 6 → 7
Verify each step appears immediately after completion (not all at once at the end)
Verify you can see progress in real-time
Check final status: curl http://localhost:8888/agent-work-orders/$WORK_ORDER_ID | jq '{status: .status, steps_completed: (.git_commit_count // 0)}'
Stop the server: pkill -f "uvicorn.*8888"

Document the Improvement

Open PRPs/specs/agent-work-orders-mvp-v2.md (or relevant spec file)

Add a note in the Observability or Implementation Notes section:

### Real-Time Progress Tracking

Step history is saved incrementally after each workflow step completes, enabling
real-time progress visibility via the `/agent-work-orders/{id}/steps` endpoint.
This allows users to monitor execution as it happens rather than waiting for the
entire workflow to complete.

Implementation: `save_step_history()` is called after each `steps.append()` in
the workflow orchestrator, providing immediate feedback to polling clients.

Save the file

Run Final Validation Commands

Execute all validation commands listed in the Validation Commands section below
Ensure every command executes successfully
Verify zero regressions across the entire codebase
Confirm real-time progress tracking works end-to-end

Testing Strategy

Unit Tests

Workflow Orchestrator Tests:

Test that save_step_history() is called after each workflow step
Test that step history is saved 7+ times during successful execution (once per step + final save)
Test that step count increases incrementally (1, 2, 3, 4, 5, 6, 7)
Test that step history is saved even when workflow fails mid-execution
Test that each save contains all steps completed up to that point

State Repository Tests:

Test that save_step_history() handles concurrent calls safely (already implemented with asyncio.Lock)
Test that retrieving step history returns the most recently saved version
Test that step history can be saved and retrieved multiple times for same work order
Test that step history overwrites previous version (not appends)

Integration Tests

End-to-End Workflow Tests:

Test that step history can be retrieved via API during workflow execution
Test that polling /agent-work-orders/{id}/steps shows progressive updates
Test that step history contains correct number of steps at each save point
Test that step history is accessible immediately after each step completes
Test that failed steps are visible in step history before workflow terminates

API Integration Tests:

Test GET /agent-work-orders/{id}/steps returns empty array before first step
Test GET /agent-work-orders/{id}/steps returns 1 step after classification
Test GET /agent-work-orders/{id}/steps returns N steps after N steps complete
Test GET /agent-work-orders/{id}/steps returns complete history after workflow finishes

Edge Cases

Concurrent Access:

Multiple clients polling /agent-work-orders/{id}/steps simultaneously
Step history being saved while another request reads it (handled by asyncio.Lock)
Workflow fails while client is retrieving step history

Performance:

Large step history (7 steps * 100+ lines each) saved multiple times
Multiple work orders executing simultaneously with incremental saves
High polling frequency (1 second intervals) during workflow execution

Failure Scenarios:

Step history save fails (network/disk error) - workflow should continue
Step history is saved but retrieval fails - should return appropriate error
Workflow interrupted mid-execution - partial step history should be preserved

Acceptance Criteria

Core Functionality:

✅ Step history is saved after each workflow step completes
✅ Step history is saved 7 times during successful workflow execution (once per step)
✅ Each incremental save contains all steps completed up to that point
✅ Step history is accessible via API immediately after each step
✅ Real-time progress visible when polling /agent-work-orders/{id}/steps

Backward Compatibility:

✅ All existing tests pass without modification
✅ API behavior unchanged (same endpoints, same response format)
✅ No breaking changes to models or state repository
✅ Performance impact negligible (save operations are fast)

Testing:

✅ New unit test verifies incremental saves occur
✅ New integration test verifies step history visibility during execution
✅ All existing workflow engine tests pass
✅ All agent work orders tests pass
✅ Manual end-to-end test confirms real-time progress tracking

Code Quality:

✅ Type checking passes (mypy)
✅ Linting passes (ruff)
✅ Code follows existing patterns and conventions
✅ Structured logging used for save operations

Documentation:

✅ Implementation documented in spec file
✅ Acceptance criteria met and verified
✅ Validation commands executed successfully

Validation Commands

Execute every command to validate the feature works correctly with zero regressions.

# Unit Tests - Verify incremental saves
cd python && uv run pytest tests/agent_work_orders/test_workflow_engine.py::test_orchestrator_saves_step_history_incrementally -v
cd python && uv run pytest tests/agent_work_orders/test_workflow_engine.py::test_step_history_visible_during_execution -v

# Workflow Engine Tests - Ensure no regressions
cd python && uv run pytest tests/agent_work_orders/test_workflow_engine.py -v

# State Manager Tests - Verify save_step_history works correctly
cd python && uv run pytest tests/agent_work_orders/test_state_manager.py -v

# API Tests - Ensure steps endpoint still works
cd python && uv run pytest tests/agent_work_orders/test_api.py -v

# Complete Agent Work Orders Test Suite
cd python && uv run pytest tests/agent_work_orders/ -v --tb=short

# Type Checking
cd python && uv run mypy src/agent_work_orders/workflow_engine/workflow_orchestrator.py
cd python && uv run mypy src/agent_work_orders/

# Linting
cd python && uv run ruff check src/agent_work_orders/workflow_engine/workflow_orchestrator.py
cd python && uv run ruff check src/agent_work_orders/

# Full Backend Test Suite (zero regressions)
cd python && uv run pytest

# Manual End-to-End Validation
cd python && uv run uvicorn src.agent_work_orders.main:app --port 8888 &
sleep 5
curl http://localhost:8888/health | jq

# Create work order
WORK_ORDER_ID=$(curl -s -X POST http://localhost:8888/agent-work-orders \
  -H "Content-Type: application/json" \
  -d '{"repository_url":"https://github.com/Wirasm/dylan.git","sandbox_type":"git_branch","workflow_type":"agent_workflow_plan","user_request":"Test real-time progress"}' \
  | jq -r '.agent_work_order_id')

echo "Work Order: $WORK_ORDER_ID"

# Poll for real-time progress (observe step count increase: 0->1->2->3->4->5->6->7)
for i in {1..30}; do
  STEP_COUNT=$(curl -s http://localhost:8888/agent-work-orders/$WORK_ORDER_ID/steps | jq '.steps | length')
  LAST_STEP=$(curl -s http://localhost:8888/agent-work-orders/$WORK_ORDER_ID/steps | jq -r '.steps[-1].step // "none"')
  echo "Poll $i: $STEP_COUNT steps completed, last: $LAST_STEP"
  sleep 3
done

# Verify final state
curl http://localhost:8888/agent-work-orders/$WORK_ORDER_ID | jq '{status: .status}'
curl http://localhost:8888/agent-work-orders/$WORK_ORDER_ID/steps | jq '.steps | length'

# Cleanup
pkill -f "uvicorn.*8888"

Notes

Performance Considerations

Save Operation Performance:

save_step_history() is a fast in-memory operation (Phase 1 MVP)
Uses asyncio.Lock to prevent race conditions
No network I/O or disk writes in current implementation
Future Supabase migration (Phase 2) will add network latency but async execution prevents blocking

Impact Analysis:

Adding 7 incremental saves adds ~7ms total overhead (1ms per save in-memory)
This is negligible compared to agent execution time (30-60 seconds per step)
Total workflow time increase: <0.01% (unmeasurable)
Trade-off: Tiny performance cost for massive observability improvement

Why This Fix is Critical

User Experience Impact:

Before: Black-box execution with 3-5 minute wait, zero feedback
After: Real-time progress updates every 30-60 seconds as steps complete

Debugging Benefits:

Immediately see which step failed without waiting for entire workflow
Monitor long-running implementation steps for progress
Identify bottlenecks in workflow execution

API Efficiency:

Clients still poll every 3 seconds, but now get meaningful updates
Reduces frustrated users refreshing pages or restarting work orders
Enables progress bars, step indicators, and real-time status UIs

Implementation Simplicity

This is one of the simplest high-value features to implement:

7 lines of code (one await save_step_history() call per step)
Zero API changes (existing endpoint already works)
Zero model changes (StepHistory already supports this pattern)
Zero state repository changes (save_step_history() already thread-safe)
High impact (transforms user experience from frustrating to delightful)

Future Enhancements

Phase 2 - Supabase Persistence:

When migrating to Supabase, the same incremental save pattern works
May want to batch saves (every 2-3 steps) to reduce DB writes
Consider write-through cache for high-frequency polling

Phase 3 - WebSocket Support:

Instead of polling, push step updates via WebSocket
Even better real-time experience with lower latency
Incremental saves still required as source of truth

Advanced Observability:

Add step timing metrics (time between saves = step duration)
Track which steps consistently take longest
Alert on unusually slow step execution
Historical analysis of workflow performance

Testing Philosophy

Focus on Real-Time Visibility:

Primary test: verify saves occur after each step (not just at end)
Secondary test: verify step count progression (1, 2, 3, 4, 5, 6, 7)
Integration test: confirm API returns incremental results during execution
Manual test: observe real progress while workflow runs

Regression Prevention:

All existing tests must pass unchanged
No API contract changes
No model changes
Performance impact negligible and measured

Agent Work Orders MVP v2 Spec: PRPs/specs/agent-work-orders-mvp-v2.md
Atomic Workflow Execution: PRPs/specs/atomic-workflow-execution-refactor.md
PRD: PRPs/PRD.md

31 KiB Raw Blame History