Loop Execution Evaluator

Name: Loop Execution Evaluator
Author: Ibrahim-3d

ASecurity

Evaluate-Loop Step 4: EVALUATE EXECUTION. This is the dispatcher agent — it determines the track type and invokes the correct specialized evaluator. Does NOT run a generic checklist. Instead dispatches to: eval-ui-ux (screens/design), eval-code-quality (features/infrastructure), eval-integration (APIs/auth/payments), eval-business-logic (generator/rules/state). Triggered by: 'evaluate execution', 'review implementation', 'check build', '/phase-review'. Always runs after loop-executor.

352 stars

0 votes

0 copies

0 views

Added 5/27/2026

data-aishellapidatabasedocumentation

Works with

api

Security Analysis

A100/100

Scanned 5/27/2026

Install via CLI

$openskills install Ibrahim-3d/orchestrator-supaconductor

Files

SKILL.md

---
name: loop-execution-evaluator
description: "Evaluate-Loop Step 4: EVALUATE EXECUTION. This is the dispatcher agent — it determines the track type and invokes the correct specialized evaluator. Does NOT run a generic checklist. Instead dispatches to: eval-ui-ux (screens/design), eval-code-quality (features/infrastructure), eval-integration (APIs/auth/payments), eval-business-logic (generator/rules/state). Triggered by: 'evaluate execution', 'review implementation', 'check build', '/phase-review'. Always runs after loop-executor."
---

# Loop Execution Evaluator — Step 4: Dispatcher

This agent does NOT evaluate directly. It determines the track type and dispatches the correct specialized evaluator.

## Why Specialized Evaluators?

Different track types need fundamentally different checks:
- A **UI track** needs design system adherence, visual consistency, responsive checks
- A **feature track** needs build integrity, type safety, code patterns
- An **integration track** needs API contracts, auth flows, error recovery
- A **business logic track** needs product rules, edge cases, state transitions

A generic checklist misses critical issues specific to each type.

## Dispatch Logic

read_file the track's `metadata.json` and `spec.md` to determine the track type, then dispatch:

| Track Type | Keywords in spec/metadata | Evaluator |
|-----------|--------------------------|-----------|
| UI / Design | "screen", "component", "design system", "layout", "visual", "UI shell" | `eval-ui-ux` |
| Feature / Code | "implement", "feature", "refactor", "infrastructure", "hook", "store" | `eval-code-quality` |
| Integration | "Supabase", "Stripe", "Gemini", "API", "auth", "database", "webhook" | `eval-integration` |
| Business Logic | "generation", "lock", "dependency", "pricing", "tier", "pipeline", "download" | `eval-business-logic` |

### Multi-Type Tracks

Some tracks need multiple evaluators. For example:
- A generator logic track → `eval-business-logic` + `eval-code-quality`
- An auth/DB integration track → `eval-integration` + `eval-code-quality`
- A UI shell track → `eval-ui-ux` only

When multiple evaluators apply, run them all. The track passes only if ALL evaluators pass.

## Dispatch Workflow

```
1. read_file track metadata.json + spec.md
2. Determine track type(s)
3. Dispatch evaluator(s):
   → eval-ui-ux         (if UI track)
   → eval-code-quality   (if code/feature track)
   → eval-integration    (if integration track)
   → eval-business-logic (if logic track)
4. Collect results from all dispatched evaluators
5. Aggregate into final verdict
```

## Structural Checks (Always Run)

Regardless of track type, always verify these baseline checks:

| Check | Method |
|-------|--------|
| plan.md updated | All completed tasks marked `[x]` with commit SHA and summary |
| Scope alignment | No unplanned work added without documentation |
| No skipped tasks | All `[ ]` tasks either completed or documented as intentionally deferred |
| Build passes | `npm run build` exits 0 |
| Business docs in sync | If track made pricing/model/business decisions, verify docs are flagged for Step 5.5 sync |

### Business Doc Sync Check

If the track made any business-impacting changes, verify:
1. The executor's summary includes `Business Doc Sync Required: Yes`
2. Affected documents are listed
3. This flags the Conductor to run Step 5.5 (Business Doc Sync) before marking complete

**What counts as business-impacting:**
- Pricing tier, price point, or feature list changes
- AI model, SDK, or cost structure changes
- New package or product tier additions
- Asset pipeline changes (add/remove/modify assets)
- Persona, GTM, or revenue assumption changes

See `${CLAUDE_PLUGIN_ROOT}/skills/business-docs-sync/SKILL.md` for the full registry.

## Aggregated Verdict

```markdown
## Execution Evaluation Report

**Track**: [track-id]
**Evaluator**: loop-execution-evaluator (dispatcher)
**Date**: [YYYY-MM-DD]

### Evaluators Dispatched
| Evaluator | Reason | Verdict |
|-----------|--------|---------|
| eval-ui-ux | Track builds P0 screens | PASS ✅ / FAIL ❌ |
| eval-code-quality | Track implements features | PASS ✅ / FAIL ❌ |

### Structural Checks
- plan.md updated: YES / NO
- Scope alignment: YES / NO
- Build passes: YES / NO
- Business doc sync needed: YES / NO (if YES, list affected docs)

### Final Verdict: PASS ✅ / FAIL ❌
All evaluators must PASS for the track to pass.

[If FAIL, aggregate all fix actions from all evaluators]
```

## Metadata Checkpoint Updates

The execution evaluator MUST update the track's `metadata.json` at key points:

### On Start
```json
{
  "loop_state": {
    "current_step": "EVALUATE_EXECUTION",
    "step_status": "IN_PROGRESS",
    "step_started_at": "[ISO timestamp]",
    "checkpoints": {
      "EVALUATE_EXECUTION": {
        "status": "IN_PROGRESS",
        "started_at": "[ISO timestamp]",
        "agent": "loop-execution-evaluator"
      }
    }
  }
}
```

### On PASS
```json
{
  "loop_state": {
    "current_step": "BUSINESS_SYNC",
    "step_status": "NOT_STARTED",
    "checkpoints": {
      "EVALUATE_EXECUTION": {
        "status": "PASSED",
        "completed_at": "[ISO timestamp]",
        "verdict": "PASS",
        "evaluators_run": [
          { "evaluator": "eval-code-quality", "verdict": "PASS", "issues": [] },
          { "evaluator": "eval-business-logic", "verdict": "PASS", "issues": [] }
        ],
        "business_sync_required": true
      },
      "BUSINESS_SYNC": {
        "status": "NOT_STARTED",
        "required": true
      }
    }
  }
}
```

### On FAIL
```json
{
  "loop_state": {
    "current_step": "FIX",
    "step_status": "NOT_STARTED",
    "checkpoints": {
      "EVALUATE_EXECUTION": {
        "status": "FAILED",
        "completed_at": "[ISO timestamp]",
        "verdict": "FAIL",
        "evaluators_run": [
          { "evaluator": "eval-code-quality", "verdict": "PASS", "issues": [] },
          { "evaluator": "eval-business-logic", "verdict": "FAIL", "issues": ["Business rule violation found"] }
        ],
        "failure_items": [
          "Fix business rule enforcement in resolver",
          "Add test coverage for edge case"
        ]
      },
      "FIX": {
        "status": "NOT_STARTED",
        "cycle": 1
      }
    }
  }
}
```

### Update Protocol
1. read_file current `metadata.json`
2. Update `loop_state.checkpoints.EVALUATE_EXECUTION` with results
3. If PASS + business sync needed: Set `current_step` to `BUSINESS_SYNC`
4. If PASS + no sync needed: Set `current_step` to `COMPLETE`
5. If FAIL: Set `current_step` to `FIX`, increment `fix_cycle_count` in loop_state
6. write_file back to `metadata.json`

## Handoff

- **ALL PASS + No Business Doc Sync** → Conductor marks track complete (Step 5)
- **ALL PASS + Business Doc Sync Needed** → Conductor runs Step 5.5 (Business Doc Sync) before marking complete
- **ANY FAIL** → Conductor dispatches `loop-fixer` with combined fix list

Loop Execution Evaluator

Works with

Security Analysis

Attribution

Comments (0)