Recipe — Review-panel supervisor
The scenario: a user submits a piece of content — a support-macro draft, a generated marketing blurb, a code snippet — and you want more than one model’s opinion before you ship it. A single reviewer is a single point of failure; a panel of independent reviewers, each judging the same submission without seeing the others’ verdicts, catches far more. Then a deterministic aggregator turns the panel’s verdicts into one decision: ship it, or send it back with the consolidated notes.
This recipe fans one reviewer agent out into three parallel branches with ai.fanOut, dispatches them all on iteration zero via a route callback, and aggregates their verdicts in a pure-code callback intent on iteration one — no extra LLM call to count votes.
yarn add @warlock.js/ai @warlock.js/ai-openai @warlock.js/sealOPENAI_API_KEY=sk-...The reviewer
Section titled “The reviewer”One reviewer agent, fanned out into N independent copies. Each copy runs the same prompt against the same submission but as a separate dispatch slot, so their verdicts are independent samples. The reviewer emits a structured verdict — approve/reject plus a one-line reason and a severity.
import { ai } from "@warlock.js/ai";import { v } from "@warlock.js/seal";import { OpenAISDK } from "@warlock.js/ai-openai";
const openai = new OpenAISDK({ apiKey: process.env.OPENAI_API_KEY!, pricing: { "gpt-4o-mini": { input: 0.15, output: 0.6 }, },});
const model = openai.model({ name: "gpt-4o-mini" });
const reviewerVerdict = v.object({ approved: v.boolean(), reason: v.string(), severity: v.enum(["none", "minor", "blocking"]),});
const reviewer = ai.agent({ name: "reviewer", description: "Independently reviews a submission and returns an approve/reject verdict.", model, // Slightly raised temperature so the parallel copies don't collapse to // identical answers — independent samples are the entire point of a panel. modelOptions: { temperature: 0.7 }, output: reviewerVerdict, systemPrompt: ai.systemPrompt() .persona("You are a strict but fair content reviewer.") .instruction("Judge ONLY the submission you are given. Do not assume other reviewers exist.") .instruction("Set `approved: false` and `severity: \"blocking\"` for policy violations, factual errors, or unsafe content.") .instruction("Use `severity: \"minor\"` for style nits that don't block shipping."),});The aggregator
Section titled “The aggregator”A callback intent — pure code, zero tokens. It reads the panel’s verdicts off the previous iteration’s snapshot (ctx.iterations[0].result), applies a majority rule with a hard blocking-veto override, and returns the consolidated decision as its state slice. We declare its output schema so only the decided keys merge into state.
import type { DispatchContext } from "@warlock.js/ai";
type PanelDecision = { decision: "approved" | "revise"; approvals: number; total: number; notes: string[];};
const aggregate = { description: "Tally the reviewer panel's verdicts into one decision.", run: (ctx: DispatchContext): PanelDecision => { // The fan-out branches all ran last iteration; their per-branch // outputs are keyed by intent name on that iteration's snapshot. const panel = ctx.iterations[0]?.result ?? {};
const verdicts = Object.values(panel) .map((branch) => branch.output as { approved: boolean; reason: string; severity: string } | undefined) .filter((verdict): verdict is NonNullable<typeof verdict> => verdict !== undefined);
const total = verdicts.length; const approvals = verdicts.filter((verdict) => verdict.approved).length; const hasBlocker = verdicts.some((verdict) => verdict.severity === "blocking");
// Majority approves AND nobody raised a blocker → ship it. const decision = approvals * 2 > total && !hasBlocker ? "approved" : "revise";
// Surface the dissenting / blocking reasons so the author knows what to fix. const notes = verdicts .filter((verdict) => !verdict.approved || verdict.severity === "blocking") .map((verdict) => verdict.reason);
return { decision, approvals, total, notes }; }, output: v.object({ decision: v.enum(["approved", "revise"]), approvals: v.number(), total: v.number(), notes: v.array(v.string()), }),};The supervisor
Section titled “The supervisor”route is deterministic here — there’s nothing for an LLM router to decide. Iteration 0 fans out to all three reviewer branches in parallel (Promise.all under the hood); iteration 1 runs the aggregator; then END. ai.fanOut(reviewer, 3) generates the keys reviewer1, reviewer2, reviewer3.
import { END } from "@warlock.js/ai";
type PanelState = { decision?: "approved" | "revise"; approvals?: number; total?: number; notes?: string[];};
const reviewPanel = ai.supervisor<PanelState, PanelState>({ name: "review-panel", goal: "Get three independent verdicts on the submission and decide whether to ship it.", intents: { ...ai.fanOut(reviewer, 3), // reviewer1, reviewer2, reviewer3 aggregate, }, route: (ctx) => { if (ctx.iteration === 0) { return ["reviewer1", "reviewer2", "reviewer3"]; }
if (ctx.iteration === 1) { return "aggregate"; }
return END; }, output: v.object({ decision: v.enum(["approved", "revise"]), approvals: v.number(), total: v.number(), notes: v.array(v.string()), }), maxIterations: 3,});Run it
Section titled “Run it”const submission = "Subject: Your account is suspended!! Click here within 24h to avoid permanent deletion.";
const { data, error, usage, report } = await reviewPanel.execute(submission);
if (error) { console.error(`panel failed (${error.code}), terminated by ${report.terminatedBy}`); return;}
if (data?.decision === "approved") { console.log(`shipped — ${data.approvals}/${data.total} reviewers approved`);} else { console.log(`needs revision (${data?.approvals}/${data?.total} approved):`); data?.notes.forEach((note) => console.log(` - ${note}`));}
// Per-branch forensics: each reviewer's verdict lives on iteration 0's snapshot.const panelSnapshot = report.snapshots[0]?.result ?? {};for (const [intent, branch] of Object.entries(panelSnapshot)) { const verdict = branch.output as { approved: boolean; severity: string }; console.log(`${intent}: ${verdict.approved ? "approve" : "reject"} (${verdict.severity})`);}
console.log(`${usage.total} tokens across the panel + aggregator`);Expected flow for this (clearly abusive) submission:
- Iteration 0 —
routereturns["reviewer1", "reviewer2", "reviewer3"]. All three dispatch in parallel; each independently flags the phishing pattern and returns{ approved: false, severity: "blocking" }. - Iteration 1 —
routereturns"aggregate". The callback reads the three verdicts offctx.iterations[0].result, sees a blocker, and returns{ decision: "revise", approvals: 0, total: 3, notes: [...] }. - Iteration 2 —
routereturnsEND; the run terminates withterminatedBy: "route", anddatais the validated decision.
Three reviewer trips (parallel, so latency is one trip’s worth) plus a zero-token aggregator.
Production notes
Section titled “Production notes”- Independence is the whole point — protect it. Each fan-out branch must judge the submission cold. Raise
temperatureso the copies actually diverge, and keep the reviewer prompt from referencing “the other reviewers” (it can’t see them anyway, but a prompt that implies a panel biases the sample). If every branch returns the same answer every time, you’re paying 3× for one opinion. - Aggregate in code, not in an LLM. Counting votes is deterministic — a callback intent does it for free and is trivially testable. Reach for an LLM aggregator only when the merge is genuinely judgmental (e.g. synthesizing conflicting prose), and even then prefer a
runcallback that calls one agent viactx.run(...)over adding another routed intent. - Veto beats majority for safety. A pure majority rule ships content two reviewers liked and one flagged as unsafe. The blocking-severity veto in this aggregator overrides the count — for moderation and code review, one credible blocker should outweigh a thin majority. Tune the rule to your risk tolerance.
- Fan-out branches share the underlying agent instance.
ai.fanOutclones the agent across distinct intent keys, not the agent object — every branch references the samereviewer. That’s intended (they’re independent dispatch slots, not independent configs); if you need genuinely different reviewers, list them as separate named intents instead of fanning one out. - The panel’s verdicts persist on the report. Iteration 0’s
snapshot.resultkeeps every branch’soutput,usage,duration, and any per-brancherror— a reviewer that errored doesn’t abort its siblings, so always checkbranch.errorbefore trusting a verdict when you tally.