Recipe — Classifier fast-path triage
Most support messages don’t need a multi-turn conversation between specialists — they need exactly one. “What’s your refund policy?” is a billing question, full stop. Running a full router loop for that is wasteful: every iteration is an extra LLM trip just to re-decide what was obvious from word one.
The supervisor’s classifier mode is the fast path. It runs once, on iteration zero, as a prelude: it classifies the input, the supervisor dispatches the single chosen intent, and — when there’s no router or route configured alongside it — the run terminates right after that intent settles. One classify call, one specialist call, done. This recipe builds a cheap triage that classifies an inbound message into billing, tech, or smalltalk and answers it in a single pass.
yarn add @warlock.js/ai @warlock.js/ai-openai @warlock.js/sealOPENAI_API_KEY=sk-...The classifier agent
Section titled “The classifier agent”A classifier agent must emit the framework’s locked classifier shape — { intent, reasoning?, confidence? } — where intent is one of the supervisor’s intent keys. We declare that with a v.enum over the exact keys so the model can’t invent an off-list intent, and add reasoning/confidence as telemetry.
import { ai } from "@warlock.js/ai";import { v } from "@warlock.js/seal";import { OpenAISDK } from "@warlock.js/ai-openai";
const openai = new OpenAISDK({ apiKey: process.env.OPENAI_API_KEY!, pricing: { "gpt-4o-mini": { input: 0.15, output: 0.6 }, },});
const model = openai.model({ name: "gpt-4o-mini" });
const classifyAgent = ai.agent({ name: "classify", description: "Single-pass triage classifier.", model, output: v.object({ intent: v.enum(["billing", "tech", "smalltalk"]), reasoning: v.string().optional(), confidence: v.number().optional(), }), systemPrompt: ai.systemPrompt() .persona("You triage inbound support messages.") .instruction("Pick the single best intent: `billing`, `tech`, or `smalltalk`.") .instruction("Set `confidence` between 0 and 1 for how sure you are."),});The specialists
Section titled “The specialists”Each intent produces its own reply slice. Because only one runs per fast-path execution, there’s no merge contention to design around.
const billingAgent = ai.agent({ name: "billing", description: "Answers charges, refunds, invoices, and plan questions.", model, output: v.object({ reply: v.string() }), systemPrompt: ai.systemPrompt() .persona("You are a billing specialist.") .instruction("Answer the customer's billing question directly in one short paragraph."),});
const techAgent = ai.agent({ name: "tech", description: "Answers bugs, errors, outages, and login failures.", model, output: v.object({ reply: v.string() }), systemPrompt: ai.systemPrompt() .persona("You are a technical support engineer.") .instruction("Give the most likely fix or next diagnostic step."),});
const smalltalkAgent = ai.agent({ name: "smalltalk", description: "Handles greetings, thanks, and non-support chit-chat.", model, output: v.object({ reply: v.string() }), systemPrompt: ai.systemPrompt() .persona("You are a friendly support concierge.") .instruction("Reply warmly and briefly. Invite a real question if there isn't one."),});The supervisor
Section titled “The supervisor”No router, no route — just classifier. That’s the whole fast-path contract: the classifier picks iteration zero’s intent, the supervisor dispatches it, and with nothing configured for iteration 1+, the run ends. We add a refine hook to demonstrate a deterministic safety net: when the model’s self-reported confidence is weak, fall back to tech (a human-reviewable path) rather than guessing.
const fastTriage = ai.supervisor<{ reply: string }, { reply?: string }>({ name: "classifier-fast-path", goal: "Answer the message in a single pass by routing it to exactly one specialist.", intents: { billing: billingAgent, tech: techAgent, smalltalk: smalltalkAgent, }, classifier: { agent: classifyAgent, refine: (ctx) => { const { intent, confidence } = ctx.result.data;
// Low-confidence guard: send ambiguous messages to the tech path // (the one that escalates to a human) instead of trusting a coin flip. if ((confidence ?? 1) < 0.6) { return { intent: "tech" }; }
return undefined; // accept the classifier's own pick }, }, output: v.object({ reply: v.string() }),});Run it
Section titled “Run it”const { data, error, usage, report } = await fastTriage.execute( "Hey, what's your refund window if I cancel mid-cycle?",);
if (error) { console.error(`fast-path failed (${error.code}), terminated by ${report.terminatedBy}`); return;}
console.log(data?.reply);
// The classifier's forensic trail is on report.classifier.console.log( `classified as "${report.classifier?.intent}"`, report.classifier?.refined ? "(refined)" : "", `confidence ${report.classifier?.confidence ?? "n/a"}`,);
console.log( `${report.iterations} iteration,`, `${usage.total} tokens, terminated by ${report.terminatedBy}`,);Expected flow for this input:
- Iteration 0, classify prelude —
classifyAgentreturns{ intent: "billing", confidence: 0.9 }. Therefinehook sees confidence ≥ 0.6 and returnsundefined, accepting the pick. - Iteration 0, dispatch — the supervisor dispatches
billing.billingAgentwrites{ reply }. - Terminate — no router/route is configured, so the run ends with
report.terminatedBy: "classifier"andreport.iterations === 1.
Two LLM trips total (classify + billing) — exactly the floor for a triage that still uses a model to decide.
When the classifier is deterministic
Section titled “When the classifier is deterministic”If your triage signal is a keyword or a metadata flag, skip the LLM classifier entirely with the callback form. A pure-code classifier costs zero tokens:
classifier: { run: (ctx) => { const text = String(ctx.input).toLowerCase();
if (text.includes("refund") || text.includes("charge") || text.includes("invoice")) { return { intent: "billing", confidence: 1 }; }
if (text.includes("error") || text.includes("won't load") || text.includes("login")) { return { intent: "tech", confidence: 1 }; }
return { intent: "smalltalk", confidence: 0.5 }; },},Now the only LLM call in the whole run is the one specialist that actually answers.
Production notes
Section titled “Production notes”- Fast-path vs. loop is a config switch, not a rewrite. The same
intentsmap works withclassifieralone (single pass), withrouter/routealone (multi-turn loop), or with both — when you configure a classifier and a router, the classifier drives iteration 0 and the router picks up from iteration 1+. Start with the classifier; add a router later only if you find messages that genuinely need more than one specialist. See the support-triage recipe for the full-loop version. - Don’t gate control flow on raw
confidence. LLM-reported confidence is poorly calibrated — a model that says0.95is wrong about as often as one that says0.85. Use it as a soft signal insiderefine(combined with a deterministic check or a fallback intent), never as the sole branch condition. refinecan also halt. Returning the bareENDsentinel fromrefineterminates the run before any intent dispatches — useful for a policy gate (“if the message is abusive, stop here”). Returning{ intent: END, ...slice }halts after merging a slice into state, so a rejection reason can surface ondata.intentmust be a real key. The classifier’s (orrefine’s)intentmust match a key inintents, or the supervisor throwsSUPERVISOR_INVALID_ROUTEat runtime. Thev.enumover the intent keys catches this at the schema boundary before it ever reaches dispatch.- Forensics live on
report.classifier.intent,reasoning,confidence, therefined/haltedflags, and the original pre-refinerawoutput are all captured there — log them to spot drift between what the classifier picked and what actually resolved the ticket.