Recipe — Multi-tool research agent
A user asks your product analytics assistant: “How many days of runway do we have if our balance is the number I gave finance last quarter, and we’re burning the rate shown on our latest investor update?” Answering it takes three different tools — a database lookup for the recorded balance, a web search for the public investor-update figure, and a calculator to divide one by the other. The interesting part isn’t any single tool; it’s watching the agent decide which to call, in what order, and surfacing that decision stream to the UI as it happens.
This recipe builds that agent and wires up the agent.tool.* event stream so you can render “Searching the web…”, “Looking up the ledger…”, “Calculating…” in real time.
yarn add @warlock.js/ai @warlock.js/ai-openai @warlock.js/sealThe three tools
Section titled “The three tools”Each tool declares an action string (or a function returning one) — that’s the human-facing “what is it doing right now” label the framework resolves and puts on the tool-lifecycle events for your UI. The description is what the model reads; action is what your user reads.
import { ai } from "@warlock.js/ai";import { v } from "@warlock.js/seal";import { searchClient, ledgerRepo } from "./services";
const webSearchTool = ai.tool({ name: "web_search", description: "Search the public web for a fact and return the top snippets.", action: ({ query }) => `Searching the web for "${query}"`, input: v.object({ query: v.string() }), execute: async ({ query }) => { const hits = await searchClient.search(query, { limit: 3 }); return { results: hits.map((h) => ({ title: h.title, snippet: h.snippet, url: h.url })) }; },});
const dbLookupTool = ai.tool({ name: "db_lookup", description: "Look up a recorded financial metric for a given quarter. Metrics: balance, burn_rate.", action: ({ metric, quarter }) => `Looking up ${metric} for ${quarter}`, input: v.object({ metric: v.enum(["balance", "burn_rate"]), quarter: v.string(), }), execute: async ({ metric, quarter }) => { const row = await ledgerRepo.metric(metric, quarter); if (!row) return { found: false }; return { found: true, value: row.value, unit: row.unit }; },});
const calculatorTool = ai.tool({ name: "calculator", description: "Evaluate a basic arithmetic expression and return the numeric result.", action: ({ expression }) => `Calculating ${expression}`, input: v.object({ expression: v.string() }), execute: async ({ expression }) => { // A real implementation uses a safe expression evaluator, not eval(). const result = safeEval(expression); return { result }; },});The agent
Section titled “The agent”import { OpenAISDK } from "@warlock.js/ai-openai";
const openai = new OpenAISDK({ apiKey: process.env.OPENAI_API_KEY! });
const researchAgent = ai.agent({ name: "research-assistant", model: openai.model({ name: "gpt-4o" }), systemPrompt: ai.systemPrompt() .persona("You are a precise financial research assistant.") .instruction("Use db_lookup for any figure we recorded internally, and web_search for any public figure.") .instruction("Never do arithmetic in your head — always call calculator for the final number.") .instruction("State which source each number came from."), tools: [webSearchTool, dbLookupTool, calculatorTool], maxTrips: 8,});maxTrips: 8 gives the agent room for a few tool round-trips (lookup → search → calculate → answer) without letting a confused loop run forever.
Surface the tool-call event stream
Section titled “Surface the tool-call event stream”Subscribe to the agent.tool.* events to drive a live UI. agent.tool.calling fires when the model requests a tool (carries the resolved action label); agent.tool.called fires when it finishes (carries the output, duration, and status); agent.tool.failed fires when it errors.
const question = "How many days of runway do we have if our balance is the Q2-2026 figure on file, " + "and burn is the monthly rate from our latest investor update?";
const { text, report, error } = await researchAgent.execute(question, { on: { "agent.tool.calling": ({ tool, tripIndex }) => { // tool.action is the pre-resolved human label, e.g. 'Searching the web for "..."' console.log(`[trip ${tripIndex}] ${tool.action ?? tool.name}…`); }, "agent.tool.called": (call) => { // `call` is the full ToolCall record + a `tool` meta object. console.log(` ↳ ${call.name} ${call.status} in ${call.duration}ms`); }, "agent.tool.failed": ({ tool, error }) => { console.log(` ↳ ${tool.name} failed: ${error.message}`); }, },});
if (error) { console.error(`research failed: ${error.message}`);} else { console.log("\nAnswer:", text);}A representative run prints something like:
[trip 0] Looking up balance for Q2-2026… ↳ db_lookup completed in 12ms[trip 1] Searching the web for "Acme investor update monthly burn rate 2026"… ↳ web_search completed in 430ms[trip 2] Calculating 1840000 / 95000… ↳ calculator completed in 1ms
Answer: Your balance on file for Q2-2026 is $1,840,000 (source: internal ledger).The latest investor update lists a monthly burn of $95,000 (source: web).That's about 19.4 days... — i.e. roughly 19 months of runway.Walk the tool calls after the fact
Section titled “Walk the tool calls after the fact”The same data is on the report tree once the run finishes — useful for audit logs, not just live UI. Tool dispatches are child nodes on report.children; filter by type === "tool" to isolate the leaf tool calls.
import type { ToolCall } from "@warlock.js/ai";
const toolCalls = report.children.filter((c): c is ToolCall => c.type === "tool");
for (const call of toolCalls) { console.log( `${call.name} (trip ${call.tripIndex}) ` + `input=${JSON.stringify(call.input)} → ${JSON.stringify(call.output)} ` + `[${call.status}, ${call.duration}ms]`, );}When a tool fails mid-research
Section titled “When a tool fails mid-research”If web_search throws (network blip, rate limit), the agent doesn’t crash. It records the error on that ToolCall, appends a tool-result message telling the model the call failed, and keeps looping up to maxTrips. The model can retry with a different query or fall back to stating what it couldn’t verify.
execute: async ({ query }) => { const hits = await searchClient.search(query, { limit: 3 }); if (hits.length === 0) { // Returning an empty-but-valid result is gentler than throwing — // the model reads "no results" and adapts instead of seeing an error. return { results: [], note: "no results found" }; } return { results: hits.map((h) => ({ title: h.title, snippet: h.snippet, url: h.url })) };},Inspect any failure post-hoc on the failed tool node’s error — find it via report.children.filter((c) => c.type === "tool").
Production notes
Section titled “Production notes”actionis for humans,descriptionis for the model. Keepdescriptionprecise enough that the model picks the right tool; keepactionshort and present-tense for the UI. The framework resolves a function-formactionagainst the validated input at emit time, so consumers receive a plain string.agent.tool.calledpayload is theToolCallitself plus atoolmeta. It carriesstatus,duration,input, andoutput— everything you need for both a live spinner and a persisted audit row, in one event.- Bound the loop with
maxTrips. A research agent that can call three tools repeatedly is exactly the shape that benefits from a cap. If the model exhausts it while still requesting tools, the run ends with anAgentMaxTripsErroronresult.error— distinguishable from a real answer. - Prefer graceful tool results over throws for recoverable cases. “No results” or “not found” as data lets the model adapt within the same run; reserve throwing for genuine failures the model can’t reason around.
- The live events and the final report tree are the same data, different timing. Drive the UI from
agent.tool.*events during the run; reconcile and persist fromreport.children(filtered totype === "tool") after it. You don’t need to buffer events yourself to get the final picture.