Agentic Design Pattern: Tool Use

LLMs don't know today's weather. They don't know your inventory count, the current exchange rate, or what's in your database. Tool Use is the pattern that fixes that gap: you describe a function to the model, the model decides when it needs that function, calls it with structured arguments, and weaves the result back into its answer. The model stays the brain; the tools are its hands.

The hand-off

01
Describe
Tell the model what tools exist — name, description, argument schema.
02
Decide
The model reads the user request and either answers directly or asks to call a tool.
03
Execute
Your code runs the tool, returns the result back to the model.
04
Answer
The model writes the final response using the tool's output.

01
Describe
Tell the model what tools exist — name, description, argument schema.
02
Decide
The model reads the user request and either answers directly or asks to call a tool.
03
Execute
Your code runs the tool, returns the result back to the model.
04
Answer
The model writes the final response using the tool's output.

The model can chain tool calls until it has enough to answer

The trick isn't the calling — modern model SDKs hand you that. The trick is schema discipline: a tool with a fuzzy description or a loose argument schema is a tool the model will misuse.

When to reach for it

Use it

The answer depends on data the model can't have memorized (live, private, or fresh)
You can express the capability as a function with crisp inputs and outputs
You'd rather the model decide *when* to fetch than hard-code an "always fetch" pipeline

Skip it

The model already knows the answer reliably — don't add latency for no gain
The work isn't expressible as a function call (huge unstructured outputs, long-running jobs)
Latency or determinism matters more than flexibility — sometimes a hard-coded fetch is just better

The three roles

Tool author

Writes the function and its schema

System Prompt

You are designing a tool the model will call. Define:

- A clear name in snake_case (the model sees this).
- A one-sentence description of what the tool does and when to use it.
- An input schema (Zod, JSON Schema, or your SDK's equivalent) with tight types.
- A return shape the model can actually use — small, named fields, not raw blobs.

Rules:
- Description is for the model, not your teammates. Say "Returns current weather for a US city." not "Wraps wttr.in."
- Validate inputs at the boundary. Never trust the model's args.
- Return structured data, not prose. The model will write the prose.

Model

Decides whether to call, and with what arguments

System Prompt

You have access to tools. For each turn:

- If a tool is the right answer, call it with valid arguments matching the schema.
- If the tool result is enough, answer the user using it.
- If you need another tool call, make it.
- Stop when you have enough — do not narrate your reasoning to the user.

Hard rule: only call tools the system has told you about. Never invent a tool name.

Runtime

Executes calls, returns results, caps the loop

System Prompt

You are the orchestrator that sits between the model and the tools.

Responsibilities:
- Receive a tool call from the model — validate its arguments against the schema.
- Execute the underlying function with a timeout and an error budget.
- Return the result (or a typed error) back to the model in the next turn.
- Cap total steps (e.g., 3–5) so a confused model can't loop forever.

The model is not "calling your code" in the literal sense. It's emitting a structured request that says "please run getWeather(city='Austin', state='TX') and tell me what you got." Your runtime is what actually runs the function. That separation is the whole safety story — you decide what the model can do, with what arguments, and what happens if it goes wrong.

Failure modes

Vague tool description.
If the model can't tell *when* to use the tool from its description, it'll either skip it or call it for the wrong thing. Write the description like a one-line spec.
Loose argument schema.
Accepting `args: any` means the model can hand you garbage. Use a tight schema with enums where you have them — Zod, Pydantic, JSON Schema, pick one.
Raw blob returns.
Returning the full API response makes the model paraphrase pages of JSON. Return only the fields the user-facing answer needs.
No step cap.
A confused model can chain tool calls indefinitely. Set a max step count (3 is usually enough for a one-shot tool).
Trusting the model's args.
The model is a probabilistic text generator, not a validator. Re-validate every argument at the function boundary, every time.
Tool errors as model errors.
If wttr.in is down, don't show the user a stack trace — return a typed error to the model so it can apologize gracefully.

When to reach for it

The three roles

Failure modes

Dan’s AI Assistant

Hi! I’m Dan’s AI Assistant