Agentic Design Pattern: Reflection
Make your agent stop, critique itself, and rewrite — before it ships a sloppy answer.
Building Blocks
On this page
Most LLM answers fail in a boring way: they're almost right. A missing constraint, a hand-wavy claim, a tone that's just off. Reflection is the fix. You make the model produce a draft, grade its own work against a rubric, and rewrite. That's the whole pattern.
- 01DraftAnswer the question. Don't overthink it.
- 02CritiqueGrade the draft against a rubric. Be strict.
- 03ReviseFix the issues. Don't invent new ones.
- 01DraftAnswer the question. Don't overthink it.
- 02CritiqueGrade the draft against a rubric. Be strict.
- 03ReviseFix the issues. Don't invent new ones.
You're not asking the model to be perfect on the first try. You're asking it to edit itself like a human would.
When to reach for it
- Output has hard constraints (format, policy, tone)
- "Mostly right" is still a failure — customer emails, requirements docs, code
- You can write a rubric for what "good" looks like
- Low-stakes work — brainstorming, casual writing
- Latency or cost matters more than polish
- You don't have a rubric (reflection without one is just vibes)
The three roles
You only need three. Same model can play all three — what changes is the system prompt.
You are a writer. Produce a single, complete draft of the user's request. Rules: - No clarifying questions, no hedging, no apologies. - Write as if this is your final answer. - Stay within the format the user asked for.
You are a strict editor. Score the draft against this rubric:
clarity, accuracy, completeness, tone.
For each criterion, return PASS or FAIL with one sentence of evidence.
"Mostly right" is FAIL. Be specific — name the line or claim that fails.
Return JSON only:
{
"verdict": "PASS" | "FAIL",
"issues": [{ "criterion": "...", "evidence": "..." }]
}You are a rewriter. Apply the critic's issues to the draft. Rules: - Fix only what the critique names. Do not add new claims. - Do not change sections the critique did not flag. - Return the revised draft only — no commentary.
Want the revised draft critiqued too? Loop it: feed the Reviser's output back to the Critic, stop when the verdict is PASS, and cap the loop at 2–3 iterations so a stubborn Critic can't spin forever.
Failure modes
- No rubric.The Critic ships vibes instead of issues. Define "good" before you start the loop.
- No max iterations.A stubborn Critic spins forever. Cap it at 2–3 — that's almost always enough.
- Critic too soft.Everything passes on iteration 1? Your rubric is too easy. Tighten it.
- Critic too harsh.Nothing ever passes — you burn tokens and ship the last draft anyway. Add a "good enough" threshold.
- Reviser hallucinates.Tell it explicitly: fix the critique, don't add new claims.
- Cost blowup.Each pass is 2–3 LLM calls. Use this for write-once content, not real-time chat.