A\
Lessons/Domain 1
1.112 min

Design and implement agentic loops for autonomous task execution

An agentic loop is the control flow that lets Claude use tools to complete a task without a human driving each step. In production you send a request, inspect the structured stop_reason, run any tools Claude asked for, feed the results back, and repeat until Claude signals it is done. Getting this loop right (driven by stop_reason, not by parsing text) is the difference between a reliable autonomous agent and one that stops early, loops forever, or misroutes work.

The agentic loop (one iteration = one API call) 1. POST messages[] + tools[] to Claude (messages.create) 2. Read the response stop_reason stop_reason == 'tool_use' ? end_turn: exit loop tool_use 3. YOUR code runs the requested tool(s) 4. Append assistant (tool_use) + user (tool_result) blocks to messages[], preserving tool_use_id repeat Claude picks the next tool from context. The loop never parses text to decide when to stop. An iteration cap is only a runaway-safety backstop, not the primary termination signal.
The agentic loop: each iteration is one API call. Continue while stop_reason is 'tool_use', executing tools and feeding results back into messages[]; exit when stop_reason is 'end_turn'.

The agentic loop is four repeating steps

An agentic loop is a plain while loop in your application code. Each pass is exactly one call to the Messages API. The four steps are: (1) send the current messages[] plus your tools[] definitions to Claude; (2) read the response and its stop_reason; (3) if Claude requested tools, execute them in your own code; (4) append both Claude's tool request and your tool results to messages[] and go back to step 1.

The critical mental model: Claude never runs your tools. It only asks for a tool by emitting a tool_use block containing an id, a name, and an input object. Your code is responsible for actually dispatching that call, capturing the result, and returning it. The model then reasons about that new information on the next iteration and decides what to do next.

messages = [{"role": "user", "content": user_request}]
while True:
    resp = client.messages.create(
        model="claude-sonnet-4-5", max_tokens=1024,
        tools=TOOLS, messages=messages,
    )
    messages.append({"role": "assistant", "content": resp.content})
    if resp.stop_reason != "tool_use":
        break            # end_turn (or other terminal reason): done
    results = []
    for block in resp.content:
        if block.type == "tool_use":
            out = dispatch(block.name, block.input)
            results.append({"type": "tool_result",
                            "tool_use_id": block.id, "content": out})
    messages.append({"role": "user", "content": results})

stop_reason is the control signal, not the text

Every Messages API response carries a structured stop_reason field. This field, not the natural-language content, tells your loop what to do. Continue the loop while stop_reason == "tool_use"; terminate on any other value.

The two you must know cold are "tool_use" (Claude wants one or more tools executed before it can continue) and "end_turn" (Claude finished its turn and has nothing more to ask for, which is the authoritative "done" signal). Other terminal values you may encounter include "max_tokens" (the response hit the token cap mid-generation), "stop_sequence" (a configured stop string was produced), "pause_turn" (a long-running server tool turn was paused and can be continued), and "refusal". Your control flow should treat anything that is not "tool_use" as a reason to stop looping and inspect the result.

Because stop_reason is a discrete enum returned by the API, it is deterministic. That is exactly why you branch on it rather than on the prose Claude wrote.

How tool results re-enter the conversation

The loop only works if each iteration's tool results are appended to messages[] so the model can reason over them next time. The wire format is strict and testable.

After a tool_use response you append two messages in order. First, the assistant message whose content includes the tool_use block(s) Claude produced (simply pass back resp.content). Second, a user message whose content is a list of tool_result blocks. Each tool_result must reference the matching request via tool_use_id and carry the result in content. If a tool failed, set is_error: true on that block so Claude can react to the failure rather than treating the error text as data.

{"role": "user", "content": [
  {"type": "tool_result",
   "tool_use_id": "toolu_01A...",
   "content": "{\"customer_id\": \"C-4471\", \"verified\": true}"}
]}

Ordering and pairing matter: the assistant tool_use block must appear in history before its tool_result, and every tool_use_id must be answered. Omitting a result or mismatching an id produces a 400 error. When Claude emits several tool_use blocks in one response (parallel tool use), return all of their tool_result blocks in a single following user message.

Model-driven decisions vs pre-configured sequences

What makes the loop agentic is that Claude chooses the next tool based on everything accumulated in context, not because your code walked a fixed script. Given a customer request, the model might call get_customer, then decide from that result whether it next needs lookup_order or should escalate_to_human. You did not encode that branching; the model reasoned to it.

Contrast this with a pre-configured decision tree or a hardcoded tool sequence, where your application always calls tool A, then B, then C regardless of intermediate findings. That approach throws away the model's ability to adapt and breaks the moment a case does not fit the fixed path. The agentic loop deliberately delegates "which tool next" to Claude.

This does not mean you never enforce ordering. When a business rule requires a deterministic step (for example, verify identity before issuing a refund), you enforce it with programmatic gates or hooks (covered in task 1.4/1.5), not by abandoning the model-driven loop. The loop stays model-driven; guardrails constrain it where correctness demands certainty.

The Agent SDK harness vs rolling your own loop

At the raw Messages API level you write the loop yourself, as shown above. The Claude Agent SDK and Claude Code implement this same loop for you (often called the harness): they call the model, execute tools, append results, manage context, and repeat, all under the hood. Both layers use the identical stop_reason -> execute -> feed-back mechanics.

Knowing which layer you are on matters for the exam. If a question hands you low-level control (a messages.create call, raw stop_reason handling), you are building the loop directly. If it describes agents, subagents, or Claude Code slash commands, the harness is running the loop and you are configuring behavior around it. The underlying lifecycle is the same either way.

Either way, the model is the decision-maker inside the loop and your code (or the SDK) is the execution environment that keeps the loop turning until a terminal stop_reason.

Terminating correctly

The correct termination signal is stop_reason == "end_turn" (or another non-tool_use terminal value). When Claude has fully answered and needs no more tools, it ends its turn; your loop breaks and returns the final assistant content to the caller.

An iteration cap (for example, max_iterations = 25) belongs in production code, but only as a safety backstop against a runaway or oscillating loop. It is not how a healthy task ends. If your agent is regularly hitting the cap, that is a signal of a design problem (bad tool descriptions, missing information, or a task the model cannot complete), not a normal stop condition.

Do not infer completion from the prose. Claude can and does emit natural-language text alongside a tool_use block in the same response (for example, explaining what it is about to look up). Treating the presence of assistant text as "done" would terminate the loop while a tool call is still pending.

Anti-patterns to avoid

avoid
Parsing Claude's natural-language output for phrases like "task complete" or "I'm done" to decide when to stop looping.

Why it fails: Prose is non-deterministic. Claude may say "done" while still needing a tool, or finish without ever using that phrase, producing both premature termination and infinite loops. It also breaks across languages, phrasings, and model versions.

instead Branch on the structured `stop_reason` field: continue while it is `"tool_use"`, terminate on `"end_turn"` (or any other terminal value).

avoid
Using an arbitrary max-iterations counter as the primary mechanism for ending the loop.

Why it fails: A fixed cap either cuts off legitimately long tasks before they finish or lets a broken agent burn tokens right up to the limit. It masks design problems instead of ending the task at the right point.

instead Let `end_turn` be the primary terminator. Keep an iteration cap only as a runaway-safety backstop, and treat hitting it as an alarm to investigate, not a normal completion.

avoid
Treating the presence of any assistant text block as a completion indicator.

Why it fails: Claude routinely emits explanatory text in the same response as a `tool_use` block. If text presence means "done," the loop stops with a tool call still outstanding, so the tool never runs and the task stalls.

instead Ignore text content for control flow. Only `stop_reason` decides continuation; execute every `tool_use` block regardless of accompanying text.

avoid
Hardcoding a fixed tool-call sequence (a decision tree) in application code instead of letting the model choose the next tool.

Why it fails: A rigid A-then-B-then-C pipeline discards Claude's ability to adapt to intermediate findings and breaks on any case that does not match the scripted path, defeating the point of an agentic loop.

instead Let the model choose each next tool from accumulated context. Where a rule demands deterministic ordering, enforce that single step with a programmatic gate or hook, keeping the rest of the loop model-driven.

Worked example: A support agent resolving a damaged-item refund (Scenario 1)

You are building the Customer Support Resolution Agent from Scenario 1 on the Agent SDK, with MCP tools get_customer, lookup_order, process_refund, and escalate_to_human. A customer writes: "My order arrived smashed, I want a refund, my name is Dana Ruiz." Here is how the agentic loop drives the resolution.

Iteration 1. You send the customer message plus the four tool definitions. Claude reasons that it must verify identity first and responds with a tool_use block for get_customer and stop_reason: "tool_use". Because the reason is tool_use, the loop continues. Your code runs get_customer, which returns a single verified customer_id: "C-4471". You append Claude's assistant message (the tool_use) and a user message with the matching tool_result (carrying tool_use_id).

Iteration 2. You call the API again with the grown history. Now aware of the verified id, Claude emits a tool_use for lookup_order, still stop_reason: "tool_use". Your code returns the order record (status: delivered, photo evidence attached). You append the request and result.

Iteration 3. Claude decides the refund is warranted and issues a tool_use for process_refund. You execute it; the backend confirms the refund. You feed the result back.

Iteration 4. With the refund confirmed, Claude has nothing left to ask for. It returns a final message to the customer and stop_reason: "end_turn". Your loop breaks and returns that text.

while True:
    resp = client.messages.create(model=MODEL, max_tokens=1024,
                                  tools=TOOLS, messages=messages)
    messages.append({"role": "assistant", "content": resp.content})
    if resp.stop_reason != "tool_use":
        return resp            # end_turn: hand final answer to caller
    tool_results = run_tools(resp.content)   # dispatch each tool_use
    messages.append({"role": "user", "content": tool_results})

Notice what the loop did not do: it never scanned Claude's wording for "refund processed" to decide it was finished, and it never followed a hardcoded get_customer -> lookup_order -> process_refund script. The model chose each tool from context, and end_turn ended the task. If identity had needed to be guaranteed before the refund, you would add a programmatic gate (task 1.4), not rewrite the loop to hardcode the order.

Exam tips

  • Continue the loop while `stop_reason == "tool_use"`; terminate on `"end_turn"` (and other non-tool_use terminal reasons). This single rule is the heart of task 1.1.
  • Claude requests tools; it never executes them. Your application code (or the Agent SDK harness) runs the tool and returns a `tool_result`.
  • Append order is strict: the assistant `tool_use` block must be in history before its `tool_result`, and every `tool_use_id` must be answered, or the API returns a 400.
  • Parallel tool use returns multiple `tool_use` blocks in one response; put all corresponding `tool_result` blocks in a single following user message.
  • An iteration cap is only a runaway-safety backstop, never the primary stop signal. Regularly hitting it means the design is broken.
  • Never parse assistant prose to detect completion, and never treat text-block presence as done: text can accompany a pending `tool_use`.
Official exam objectives for 1.1
Knowledge of
  • The agentic loop lifecycle: sending requests to Claude, inspecting stop_reason ("tool_use" vs "end_turn"), executing requested tools, and returning results for the next iteration
  • How tool results are appended to conversation history so the model can reason about the next action
  • The distinction between model-driven decision-making (Claude reasons about which tool to call next based on context) and pre-configured decision trees or tool sequences
Skills in
  • Implementing agentic loop control flow that continues when stop_reason is "tool_use" and terminates when stop_reason is "end_turn"
  • Adding tool results to conversation context between iterations so the model can incorporate new information into its reasoning
  • Avoiding anti-patterns such as parsing natural language signals to determine loop termination, setting arbitrary iteration caps as the primary stopping mechanism, or checking for assistant text content as a completion indicator

Flashcards from this lesson

After executing a tool, what two things must you append to messages[] and in what order?

First the assistant message containing the tool_use block, then a user message containing tool_result block(s) that reference the matching tool_use_id.

Claude returns three `tool_use` blocks in one response. How do you reply?

Execute all three and return all three tool_result blocks in a single following user message, each keyed to its tool_use_id.

Why is a max-iterations counter not a correct primary termination mechanism?

It cuts off long legitimate tasks or lets broken loops burn tokens to the limit. end_turn is the real done signal; the cap is only a runaway-safety backstop.

Why can't you treat assistant text as a 'done' indicator?

Claude can emit natural-language text alongside a tool_use block in the same response, so text presence would stop the loop while a tool call is still pending.

What distinguishes an agentic loop from a hardcoded tool pipeline?

In an agentic loop the model chooses each next tool from accumulated context (model-driven), rather than your code walking a fixed A-then-B-then-C decision tree.

In the loop, who actually executes a tool that Claude requests?

Your application code (or the Agent SDK harness). Claude only emits a tool_use block with an id, name, and input; it never runs the tool itself.

Which field decides whether the agentic loop continues or stops?

The response's stop_reason. Continue while it equals "tool_use"; terminate on "end_turn" or any other non-tool_use terminal value.

Study all flashcards with spaced repetition
Mark this lesson complete when you are confident.