stop_reason: tool_use vs end_turn
stop_reason is the single most important field for driving an agent loop. It tells you why the model stopped generating on its last turn — and therefore what your code should do next. Reading it correctly is the difference between a loop that progresses and one that hangs or terminates early.
The values you must know
stop_reason | Meaning | What your loop does |
|---|---|---|
tool_use | The model emitted one or more tool_use blocks and is waiting for results | Execute the tools, append tool_result, call the API again |
end_turn | The model finished naturally; the response is the final answer | Stop the loop; return the text to the user |
max_tokens | Output hit max_tokens mid-generation | Continue (ask it to keep going) or raise the cap — the answer is incomplete |
stop_sequence | A custom stop sequence was hit | Handle per your protocol |
refusal | The model declined for safety reasons | Surface to the user; do not retry blindly |
pause_turn | A long-running server-side turn was paused (e.g. some server tools) | Resume by sending the response back |
The core distinction
The exam centers on tool_use vs end_turn:
A tool_use stop means continue the loop; end_turn means the loop is over. A common bug is treating any response containing text as final — but a tool_use turn can include text and a tool call. Always branch on stop_reason, never on "did it produce text."
Multiple tools in one stop
A single tool_use turn can contain several tool_use blocks. When the requested tools are independent, run them in parallel and return all results in one user message. This is the foundation of parallel tool execution (and, at the agent level, parallel subagents).
Why not just look at content?
Because the protocol is explicit by design. stop_reason is the control signal; the content is the payload. Hard-coding your loop's branch on stop_reason keeps it correct across model versions and tool counts. If you ever see max_tokens, the output is truncated — JSON or tool input may be incomplete, so handle it before parsing.
Exam focus: stop_reason: "tool_use" ⇒ execute tools and continue; "end_turn" ⇒ final answer, stop. max_tokens means truncated output (not done). One tool_use turn may carry multiple tool calls; run independent ones in parallel. Branch on stop_reason, not on the presence of text.