Agentic Architecture & Orchestration

stop_reason: tool_use vs end_turn

8 min read

stop_reason is the single most important field for driving an agent loop. It tells you why the model stopped generating on its last turn — and therefore what your code should do next. Reading it correctly is the difference between a loop that progresses and one that hangs or terminates early.

The values you must know

stop_reasonMeaningWhat your loop does
tool_useThe model emitted one or more tool_use blocks and is waiting for resultsExecute the tools, append tool_result, call the API again
end_turnThe model finished naturally; the response is the final answerStop the loop; return the text to the user
max_tokensOutput hit max_tokens mid-generationContinue (ask it to keep going) or raise the cap — the answer is incomplete
stop_sequenceA custom stop sequence was hitHandle per your protocol
refusalThe model declined for safety reasonsSurface to the user; do not retry blindly
pause_turnA long-running server-side turn was paused (e.g. some server tools)Resume by sending the response back

The core distinction

The exam centers on tool_use vs end_turn:

json

A tool_use stop means continue the loop; end_turn means the loop is over. A common bug is treating any response containing text as final — but a tool_use turn can include text and a tool call. Always branch on stop_reason, never on "did it produce text."

Multiple tools in one stop

A single tool_use turn can contain several tool_use blocks. When the requested tools are independent, run them in parallel and return all results in one user message. This is the foundation of parallel tool execution (and, at the agent level, parallel subagents).

Why not just look at content?

Because the protocol is explicit by design. stop_reason is the control signal; the content is the payload. Hard-coding your loop's branch on stop_reason keeps it correct across model versions and tool counts. If you ever see max_tokens, the output is truncated — JSON or tool input may be incomplete, so handle it before parsing.

Exam focus: stop_reason: "tool_use" ⇒ execute tools and continue; "end_turn" ⇒ final answer, stop. max_tokens means truncated output (not done). One tool_use turn may carry multiple tool calls; run independent ones in parallel. Branch on stop_reason, not on the presence of text.