Domínio 1 · 27% do exame

Domain 1 Study Guide: Agentic Architecture & Orchestration

Domain 1: Agentic Architecture & Orchestration (27%)

Domain 1 is the largest single slice of the CCA-Foundations exam — just over a quarter of every question. It tests whether you can design reliable agentic systems on Claude, not just call the Messages API. Expect scenario questions that hand you a goal, a set of tools, and a reliability constraint, then ask you to pick the right control structure. The wrong answers almost always confuse "the model decides" with "my code decides."

The agentic loop is the foundation

An agent is not a single prompt — it is a loop. You give Claude a goal and a set of tools, and it repeats a cycle: gather context → take action → verify work → repeat until the task is done. This is the exact loop that powers Claude Code and the Claude Agent SDK.

Each iteration is one round-trip to the Messages API:

You send the conversation (system prompt, user goal, prior tool_use and tool_result blocks).
Claude responds. If it wants to act, it emits one or more tool_use blocks and stops with stop_reason: "tool_use".
Your application code executes the tool and appends the output as a tool_result block inside a new user message.
You call the API again. Claude either calls another tool or produces a final answer with stop_reason: "end_turn".

The single most-tested fact in this domain: the model never executes tools — it only requests them. The loop, the tool execution, and the termination check are your responsibility.

python

stop_reason drives control flow

You must branch on stop_reason every turn. Know these values:

tool_use — Claude requested a tool. Execute it, append tool_result, loop again.
end_turn — Claude finished naturally. The loop is done.
max_tokens — the response was truncated by the token cap. Continue or raise the limit; do not treat it as a complete answer.
stop_sequence — a custom stop sequence was hit.
pause_turn — a long-running server tool turn was paused; pass the response back to continue.
refusal — the model declined for safety reasons (with a stop_details policy category on newer models).

A classic trap: assuming the agent will stop on its own. It won't — always bound the loop with a turn cap (max_turns) or a token/cost budget.

Model-driven vs. hard-coded control

This is the heart of the domain. The exam wants you to choose the least powerful structure that solves the problem:

Hard-coded / workflow (deterministic): you write the control flow — prompt chaining, routing, fixed sequences. Use when steps are known in advance, latency and cost must be predictable, and you want auditability. A single LLM call with retrieval is not an agent; it is a workflow.
Model-driven (agentic): Claude decides which tools to call and in what order. Use only when the path is open-ended and cannot be predetermined — exploration, debugging, research.

Agentic flexibility costs tokens, latency, and predictability. If a fixed pipeline works, use it. "Make everything an agent" is almost always a wrong answer.

Multi-agent topologies

When one agent's context window or responsibilities get overloaded, decompose into multiple agents:

Hub-and-spoke (orchestrator–worker): a lead/coordinator agent breaks the task down and delegates to specialized subagents, then synthesizes their results. This is the dominant pattern.
Parallel subagents: independent subtasks (e.g., researching several sources) run concurrently to cut wall-clock latency. Use when subtasks don't depend on each other.
Context isolation: the decisive reason to use subagents. Each subagent runs in its own clean context window and returns only a condensed summary to the coordinator. This keeps the orchestrator's context lean and prevents one task's noise from polluting another. Anthropic's multi-agent research system used this to handle work no single context could hold — at the cost of many more tokens.

Trade-off the exam tests: multi-agent systems multiply token usage (often 4x+ a chat). Reach for them when the task is broad and parallelizable enough to justify the cost — not for narrow, sequential work.

The Claude Agent SDK

The Agent SDK runs the production-grade version of the loop above (the same engine as Claude Code). Key building blocks:

AgentDefinition — declares a subagent: its name, description, system prompt, model, and the specific tools it may use.
Per-agent tool scoping — grant each agent the minimum tools it needs. A read-only research agent should never hold write/delete tools. This is least-privilege as a reliability and safety control.
Hooks vs. prompts — hooks are deterministic code that fire at lifecycle points (before a tool runs, after a response) to enforce rules you cannot trust a prompt to follow — input validation, permission gates, logging. Use a hook for hard guarantees; use the prompt for guidance and preferences.

Sessions, decomposition, and human-in-the-loop

Session management — persist conversation state so long or resumable tasks keep their context across turns; compact or summarize when the window fills.
Prompt chaining vs. task decomposition — chaining feeds one call's output into the next in a fixed sequence; decomposition lets an agent dynamically split a goal into subtasks. Chaining = deterministic, decomposition = adaptive.
Human-in-the-loop (HITL) escalation — for irreversible or high-stakes actions (sending money, deleting data, emailing customers), the agent should pause and request approval rather than act autonomously. Design explicit checkpoints; never let an unattended loop perform unbounded irreversible actions.

Master the loop, stop_reason, the model-vs-hard-coded decision, context isolation, and least-privilege tool scoping, and you have the spine of 27% of the exam.

Dicas para o exame

✓When a question asks "who executes the tool," the answer is always your application code — Claude only emits tool_use requests. The model returning stop_reason: "tool_use" is a request, not an execution.
✓Memorize the stop_reason values and what each requires: tool_use (run tool + loop), end_turn (done), max_tokens (truncated — continue, do not treat as final), stop_sequence, pause_turn (resume server tool), refusal (safety decline).
✓For "which control structure" questions, pick the LEAST powerful option that works. A fixed workflow (prompt chaining / routing) beats an agent whenever the steps are known in advance and predictability matters.
✓The defining benefit of subagents is context isolation: each runs in its own clean window and returns a condensed summary. If a question stresses a bloated or polluted context window, the answer usually involves isolating work into subagents.
✓Use a hook (deterministic code) for anything that must be guaranteed — permission gates, validation, logging. Use the system prompt for guidance and preferences. Never rely on a prompt alone to enforce a hard safety rule.
✓Always bound the agentic loop with a turn cap or cost/token budget. Any answer that assumes the model will reliably terminate on its own is wrong.
✓Apply least-privilege tool scoping per agent: a read-only or research agent should never be granted write/delete tools. Scoping is both a reliability and a safety control.
✓For irreversible or high-stakes actions (payments, deletions, outbound email), the correct design is a human-in-the-loop approval checkpoint, not fully autonomous execution.

Antipadrões

✗Assuming the model runs the tool itself. Claude returns a tool_use block and stops; the application must execute the tool and feed back a tool_result. Treating the API as if it calls tools server-side is wrong for client-defined tools.
✗Making everything an agent. Wrapping a deterministic, known-step task in a model-driven loop adds cost, latency, and unpredictability for no benefit — a workflow is the correct choice there.
✗Treating a single retrieval-augmented LLM call as an agent. One call with context is a workflow; an agent requires the gather-act-verify loop with tool use and a termination condition.
✗Running an unbounded loop with no max_turns or budget cap. Without a guardrail the agent can spin indefinitely or burn unbounded tokens; termination must be enforced by your code.
✗Granting every agent the full tool set. Broad tool access violates least privilege and lets a research/read-only agent perform destructive writes — scope tools per agent instead.
✗Using a multi-agent system for a narrow, sequential task. Multi-agent topologies multiply token usage several-fold; they are only justified when the work is broad, parallelizable, or too large for one context window.
✗Enforcing a hard rule (such as a destructive-action block) through the system prompt instead of a hook. Prompts are guidance and can be ignored; deterministic guarantees require code-level hooks.
✗Letting an autonomous agent perform irreversible high-stakes actions with no human approval step. Lacking a HITL checkpoint for payments, deletions, or external communications is a reliability and safety failure.

Lições relacionadas

agentic loop stop reason model vs hardcoded hub and spoke subagent isolation parallel execution agent definition tool scoping hooks vs prompts session management prompt chaining vs decomposition hitl escalation