Lesson — Claude Certified Architect Prep

Everything you build on Claude goes through one endpoint: POST /v1/messages. Tools, structured output, vision, caching, and thinking are all parameters of this one request — not separate APIs. Internalize that and the platform stops feeling like a pile of features.

The request shape

A minimal request needs three things: model, max_tokens, and messages.

json

max_tokens is the ceiling on the output — it does not include the input. Set it too low and the response truncates mid-sentence with stop_reason: "max_tokens".

Roles

The conversation is an ordered array of messages, each with a role:

user — what you (or your application) send to Claude. The first message must be user.
assistant — Claude's replies. When you continue a conversation, you append the prior assistant turns back into the array.
system — not an entry in messages. The system prompt is a separate top-level system field that sets persona, rules, and context. (A newer beta lets you place role: "system" messages inside messages for mid-conversation operator instructions, but the canonical system prompt is the top-level field.)

json

The API is stateless

Claude has no memory between calls. Each request must carry the entire conversation history. To continue a chat you resend every prior turn:

json

Statelessness is why context-window management (Domain 5) matters: you pay for and re-send the whole transcript on every turn.

Roles must alternate (mostly)

The first message is user, and turns generally alternate user / assistant. Consecutive same-role messages are merged into one turn rather than rejected, but a first assistant message returns a 400. content can be a plain string or an array of typed blocks (text, image, tool_use, tool_result, document).

Reading the response

The response is a message object. The two fields you check first:

content — an array of blocks. For plain text it's one {"type": "text", "text": "..."} block; with tools it can also contain tool_use blocks.
stop_reason — why generation stopped: end_turn (done), max_tokens (hit the cap), tool_use (Claude wants a tool), stop_sequence, pause_turn, or refusal.

Always branch on stop_reason rather than assuming the text is complete.

Exam focus

Know that all Claude capabilities flow through /v1/messages. Memorize the three roles and that system is a top-level field, not a message entry. Remember the API is stateless — you resend full history each call. Know that max_tokens caps output only, and that stop_reason: "max_tokens" means truncation (raise the cap or stream), distinct from end_turn.

The Messages API & Roles

The request shape

Roles

The API is stateless

Roles must alternate (mostly)

Reading the response

Exam focus