The Messages API & Roles
Everything you build on Claude goes through one endpoint: POST /v1/messages. Tools, structured output, vision, caching, and thinking are all parameters of this one request — not separate APIs. Internalize that and the platform stops feeling like a pile of features.
The request shape
A minimal request needs three things: model, max_tokens, and messages.
max_tokens is the ceiling on the output — it does not include the input. Set it too low and the response truncates mid-sentence with stop_reason: "max_tokens".
Roles
The conversation is an ordered array of messages, each with a role:
user— what you (or your application) send to Claude. The first message must beuser.assistant— Claude's replies. When you continue a conversation, you append the prior assistant turns back into the array.system— not an entry inmessages. The system prompt is a separate top-levelsystemfield that sets persona, rules, and context. (A newer beta lets you placerole: "system"messages insidemessagesfor mid-conversation operator instructions, but the canonical system prompt is the top-level field.)
The API is stateless
Claude has no memory between calls. Each request must carry the entire conversation history. To continue a chat you resend every prior turn:
Statelessness is why context-window management (Domain 5) matters: you pay for and re-send the whole transcript on every turn.
Roles must alternate (mostly)
The first message is user, and turns generally alternate user / assistant. Consecutive same-role messages are merged into one turn rather than rejected, but a first assistant message returns a 400. content can be a plain string or an array of typed blocks (text, image, tool_use, tool_result, document).
Reading the response
The response is a message object. The two fields you check first:
content— an array of blocks. For plain text it's one{"type": "text", "text": "..."}block; with tools it can also containtool_useblocks.stop_reason— why generation stopped:end_turn(done),max_tokens(hit the cap),tool_use(Claude wants a tool),stop_sequence,pause_turn, orrefusal.
Always branch on stop_reason rather than assuming the text is complete.
Exam focus
Know that all Claude capabilities flow through /v1/messages. Memorize the three roles and that system is a top-level field, not a message entry. Remember the API is stateless — you resend full history each call. Know that max_tokens caps output only, and that stop_reason: "max_tokens" means truncation (raise the cap or stream), distinct from end_turn.