Chain-of-Thought, Roles & System Prompts

This lesson covers two related levers: getting Claude to reason before answering (chain-of-thought), and setting up who Claude is and how it should behave via role / persona / system prompts.

Chain-of-thought (CoT)

For multi-step problems — math, logic, multi-criteria decisions, debugging — letting the model "think out loud" before committing to an answer measurably improves accuracy. Reasoning that would otherwise be compressed into a single token jump gets space to unfold.

Two ways to do it:

text

Wrapping the reasoning in tags lets you strip it from the user-facing output while still getting the accuracy benefit, and makes the reasoning easy to log or evaluate. Anthropic also offers extended thinking as a first-class API feature on capable models: you set a thinking budget and the model returns thinking content blocks separate from the final text — use that when you want native, auditable reasoning rather than prompt-engineered tags.

CoT is not free: it adds output tokens and latency. Reserve it for tasks that genuinely need reasoning. For simple classification, CoT can even hurt by over-thinking — measure, don't assume.

CoT before structured output

A useful pattern: let the model reason in prose first, then emit the structured result. If you force JSON immediately, you lose the reasoning lift. Order matters — reason, then commit. With tool_use (Module 2) you can prompt the model to think in text, then call the output tool last.

Roles, personas & system prompts

The system prompt sets persistent context and behavior that applies across the whole conversation, separate from the user turn. In the Messages API it is the top-level system parameter, not a message:

json

Assigning a role ("You are a senior security engineer…") primes domain vocabulary, raises the bar for output quality, and constrains tone. It is one of the cheapest accuracy and consistency wins available.

Guidance:

Put stable instructions (role, rules, format, tone) in system; put the task-specific content in the user message.
A role should match the actual task — "senior security engineer" for code review, not a generic "helpful assistant."
Keep the persona consistent; contradictory instructions across system and user turns degrade reliability.

Combining the levers

These compose: a sharp system/role prompt + explicit criteria + 2–4 examples + CoT for the hard part is the standard recipe for a reliable task prompt. Add structured output on top to make it machine-checkable.

Exam focus

Know when CoT helps (multi-step reasoning) and its cost (tokens/latency, can hurt trivial tasks). Remember you can hide reasoning in <thinking> tags or use the native extended thinking feature. Know that the system prompt is a distinct top-level parameter for persistent role/behavior, while the user message carries the task — and that assigning a fitting role is a cheap, high-impact quality lever.