시나리오 목록으로 돌아가기
8개 시나리오 중 2번째

Code Generation with Claude Code

The situation

A platform team maintains a TypeScript monorepo (web app, shared libraries, a few services). They want Claude Code to ship small features and bug fixes with minimal hand-holding while staying inside the team's conventions: strict linting, a house test runner, no edits to generated files, and pull requests that pass CI before a human reviews them. The architect's job is to configure the environment so the model produces correct, in-style code repeatably — not to write any single prompt that happens to work once.

This scenario lives at the intersection of Domain 3 (Claude Code Configuration & Workflows) and Domain 4 (Prompt Engineering & Structured Output).

The right approach: encode taste as configuration

The first decision is where knowledge lives. Repository conventions belong in a versioned CLAUDE.md at the project root, not in a one-off chat message. Claude Code merges memory by precedence — enterprise/user (~/.claude/CLAUDE.md) then project (./CLAUDE.md) then the current directory — so put cross-cutting rules ("run pnpm test, never edit *.gen.ts") at the project level and let nested directories add local detail. Use @path imports to pull in a style guide or schema doc instead of pasting it.*

Per-path rules go in .claude/rules/ with YAML frontmatter and glob matching, so a rule like "all API handlers validate input with Zod" only loads when handler files are in context. This keeps the prompt small while still steering generation — the heart of Domain 3.

Workflow: plan mode, then execute, then verify

For anything beyond a trivial edit, start in plan mode. Claude explores the codebase and proposes a change set without writing files, which catches wrong-file or wrong-architecture errors before they cost a round-trip. Approve the plan, then let it execute. Manage the context window deliberately: use the Explore subagent for read-heavy investigation and /compact to summarize long sessions so the working set stays focused.

Configure a permission model that matches risk. Pre-approve safe, idempotent tools and require confirmation for destructive ones. In the terminal this is the allow/ask/deny model; in automation you pass it explicitly (below).

Mapping to Domain 4: structured, verifiable output

Generation quality comes from the same prompt-engineering levers the exam tests: explicit acceptance criteria, 2–4 few-shot examples drawn from existing code, and a request for a brief plan (chain-of-thought) before the diff. But code is special because correctness is checkable. The architect's leverage is the validation-retry loop: generate, run the linter and tests, feed failures back, and let the model self-correct. Distinguish syntax errors (won't compile — always retry) from semantic errors (compiles but wrong — needs a sharper criterion or a human), exactly as Domain 4 frames it. For machine-consumed results, request structured output (e.g. a JSON summary of files changed and tests run) rather than scraping prose.

Headless / CI usage

When this runs unattended, use headless mode: claude -p "<task>" --output-format json. The -p flag is the single entry point to non-interactive execution; --output-format json returns a parseable payload (including total_cost_usd) so the pipeline can branch on the result and track spend. Pre-approve tools with --allowedTools "Read,Edit,Bash(pnpm test:*)" so the run completes without a permission prompt, and keep the agent on a tight turn budget. The gate stays the same as for humans: the generated branch must pass CI before merge.*

bash

Common traps the exam probes

  • Per-chat instructions instead of CLAUDE.md. Knowledge that isn't versioned doesn't persist and can't be reviewed. Encode conventions as files and rules.
  • Skipping plan mode on multi-file work, then discovering the model touched the wrong layer.
  • Trusting "it compiles" as "it's correct." Always close the loop with tests; treat semantic failures differently from syntax failures.
  • Scraping prose output in CI instead of --output-format json, making the pipeline brittle.
  • Over-broad permissions in automation — granting unrestricted Bash where a scoped --allowedTools would do.

Exam focus

Domain 3 wants you to choose where to place configuration (CLAUDE.md hierarchy, .claude/rules/, permission model) and to run headless mode correctly (-p, --output-format json, --allowedTools). Domain 4 wants reliable, checkable output: criteria + few-shot, a validation-retry loop, and the syntax-vs-semantic error distinction. The unifying idea: make quality a property of the environment and the loop, not of any single prompt.