시나리오 목록으로 돌아가기
8개 시나리오 중 5번째

Claude Code in CI/CD

Scenario 5: Claude Code in CI/CD

The situation

A platform team wants Claude Code to do real work inside their delivery pipeline — not just answer questions in a terminal. They have two distinct needs. First, an event-driven flow: when a developer writes @claude implement this in a GitHub issue or PR comment, Claude should open a branch, make the change, and push a PR. Second, a scheduled/scripted flow: a nightly job that scans the latest diff for typos, missing tests, or security issues and posts a report. The architect must choose the right integration for each, keep secrets safe, and make runs deterministic so the same input produces the same behavior on every machine.

This is a Domain 3 problem at its core: configuring Claude Code so an agent behaves predictably across an individual workstation, a shared team repo, and a non-interactive pipeline where no human is available to approve a tool call.

The right architecture

There are two supported building blocks, and the exam expects you to match them to the trigger.

1. The GitHub Action (anthropics/claude-code-action@v1). This is the answer for the event-driven flow. Install it with /install-github-app from the Claude Code CLI (you must be a repo admin), which wires up the GitHub App and stores ANTHROPIC_API_KEY as a repository secret. The action auto-detects its mode: with no prompt, it responds to @claude mentions; with a prompt, it runs automation immediately. CLI options pass through claude_args:

yaml

2. Headless mode (claude -p). This is the answer for the scripted/nightly flow and for non-GitHub runners (GitLab, Jenkins). -p/--print runs once, non-interactively, reads stdin, and exits. Pipe data in and redirect out like any Unix tool:

bash

Key decisions

  • Permissions, not prompts. A pipeline has no human to click "approve." Pre-authorize exactly the tools the job needs with --allowedTools (e.g. "Read,Edit,Bash(git diff *)"), or set a locked-down baseline with --permission-mode dontAsk / acceptEdits. Grant the minimum set — least privilege beats convenience.
  • Determinism with --bare. A plain claude -p still auto-discovers hooks, Skills, MCP servers, and CLAUDE.md from the working directory and ~/.claude — so a teammate's local hook could change CI behavior. --bare skips all auto-discovery; only flags you pass explicitly take effect, giving the same result on every machine. Pass context back in deliberately with --mcp-config, --append-system-prompt, or --settings.
  • Structured output for gating. --output-format json returns a payload with the text in .result, a session_id, and total_cost_usd, so a script can parse the verdict, track spend, and gate the build. stream-json is for live token streaming.
  • Cost controls. Set --max-turns to cap iterations, add a workflow timeout, and use concurrency limits to prevent runaway parallel jobs.
  • Enterprise auth. For data-residency or billing control, use Amazon Bedrock (use_bedrock: "true") or Google Vertex AI (use_vertex: "true") with GitHub OIDC / Workload Identity Federation instead of static keys. Note the Bedrock model-id region prefix, e.g. us.anthropic.claude-sonnet-4-6.*

Common traps

  • Hardcoding the API key in the workflow YAML. Always reference ${{ secrets.ANTHROPIC_API_KEY }}; never commit the key.
  • Expecting /code-review or other slash commands to work in -p. User-invoked Skills and built-in commands are interactive-only. In headless mode, describe the task in the prompt instead (or pass a Skill name through the action's prompt after actions/checkout).
  • Over-broad permissions. --allowedTools "Bash" lets the agent run any shell command in CI — a prompt-injection and supply-chain risk. Scope it: Bash(git diff *). Mind the space before * (prefix match) so git diff * doesn't accidentally also match git diff-index.
  • Forgetting least-privilege on the GitHub side. The action needs Contents, Issues, and Pull-requests read/write, plus id-token: write for OIDC — grant nothing more.
  • Ignoring @ vs /. Triggers are @claude, not /claude; a wrong sigil is a frequent "why isn't it responding" cause.*

How it maps to the exam

Everything here is Domain 3 — Claude Code Configuration & Workflows. Expect questions that ask you to (a) pick the Action for @claude events versus claude -p for scripts and non-GitHub CI; (b) recognize that headless runs must pre-grant tools because no human can approve; (c) explain why --bare makes runs reproducible; (d) read a JSON result to gate a build and track cost; and (e) spot insecure configs (hardcoded keys, unscoped Bash, excess GitHub permissions). The throughline is the same as all of Domain 3: encode behavior as explicit, verifiable configuration so the agent acts predictably without a human in the loop.