03-agentic-cli-tools - Albert Masoliver's learning site

# Module 3 — Mastery of Agentic CLI Tools > *"The terminal isn't your dev environment anymore. It's the shop floor > where the agents work and where you tell them what to do."* --- ## Learning objectives By the end of this module you will be able to: 1. Set up **Claude Code** and **OpenCode** with multi-provider routing, LSP-aware editing, and project-scoped configuration. 2. Use **Plan** and **Build** modes deliberately, and know when to flip between them mid-task. 3. Codify your team's conventions in **`AGENTS.md`** / **`CLAUDE.md`** so every session inherits the same house rules. 4. Diagnose common configuration smells (slow startup, wrong tool permissions, ignored instructions) and fix them at the source. --- ## 3.1 The CLI is the platform For most teams in 2026, the *agentic CLI* — not the IDE — is where day-to-day software gets written. The reasons are unsentimental: - **Composability.** Agents call shell tools; the shell is where shells live. - **Scriptability.** A CLI can be run in CI, piped, looped, and version- controlled. Chat windows can't. - **Surface area.** A terminal exposes file system, git, package managers, databases, browsers, and the agent — all in one process tree, with one set of permissions to reason about. The IDE extension is now a *view* onto the CLI session, not its replacement. You will spend the rest of this course operating through the CLI. Get comfortable. --- ## 3.2 Claude Code setup, properly ### Installation and first run ```bash # Install (npm, Homebrew, or curl installer — pick your platform) npm install -g @anthropic-ai/claude-code # First run launches an interactive setup claude ``` The first run will ask about authentication. Two paths: 1. **Anthropic API key** — direct, recommended for individuals. Set `ANTHROPIC_API_KEY` in your environment or paste at the prompt. 2. **Bedrock / Vertex** — for orgs that route AI traffic through their cloud. See `claude doctor` for environment variables. ### The settings hierarchy Claude Code reads configuration from three places, in increasing specificity: | Scope | Path | When to use | |-----------|---------------------------------------------------------|----------------------------------------| | User | `~/.claude/settings.json` | Personal defaults (theme, default model). | | Project | `.claude/settings.json` (committed) | Team conventions everyone gets. | | Local | `.claude/settings.local.json` (gitignored) | Per-developer overrides for the project. | A minimal but useful **project** `.claude/settings.json`: ```jsonc { "model": "claude-sonnet-4-6", "permissions": { "allow": [ "Bash(npm test:*)", "Bash(npm run lint:*)", "Bash(git status)", "Bash(git diff:*)", "Bash(git log:*)" ], "deny": [ "Bash(git push:*)", "Bash(rm -rf:*)" ] }, "env": { "NODE_ENV": "development" } } ``` Three things to notice: 1. **`allow` is a positive list.** Tools not allow-listed prompt the user for permission. This is the right default — explicit beats implicit. 2. **`deny` is non-negotiable.** A user can't approve a denied tool with a "yes." Use this for true safety rails (force push, recursive delete). 3. **`env` is forwarded** to the shell tools the agent runs. Use it for test-only credentials; never put production secrets here. ### Multi-provider configuration (OpenCode and friends) OpenCode is the leading provider-agnostic agentic CLI. The same prompt can hit Claude, Bedrock, Vertex, or a local model depending on the route. A minimal `opencode.json`: ```jsonc { "$schema": "https://opencode.ai/config.schema.json", "providers": { "anthropic": { "models": ["claude-opus-4-7", "claude-sonnet-4-6", "claude-haiku-4-5-20251001"] }, "openrouter": { "models": ["meta-llama/llama-4-405b-instruct"] } }, "agents": { "default": { "provider": "anthropic", "model": "claude-sonnet-4-6" }, "architect": { "provider": "anthropic", "model": "claude-opus-4-7" }, "scout": { "provider": "anthropic", "model": "claude-haiku-4-5-20251001" } } } ``` The point of multi-provider isn't switching models randomly. It's giving you **fallbacks** (when one provider rate-limits you) and **specialization** (some models are noticeably better at certain languages). ### Language Server Protocol integration The single biggest quality jump comes from giving the agent **LSP access** to the files it's editing. With LSP: - Renames are project-wide and correct, not regex-and-pray. - The agent sees type errors *as it edits*, not after. - "Go to definition" works from the agent's side, which means less file reading and more focused context. In Claude Code, LSP is on by default for supported languages (TypeScript, Python, Go, Rust, others). Verify it's working: ```bash claude /diagnostics # Look for a "lsp" section listing the running servers. ``` If LSP isn't running for your language, install the standard server (`typescript-language-server`, `gopls`, `rust-analyzer`, etc.) and restart the session. > **Pitfall:** working in monorepos with no `tsconfig.json` at the root. > The TypeScript server doesn't know how to resolve imports, the agent gets > confused, and quality drops. Fix the workspace before blaming the model. --- ## 3.3 Plan mode vs Build mode Every serious agentic CLI distinguishes between **reading the world** and **changing it**. Claude Code calls these **Plan mode** and the default build-capable mode; OpenCode has equivalents. ### Plan mode Plan mode is **read-only**. The agent can use Read, Grep, Glob, the LSP, and shell commands flagged as read-only. It *cannot* edit, write, or run mutating shell commands. Use it when: - You don't yet know the shape of the change. - You want a proposal you can review before any code moves. - You're new to the codebase and the agent is faster at orienting you than you are. Enter it with `/plan` or by toggling the mode keybinding. The agent finishes by emitting a structured plan you approve, reject, or amend. ``` You: /plan Plan> investigate why pagination breaks when filter "status=archived" is set, then propose a fix. Agent: <reads code, traces query> Plan v1 Root cause: list endpoint composes `WHERE status = ?` with `LIMIT/OFFSET` but pagination cursor is built before the WHERE clause is applied. Fix: 1. Move cursor computation to after the filter is applied — src/api/list.ts:42 2. Add regression test — test/api/list.spec.ts (new case "archived + page 2") Risk: low. No schema or public-API changes. Approve / Edit / Reject? ``` ### Build mode Build mode is the agent's default — read, write, run, commit (if you've authorized commits). Use it once a plan is in hand or for tasks small enough that "the plan" is implicit ("fix this typo on line 17"). The most common mistake is flipping into build mode too early, getting a plausible-but-wrong implementation, and then arguing with the agent about it turn after turn. A 90-second plan-mode pass eliminates this loop entirely. ### When to flip mid-task Flip back to plan mode mid-build when: - The agent surfaces an unexpected branch ("I see this also affects X — do you want me to handle that?"). Yes, but plan it first. - A test you didn't expect to touch starts failing. Stop, plan the fix, don't reflexively patch. - You feel the urge to write a 200-token prompt to "guide" the next step. That prompt is a plan; promote it. --- ## 3.4 Governing the project: `AGENTS.md` and `CLAUDE.md` Both files do the same job: they are **system prompts your project owns**. Anything in them is loaded into every session that starts in this repo. This is where house rules live. - `AGENTS.md` is the cross-tool convention (works with OpenCode, Cursor's agent mode, Continue, etc.). - `CLAUDE.md` is Claude Code's tool-specific file (additional and complementary to `AGENTS.md`). In practice: put portable rules in `AGENTS.md`, put Claude-specific instructions in `CLAUDE.md`, and have the agent read both. ### What belongs in these files Good content has a high signal-to-token ratio and tells the agent things it *cannot infer from the code*. Examples: ```markdown  # Engineering conventions ## Stack - Backend: TypeScript, Fastify, PostgreSQL via pg-promise. - Frontend: React, Vite, TanStack Router. - Tests: Vitest. Run with `npm test`. Single file: `npm test -- <path>`. ## Non-obvious rules - All new endpoints MUST register Zod schemas and use them for both validation and OpenAPI generation. See `src/api/_template.ts`. - We do not use `any` or `unknown` casts outside `src/lib/external/`. - Database migrations live in `db/migrations/`. Never edit a migration that has been merged to main — write a new one. - Logging: use the `log` import from `src/lib/log.ts`. `console.log` will fail CI. ## How we talk to agents - For changes affecting more than one module, start in plan mode. - Run `npm run typecheck` after edits in `src/`. Run `npm test` before declaring a task complete. - When in doubt about a product behavior, ASK. Do not invent. ``` ```markdown  # Claude-specific guidance - The `verify` skill is the canonical way to confirm a fix works. Use it before marking a task done. - Use the `architect` sub-agent for any change touching `src/api/` or `db/migrations/`. - Our MCP server `internal-jira` is configured. Use `/mcp` to inspect. ``` ### What does *not* belong - **Anything in the code already.** "We use TypeScript" — yes, the agent can tell. - **Stale war stories.** "Last year we had a bug where…" — fascinating, but the agent doesn't need it. Use git history. - **Long lists of files.** "Important files: src/a.ts, src/b.ts, …" — the agent can search. - **Prompts.** This file is not where you write the task. It's where you encode the standing rules. > **Heuristic:** if a fact would be true tomorrow even if half the code > changed today, it probably belongs in `AGENTS.md`. If it would be stale > by Friday, it doesn't. ### Keeping these files honest The fastest way to rot these files is to write them once and forget. Two mitigations: 1. **Treat them as code.** They go through review like everything else. A PR that introduces a new convention updates `AGENTS.md` in the same change. 2. **Audit them quarterly.** Ask an agent to read `AGENTS.md` and produce a list of rules it believes are *no longer applied in the codebase*. Delete what's dead. --- ## 3.5 Operational hygiene ### Slash commands you'll use every day - `/init` — bootstrap `CLAUDE.md` from the existing codebase. - `/plan` — enter plan mode. - `/diagnostics` — sanity check (LSP, MCP, hooks). - `/cost` — token spend for the current session. - `/compact` — manually trigger context summarization when you know you're done with the early conversation (don't wait for the auto-compaction at the worst moment). - `/resume` — pick up a previous session — invaluable when you've stepped away. ### Session hygiene Three habits that compound: 1. **One task per session.** When you finish a feature, end the session. Starting fresh is cheaper than carrying contaminated context. 2. **Name your sessions.** Most CLIs let you tag a session (`claude --session feat/passwordless`). Future-you will be grateful when you come back from lunch. 3. **Commit at every green test.** Even if you don't push, the local commit is a checkpoint the agent can `git reset` to when it goes off-piste. ### Reading the cost meter ```bash claude /cost # Session cost so far: $0.42 (Sonnet: 410k in, 38k out; Opus: 12k in, 4k out) ``` A daily glance at this number trains intuition faster than any pricing chart. Most sessions should clock in well under a dollar; an Opus-heavy architecture session might hit a few dollars. If you regularly see double digits for a single task, something is wrong upstream — likely an unbounded loop or a session that refused to die. --- ## Lab 3 — Bootstrap a governed project **Goal:** transform a "works on my machine" project into one any agent can pick up cleanly. **Time:** ~45 minutes. 1. Pick a real repo you own that currently has no `AGENTS.md`. 2. Run `claude` in the repo, then `/init`. Review the proposed `CLAUDE.md` and **delete every line that's already inferable from the code**. Most of them will be. 3. Author an `AGENTS.md` from scratch following the §3.4 template. Aim for under 80 lines. Include at least: - The non-obvious rules (lint setup, test command, banned patterns). - One "how we work with agents" rule. 4. Add a minimal `.claude/settings.json`: - Default model. - One allow-listed read-only Bash command. - One denied destructive command. 5. Start a **fresh session**, then ask the agent to perform a real, modestly-sized task in plan mode. Observe how much faster it orients itself with the conventions in hand. **What to look for:** the second session should ask noticeably *fewer* "do you use X?" questions and produce a plan that already respects your conventions. That delta is the value of the configuration. --- ## Common pitfalls - **Putting everything in `CLAUDE.md`.** The file balloons, the agent skims it, your rules get ignored. Keep it terse. - **Allow-listing `Bash(*)`.** You've turned off the safety system. Don't. Take the 30 seconds to add the specific patterns you actually use. - **Ignoring `/diagnostics`.** A broken LSP doesn't error — it silently degrades output. Run diagnostics when output gets dumber than expected. - **Working in default mode for risky changes.** Migrations, deletes, and cross-module refactors deserve plan mode every time. The plan is also your PR description. --- ## Summary - The CLI is the platform; the IDE is now a view onto it. - A real setup has a default model, an explicit permission policy, LSP working, and (for multi-provider users) a sensible agent → model map. - Plan mode and Build mode are different modes of thinking, not just different tool sets. Use plan deliberately. - `AGENTS.md` and `CLAUDE.md` are your standing system prompts. Treat them as code: review, version, audit. --- ## Further reading - *Claude Code* — official documentation, especially the "settings" and "sub-agents" sections. - *OpenCode* — `opencode.json` reference. - Anthropic's published `AGENTS.md` examples in the `anthropics/claude-code` repo. **Next:** [Module 4 — Extending AI Intelligence](04-extending-intelligence.md)