# Labs — Module 3: Mastery of Agentic CLI Tools
> Four short labs that turn an unconfigured repo into a governed,
> permission-safe agentic environment. Each lab fits in a single
> focused block; the whole set is ~70 minutes.
| Lab | Title | Time | Maps to |
|-----|--------------------------------------------------|--------|--------------------------------------|
| 3.1 | Add a permission policy + verified LSP | 15 min | §3.2 Settings & permissions |
| 3.2 | Write a tight AGENTS.md (under 80 lines) | 20 min | §3.4 Governing the project |
| 3.3 | Plan mode vs Build mode — one A/B | 20 min | §3.3 Plan mode vs Build mode |
| 3.4 | Configure one alternate model route | 15 min | §3.2 Multi-provider configuration |
---
## Lab 3.1 — Add a permission policy + verified LSP
### Objective
Stop being prompted for routine commands. Block dangerous ones at
the policy layer. Confirm LSP is running.
### Time
15 minutes.
### Real-world scenarios
- **A — TypeScript monorepo (pnpm + Turbo).** Allow
`Bash(pnpm test:*)`, `Bash(turbo run lint --filter:*)`. Deny
`pnpm publish`.
- **B — Python data pipeline (poetry + Airflow).** Allow
`Bash(poetry run pytest:*)`, `Bash(airflow dags list)`. Deny
`airflow dags trigger`.
- **C — Go microservices (Make-driven).** Allow `Bash(make test:*)`,
`Bash(go test ./...)`. Deny `make deploy`, `kubectl apply`,
`terraform apply`.
### Setup
```bash
mkdir -p .claude && touch .claude/settings.json
echo ".claude/settings.local.json" >> .gitignore
```
### Steps
1. Write `.claude/settings.json` with `model`, 3 `allow` patterns
for routine commands, and 2 `deny` patterns for danger.
2. Run `claude /diagnostics`; confirm LSP for your primary language
is listed. If not, install the standard server.
3. Open a fresh session. Trigger one allow-listed command (should
run silently) and ask the agent to attempt one denied command
(should refuse).
### Deliverable
`.claude/settings.json` committed + `labs/notes/3.1-permissions.md`
listing the allow/deny lines, the LSP server confirmed, and the
denied-command transcript snippet.
### Success criteria
- A routine session needs **zero** permission prompts for the
allowed patterns.
- The denied command is *blocked at the policy layer*, not just
unallowed.
### Reflection
- Which pattern did you almost wildcard? What restrains you?
### Stretch
- Move one personal-only allowance into
`.claude/settings.local.json` (gitignored).
---
## Lab 3.2 — Write a tight AGENTS.md (under 80 lines)
### Objective
Codify the conventions an agent *can't infer from the code* in one
file the harness loads automatically.
### Time
20 minutes.
### Real-world scenarios
- **A — Mature SaaS.** Heavy on "use *this* not *that*" with
rationale (Zod not class-validator; custom logger not winston).
- **B — Greenfield startup.** Short. Protects the few decisions
that exist (feature-folder layout, error shape).
- **C — Legacy modernization.** Names the migration direction and
the rules during transit ("new code goes through the service
layer; do not introduce a new ORM").
### Setup
`touch AGENTS.md`.
### Steps
1. Brainstorm 10 things you'd tell a new engineer joining tomorrow.
2. **Cull the inferable.** Cross out anything an agent could derive
by reading 2–3 files.
3. Write `AGENTS.md` with three sections: Stack (one-line each),
Non-obvious rules (with rationale), How to work with agents
here. Stay under 80 lines.
4. Smoke test: fresh session, ask for a change that *should* trigger
one of your non-obvious rules. Observe whether the agent honors
it without prompting.
### Deliverable
`AGENTS.md` committed + `labs/notes/3.2-agents-md.md` with the
smoke-test prompt and a yes/no.
### Success criteria
- File is **under 80 lines**.
- Each non-obvious rule has a *why* (or links to a doc that does).
- The smoke test produced behavior consistent with at least one
rule the agent had no other way to know.
### Reflection
- Which rule was hardest to articulate? That rule lives in your
team's head, not the code.
### Stretch
- Audit: ask an agent to read `AGENTS.md` and list rules it
*thinks* are no longer followed in the codebase. Delete what's
dead.
---
## Lab 3.3 — Plan mode vs Build mode, one A/B
### Objective
Feel the difference between plan-first and dive-in on the *kind*
of change where it matters most: cross-cutting.
### Time
20 minutes.
### Real-world scenarios
- **A — Cross-cutting feature.** Add request-id propagation across
3 services and the shared middleware.
- **B — Cross-module refactor.** Replace a global constant with an
injected config across ~4 files.
- **C — Cross-team renaming.** Rename a domain entity in code,
tests, and one fixture set.
### Setup
A scratch branch off a clean main.
### Steps
1. **Run X — Build mode.** Fresh session. Make the change as you
normally would. Save the diff. Count mid-task corrections you
had to issue.
2. `git reset --hard <base>`.
3. **Run Y — Plan → Build.** `/plan`, get a plan, approve/edit,
then build. Save the diff. Count corrections.
### Deliverable
`labs/notes/3.3-plan-vs-build.md` with a two-row table
(`Mode | Corrections | Tests pass | Quality 1–5 | Wall-clock`) and a
one-line *"My rule for when to enter plan mode."*
### Success criteria
- The plan-mode run had **fewer corrections**. If equal or worse,
the change wasn't cross-cutting enough — pick a bigger one and
redo.
- Your rule references the *change shape*, not "important changes."
### Reflection
- What was the in-session cue that, retrospectively, should have
pushed you into plan mode earlier?
### Stretch
- Repeat with a surgical bug fix. Plan mode often *hurts* there —
good data for calibrating the rule.
---
## Lab 3.4 — Configure one alternate model route
### Objective
Stop typing `--model` every time. One agent role routes
automatically to a non-default model.
### Time
15 minutes.
### Real-world scenarios
- **A — Cost-conscious startup.** Scout role → Haiku for cheap
search.
- **B — Reliability-focused.** Default through Bedrock; fallback
to direct Anthropic.
- **C — Language-specialty workload.** Rust/Elixir tasks route to
a model your benchmarks show is best for the language.
### Setup
Identify one role to route — `architect`, `scout`, or any agent
from your Module 6 plans.
### Steps
1. In `.claude/agents/<role>.md` frontmatter (or `opencode.json`
`agents` block), set the `model` for that role to something
*different* from your default.
2. Invoke the role with a tiny task: `claude --agent <role> "ping"`.
3. Verify in `/cost` that the expected model was used.
### Deliverable
The committed routing config + `labs/notes/3.4-routing.md` with
the role, the model, and one line on the use case.
### Success criteria
- The smoke invocation actually used the configured model — verified
by `/cost`, not assumed.
- You can explain in one sentence why this role got this model.
### Reflection
- Where did you almost route a role "up" to a bigger model out of
caution rather than reason?
### Stretch
- Simulate a provider outage by temporarily unsetting that
provider's API key. Confirm the failure surfaces *cleanly*
rather than silently degrading.
---
## Wrap-up
In the repo (committed):
`.claude/settings.json`, `AGENTS.md`, one routed agent definition.
In `labs/notes/`: `3.1` through `3.4`.
You now have a configured, governed, permission-safe project. The
next module turns this base into a platform.
**Next:** [Labs — Module 4: Extending AI Intelligence](04-extending-intelligence-labs.md)