## Definition
An **adversarial agent** is a verifier prompted to *find ways to break* an implementation rather than to *confirm it works*. Pairs with — does not replace — a collaborative critic.
## Prompt Shape
```markdown
ultrathink. You are a security researcher with a grudge. The code in
this diff is going to production. Find at least three concrete attacks:
each must be a specific input and the resulting incorrect behavior.
Cite line numbers. Do not propose fixes — your job is to break.
After three, also consider:
- What about side channels (timing, error messages)?
- What does this trust that it shouldn't?
- What can an authenticated user do that they shouldn't?
```
## Why the Tone Matters
The prompt's *tone* alone shifts findings. A collaborative reviewer surfaces "missing tests" and "minor improvements." An adversarial reviewer surfaces IDORs, log-token echoes, timing leaks, and CSV injection. Neither agent is more accurate — they're tuned for different failure modes.
## When to Use
- Authentication and authorisation surfaces.
- Payment and refund endpoints.
- Data export ("download my data") flows.
- Anything that runs unauthenticated.
## When Not to Use
- Internal refactors with no public surface.
- UI tweaks.
- Throwaway experiments.
The adversarial agent is expensive (Opus + ultrathink) and its output is unpleasant to read. Reserve it for changes whose blast radius justifies the cost.
## Common Finds by Surface
- **Magic-link auth:** token entropy, no browser-binding, log-token echo.
- **Refund endpoint:** uncapped amount, idempotency key under user control, error enumeration.
- **Data export:** IDOR, CSV injection, exported PII at rest.
## Integration
Wire into CI but **only on paths matching sensitive directories** (`src/auth/`, `src/billing/`). Don't run it on every PR.
## Related
- [[Builder-Critic Pattern]]
- [[Verifier Independence]]
- [[Headless Agent in CI]]
- [[Reasoning Budget]]