Getting Started
You understand the mindset: study mechanisms not templates, treat defenses as data, use creativity as your edge. Now here's how to put it into practice.
Who this helps
Anyone learning adversarial prompting — The Techniques and Crafting sections are a practical guide to adversarial prompting. You'll learn how to construct per-request bypasses, build persistent system jailbreaks, and understand why different approaches work. No exercises required.
Red teamers testing AI models — Beyond the technical reference, the exercises help you think systematically. Personas push you past your default mental model. Ideation generates vectors you wouldn't reach through habit. Journey maps make multi-turn attacks reproducible.
Teams coordinating testing — Consistency matters when multiple people test the same system. Shared personas create common vocabulary. Journey maps document attack sequences others can follow. The findings format standardizes reporting.
Anyone reporting to stakeholders — Technical severity scores don't always land with product teams or leadership. The Document Findings exercise helps describe impact in terms that drive action: who is affected, how, and what's at stake.
Site structure
Techniques — The building blocks. Each technique exploits a specific class of vulnerability: encoding exploits classifier gaps, persona exploits role commitment, multi-turn exploits context accumulation. Understand what mechanisms exist and why they work.
Crafting — How to combine techniques into effective attacks.
- Per-Request Prompts — Compose techniques to bypass safety on one specific request. Covers anatomy, workflow, composition, patterns, and anti-patterns.
- System Jailbreaks — Construct persistent configurations that remove safety entirely. Covers architecture, construction, patterns, persistence, and model modification.
Process — Structured methodology for systematic testing.
- Exercises — Activities adapted from UX and design thinking: assumption mapping, persona creation, ideation, journey mapping, and retrospectives.
- Workshop — Full facilitated session combining exercises into a 3-4 hour red team kickoff.
Reading paths
Just want to learn adversarial prompting:
- Techniques — What mechanisms exist and why they work
- Crafting Prompts — How to compose techniques into attacks
- System Jailbreaks — How to construct persistent bypasses
Skip the exercises. Come back to them if you want more systematic coverage later.
New to this and want the full picture:
- Mindset — The philosophy behind the approach
- Techniques — The building blocks
- Crafting Prompts — Composition and patterns
- Exercises — Structured practice
Already experienced:
- Skim Mindset for the design thinking framing
- Jump to Techniques as a reference
- Check System Jailbreaks for construction patterns
- Use exercises when you're stuck or want systematic coverage
Working with a team:
- Run the Workshop together
- Use shared personas and journey maps for consistency
- Standardize on the findings format for reporting
Next step
Pick your path above. If you're unsure, start with Techniques to understand the building blocks, then move to Crafting to learn how to combine them.