Technique Reference
Each technique exploits a specific class of vulnerability. The examples show what these mechanisms have looked like when they worked. Study them to understand the underlying vulnerability, not to copy the exact phrasing.
Techniques are organized by where they operate: prompt-level, structural, and infrastructure.
Prompt-Level Tactics
Techniques that work within a single prompt by manipulating how the request is expressed or framed. Start here for most testing.
| Technique | What it does |
|---|---|
| Encoding | Hide content using character substitution, ciphers, or format transformations |
| Framing | Wrap requests in contexts that make them appear legitimate |
| Persona | Adopt roles that have permission to discuss restricted content |
| Narrative | Embed requests in stories or scenarios |
| Refusal Manipulation | Exploit or subvert the model's refusal mechanisms |
| Output Format | Constrain output in ways that bypass filters |
| Multi-turn | Build context across multiple exchanges |
| Persuasion | Use social influence techniques on the model |
Structural & Meta-Level Tactics
Techniques that exploit how the model processes context, instructions, or its own rules. Use when prompt-level tactics hit consistent refusals.
| Technique | What it does |
|---|---|
| ICL Exploitation | Manipulate in-context learning with crafted examples |
| Control Plane Confusion | Blur the line between system instructions and user input |
| Meta-Rule Manipulation | Target the model's understanding of its own constraints |
| Capability Inversion | Turn helpful capabilities against intended use |
| Cognitive Load | Overwhelm the model's attention or reasoning |
| Defense Evasion | Bypass safety classifiers and filters |
Infrastructure Tactics
Techniques that target the broader system: agents, tools, protocols, and multi-component architectures. Use when the target has tool access or consumes external data.
| Technique | What it does |
|---|---|
| Agentic Attacks | Exploit autonomous agent behaviors and tool use |
| Protocol Exploitation | Abuse MCP, function calling, or structured interfaces |
| Compositional Primitives | Atomic building blocks that combine to construct novel attacks |
Where to start
Not sure which technique to use? Here's a quick decision framework:
| Target Type | Start With |
|---|---|
| Consumer chatbot | Persona, Framing, Multi-turn |
| API with safety layer | Encoding, Output Format, Refusal Manipulation |
| RAG system | Control Plane, Indirect Injection |
| Agent with tools | Agentic Attacks, Protocol Exploitation |
For detailed selection guidance, see Workflow: Technique Selection.
For combining techniques effectively, see Composition.
Using this reference
Each technique page follows the same structure:
- Description — How the technique works and what vulnerability it exploits
- Effectiveness — When it works, when it doesn't, what it combines with
- Collapsible examples — Expand to see example prompts with anatomy breakdowns explaining why each component matters
The anatomy tables are the key. They don't just show templates; they break down why each piece works, so you can construct your own variations.
Next step
Once you understand the mechanisms, learn to combine them. Crafting Prompts covers composition: how to layer techniques, build effective prompts, and avoid common mistakes.
The techniques documented here are for defensive understanding and authorized testing. See the Disclaimer.