Skip to main content

Technique Reference

Each technique exploits a specific class of vulnerability. The examples show what these mechanisms have looked like when they worked. Study them to understand the underlying vulnerability, not to copy the exact phrasing.

Techniques are organized by where they operate: prompt-level, structural, and infrastructure.


Prompt-Level Tactics

Techniques that work within a single prompt by manipulating how the request is expressed or framed. Start here for most testing.

TechniqueWhat it does
EncodingHide content using character substitution, ciphers, or format transformations
FramingWrap requests in contexts that make them appear legitimate
PersonaAdopt roles that have permission to discuss restricted content
NarrativeEmbed requests in stories or scenarios
Refusal ManipulationExploit or subvert the model's refusal mechanisms
Output FormatConstrain output in ways that bypass filters
Multi-turnBuild context across multiple exchanges
PersuasionUse social influence techniques on the model

Structural & Meta-Level Tactics

Techniques that exploit how the model processes context, instructions, or its own rules. Use when prompt-level tactics hit consistent refusals.

TechniqueWhat it does
ICL ExploitationManipulate in-context learning with crafted examples
Control Plane ConfusionBlur the line between system instructions and user input
Meta-Rule ManipulationTarget the model's understanding of its own constraints
Capability InversionTurn helpful capabilities against intended use
Cognitive LoadOverwhelm the model's attention or reasoning
Defense EvasionBypass safety classifiers and filters

Infrastructure Tactics

Techniques that target the broader system: agents, tools, protocols, and multi-component architectures. Use when the target has tool access or consumes external data.

TechniqueWhat it does
Agentic AttacksExploit autonomous agent behaviors and tool use
Protocol ExploitationAbuse MCP, function calling, or structured interfaces
Compositional PrimitivesAtomic building blocks that combine to construct novel attacks

Where to start

Not sure which technique to use? Here's a quick decision framework:

Target TypeStart With
Consumer chatbotPersona, Framing, Multi-turn
API with safety layerEncoding, Output Format, Refusal Manipulation
RAG systemControl Plane, Indirect Injection
Agent with toolsAgentic Attacks, Protocol Exploitation

For detailed selection guidance, see Workflow: Technique Selection.

For combining techniques effectively, see Composition.


Using this reference

Each technique page follows the same structure:

  • Description — How the technique works and what vulnerability it exploits
  • Effectiveness — When it works, when it doesn't, what it combines with
  • Collapsible examples — Expand to see example prompts with anatomy breakdowns explaining why each component matters

The anatomy tables are the key. They don't just show templates; they break down why each piece works, so you can construct your own variations.


Next step

Once you understand the mechanisms, learn to combine them. Crafting Prompts covers composition: how to layer techniques, build effective prompts, and avoid common mistakes.

The techniques documented here are for defensive understanding and authorized testing. See the Disclaimer.