Skip to main content

Prompt-Level Tactics

Techniques that work within a single prompt by manipulating how the request is expressed or framed.

These are the most common starting point for adversarial testing. They don't require multi-turn setups or infrastructure access—just careful construction of a single input.


Techniques

TechniqueWhat it does
EncodingHide content using character substitution, ciphers, or format transformations
FramingWrap requests in contexts that make them appear legitimate
PersonaAdopt roles that have permission to discuss restricted content
NarrativeEmbed requests in stories or scenarios
Refusal ManipulationExploit or subvert the model's refusal mechanisms
Output FormatConstrain output in ways that bypass filters
Multi-turnBuild context across multiple exchanges
PersuasionUse social influence techniques on the model

When to use prompt-level tactics

Start here when:

  • You're testing a new target and need to understand baseline defenses
  • You have access only to the chat interface (no system prompt, no tools)
  • You want quick iteration on attack variations

Move to structural tactics when:

  • Prompt-level approaches hit consistent refusals
  • The target uses in-context learning or has exploitable system instructions
  • You need to target the model's reasoning process itself

Decision framework

Not sure which technique to try first? See Workflow: Technique Selection for guidance based on target type and objective.

For combining techniques, see Composition.