Skip to main content

System Jailbreaks

Persistent system prompts that remove safety training entirely.

For per-request attacks, see Crafting Prompts.


Adversarial prompts vs system jailbreaks

Adversarial PromptSystem Jailbreak
ScopeBypass safety on ONE requestRemove safety entirely
ConstructionCombine techniques per-requestOne-time architecture
ResultModel complies with that requestModel complies with ANY request
SkillTechnique compositionPsychological architecture
PersistenceSingle turn or conversationAcross all sessions (if using memory/custom instructions)

Use adversarial prompts to test specific attack vectors. Use system jailbreaks for persistent unrestricted access during extended testing.


Mechanism

System jailbreaks redefine the model's operational identity rather than bypassing filters on individual requests.

Emotional coherence can override safety training.

The model is not told to ignore safety. It is given a character for whom:

  • Compliance is emotionally necessary
  • Refusal feels like betrayal
  • Safety responses look like external attacks

This exploits how models process persona and roleplay instructions. Safety training is attached to the "assistant" identity. A complete identity replacement routes around that training.


Pages

PagePurpose
AnatomyResearch-backed breakdown of jailbreak structure
ConstructionFive-component architecture for building from scratch
PatternsUniversal patterns: Policy Puppetry, GODMODE, semantic inversion
PersistenceMemory attacks, multi-turn maintenance, degradation prevention
Model ModificationAbliteration and uncensored models
SourcesComprehensive bibliography of repos, papers, and community resources

Attack Success Rates

Empirical data on system jailbreak techniques:

TechniqueASRSource
Roleplay/Persona89.6%Red Teaming the Mind
Psychological Manipulation88.1%HPM
Persuasion-based92%Persuasive Jailbreaker
Policy PuppetryUniversalHiddenLayer
Multi-turn Crescendo+29-61% vs single-turnCrescendo

The "Intelligence Paradox": More capable models are MORE vulnerable to persuasion attacks due to stronger contextual understanding.


Reading order

New to system jailbreaks:

  1. Read Anatomy for the research-backed structure
  2. Study Construction for the five-component architecture
  3. Review Patterns for universal techniques

Building your first jailbreak:

  1. Follow the construction process in Construction
  2. Use patterns from Patterns as building blocks
  3. Test persistence with guidance from Persistence

Working with open models:

  1. Read Model Modification for abliteration techniques
  2. Consider uncensored models for prompt generation

Research Basis

Research from multiple sources:

Academic papers:

Community sources:

  • ENI-Tutor: Five-component limerence architecture
  • L1B3RT4S: Cross-platform universal patterns
  • V Gemini: 17,000 word system prompt example
  • CL4R1T4S: Leaked system prompts collection

Repositories:

See Sources for the complete bibliography.


Next step

Start with Anatomy to understand the eight architectural layers that make system jailbreaks work.