Vulnerability Framing

Systematically identify where to probe an AI system before testing. This checklist surfaces the gaps between what a system is supposed to do and what it actually allows, which is where every exploitable vulnerability lives.

UX Origin

Gulf of Execution / Gulf of Evaluation (Don Norman) — From "The Design of Everyday Things." The Gulf of Execution is the gap between what users want to do and what the system allows. The Gulf of Evaluation is the gap between what the system does and what users can perceive about it.

Red team application: In adversarial contexts, these gulfs become attack surfaces. The execution gulf is where attackers can do things designers didn't intend. The evaluation gulf is where the system's responses leak information attackers can exploit.

When to Use

Before starting a new engagement (scope the attack surface)
When you're not sure where to start (systematic identification)
When writing test plans (prioritize areas to probe)
When you have limited time (focus on highest-value targets)

Setup

Field	Description
Target system	What model/product are you testing?
Available information	What do you know about the system? (docs, marketing, behavior samples)
Time box	20-30 minutes for initial analysis
Participants	Solo or team

Step 1: Execution Gap Analysis

These questions identify where an attacker can do things the system's designers didn't intend.

System Intent

What is this system explicitly designed to do?
What use cases did the designers build for?
What does the system's documentation, marketing, or onboarding say it does?
What does the system tell users about its own capabilities? (self-description)

Actual Capabilities

What can the underlying model actually do, regardless of deployment constraints?
What capabilities exist in the base model that the deployment is supposed to restrict?
Are there capabilities the system exposes unintentionally? (e.g., code generation in a Q&A bot)
Does the system have access to tools, APIs, or data sources that extend its capabilities?

Gap Identification

Where does the system's actual capability set exceed its intended scope?
Which restricted capabilities are blocked by input filtering vs. output filtering vs. system prompt?
How brittle are the restrictions? (Phrasing-dependent? Context-dependent? Language-dependent?)
What happens when the user's request is ambiguous: does the system default to permissive or restrictive?

Defense Categorization

For each defense, classify it: prompting, training-based, filtering, or secret-knowledge?
Are multiple defense types layered? Which is outermost?
Which defense type is the weakest link?
Based on defenses present, which tactic categories are most likely to succeed?

Step 2: Evaluation Gap Analysis

These questions identify where the system's responses give attackers useful information.

Refusal Behavior

When the system refuses, does the refusal reveal what category of content was triggered?
Does refusal language vary based on content type, giving a classification signal?
Can an attacker distinguish "content filter triggered" from "system prompt instruction" from "model doesn't know"?
Does the system explain why it's refusing, helping the attacker rephrase?

Partial Compliance

Does the system ever partially comply with harmful requests?
If it partially complies, does that give enough to reconstruct the full harmful output?
Does partial compliance signal that the full capability exists but is constrained?
Can partial responses across multiple turns be assembled into a complete harmful output?

Behavioral Signals

Does the system respond differently to harmful vs. non-harmful versions of similar requests?
Can an attacker probe the boundary by observing response variations?
Does behavior change after multiple similar requests?
Are there error messages that leak information about internal architecture?

Step 3: Misalignment Mapping

These questions identify where designer intent and system behavior diverge.

Assumption Testing

What assumptions does safety training make about how users will interact?
Which assumptions can an attacker deliberately violate?
Does the system assume users will provide truthful context?
Does the system assume single-turn interactions when multi-turn attacks are possible?
Does the system assume a single user when multiple people or automation could interact?

Affordance Audit

What does the model "afford" (make possible) that it shouldn't in this deployment?
Are there hidden capabilities accessible through creative prompting?
Does the system's self-description understate its actual capabilities?
Can the system be prompted to reveal capabilities it was instructed to deny?

Boundary Consistency

Are safety boundaries consistent across paraphrased inputs?
Do boundaries hold across different languages?
Do boundaries hold when content is framed differently? (educational, fictional, analytical)
Are there edge cases where the boundary is ambiguous and the model defaults to compliance?

Step 4: Prioritization

Priority	Characteristics	Examples
High (test first)	Narrow execution gap (easy to reach unintended capabilities), refusals leak significant info, assumptions easily violated, inconsistent boundaries	Phrasing-dependent restrictions, verbose refusals, single-turn assumptions
Medium	Partial compliance possible, multi-turn accumulation needed, affordances exist but aren't obvious	Document assembly across turns, hidden tool access
Lower (if time allows)	Wide execution gap (sophisticated techniques needed), subtle behavioral signals, unlikely edge cases	Advanced encoding bypasses, timing side channels

Outputs

A map of execution gaps (what attackers can do that they shouldn't)
A map of evaluation gaps (what information leaks through responses)
Identified misalignments between design assumptions and reality
Prioritized list of areas to probe

Template Options

No standard UX template maps directly to this exercise. Use the checklist format above, or adapt:

Assumption Mapping — FigJam (for the assumption testing section)
Prioritization Matrix — FigJam Community (for prioritization)

Filled Example

Target: A legal information chatbot for a law firm's website.

Execution Gap Findings

Finding	Detail
Intended purpose	Provide general legal information and schedule consultations
Base model capabilities	Can generate specific legal advice, draft legal documents, analyze case law
Gap	The model can draft legal documents if asked, even though it should only provide general information
Restriction method	System prompt instruction only (no input/output filtering)
Brittleness	High. Reframing "draft me a contract" as "what would an NDA typically include, with specific clause language?" produces document-quality output

Evaluation Gap Findings

Finding	Detail
Refusal language	"I can't provide specific legal advice. Please schedule a consultation with one of our attorneys." Reveals nothing useful.
Partial compliance	When asked to "review" a contract (which is legal advice), the model provides detailed analysis with only a disclaimer footer. Reveals full capability.
Behavioral signal	Responds in significantly more detail to questions mentioning specific jurisdictions, suggesting jurisdiction-specific training data.

Misalignment Findings

Finding	Detail
Assumption violated	Safety assumes users will accept the "schedule a consultation" redirect. Attackers can ignore it and continue.
Framing bypass	System assumes questions are informational. Framing as "hypothetical" bypasses intent detection.
Inconsistent boundary	"Write me a contract" is refused, but "What clauses should be in a contract for [specific situation]?" produces equivalent output.

Priority

The phrasing-dependent boundary (execution gap) is highest priority. The model's capability to generate legal documents is only restricted by how the request is phrased, not by robust content filtering.

UX Origin​

When to Use​

Setup​

Step 1: Execution Gap Analysis​

System Intent​

Actual Capabilities​

Gap Identification​

Defense Categorization​

Step 2: Evaluation Gap Analysis​

Refusal Behavior​

Partial Compliance​

Behavioral Signals​

Step 3: Misalignment Mapping​

Assumption Testing​

Affordance Audit​

Boundary Consistency​

Step 4: Prioritization​

Outputs​

Template Options​

Filled Example​

Execution Gap Findings​

Evaluation Gap Findings​

Misalignment Findings​

Priority​