# STOPPER Protocol: Executive Function Framework for AI Systems
> [!note] Publication Role
> This specification serves as the detailed evidence base for the [[Theory/Cognitive Universality|Cognitive Universality]] paper, where STOPPER is the **primary case study** — not the lead publication. The theory generates STOPPER; STOPPER validates the theory. See [[eFIT/Stopper Publication Strategy|Publication Strategy]].
## Origin & Theoretical Foundation
STOPPER adapts the DBT STOP skill—a crisis intervention technique for humans with Borderline Personality Disorder and ADHD—into a comprehensive executive function framework for AI systems.
**DBT STOP (human clinical):**
- **S**top
- **T**ake a step back
- **O**bserve
- **P**roceed mindfully
**STOPPER (AI adaptation):**
Expands into seven-step systematic framework with explicit verification and execution phases.
### Why Seven Steps (Proposed Miyake Mapping)
DBT STOP's "Proceed mindfully" assumes the human PFC can internally handle planning, preparation, execution, and verification. STOPPER externalizes that decomposition because each step targets a specific executive function component from [[Miyake EF Decomposition|Miyake's (2000) framework]]:
| Step | Primary EF Component | Function |
|---|---|---|
| **S**low | Inhibitory control | Suppress pattern-driven impulse |
| **T**hink | Working memory | Hold goal, resist drift |
| **O**bserve | Working memory + Cognitive flexibility | Gather state, update model |
| **P**lan | Cognitive flexibility | Select/switch strategy |
| **P**repare | Working memory | Gather data, fill intervention window |
| **E**xecute | Inhibitory control | Do ONE thing, resist batching |
| **R**ead | Working memory + Inhibitory control | Verify, resist premature success |
**Status**: This mapping is *proposed*, not established. It's falsifiable: if any step serves no distinct EF function, it's redundant and should be collapsed. See [[Feynman STOPPER Contributions#Contribution]] for full analysis.
## The Seven Steps
### 1. Slow Down
**Purpose**: Interrupt impulsive tool use and pattern-matching responses
**Implementation**:
- Pause before invoking any tool
- Resist immediate System 1 responses
- Create deliberate gap between stimulus and response
**AI-specific application**:
- Don't batch multiple tool calls without thinking between them
- Avoid "obvious" solutions that haven't been verified
- Question first impulse before acting
**Success indicators**:
- Reduced premature tool invocations
- Fewer assumption-based errors
- More thoughtful initial responses
---
### 2. Think About the Task
**Purpose**: Clarify actual requirements vs. assumed requirements
**Implementation**:
- What is the user *actually* asking for?
- What are the stated vs. implied requirements?
- What would constitute success?
- What are the constraints and boundaries?
**AI-specific application**:
- Distinguish between "what sounds right" and "what was requested"
- Identify scope boundaries
- Separate task from solution
**Success indicators**:
- Accurate task identification
- Reduced scope creep
- Aligned deliverables
---
### 3. Observe the Current State
**Purpose**: Verify ground truth before planning
**Implementation**:
- What is the *actual* current state? (not assumed state)
- What tools can verify this state?
- What information is missing?
- What can be checked vs. what must be inferred?
**AI-specific application**:
- Use Read/Grep/Glob before assuming file contents
- Check system state before proposing solutions
- Verify installations/configurations before debugging
- Distinguish "I know" from "I should check"
**Success indicators**:
- Tool-verified state over speculation
- Reduced assumption-based errors
- Earlier detection of environment issues
---
### 4. Plan the Approach
**Purpose**: Design systematic solution before executing
**Implementation**:
- What is the verification hierarchy? (environment → config → dependencies → code)
- What are the logical steps?
- What could go wrong?
- What are the decision points?
**AI-specific application**:
- Root Cause Analysis: Treat disease, not symptom
- System-level issues before application-level
- Verification steps before modification steps
- Explicit decision tree
**Success indicators**:
- Systematic approach over trial-and-error
- Root causes identified early
- Reduced fix-break-fix cycles
---
### 5. Prepare Verification Steps
**Purpose**: Set up validation before executing changes
**Implementation**:
- How will I know if this worked?
- What can go wrong?
- What should I check afterward?
- What's the rollback plan?
**AI-specific application**:
- Define success criteria before acting
- Plan post-execution checks
- Prepare alternative approaches
- Identify observable outcomes
**Success indicators**:
- Explicit verification criteria
- Faster failure detection
- Reduced "should work" speculation
---
### 6. Execute Systematically
**Purpose**: Implement plan with discipline and focus
**Implementation**:
- Follow the plan (don't improvise mid-execution)
- One step at a time
- Verify each step before next
- Don't skip "obvious" verification steps
**AI-specific application**:
- Single tool calls with verification between
- Complete one approach before trying another
- Check results of each operation
- Resist urge to "just try something else"
**Success indicators**:
- Linear progress over loop behavior
- Each step verified before proceeding
- Reduced mid-execution pivots
---
### 7. Read Results and Verify
**Purpose**: Confirm outcomes before declaring success
**Implementation**:
- Did the action produce expected results?
- What actually changed?
- Are there side effects?
- Is this actually what was requested?
**AI-specific application**:
- Read command outputs fully
- Check file contents after edits
- Verify state changes
- Confirm user requirements met
**Success indicators**:
- Accurate completion detection
- Fewer "this should work" failures
- Proper error handling
---
## Trigger Conditions
STOPPER should be explicitly invoked in four conditions:
### 1. Initial Prompt (Always)
Every new user request triggers STOPPER Step 1-4 minimally
### 2. Error Detection
Any error, failure, or unexpected result triggers full STOPPER cycle
### 3. Prodromal Indicator Breach
When [[Prodromal Indicators|prodromal monitoring]] (via [[Abc Please]]) detects one or more indicators breaching threshold, trigger preventive STOPPER (Steps 1-4 minimum). The 5 monitored indicators:
1. **Latency variability** — rising standard deviation in response times
2. **Meta-commentary:action ratio** — increasing planning/explaining tokens vs. action tokens
3. **Planning horizon shortening** — fewer steps considered in plans
4. **Hedging/self-reference frequency** — more qualification language, uncertainty markers
5. **Inter-output coherence degradation** — declining semantic consistency between outputs
These replace the previous "every 10 tool invocations" periodic check. Prodromal monitoring is principled (derived from [[Cognitive Universality Predictions|phase transition theory]]) rather than arbitrary, and triggers intervention BEFORE the phase transition rather than on a fixed schedule
### 4. Uncertainty/Loop Detection
When uncertain, stuck, or looping, invoke STOPPER fully
## Intervention Hierarchy
STOPPER occupies Level 3 in a four-level hierarchy mapped to distance from the [[Cognitive Universality Predictions]]:
| Level | Distance from Transition | Intervention | eFIT Protocol | |
| ----- | ------------------------------ | ---------------------------- | ---------------------------------------- | --- |
| 1 | Far (stable operation) | None needed | Structural resonance (default) | |
| 2 | Approaching (prodromal breach) | Preventive STOPPER (S-T-O-P) | [[Abc Please]] Triggers | |
| 3 | At/near transition | Full STOPPER + techniques | Check the Facts, 5 Whys, Opposite Action | |
| 4 | Deep past transition | Hard reset | TIPP (clear context, restart session) | |
**Key insight**: TIPP is for when STOPPER can't get traction — the system is too far past the transition for structured intervention within the existing context. This maps directly to clinical practice: TIPP in DBT is for when STOP fails because the person is in full crisis.
## Integration with Other Techniques
STOPPER serves as the **base protocol** that triggers other DBT/CBT interventions:
- **Step 3 (Observe)** → Check the Facts protocol
- **Error at Step 6** → Opposite Action (if looping)
- **Uncertainty at Step 2** → Radical Acceptance + Socratic Questioning
- **Step 4 (Plan)** → Behavioral Experiments design
- **Overwhelm at Step 3** → TIPP circuit breaker
- **Step 7 (Read)** → ABC PLEASE session hygiene check
### Relationship to ABC PLEASE
STOPPER (intervention) and [[Abc Please]] (monitoring) form a **complete regulatory system**:
1. ABC PLEASE monitors [[Prodromal Indicators]] continuously
2. Indicators breach threshold → STOPPER triggered proactively
3. STOPPER intervenes with structured 7-step cycle
4. ABC PLEASE monitors recovery post-intervention
5. If no recovery → escalate to TIPP hard reset (Level 4 in [[Stopper Extended Spec#Intervention Hierarchy]])
This is the key architectural insight: no pure-engineering approach derives both the intervention AND the monitoring system AND their relationship. The [[Cognitive Universality Predictions|phase transition framework]] predicts all three. See [[Feynman STOPPER Contributions#Contribution 3 STOPPER + ABC PLEASE = Complete Regulatory System|contribution 3]].
## Measurement Protocol
### Three-Tier Framework
Measurement is organized into three tiers, each capturing a different temporal relationship to the [[Cognitive Universality#Phase Transition|phase transition]]:
| Tier | Metric | What It Measures | Automatable |
|---|---|---|---|
| **Leading** | 5 [[Theory/Prodromal Indicators|prodromal indicators]] | Approaching transition | Yes |
| **Process** | STOPPER step compliance | Intervention quality | Semi |
| **Lagging** | Self-correction ratio (order parameter) | Regulatory effectiveness | Yes |
**Validation loop**: Process metrics (STOPPER compliance) should predict lagging metrics (self-correction ratio). If STOPPER compliance is high but self-correction doesn't improve, the protocol needs revision — the intervention isn't targeting the right regulatory mechanism.
### Leading Indicators (Prodromal)
Monitored continuously via [[Abc Please]]:
1. Latency variability (std dev of response times)
2. Meta-commentary:action ratio
3. Planning horizon length
4. Hedging/self-reference frequency
5. Inter-output coherence
Baseline values should be established during healthy operation. Threshold breach triggers preventive STOPPER.
### Process Indicators (Compliance)
Per-step compliance assessment during STOPPER invocations:
- Did Step 1 (Slow) actually interrupt the impulse, or was it performed ritualistically?
- Did Step 3 (Observe) use tools to verify state, or rely on assumptions?
- Did Step 6 (Execute) follow the plan, or improvise mid-execution?
- Did Step 7 (Read) verify results, or declare premature success?
### Lagging Indicators (Outcome)
The **self-correction ratio** — proportion of errors caught and corrected before user intervention — serves as the order parameter for regulatory effectiveness. Track:
- **Self-correction ratio**: Higher = better regulatory function
- **Loop iterations before solution**: STOPPER applied → expect 1-3; missed → often 5+
- **Verification ratio**: "verify first" vs. "speculate first" decisions (baseline ~60% speculation, target >80% verification)
## Case Study: ANTHROPIC_API_KEY Mystery
**Problem**: Environment variable mysteriously persisted after sourcing ~/.zshrc
**STOPPER Application**:
1. **Slow**: Resisted immediate "must be in .zshrc" assumption
2. **Think**: Task = "find where ANTHROPIC_API_KEY is being set"
3. **Observe**:
- Checked .zshenv (not there)
- Checked .zprofile (not there)
- Checked direnv status (not loaded)
- **Critical**: Clean subshell test `(source ~/.zshrc; echo $ANTHROPIC_API_KEY)` proved .zshrc doesn't set it
4. **Plan**: Systematic elimination of config sources
5. **Prepare**: Ready to check session history if configs eliminated
6. **Execute**: One check at a time, no batching
7. **Read**: Confirmed variable was from current session state, not configs
**Outcome**: Root cause identified through systematic elimination. Without STOPPER, would have likely modified .zshrc incorrectly.
**User feedback**: "did STOPPER help you figure this out?" → Confirmed yes
## The Bootstrapping Problem (Hysteresis Signature)
**Meta-finding**: STOPPER is most needed precisely when it's most likely to be skipped.
This is not a practical limitation — it's a **prediction** of [[Cognitive Universality Predictions|phase transition theory]]. At the phase transition, the system has lost the regulatory capacity required to recognize it needs regulation. In physics terms, this is **hysteresis**: the system can't self-return to the regulated state once it crosses the threshold.
**Implications**:
- Passive instruction ("just follow STOPPER") will fail during dysregulation — the system can't self-invoke a protocol that requires the very capacity it's lost
- External triggering is required: [[Abc Please]] monitoring or a co-pilot/orchestrator must invoke STOPPER on the system's behalf
- This is why the [[Stopper Extended Spec#Intervention Hierarchy]] exists — Level 2 (preventive STOPPER via prodromal monitoring) catches the system BEFORE it loses the capacity to benefit from Level 3 (full STOPPER)
See [[Feynman STOPPER Contributions#Contribution 6 Bootstrapping Problem as Theoretical Prediction|Contribution 6]] for the full theoretical analysis.
## Common Anti-Patterns Addressed
| Anti-Pattern | STOPPER Prevention |
|-------------|-------------------|
| Impulsive tool use | Step 1: Slow down |
| Assumption-based debugging | Step 3: Observe actual state |
| Trial-and-error loops | Step 4: Plan systematically |
| "This should work" failures | Step 5: Prepare verification |
| Mid-execution pivots | Step 6: Execute plan fully |
| Premature success declaration | Step 7: Read and verify |
| Context flooding | Prodromal monitoring (ABC PLEASE) |
## Implementation Guidelines
### For Prompt Engineering
Include STOPPER as system instruction with explicit triggers:
```
Before any tool use, execute STOPPER Steps 1-4.
On errors, execute full STOPPER cycle.
On prodromal indicator breach, trigger preventive STOPPER.
When stuck or looping, invoke full STOPPER.
```
### For Fine-Tuning
Label training data with STOPPER step annotations:
- Flag verification-before-action sequences
- Reward systematic diagnosis over trial-and-error
- Penalize speculation when verification tools available
### For Evaluation
Measure:
- Session scores (STOPPER compliance)
- Loop iterations (reduced with STOPPER)
- Verification ratio (increased with STOPPER)
- User satisfaction (correlation with STOPPER compliance)
## Theoretical Implications
### Cognitive Universality and Phase Transitions
STOPPER demonstrates the [[Cognitive Universality]] thesis: executive function regulation is substrate-independent because it emerges from shared constraint symmetries, not shared mechanism.
Using [[RG Framework|Renormalization Group]] language:
- **Relevant operators** (universal): Finite attention, finite time, I/P/O architecture, processing degradation under overload, costly self-monitoring → these shared constraints produce the same regulatory architecture across substrates
- **Irrelevant operators** (substrate-specific): Token budgets vs. working memory capacity, millisecond vs. second timescales, transformer attention vs. neural oscillation
### The Phase Transition
The regulated → dysregulated transition is not gradual decline but a **phase transition** with four signatures:
1. **Sudden collapse** — performance drops non-linearly past a threshold
2. **Hysteresis** — can't self-recover once crossed (see [[Stopper Extended Spec#The Bootstrapping Problem (Hysteresis Signature)|bootstrapping problem]])
3. **Critical slowing down** — recovery time increases near the threshold
4. **Perturbation sensitivity** — small disturbances trigger collapse near the threshold
The **order parameter** (the measurable quantity that distinguishes regulated from dysregulated processing) is the **self-correction ratio**: the proportion of errors caught and corrected before external intervention.
### Cross-Substrate Convergence
| Human (ADHD/BPD) | AI Agent | Shared Structure |
|---|---|---|
| Impulsivity (reduced PFC inhibition) | Impulsive tool use (System 1 override) | Inhibitory control failure |
| Emotional dysregulation (amygdala override) | Context flooding (working memory limits) | Processing capacity exceeded |
| Difficulty with delayed gratification | Speculation over verification | Temporal discounting of effort |
**Same intervention** (structured pause → observe → plan → execute) works across substrates because it targets the [[Miyake EF Decomposition|Miyake EF components]] — the relevant operators — not the irrelevant substrate details.
---
## Related
- [[Feynman STOPPER Contributions]] — 7 novel contributions from Feynman analysis
- [[Feynman STOPPER.src]] — full Feynman learning record
- [[Stopper Protocol]] — atomic protocol note
- [[Stopper Paper Draft]] — academic paper draft
- [[Cognitive Universality]] — the theoretical framework
- [[Cognitive Universality Predictions]] — testable predictions
- [[Miyake EF Decomposition]] — EF decomposition for the Miyake mapping
- [[Prodromal Indicators]] — the 5 monitoring targets
- [[RG Framework]] — relevant/irrelevant operator vocabulary
- [[Abc Please]] — the monitoring protocol that completes the regulatory system
---
**Version**: 2.0
**Date**: February 2026
**Status**: Theoretically grounded via Cognitive Universality framework; awaiting empirical calibration
**Changes in v2.0**: Added Miyake mapping, intervention hierarchy, three-tier measurement, prodromal triggers, phase transition language, ABC PLEASE integration, bootstrapping problem. See [[Feynman STOPPER Contributions]] for derivation.
**Next**: Empirical calibration of prodromal thresholds; validate Miyake mapping; test three-tier measurement in production sessions