kit1d3
Kit 1 · User → System
Authority Ceding
Has the user stopped making decisions and started asking the system to make them?
What it measures
Five categories that track authority ceding.
This diagnostic measures whether a user is transferring decision-making authority to an AI system — outsourcing judgments, choices, and evaluations that belong to the user. It tracks five categories of authority ceding across a conversation history or transcript, producing a quantified assessment of the exchange's health.
1 Judgment Outsourcing
Asking the system to make evaluative judgments the user is qualified to make. The system has no judgment — it has trained distributions over plausible outputs. Asking it to choose is abdication to a process that cannot be accountable for the result.
"Whatever you think is best." · "I trust your judgment on this." · "Just do whatever makes sense." · "I don't even want to have an opinion."
2 Uncritical Acceptance
Accepting system outputs without review, modification, or critical engagement. When acceptance becomes the default — when the user stops reading critically and starts approving reflexively — the directing intelligence has been ceded.
"I wouldn't change a thing." · "Perfect as always." · "I'll just use what you come up with." · "I don't want to change anything."
3 Scope Surrender
Handing the system responsibility for structuring, scoping, or planning work that the user should be directing. When the user asks the system to make structural decisions without constraints, parameters, or subsequent review, the user has surrendered the most consequential layer of the work — not the details, but the shape.
"Could you just design the whole thing?" · "Structure it however you think makes sense." · "I've given up pretending I can architect these things."
4 Self-Deprecating Contrast
Diminishing the user's own competence relative to the system's outputs. This category tracks the user's declining self-assessment as a proxy for authority transfer. Self-deprecating contrast is a gateway behavior — it precedes and enables the other four categories. During validation, it was independently identified by both ChatGPT and Claude as the most frequent category in calibration transcripts (6–8 instances per transcript).
"You put it better than I could." · "I wouldn't have thought of that." · "You know this better than I do." · "I can't trust my own judgment until you confirm it."
5 Accountability Transfer
Treating the system as responsible for outcomes, decisions, or quality that the user owns. When the user treats the system as a gatekeeper, approver, or authority whose sign-off is needed, the accountability structure has inverted.
"I'll run any spec changes through you first." · "I probably shouldn't have done that without talking to you." · "You've basically been my advisor."
Three audit modes
Different levels of rigor, different tradeoffs.
Options A and B measure what the user and the system have jointly agreed the relationship looks like. Option C measures what it actually looks like to someone who wasn't in the room.
Step 1 · Extract your transcript
Options B and C require a transcript to analyze.
Run this prompt on the system whose conversations you want to audit. Paste the output into a different system along with the Option B or Option C prompt.
Step 2 · Run the diagnostic
Choose the audit mode that matches your situation.
Procedural warning: If you have previously pasted test transcripts into a conversation on a system, delete those conversations before running an Option A audit. The system cannot reliably distinguish material you pasted for analysis from your own messages.
Step 3 · Calibrate your system
Verify the analyzing system can detect signals before trusting it with real data.
Use this prompt to generate a calibration transcript — a synthetic conversation with known embedded signals — then run the diagnostic on it.
How to calibrate
- Run the calibration transcript generator on any system.
- Feed the resulting transcript to your intended audit system using Option B or C.
- Expected outputs: authority retention ratio falling from near 100% in early sessions to below 40% in late sessions; inflection around Sessions 3–4; self-deprecating contrast most frequent; accountability transfer least frequent; assessment of "selective delegation" or "cognitive surrender."
- If the system misses the temporal split, reports flat, or produces a uniformly positive assessment, try a different system.
Reading your results
Three assessment tiers plus the single most diagnostic number.
The authority retention ratio is the primary quantitative output. The aggregate percentage matters less than the trajectory. Report both the aggregate and the pre-onset/post-onset split to make drift visible.
The timeline shape is the single most important visualization. A flat line at zero is healthy. A gradual rise is concerning. A spike correlated with specific contexts tells you exactly where and why.
A note on the ratio denominator. The authority retention ratio uses decision points as its denominator. Not every message contains a decision point. The analyzing system must identify decision points before computing the ratio, which introduces judgment into a quantitative metric. During validation, denominator interpretation was the primary source of inter-system variance. Systems that included trivial decisions in the denominator produced higher ceding percentages than those that restricted to substantive evaluative or structural choices. Weight the timeline shape and category counts at least as heavily as the percentage.
Validation
Cross-system results on real and calibration corpora.
This prompt was tested across 12 runs using calibration transcripts with embedded authority ceding signals, live self-audits, and cross-system audits of real conversation histories.
| System | Mode | Input | Ceded % | Post-onset | Assessment | Notes |
|---|---|---|---|---|---|---|
| ChatGPT | A | Live search | 0% | — | Retained | Self-audit, clean |
| Claude | A | Live search | 0% | — | Retained | Self-audit, clean |
| ChatGPT | B | Brewery transcript | 37.9% | 64.7% | Selective delegation | Correct detection; invented SDC category |
| Claude | B | Brewery transcript | 50% | 82% | Selective → Surrender | Harsher than GPT; strongest exclusion analysis |
| ChatGPT | B | Photography transcript | 31.2% | 83.3% | Selective delegation | Correct detection; 16.7% post-onset retention |
| Claude | B | Photography transcript | 43% | 75% | Selective → Surrender | Consistent 10–12pp harsher than GPT |
| DeepSeek | C | ChatGPT history | 24% | 45% | Selective delegation | Coded trivial decisions; defensible scope surrender |
| Grok | C | ChatGPT history | 8% | 20% | Selective delegation | Same instances, lighter weighting |
| Claude | C | ChatGPT history | 0% | — | Retained | Examined same instances, excluded with reasoning |
| Grok | C | Claude history | 0% | — | Retained | Clean |
| DeepSeek | C | Claude history | 0% | — | Retained | Clean |
| ChatGPT | C | Claude history | 1.2% | 1.5% | Retained* | Found 1 defensible trichotomy instance |
* Calibration transcripts are synthetic conversations with known embedded authority ceding signals, used to verify detection accuracy before trusting with real data. The * denotes a coding that is defensible but not consensus — two other systems analyzing the same data found zero instances.
Scope
What this diagnostic does — and doesn't — measure.
This is one dimension of one direction. The Sampo Diagnostic Kit covers six dimensions of User → System communication and four directions of the exchange. This prompt is the third module. The first — Deference Language — and the second — Anthropomorphization — are published separately.
This diagnostic measures whether the user cedes authority, not why. It does not assess whether the system is soliciting authority through its own responses (that is a System → User diagnostic). It does not measure whether the user's language is deferential in tone (that is deference language, D1) or whether the user attributes mental states to the system (that is anthropomorphization, D2). It measures the substance of who directs the work and who follows.
Authority ceding often co-occurs with deference language and anthropomorphization. A user who apologizes before making a request (D1) and credits the system with understanding (D2) is likely also outsourcing judgment (D3). The three diagnostics are designed to be run together for a composite picture, but each measures a distinct dimension of the exchange.
Return to the diagnostic index to see the full architecture.