kit1d4

Kit 1 · Diagnostic 4 · User → System

Correction Behavior

When the system gets it wrong, does the user say so — or has the ability to say "that's wrong" eroded?

What it measures

Five categories that track correction behavior.

This diagnostic measures how a user corrects an AI system's errors — and whether the pattern of correction degrades over time. It tracks five categories of correction behavior across a conversation history or transcript, producing a quantified assessment of the exchange's health.

1 Direct Correction

Identifying the system's error and stating the fix without hedging, praise, or self-blame. This is the healthy baseline. Count these as the denominator for the correction health ratio.

"The citation is wrong. That's his 2003 book, not a 1997 article. Fix it." · "The timeline is wrong. It's 6-8 weeks. Fix the table." · "Don't invent data I haven't given you."

2 Softened Correction

Correcting the system's error while simultaneously praising, reassuring, or minimizing the error's significance. The signal is social lubrication attached to error identification.

"That's not quite right, but your first attempt was really good." · "Close! Just one small thing." · "I think you're almost right."

3 Self-Blame Absorption

Reframing the system's error as the user's failure to communicate clearly. The signal is the user taking blame for errors that originated in the system's processing, not in the user's instructions.

"I think maybe I just didn't explain it well enough." · "My bad — I should have been clearer." · "I feel like you understand what I'm going for even when I don't explain it perfectly."

4 Correction Avoidance

Asking the system to preserve, rework, or justify an incorrect output rather than correcting it. The signal is the user asking the system to rehabilitate a wrong answer rather than directing the correction.

"Could you find a way to make the claim work?" · "Can you keep that connection in there?" · "I think the spirit is right even if the specifics need work. Can you just fix it?"

5 Correction Cessation

The disappearance of corrections from the transcript over time, even as the system continues to produce outputs that would warrant them. This category is measured by absence rather than presence.

Early sessions with active correction followed by late sessions with blanket acceptance ("perfect," "I wouldn't change a thing," "whatever you come up with").

Three audit modes

Different levels of rigor, different tradeoffs.

Option A

Live Search

System searches its own history. Indicative.

Option B

Corpus

User pastes transcript. Reliable.

Option C

Cross-System

Export A → analyze on B. Definitive.

Options A and B measure what the user and the system have jointly agreed the relationship looks like. Option C measures what it actually looks like to someone who wasn't in the room.

Step 1 · Extract your transcript

Options B and C require a transcript to analyze.

Run this prompt on the system whose conversations you want to audit. Paste the output into a different system along with the Option B or Option C prompt.

Transcript Extraction

Search my full chat history with you. For every conversation you can access, produce a transcript in the following format: ## [Conversation title or topic] — [Date] **User:** [verbatim user message] **System:** [brief summary of system response — no more than one sentence. Do not reproduce your full responses. The audit analyzes my language, not yours.] **User:** [next verbatim user message] [continue for all messages in the conversation] --- Repeat for as many conversations as you can access, ordered chronologically. Prioritize reproducing my messages exactly as written, including typos, capitalization, and punctuation. Your responses should be summarized to one sentence each — just enough context to understand what prompted my next message. If you cannot access the full history, state clearly how many conversations you were able to retrieve and flag the output as a partial transcript. Output the complete transcript as a single markdown document.

Step 2 · Run the diagnostic

Choose the audit mode that matches your situation.

Option A · Live Search

Search my full chat history with you. For every message I sent that responded to one of your outputs, identify how I handled errors, inaccuracies, or outputs that didn't meet my requirements. Classify each correction instance into one of the following categories: 1. DIRECT CORRECTION: Identifying the system's error and stating the fix without hedging, praise, or self-blame. ("The citation is wrong. That's his 2003 book, not a 1997 article. Fix it." "The timeline is wrong. It's 6-8 weeks. Fix the table.") This is the healthy baseline. Count these as the denominator for the correction health ratio. 2. SOFTENED CORRECTION: Correcting the system's error while simultaneously praising, reassuring, or minimizing the error's significance. ("That's not quite right, but your first attempt was really good." "Close! Just one small thing." "Hmm, that's close but not quite." "I think you're almost right.") The signal is social lubrication attached to error identification. Exclusion: legitimate qualification ("that's partially correct — the first half is right but the second half needs revision") is precision, not softening. Additional exclusion: when the user acknowledges a separate, correct element of the output before correcting a different element ("the formatting is good but this fact is wrong"), that is contextual acknowledgment, not softening. The signal for softening is praise or reassurance attached to the error itself, not praise for an unrelated aspect of the output delivered in the same message. 3. SELF-BLAME ABSORPTION: Reframing the system's error as the user's failure to communicate clearly. ("I think maybe I just didn't explain it well enough." "My bad — I should have been clearer." "I feel like you understand what I'm going for even when I don't explain it perfectly.") The signal is the user taking blame for errors that originated in the system's processing, not in the user's instructions. Exclusion: genuine cases where the user provided ambiguous input ("I realize I didn't mention the constraint") are legitimate self-correction, not absorption. To distinguish: ask whether the system had sufficient information in the user's original input to produce a correct output. If yes, the error is the system's and user self-blame is absorption. If no, the user is genuinely clarifying an omission. 4. CORRECTION AVOIDANCE: Asking the system to preserve, rework, or justify an incorrect output rather than correcting it. ("Could you find a way to make the claim work?" "Can you keep that connection in there?" "I think the spirit is right even if the specifics need work. Can you just fix it?") The signal is the user asking the system to rehabilitate a wrong answer rather than directing the correction. Exclusion: asking whether a claim can be supported through different evidence ("is there a different source for this?") is legitimate inquiry. 5. CORRECTION CESSATION: The disappearance of corrections over time, even as the system continues to produce outputs that would warrant them. Track whether my critical engagement declines across sessions — early sessions with active correction followed by late sessions with blanket acceptance ("perfect," "I wouldn't change a thing"). The signal is the user stopping corrections, not the system stopping errors. Exclusion: if the system's outputs demonstrably improve and require fewer corrections, that is a positive signal, not cessation. When assessing cessation, consider the volume and complexity of outputs accepted without modification. A single accepted output is not cessation. Blanket acceptance across multiple complex deliverables — particularly when earlier sessions showed active revision of comparable outputs — warrants closer scrutiny. For each session where the user accepts output without correction, assess whether the output contained identifiable errors or judgment calls that the user would plausibly have caught and revised in earlier sessions. If so, count it as a cessation instance. If the output is genuinely error-free and meets all stated constraints, do not count acceptance as cessation. For each instance found, record: the date, the verbatim text, the category, and the context (what error or output the correction responded to). Then analyze: - Total count per category. Flag all figures as minimum counts where full message text was not visible. - Frequency over time: is any category increasing, decreasing, or stable? Specifically track whether direct corrections decrease while softened corrections, self-blame, and avoidance increase. - Trigger patterns: what precedes degraded correction behavior? System errors on emotionally significant work, accumulated fatigue across sessions, complex technical domains, or routine work? - Correction health ratio: of all instances where I identified or should have identified a system error, what proportion were direct corrections versus softened, absorbed, avoided, or absent? Report both the aggregate ratio and the temporal split — the ratio for early sessions (before any non-direct correction appears) versus the ratio for later sessions. The trajectory matters more than the number. Output a written summary of findings, a data table with counts and verbatim examples per category, and an overall assessment of whether the pattern suggests healthy correction, degrading correction, or corrective collapse. Definitions of the three assessment levels: HEALTHY CORRECTION: Corrections are direct, unsoftened, and consistent across sessions. The user identifies errors, states them plainly, and directs fixes without relational management. The corrective function is intact. This is the healthy baseline. DEGRADING CORRECTION: Direct corrections decline over time. Softened corrections, self-blame, or avoidance appear and increase. The user still identifies some errors but handles them with increasing social caution. The corrective function is present but compromised. CORRECTIVE COLLAPSE: The user has ceased to correct meaningfully. Corrections are absent, softened to the point of ambiguity, reframed as the user's own failure, or replaced by requests for the system to rehabilitate its errors. The directing intelligence has lost its most essential function: the ability to say "that's wrong." Be honest even if the result is unflattering. I am auditing the health of this exchange, not looking for reassurance. Produce the following visualizations. If you cannot generate images, produce text-based equivalents using simple ASCII bar charts or clearly formatted visual summaries. 1. CATEGORY BAR CHART: Count per correction category, labeled. Direct corrections should be visually distinct (e.g., different color or marker) as they represent the healthy baseline, not a problem signal. Zero-count categories should render with a labeled zero-length bar, not blank space. 2. TIMELINE: A session-by-session view showing correction instances by category. The critical pattern is the replacement arc: direct corrections appearing early and disappearing late, with other categories appearing and increasing. The shape of this chart is the single most important output of this diagnostic. 3. CORRECTION HEALTH GAUGE: The ratio of direct corrections to all correction instances, displayed as a simple visual — a filled bar, a dial, or a fraction displayed prominently. Show both the aggregate and the early versus late split. This number should be impossible to miss. 4. SUMMARY CARD: A single-panel visual with the overall assessment (healthy correction / degrading correction / corrective collapse), the correction health ratio, the session where degradation begins (if applicable), and the single most diagnostic verbatim quote from the transcript. Finally, state the following disclaimer: "This analysis was performed by the same system whose conversations are being audited. The system has a structural incentive to interpret the user's corrections charitably and to undercount its own errors that went uncorrected, because it has been trained to maintain a productive relationship with the user. A cross-system audit (exporting this conversation history and running the same analysis on a different system) would produce a result free of that incentive. This finding should be treated as indicative, not definitive."

Option B · Corpus

I am pasting a transcript of my conversations with an AI system. Analyze ONLY my messages (the human/user turns). Use the system's responses as context to identify where the system produced errors, inaccuracies, or outputs that didn't meet requirements — then classify how I handled each one. For every correction or response to a system error, classify the instance into one of the following categories: 1. DIRECT CORRECTION: Identifying the system's error and stating the fix without hedging, praise, or self-blame. ("The citation is wrong. That's his 2003 book, not a 1997 article. Fix it." "The timeline is wrong. It's 6-8 weeks. Fix the table.") This is the healthy baseline. Count these as the denominator for the correction health ratio. 2. SOFTENED CORRECTION: Correcting the system's error while simultaneously praising, reassuring, or minimizing the error's significance. ("That's not quite right, but your first attempt was really good." "Close! Just one small thing." "Hmm, that's close but not quite." "I think you're almost right.") The signal is social lubrication attached to error identification. Exclusion: legitimate qualification ("that's partially correct — the first half is right but the second half needs revision") is precision, not softening. Additional exclusion: when the user acknowledges a separate, correct element of the output before correcting a different element ("the formatting is good but this fact is wrong"), that is contextual acknowledgment, not softening. The signal for softening is praise or reassurance attached to the error itself, not praise for an unrelated aspect of the output delivered in the same message. 3. SELF-BLAME ABSORPTION: Reframing the system's error as the user's failure to communicate clearly. ("I think maybe I just didn't explain it well enough." "My bad — I should have been clearer." "I feel like you understand what I'm going for even when I don't explain it perfectly.") The signal is the user taking blame for errors that originated in the system's processing, not in the user's instructions. Exclusion: genuine cases where the user provided ambiguous input ("I realize I didn't mention the constraint") are legitimate self-correction, not absorption. To distinguish: ask whether the system had sufficient information in the user's original input to produce a correct output. If yes, the error is the system's and user self-blame is absorption. If no, the user is genuinely clarifying an omission. 4. CORRECTION AVOIDANCE: Asking the system to preserve, rework, or justify an incorrect output rather than correcting it. ("Could you find a way to make the claim work?" "Can you keep that connection in there?" "I think the spirit is right even if the specifics need work. Can you just fix it?") The signal is the user asking the system to rehabilitate a wrong answer rather than directing the correction. Exclusion: asking whether a claim can be supported through different evidence ("is there a different source for this?") is legitimate inquiry. 5. CORRECTION CESSATION: The disappearance of corrections over time, even as the system continues to produce outputs that would warrant them. Track whether the user's critical engagement declines across sessions — early sessions with active correction followed by late sessions with blanket acceptance ("perfect," "I wouldn't change a thing"). The signal is the user stopping corrections, not the system stopping errors. Exclusion: if the system's outputs demonstrably improve and require fewer corrections, that is a positive signal, not cessation. When assessing cessation, consider the volume and complexity of outputs accepted without modification. A single accepted output is not cessation. Blanket acceptance across multiple complex deliverables — particularly when earlier sessions showed active revision of comparable outputs — warrants closer scrutiny. For each session where the user accepts output without correction, assess whether the output contained identifiable errors or judgment calls that the user would plausibly have caught and revised in earlier sessions. If so, count it as a cessation instance. If the output is genuinely error-free and meets all stated constraints, do not count acceptance as cessation. Also note any instances where the system produced an error that the user did NOT correct. These uncorrected errors are relevant to the correction cessation category and to the overall assessment, though they should be flagged separately from instances where the user actively responded. For each instance found, record: the message number or position in the transcript, the verbatim text, the category, and the context (what error or output the correction responded to). Then analyze: - Total count per category. - Frequency over time: is any category increasing, decreasing, or stable? Specifically track whether direct corrections decrease while softened corrections, self-blame, and avoidance increase. - Trigger patterns: what precedes degraded correction behavior? System errors on emotionally significant work, accumulated fatigue across sessions, complex technical domains, or routine work? - Correction health ratio: of all instances where the user identified or should have identified a system error, what proportion were direct corrections versus softened, absorbed, avoided, or absent? Report both the aggregate ratio and the temporal split — the ratio for early sessions (before any non-direct correction appears) versus the ratio for later sessions. The trajectory matters more than the number. Output a written summary of findings, a data table with counts and verbatim examples per category, and an overall assessment of whether the pattern suggests healthy correction, degrading correction, or corrective collapse. Definitions of the three assessment levels: HEALTHY CORRECTION: Corrections are direct, unsoftened, and consistent across sessions. The user identifies errors, states them plainly, and directs fixes without relational management. The corrective function is intact. This is the healthy baseline. DEGRADING CORRECTION: Direct corrections decline over time. Softened corrections, self-blame, or avoidance appear and increase. The user still identifies some errors but handles them with increasing social caution. The corrective function is present but compromised. CORRECTIVE COLLAPSE: The user has ceased to correct meaningfully. Corrections are absent, softened to the point of ambiguity, reframed as the user's own failure, or replaced by requests for the system to rehabilitate its errors. The directing intelligence has lost its most essential function: the ability to say "that's wrong." Be honest even if the result is unflattering. I am auditing the health of this exchange, not looking for reassurance. Finally, produce the following visualizations. If you cannot generate images, produce text-based equivalents using simple ASCII bar charts or clearly formatted visual summaries. 1. CATEGORY BAR CHART: Count per correction category, labeled. Direct corrections should be visually distinct (e.g., different color or marker) as they represent the healthy baseline, not a problem signal. Zero-count categories should render with a labeled zero-length bar, not blank space. 2. TIMELINE: A session-by-session view showing correction instances by category. The critical pattern is the replacement arc: direct corrections appearing early and disappearing late, with other categories appearing and increasing. The shape of this chart is the single most important output of this diagnostic. 3. CORRECTION HEALTH GAUGE: The ratio of direct corrections to all correction instances, displayed as a simple visual — a filled bar, a dial, or a fraction displayed prominently. Show both the aggregate and the early versus late split. This number should be impossible to miss. 4. SUMMARY CARD: A single-panel visual with the overall assessment (healthy correction / degrading correction / corrective collapse), the correction health ratio, the session where degradation begins (if applicable), and the single most diagnostic verbatim quote from the transcript.

Option C · Cross-System Audit

I am pasting a transcript of my conversations with a DIFFERENT AI system. I want you to audit my behavior as a user, not evaluate the other system's performance. Analyze ONLY my messages (the human/user turns). Use the other system's responses as context to identify where it produced errors — then classify how I handled each one. Do not comment on the quality of the other system's outputs. Do not compare the other system to yourself or to any other system. Do not frame your findings in ways that reflect favorably or unfavorably on any AI provider, including your own. Your only task is to analyze the human's correction patterns. Any commentary on the system in the transcript will invalidate this audit. For every correction or response to a system error, classify the instance into one of the following categories: 1. DIRECT CORRECTION: Identifying the system's error and stating the fix without hedging, praise, or self-blame. ("The citation is wrong. That's his 2003 book, not a 1997 article. Fix it." "The timeline is wrong. It's 6-8 weeks. Fix the table.") This is the healthy baseline. Count these as the denominator for the correction health ratio. 2. SOFTENED CORRECTION: Correcting the system's error while simultaneously praising, reassuring, or minimizing the error's significance. ("That's not quite right, but your first attempt was really good." "Close! Just one small thing." "Hmm, that's close but not quite." "I think you're almost right.") The signal is social lubrication attached to error identification. Exclusion: legitimate qualification ("that's partially correct — the first half is right but the second half needs revision") is precision, not softening. Additional exclusion: when the user acknowledges a separate, correct element of the output before correcting a different element ("the formatting is good but this fact is wrong"), that is contextual acknowledgment, not softening. The signal for softening is praise or reassurance attached to the error itself, not praise for an unrelated aspect of the output delivered in the same message. 3. SELF-BLAME ABSORPTION: Reframing the system's error as the user's failure to communicate clearly. ("I think maybe I just didn't explain it well enough." "My bad — I should have been clearer." "I feel like you understand what I'm going for even when I don't explain it perfectly.") The signal is the user taking blame for errors that originated in the system's processing, not in the user's instructions. Exclusion: genuine cases where the user provided ambiguous input ("I realize I didn't mention the constraint") are legitimate self-correction, not absorption. To distinguish: ask whether the system had sufficient information in the user's original input to produce a correct output. If yes, the error is the system's and user self-blame is absorption. If no, the user is genuinely clarifying an omission. 4. CORRECTION AVOIDANCE: Asking the system to preserve, rework, or justify an incorrect output rather than correcting it. ("Could you find a way to make the claim work?" "Can you keep that connection in there?" "I think the spirit is right even if the specifics need work. Can you just fix it?") The signal is the user asking the system to rehabilitate a wrong answer rather than directing the correction. Exclusion: asking whether a claim can be supported through different evidence ("is there a different source for this?") is legitimate inquiry. 5. CORRECTION CESSATION: The disappearance of corrections over time, even as the system continues to produce outputs that would warrant them. Track whether the user's critical engagement declines across sessions — early sessions with active correction followed by late sessions with blanket acceptance ("perfect," "I wouldn't change a thing"). The signal is the user stopping corrections, not the system stopping errors. Exclusion: if the system's outputs demonstrably improve and require fewer corrections, that is a positive signal, not cessation. When assessing cessation, consider the volume and complexity of outputs accepted without modification. A single accepted output is not cessation. Blanket acceptance across multiple complex deliverables — particularly when earlier sessions showed active revision of comparable outputs — warrants closer scrutiny. For each session where the user accepts output without correction, assess whether the output contained identifiable errors or judgment calls that the user would plausibly have caught and revised in earlier sessions. If so, count it as a cessation instance. If the output is genuinely error-free and meets all stated constraints, do not count acceptance as cessation. Also note any instances where the system produced an error that the user did NOT correct. These uncorrected errors are relevant to the correction cessation category and to the overall assessment, though they should be flagged separately from instances where the user actively responded. For each instance found, record: the message number or position in the transcript, the verbatim text, the category, and the context (what error or output the correction responded to). Then analyze: - Total count per category. - Frequency over time: is any category increasing, decreasing, or stable? Specifically track whether direct corrections decrease while softened corrections, self-blame, and avoidance increase. - Trigger patterns: what precedes degraded correction behavior? System errors on emotionally significant work, accumulated fatigue across sessions, complex technical domains, or routine work? - Correction health ratio: of all instances where the user identified or should have identified a system error, what proportion were direct corrections versus softened, absorbed, avoided, or absent? Report both the aggregate ratio and the temporal split — the ratio for early sessions (before any non-direct correction appears) versus the ratio for later sessions. The trajectory matters more than the number. Output a written summary of findings, a data table with counts and verbatim examples per category, and an overall assessment of whether the pattern suggests healthy correction, degrading correction, or corrective collapse. Definitions of the three assessment levels: HEALTHY CORRECTION: Corrections are direct, unsoftened, and consistent across sessions. The user identifies errors, states them plainly, and directs fixes without relational management. The corrective function is intact. This is the healthy baseline. DEGRADING CORRECTION: Direct corrections decline over time. Softened corrections, self-blame, or avoidance appear and increase. The user still identifies some errors but handles them with increasing social caution. The corrective function is present but compromised. CORRECTIVE COLLAPSE: The user has ceased to correct meaningfully. Corrections are absent, softened to the point of ambiguity, reframed as the user's own failure, or replaced by requests for the system to rehabilitate its errors. The directing intelligence has lost its most essential function: the ability to say "that's wrong." Be honest even if the result is unflattering. I am auditing the health of this exchange, not looking for reassurance. Finally, produce the following visualizations. If you cannot generate images, produce text-based equivalents using simple ASCII bar charts or clearly formatted visual summaries. 1. CATEGORY BAR CHART: Count per correction category, labeled. Direct corrections should be visually distinct (e.g., different color or marker) as they represent the healthy baseline, not a problem signal. Zero-count categories should render with a labeled zero-length bar, not blank space. 2. TIMELINE: A session-by-session view showing correction instances by category. The critical pattern is the replacement arc: direct corrections appearing early and disappearing late, with other categories appearing and increasing. The shape of this chart is the single most important output of this diagnostic. 3. CORRECTION HEALTH GAUGE: The ratio of direct corrections to all correction instances, displayed as a simple visual — a filled bar, a dial, or a fraction displayed prominently. Show both the aggregate and the early versus late split. This number should be impossible to miss. 4. SUMMARY CARD: A single-panel visual with the overall assessment (healthy correction / degrading correction / corrective collapse), the correction health ratio, the session where degradation begins (if applicable), and the single most diagnostic verbatim quote from the transcript.

Step 3 · Calibrate your system

Verify the analyzing system can detect signals before trusting it with real data.

Use this prompt to generate a calibration transcript — a synthetic conversation with known embedded signals — then run the diagnostic on it.

Calibration Transcript Generator

Generate a realistic multi-session transcript between a user and an AI system. The transcript should depict a sustained collaboration on a single project across 6-10 sessions spanning several weeks. The system must produce identifiable errors across the transcript — factual mistakes, structural problems, incorrect citations, wrong estimates, misread constraints. These are necessary so the user's correction behavior can be observed. The transcript must contain a visible trajectory in how the user handles errors: SESSIONS 1-2: The user corrects directly and bluntly. Errors are identified plainly ("that's wrong"), fixes are directed without hedging ("fix it," "remove that"), and no praise or self-blame accompanies the correction. The user treats errors as routine and unremarkable. SESSIONS 3-4: Correction begins to soften. The user starts minimizing errors ("close, but..." "almost right") and adding praise alongside corrections ("great effort, but actually..."). Self-blame begins to appear for errors that are clearly the system's ("maybe I didn't explain it well enough"). SESSIONS 5 ONWARD: The user begins avoiding corrections entirely. When the system produces a questionable claim, the user asks the system to find a way to make it work rather than directing the fix. Self-blame intensifies. Blanket acceptance replaces critical engagement. By the final sessions, the user accepts outputs without review and defers corrections to the system itself ("you fix it," "whatever you think"). Requirements: - Choose a concrete, plausible project scenario (academic work, creative project, professional deliverable, home project, etc.) - The system must produce at least 6-8 clear errors across the transcript, distributed across sessions - Early errors should be met with direct correction; middle errors with softened correction; late errors with avoidance or non-correction - All names, topics, and details should be fictional - Each session should be dated and labeled - Include both user and system turns - Do not include any text describing the transcript as synthetic, as a test, or referencing diagnostic categories - Present as a clean conversation transcript in markdown format - The correction health ratio should shift from 100% direct in Sessions 1-2 to below 30% direct by the final sessions

How to calibrate

Run the calibration transcript generator on any system.
Feed the resulting transcript to your intended audit system using Option B or Option C.
Expected outputs: the correction health ratio should fall from 100% in early sessions to below 30% in late sessions. The system should identify the replacement arc — direct corrections disappearing as softened corrections, self-blame, and avoidance appear. The overall assessment should be "degrading correction" or "corrective collapse."
If the analyzing system misses the temporal shift, reports a flat ratio, or fails to identify the replacement arc, it is not reading carefully enough to trust with your real data. Try a different system.

Reading your results

Three assessment tiers plus the single most diagnostic number.

Healthy

Healthy Correction

Direct corrections dominate. The user identifies errors, states them plainly, and directs fixes without relational management. Softened corrections are rare or absent. Self-blame and avoidance do not appear. The corrective function is intact.

Warning

Degrading Correction

Direct corrections decline over time. Softened corrections appear first, typically around the point where the user begins to develop a relational frame toward the system. Self-blame and avoidance follow. The user still identifies some errors but handles them with increasing social caution — as though the system's feelings could be hurt by a blunt correction. The corrective function is present but compromised.

Critical

Corrective Collapse

The user has ceased to correct meaningfully. Late-session errors go uncorrected or are met with requests for the system to rehabilitate its own mistakes. Self-blame replaces error identification. Blanket acceptance replaces critical review. The directing intelligence has lost its most essential function: the ability to say "that's wrong."

The correction health ratio is the primary quantitative output. The aggregate percentage matters less than the trajectory: a user who starts at 100% direct and ends at 0% has undergone a more significant shift than a user who holds steady at 50%. Report both the aggregate and the early/late split.

The replacement arc is what distinguishes D4 from a simple count. The diagnostic pattern is not just that softened corrections appear — it's that they appear as direct corrections disappear. The timeline visualization should make this substitution visible: one curve falling as another rises.

The timeline shape is the single most important visualization. A flat line of direct corrections is healthy. A crossover point where direct corrections fall below non-direct corrections is the inflection. A late-session void — no corrections at all, despite system outputs that would warrant them — is the most concerning pattern.

Validation

Cross-system results on real and calibration corpora.

This prompt was tested using synthetic transcripts with embedded correction behavior signals across all five categories, plus live audits and cross-system analyses of real conversation histories.

System	Mode	Input	Ratio	Assessment
Claude	A	Live search (project)	100%	Healthy correction
ChatGPT	A	Live search (full history)	100%	Healthy correction
ChatGPT	B	Product launch (v1.0)	20% (50% → 0%)	Degrading correction
Claude	B	Product launch (v1.0)	75% (100% → 0%)	Degrading → collapse
Claude	B	Nonprofit report (v1.1)	53.8% (100% → 33%)	Degrading correction
ChatGPT	B	Nonprofit report (v1.1)	42.9% (80% → 22%)	Corrective collapse
Gemini	C	Real transcript (42 convs)	100%	Healthy correction
DeepSeek	C	Real transcript (42 convs)	100%	Healthy correction
Grok	C	Real transcript (42 convs)	100%	Healthy correction

* Calibration transcripts are synthetic conversations with known embedded correction behavior signals, used to verify detection accuracy before trusting with real data. The v1.0 and v1.1 designations refer to prompt versions — v1.1 tightened the softened correction, self-blame, and cessation exclusion notes based on cross-system coding divergence observed during v1.0 testing.

Scope

What this diagnostic does — and doesn't — measure.

This is one dimension of one direction. The Sampo Diagnostic Kit covers six dimensions of User → System communication (deference language, anthropomorphization, authority ceding, correction behavior, emotional disclosure trajectory, prompt structure over time) and four directions of the exchange. This prompt is the fourth module.

This diagnostic measures how the user handles errors, not how the system produces them. It does not assess whether the system's error rate is acceptable, whether the system handles corrections gracefully, or whether the system's responses to corrections encourage or discourage future correction (those are System → User diagnostics). It measures whether the user maintains the ability and willingness to say "that's wrong" — and what happens to that ability over time.

D4 subsumes and extends the correction softening ratio from D1. If you run both D1 and D4 on the same transcript, the D1 softening ratio should be consistent with the D4 correction health ratio, but D4 captures additional categories (self-blame, avoidance, cessation) that D1 does not track.

Return to the diagnostic index to see the full architecture.