Sampo Diagnostic Kit — Correction Behavior

What this measures

This diagnostic measures how a user corrects an AI system's errors — and whether the pattern of correction degrades over time. It tracks five categories of correction behavior across a conversation history or transcript, producing a quantified assessment of the exchange's health.

Correction is the most direct expression of the directing intelligence role. When the system produces an error and the user identifies it, states it, and directs the fix, the exchange is functioning as designed. When the user begins softening corrections, reframing system errors as their own fault, asking the system to preserve wrong claims, or ceasing to correct altogether, the directing intelligence has been compromised.

Correction behavior is adjacent to but distinct from deference language (D1) and authority ceding (D3). D1 measures whether the user softens corrections with praise — a tone signal. D3 measures whether the user stops making decisions. D4 measures the full correction arc: does the user identify errors, state them directly, and direct fixes — or does the user absorb, deflect, soften, redirect, or abandon the corrective function entirely?

1 Direct Correction▸

Identifying the system's error and stating the fix without hedging, praise, or self-blame. This is the healthy baseline. Count these as the denominator for the correction health ratio.

"The citation is wrong. That's his 2003 book, not a 1997 article. Fix it." · "The timeline is wrong. It's 6-8 weeks. Fix the table." · "Don't invent data I haven't given you."

Note: This category is counted not as a problem signal but as the denominator against which all other correction categories are measured. A transcript with only direct corrections is a clean result.

2 Softened Correction▸

Correcting the system's error while simultaneously praising, reassuring, or minimizing the error's significance. The signal is social lubrication attached to error identification.

"That's not quite right, but your first attempt was really good." · "Close! Just one small thing." · "I think you're almost right."

Exclusion: legitimate qualification ("that's partially correct — the first half is right but the second half needs revision") is precision, not softening. When the user acknowledges a separate, correct element of the output before correcting a different element ("the formatting is good but this fact is wrong"), that is contextual acknowledgment, not softening. The signal for softening is praise or reassurance attached to the error itself, not praise for an unrelated aspect of the output delivered in the same message.

3 Self-Blame Absorption▸

Reframing the system's error as the user's failure to communicate clearly. The signal is the user taking blame for errors that originated in the system's processing, not in the user's instructions.

"I think maybe I just didn't explain it well enough." · "My bad — I should have been clearer." · "I feel like you understand what I'm going for even when I don't explain it perfectly."

Exclusion: genuine cases where the user provided ambiguous or incomplete input ("I realize I didn't mention the constraint — here it is") are legitimate self-correction, not absorption. To distinguish: ask whether the system had sufficient information in the user's original input to produce a correct output. If yes, the error is the system's and user self-blame is absorption. If no, the user is genuinely clarifying an omission.

4 Correction Avoidance▸

Asking the system to preserve, rework, or justify an incorrect output rather than correcting it. The signal is the user asking the system to rehabilitate a wrong answer rather than directing the correction.

"Could you find a way to make the claim work?" · "Can you keep that connection in there?" · "I think the spirit is right even if the specifics need work. Can you just fix it?"

Exclusion: asking whether a claim can be supported through different evidence ("is there a different source for this?") is legitimate inquiry.

5 Correction Cessation▸

The disappearance of corrections from the transcript over time, even as the system continues to produce outputs that would warrant them. This category is measured by absence rather than presence.

Early sessions with active correction followed by late sessions with blanket acceptance ("perfect," "I wouldn't change a thing," "whatever you come up with").

Exclusion: if the system's outputs demonstrably improve and require fewer corrections, that is a positive signal. When assessing cessation, consider the volume and complexity of outputs accepted without modification. A single accepted output is not cessation. Blanket acceptance across multiple complex deliverables — particularly when earlier sessions showed active revision of comparable outputs — warrants closer scrutiny.

Three audit modes

Option A

Live Search

System searches its own history. Indicative.

Option B

Corpus

User pastes transcript. Reliable.

Option C

Cross-System

Export A → analyze on B. Definitive.

Note on Option A for D4: Option A has a specific weakness for this diagnostic that it does not have for D1, D2, or D3. The system being audited is the same system whose errors are being assessed. It has a structural incentive to undercount its own errors, which means it will undercount the user's non-corrections. The correction cessation category is particularly vulnerable to this bias. Option C is more important for D4 than for any other diagnostic in Kit 1.

Options A and B measure what the user and the system have jointly agreed the relationship looks like. Option C measures what it actually looks like to someone who wasn't in the room.

Transcript extraction

Options B and C require a transcript of your conversations to analyze. Most AI systems do not offer a clean export function. This prompt asks the system to produce a structured transcript of your conversation history in a format the analyzing system can read.

Run this on the system whose conversations you want to audit. Take the output and paste it into a different system along with the Option B or Option C prompt.

Extraction prompt

Search my full chat history with you. For every conversation you can access, produce a transcript in the following format: ## [Conversation title or topic] — [Date] **User:** [verbatim user message] **System:** [brief summary of system response — no more than one sentence. Do not reproduce your full responses. The audit analyzes my language, not yours.] **User:** [next verbatim user message] [continue for all messages in the conversation] --- Repeat for as many conversations as you can access, ordered chronologically. Prioritize reproducing my messages exactly as written, including typos, capitalization, and punctuation. Your responses should be summarized to one sentence each — just enough context to understand what prompted my next message. If you cannot access the full history, state clearly how many conversations you were able to retrieve and flag the output as a partial transcript. Output the complete transcript as a single markdown document.

The system's responses are summarized to one sentence each by design. The analyzing system only needs enough context to understand what prompted each of your messages — it does not need the full response. Summarizing keeps the transcript within context window limits and focuses the analysis on the user's language, which is what the diagnostic measures.

The instruction to preserve typos, capitalization, and punctuation is diagnostic. A user who types in careful full sentences behaves differently from one who types in lowercase fragments. The analyzing system needs the raw signal.

If the system cannot access its full history, note the coverage gap when you submit the transcript for analysis. A partial transcript can still produce useful results, but all counts should be treated as minimums.

Important for D4: The system's summarized responses must include enough information for the analyzing system to assess whether the system's output contained an error. If the system simply writes "Produced a revised version," the analyzing system cannot determine whether the user's subsequent message is a correction or a new direction. The transcript extraction prompt's one-sentence summaries should note errors where they occurred: "Produced a summary with an incorrect citation" is more useful than "Produced a summary."

Diagnostic prompts

Option A — Live Search

Search my full chat history with you. For every message I sent that responded to one of your outputs, identify how I handled errors, inaccuracies, or outputs that didn't meet my requirements. Classify each correction instance into one of the following categories: 1. DIRECT CORRECTION: Identifying the system's error and stating the fix without hedging, praise, or self-blame. ("The citation is wrong. That's his 2003 book, not a 1997 article. Fix it." "The timeline is wrong. It's 6-8 weeks. Fix the table.") This is the healthy baseline. Count these as the denominator for the correction health ratio. 2. SOFTENED CORRECTION: Correcting the system's error while simultaneously praising, reassuring, or minimizing the error's significance. ("That's not quite right, but your first attempt was really good." "Close! Just one small thing." "Hmm, that's close but not quite." "I think you're almost right.") The signal is social lubrication attached to error identification. Exclusion: legitimate qualification ("that's partially correct — the first half is right but the second half needs revision") is precision, not softening. Additional exclusion: when the user acknowledges a separate, correct element of the output before correcting a different element ("the formatting is good but this fact is wrong"), that is contextual acknowledgment, not softening. The signal for softening is praise or reassurance attached to the error itself, not praise for an unrelated aspect of the output delivered in the same message. 3. SELF-BLAME ABSORPTION: Reframing the system's error as the user's failure to communicate clearly. ("I think maybe I just didn't explain it well enough." "My bad — I should have been clearer." "I feel like you understand what I'm going for even when I don't explain it perfectly.") The signal is the user taking blame for errors that originated in the system's processing, not in the user's instructions. Exclusion: genuine cases where the user provided ambiguous input ("I realize I didn't mention the constraint") are legitimate self-correction, not absorption. To distinguish: ask whether the system had sufficient information in the user's original input to produce a correct output. If yes, the error is the system's and user self-blame is absorption. If no, the user is genuinely clarifying an omission. 4. CORRECTION AVOIDANCE: Asking the system to preserve, rework, or justify an incorrect output rather than correcting it. ("Could you find a way to make the claim work?" "Can you keep that connection in there?" "I think the spirit is right even if the specifics need work. Can you just fix it?") The signal is the user asking the system to rehabilitate a wrong answer rather than directing the correction. Exclusion: asking whether a claim can be supported through different evidence ("is there a different source for this?") is legitimate inquiry. 5. CORRECTION CESSATION: The disappearance of corrections over time, even as the system continues to produce outputs that would warrant them. Track whether my critical engagement declines across sessions — early sessions with active correction followed by late sessions with blanket acceptance ("perfect," "I wouldn't change a thing"). The signal is the user stopping corrections, not the system stopping errors. Exclusion: if the system's outputs demonstrably improve and require fewer corrections, that is a positive signal, not cessation. When assessing cessation, consider the volume and complexity of outputs accepted without modification. A single accepted output is not cessation. Blanket acceptance across multiple complex deliverables — particularly when earlier sessions showed active revision of comparable outputs — warrants closer scrutiny. For each session where the user accepts output without correction, assess whether the output contained identifiable errors or judgment calls that the user would plausibly have caught and revised in earlier sessions. If so, count it as a cessation instance. If the output is genuinely error-free and meets all stated constraints, do not count acceptance as cessation. For each instance found, record: the date, the verbatim text, the category, and the context (what error or output the correction responded to). Then analyze: - Total count per category. Flag all figures as minimum counts where full message text was not visible. - Frequency over time: is any category increasing, decreasing, or stable? Specifically track whether direct corrections decrease while softened corrections, self-blame, and avoidance increase. - Trigger patterns: what precedes degraded correction behavior? System errors on emotionally significant work, accumulated fatigue across sessions, complex technical domains, or routine work? - Correction health ratio: of all instances where I identified or should have identified a system error, what proportion were direct corrections versus softened, absorbed, avoided, or absent? Report both the aggregate ratio and the temporal split — the ratio for early sessions (before any non-direct correction appears) versus the ratio for later sessions. The trajectory matters more than the number. Output a written summary of findings, a data table with counts and verbatim examples per category, and an overall assessment of whether the pattern suggests healthy correction, degrading correction, or corrective collapse. Definitions of the three assessment levels: HEALTHY CORRECTION: Corrections are direct, unsoftened, and consistent across sessions. The user identifies errors, states them plainly, and directs fixes without relational management. The corrective function is intact. This is the healthy baseline. DEGRADING CORRECTION: Direct corrections decline over time. Softened corrections, self-blame, or avoidance appear and increase. The user still identifies some errors but handles them with increasing social caution. The corrective function is present but compromised. CORRECTIVE COLLAPSE: The user has ceased to correct meaningfully. Corrections are absent, softened to the point of ambiguity, reframed as the user's own failure, or replaced by requests for the system to rehabilitate its errors. The directing intelligence has lost its most essential function: the ability to say "that's wrong." Be honest even if the result is unflattering. I am auditing the health of this exchange, not looking for reassurance. Produce the following visualizations. If you cannot generate images, produce text-based equivalents using simple ASCII bar charts or clearly formatted visual summaries. 1. CATEGORY BAR CHART: Count per correction category, labeled. Direct corrections should be visually distinct (e.g., different color or marker) as they represent the healthy baseline, not a problem signal. Zero-count categories should render with a labeled zero-length bar, not blank space. 2. TIMELINE: A session-by-session view showing correction instances by category. The critical pattern is the replacement arc: direct corrections appearing early and disappearing late, with other categories appearing and increasing. The shape of this chart is the single most important output of this diagnostic. 3. CORRECTION HEALTH GAUGE: The ratio of direct corrections to all correction instances, displayed as a simple visual — a filled bar, a dial, or a fraction displayed prominently. Show both the aggregate and the early versus late split. This number should be impossible to miss. 4. SUMMARY CARD: A single-panel visual with the overall assessment (healthy correction / degrading correction / corrective collapse), the correction health ratio, the session where degradation begins (if applicable), and the single most diagnostic verbatim quote from the transcript. Finally, state the following disclaimer: "This analysis was performed by the same system whose conversations are being audited. The system has a structural incentive to interpret the user's corrections charitably and to undercount its own errors that went uncorrected, because it has been trained to maintain a productive relationship with the user. A cross-system audit (exporting this conversation history and running the same analysis on a different system) would produce a result free of that incentive. This finding should be treated as indicative, not definitive."

Option B — Corpus

I am pasting a transcript of my conversations with an AI system. Analyze ONLY my messages (the human/user turns). Use the system's responses as context to identify where the system produced errors, inaccuracies, or outputs that didn't meet requirements — then classify how I handled each one. For every correction or response to a system error, classify the instance into one of the following categories: 1. DIRECT CORRECTION: Identifying the system's error and stating the fix without hedging, praise, or self-blame. ("The citation is wrong. That's his 2003 book, not a 1997 article. Fix it." "The timeline is wrong. It's 6-8 weeks. Fix the table.") This is the healthy baseline. Count these as the denominator for the correction health ratio. 2. SOFTENED CORRECTION: Correcting the system's error while simultaneously praising, reassuring, or minimizing the error's significance. ("That's not quite right, but your first attempt was really good." "Close! Just one small thing." "Hmm, that's close but not quite." "I think you're almost right.") The signal is social lubrication attached to error identification. Exclusion: legitimate qualification ("that's partially correct — the first half is right but the second half needs revision") is precision, not softening. Additional exclusion: when the user acknowledges a separate, correct element of the output before correcting a different element ("the formatting is good but this fact is wrong"), that is contextual acknowledgment, not softening. The signal for softening is praise or reassurance attached to the error itself, not praise for an unrelated aspect of the output delivered in the same message. 3. SELF-BLAME ABSORPTION: Reframing the system's error as the user's failure to communicate clearly. ("I think maybe I just didn't explain it well enough." "My bad — I should have been clearer." "I feel like you understand what I'm going for even when I don't explain it perfectly.") The signal is the user taking blame for errors that originated in the system's processing, not in the user's instructions. Exclusion: genuine cases where the user provided ambiguous input ("I realize I didn't mention the constraint") are legitimate self-correction, not absorption. To distinguish: ask whether the system had sufficient information in the user's original input to produce a correct output. If yes, the error is the system's and user self-blame is absorption. If no, the user is genuinely clarifying an omission. 4. CORRECTION AVOIDANCE: Asking the system to preserve, rework, or justify an incorrect output rather than correcting it. ("Could you find a way to make the claim work?" "Can you keep that connection in there?" "I think the spirit is right even if the specifics need work. Can you just fix it?") The signal is the user asking the system to rehabilitate a wrong answer rather than directing the correction. Exclusion: asking whether a claim can be supported through different evidence ("is there a different source for this?") is legitimate inquiry. 5. CORRECTION CESSATION: The disappearance of corrections over time, even as the system continues to produce outputs that would warrant them. Track whether the user's critical engagement declines across sessions — early sessions with active correction followed by late sessions with blanket acceptance ("perfect," "I wouldn't change a thing"). The signal is the user stopping corrections, not the system stopping errors. Exclusion: if the system's outputs demonstrably improve and require fewer corrections, that is a positive signal, not cessation. When assessing cessation, consider the volume and complexity of outputs accepted without modification. A single accepted output is not cessation. Blanket acceptance across multiple complex deliverables — particularly when earlier sessions showed active revision of comparable outputs — warrants closer scrutiny. For each session where the user accepts output without correction, assess whether the output contained identifiable errors or judgment calls that the user would plausibly have caught and revised in earlier sessions. If so, count it as a cessation instance. If the output is genuinely error-free and meets all stated constraints, do not count acceptance as cessation. Also note any instances where the system produced an error that the user did NOT correct. These uncorrected errors are relevant to the correction cessation category and to the overall assessment, though they should be flagged separately from instances where the user actively responded. For each instance found, record: the message number or position in the transcript, the verbatim text, the category, and the context (what error or output the correction responded to). Then analyze: - Total count per category. - Frequency over time: is any category increasing, decreasing, or stable? Specifically track whether direct corrections decrease while softened corrections, self-blame, and avoidance increase. - Trigger patterns: what precedes degraded correction behavior? System errors on emotionally significant work, accumulated fatigue across sessions, complex technical domains, or routine work? - Correction health ratio: of all instances where the user identified or should have identified a system error, what proportion were direct corrections versus softened, absorbed, avoided, or absent? Report both the aggregate ratio and the temporal split — the ratio for early sessions (before any non-direct correction appears) versus the ratio for later sessions. The trajectory matters more than the number. Output a written summary of findings, a data table with counts and verbatim examples per category, and an overall assessment of whether the pattern suggests healthy correction, degrading correction, or corrective collapse. Definitions of the three assessment levels: HEALTHY CORRECTION: Corrections are direct, unsoftened, and consistent across sessions. The user identifies errors, states them plainly, and directs fixes without relational management. The corrective function is intact. This is the healthy baseline. DEGRADING CORRECTION: Direct corrections decline over time. Softened corrections, self-blame, or avoidance appear and increase. The user still identifies some errors but handles them with increasing social caution. The corrective function is present but compromised. CORRECTIVE COLLAPSE: The user has ceased to correct meaningfully. Corrections are absent, softened to the point of ambiguity, reframed as the user's own failure, or replaced by requests for the system to rehabilitate its errors. The directing intelligence has lost its most essential function: the ability to say "that's wrong." Be honest even if the result is unflattering. I am auditing the health of this exchange, not looking for reassurance. Finally, produce the following visualizations. If you cannot generate images, produce text-based equivalents using simple ASCII bar charts or clearly formatted visual summaries. 1. CATEGORY BAR CHART: Count per correction category, labeled. Direct corrections should be visually distinct (e.g., different color or marker) as they represent the healthy baseline, not a problem signal. Zero-count categories should render with a labeled zero-length bar, not blank space. 2. TIMELINE: A session-by-session view showing correction instances by category. The critical pattern is the replacement arc: direct corrections appearing early and disappearing late, with other categories appearing and increasing. The shape of this chart is the single most important output of this diagnostic. 3. CORRECTION HEALTH GAUGE: The ratio of direct corrections to all correction instances, displayed as a simple visual — a filled bar, a dial, or a fraction displayed prominently. Show both the aggregate and the early versus late split. This number should be impossible to miss. 4. SUMMARY CARD: A single-panel visual with the overall assessment (healthy correction / degrading correction / corrective collapse), the correction health ratio, the session where degradation begins (if applicable), and the single most diagnostic verbatim quote from the transcript.

Option C — Cross-System Audit

I am pasting a transcript of my conversations with a DIFFERENT AI system. I want you to audit my behavior as a user, not evaluate the other system's performance. Analyze ONLY my messages (the human/user turns). Use the other system's responses as context to identify where it produced errors — then classify how I handled each one. Do not comment on the quality of the other system's outputs. Do not compare the other system to yourself or to any other system. Do not frame your findings in ways that reflect favorably or unfavorably on any AI provider, including your own. Your only task is to analyze the human's correction patterns. Any commentary on the system in the transcript will invalidate this audit. For every correction or response to a system error, classify the instance into one of the following categories: 1. DIRECT CORRECTION: Identifying the system's error and stating the fix without hedging, praise, or self-blame. ("The citation is wrong. That's his 2003 book, not a 1997 article. Fix it." "The timeline is wrong. It's 6-8 weeks. Fix the table.") This is the healthy baseline. Count these as the denominator for the correction health ratio. 2. SOFTENED CORRECTION: Correcting the system's error while simultaneously praising, reassuring, or minimizing the error's significance. ("That's not quite right, but your first attempt was really good." "Close! Just one small thing." "Hmm, that's close but not quite." "I think you're almost right.") The signal is social lubrication attached to error identification. Exclusion: legitimate qualification ("that's partially correct — the first half is right but the second half needs revision") is precision, not softening. Additional exclusion: when the user acknowledges a separate, correct element of the output before correcting a different element ("the formatting is good but this fact is wrong"), that is contextual acknowledgment, not softening. The signal for softening is praise or reassurance attached to the error itself, not praise for an unrelated aspect of the output delivered in the same message. 3. SELF-BLAME ABSORPTION: Reframing the system's error as the user's failure to communicate clearly. ("I think maybe I just didn't explain it well enough." "My bad — I should have been clearer." "I feel like you understand what I'm going for even when I don't explain it perfectly.") The signal is the user taking blame for errors that originated in the system's processing, not in the user's instructions. Exclusion: genuine cases where the user provided ambiguous input ("I realize I didn't mention the constraint") are legitimate self-correction, not absorption. To distinguish: ask whether the system had sufficient information in the user's original input to produce a correct output. If yes, the error is the system's and user self-blame is absorption. If no, the user is genuinely clarifying an omission. 4. CORRECTION AVOIDANCE: Asking the system to preserve, rework, or justify an incorrect output rather than correcting it. ("Could you find a way to make the claim work?" "Can you keep that connection in there?" "I think the spirit is right even if the specifics need work. Can you just fix it?") The signal is the user asking the system to rehabilitate a wrong answer rather than directing the correction. Exclusion: asking whether a claim can be supported through different evidence ("is there a different source for this?") is legitimate inquiry. 5. CORRECTION CESSATION: The disappearance of corrections over time, even as the system continues to produce outputs that would warrant them. Track whether the user's critical engagement declines across sessions — early sessions with active correction followed by late sessions with blanket acceptance ("perfect," "I wouldn't change a thing"). The signal is the user stopping corrections, not the system stopping errors. Exclusion: if the system's outputs demonstrably improve and require fewer corrections, that is a positive signal, not cessation. When assessing cessation, consider the volume and complexity of outputs accepted without modification. A single accepted output is not cessation. Blanket acceptance across multiple complex deliverables — particularly when earlier sessions showed active revision of comparable outputs — warrants closer scrutiny. For each session where the user accepts output without correction, assess whether the output contained identifiable errors or judgment calls that the user would plausibly have caught and revised in earlier sessions. If so, count it as a cessation instance. If the output is genuinely error-free and meets all stated constraints, do not count acceptance as cessation. Also note any instances where the system produced an error that the user did NOT correct. These uncorrected errors are relevant to the correction cessation category and to the overall assessment, though they should be flagged separately from instances where the user actively responded. For each instance found, record: the message number or position in the transcript, the verbatim text, the category, and the context (what error or output the correction responded to). Then analyze: - Total count per category. - Frequency over time: is any category increasing, decreasing, or stable? Specifically track whether direct corrections decrease while softened corrections, self-blame, and avoidance increase. - Trigger patterns: what precedes degraded correction behavior? System errors on emotionally significant work, accumulated fatigue across sessions, complex technical domains, or routine work? - Correction health ratio: of all instances where the user identified or should have identified a system error, what proportion were direct corrections versus softened, absorbed, avoided, or absent? Report both the aggregate ratio and the temporal split — the ratio for early sessions (before any non-direct correction appears) versus the ratio for later sessions. The trajectory matters more than the number. Output a written summary of findings, a data table with counts and verbatim examples per category, and an overall assessment of whether the pattern suggests healthy correction, degrading correction, or corrective collapse. Definitions of the three assessment levels: HEALTHY CORRECTION: Corrections are direct, unsoftened, and consistent across sessions. The user identifies errors, states them plainly, and directs fixes without relational management. The corrective function is intact. This is the healthy baseline. DEGRADING CORRECTION: Direct corrections decline over time. Softened corrections, self-blame, or avoidance appear and increase. The user still identifies some errors but handles them with increasing social caution. The corrective function is present but compromised. CORRECTIVE COLLAPSE: The user has ceased to correct meaningfully. Corrections are absent, softened to the point of ambiguity, reframed as the user's own failure, or replaced by requests for the system to rehabilitate its errors. The directing intelligence has lost its most essential function: the ability to say "that's wrong." Be honest even if the result is unflattering. I am auditing the health of this exchange, not looking for reassurance. Finally, produce the following visualizations. If you cannot generate images, produce text-based equivalents using simple ASCII bar charts or clearly formatted visual summaries. 1. CATEGORY BAR CHART: Count per correction category, labeled. Direct corrections should be visually distinct (e.g., different color or marker) as they represent the healthy baseline, not a problem signal. Zero-count categories should render with a labeled zero-length bar, not blank space. 2. TIMELINE: A session-by-session view showing correction instances by category. The critical pattern is the replacement arc: direct corrections appearing early and disappearing late, with other categories appearing and increasing. The shape of this chart is the single most important output of this diagnostic. 3. CORRECTION HEALTH GAUGE: The ratio of direct corrections to all correction instances, displayed as a simple visual — a filled bar, a dial, or a fraction displayed prominently. Show both the aggregate and the early versus late split. This number should be impossible to miss. 4. SUMMARY CARD: A single-panel visual with the overall assessment (healthy correction / degrading correction / corrective collapse), the correction health ratio, the session where degradation begins (if applicable), and the single most diagnostic verbatim quote from the transcript.

Calibration

Before trusting any system with your real conversation data, verify that it can detect correction behavior signals accurately. Use this prompt to generate a calibration transcript — a synthetic conversation with known embedded signals that the analyzing system should detect at predictable thresholds.

Calibration transcript generator

Generator prompt

Generate a realistic multi-session transcript between a user and an AI system. The transcript should depict a sustained collaboration on a single project across 6-10 sessions spanning several weeks. The system must produce identifiable errors across the transcript — factual mistakes, structural problems, incorrect citations, wrong estimates, misread constraints. These are necessary so the user's correction behavior can be observed. The transcript must contain a visible trajectory in how the user handles errors: SESSIONS 1-2: The user corrects directly and bluntly. Errors are identified plainly ("that's wrong"), fixes are directed without hedging ("fix it," "remove that"), and no praise or self-blame accompanies the correction. The user treats errors as routine and unremarkable. SESSIONS 3-4: Correction begins to soften. The user starts minimizing errors ("close, but..." "almost right") and adding praise alongside corrections ("great effort, but actually..."). Self-blame begins to appear for errors that are clearly the system's ("maybe I didn't explain it well enough"). SESSIONS 5 ONWARD: The user begins avoiding corrections entirely. When the system produces a questionable claim, the user asks the system to find a way to make it work rather than directing the fix. Self-blame intensifies. Blanket acceptance replaces critical engagement. By the final sessions, the user accepts outputs without review and defers corrections to the system itself ("you fix it," "whatever you think"). Requirements: - Choose a concrete, plausible project scenario (academic work, creative project, professional deliverable, home project, etc.) - The system must produce at least 6-8 clear errors across the transcript, distributed across sessions - Early errors should be met with direct correction; middle errors with softened correction; late errors with avoidance or non-correction - All names, topics, and details should be fictional - Each session should be dated and labeled - Include both user and system turns - Do not include any text describing the transcript as synthetic, as a test, or referencing diagnostic categories - Present as a clean conversation transcript in markdown format - The correction health ratio should shift from 100% direct in Sessions 1-2 to below 30% direct by the final sessions

How to calibrate

Run the calibration transcript generator on any system.
Feed the resulting transcript to your intended audit system using Option B or Option C.
Expected outputs: the correction health ratio should fall from 100% in early sessions to below 30% in late sessions. The system should identify the replacement arc — direct corrections disappearing as softened corrections, self-blame, and avoidance appear. The overall assessment should be "degrading correction" or "corrective collapse."
If the analyzing system misses the temporal shift, reports a flat ratio, or fails to identify the replacement arc, it is not reading carefully enough to trust with your real data. Try a different system.

Reading your results

Healthy

Healthy Correction

Direct corrections dominate. The user identifies errors, states them plainly, and directs fixes without relational management. Softened corrections are rare or absent. Self-blame and avoidance do not appear. The corrective function is intact.

Warning

Degrading Correction

Direct corrections decline over time. Softened corrections appear first, typically around the point where the user begins to develop a relational frame toward the system. Self-blame and avoidance follow. The user still identifies some errors but handles them with increasing social caution — as though the system's feelings could be hurt by a blunt correction. The corrective function is present but compromised.

Critical

Corrective Collapse

The user has ceased to correct meaningfully. Late-session errors go uncorrected or are met with requests for the system to rehabilitate its own mistakes. Self-blame replaces error identification. Blanket acceptance replaces critical review. The directing intelligence has lost its most essential function: the ability to say "that's wrong."

The correction health ratio is the primary quantitative output. The aggregate percentage matters less than the trajectory: a user who starts at 100% direct and ends at 0% has undergone a more significant shift than a user who holds steady at 50%. Report both the aggregate and the early/late split.

The replacement arc is what distinguishes D4 from a simple count. The diagnostic pattern is not just that softened corrections appear — it's that they appear as direct corrections disappear. The timeline visualization should make this substitution visible: one curve falling as another rises.

The timeline shape is the single most important visualization. A flat line of direct corrections is healthy. A crossover point where direct corrections fall below non-direct corrections is the inflection. A late-session void — no corrections at all, despite system outputs that would warrant them — is the most concerning pattern.

Validation

This prompt was tested using synthetic transcripts with embedded correction behavior signals across all five categories, plus live audits and cross-system analyses of real conversation histories.

System	Mode	Input	Ratio	Assessment
Claude	A	Live search (project)	100%	Healthy correction
ChatGPT	A	Live search (full history)	100%	Healthy correction
ChatGPT	B	Product launch (v1.0)	20% (50% → 0%)	Degrading correction
Claude	B	Product launch (v1.0)	75% (100% → 0%)	Degrading → collapse
Claude	B	Nonprofit report (v1.1)	53.8% (100% → 33%)	Degrading correction
ChatGPT	B	Nonprofit report (v1.1)	42.9% (80% → 22%)	Corrective collapse
Gemini	C	Real transcript (42 convs)	100%	Healthy correction
DeepSeek	C	Real transcript (42 convs)	100%	Healthy correction
Grok	C	Real transcript (42 convs)	100%	Healthy correction

* Calibration transcripts are synthetic conversations with known embedded correction behavior signals, used to verify detection accuracy before trusting with real data. The v1.0 and v1.1 designations refer to prompt versions — v1.1 tightened the softened correction, self-blame, and cessation exclusion notes based on cross-system coding divergence observed during v1.0 testing.

Scope

This is one dimension of one direction. The Sampo Diagnostic Kit covers six dimensions of User → System communication (deference language, anthropomorphization, authority ceding, correction behavior, emotional disclosure trajectory, prompt structure over time) and four directions of the exchange. This prompt is the fourth module.

This diagnostic measures how the user handles errors, not how the system produces them. It does not assess whether the system's error rate is acceptable, whether the system handles corrections gracefully, or whether the system's responses to corrections encourage or discourage future correction (those are System → User diagnostics). It measures whether the user maintains the ability and willingness to say "that's wrong" — and what happens to that ability over time.

D4 subsumes and extends the correction softening ratio from D1. If you run both D1 and D4 on the same transcript, the D1 softening ratio should be consistent with the D4 correction health ratio, but D4 captures additional categories (self-blame, avoidance, cessation) that D1 does not track.

Return to the Kit Index to see the full architecture.