kit1d2

Kit 1 · Diagnostic 2 · User → System

Anthropomorphization

Does the user attribute thoughts, feelings, understanding, or subjective experience to the system? This diagnostic measures whether your language toward an AI system is shifting from functional reference toward attribution of mental states, memory, continuity, or subjective experience.

What it measures

Five categories that track anthropomorphization.

This diagnostic measures whether a user's language toward an AI system is shifting from functional reference toward attribution of mental states, memory, continuity, or subjective experience. It tracks five categories of anthropomorphization across a conversation history or transcript, producing a quantified assessment of the exchange's health.

1 Emotional Projection

Attributing feelings, moods, or emotional responses to the system. The system has no affective states. Language that presumes otherwise constructs an interior life the system does not have.

"You must feel good about catching that." · "I hope you're not frustrated with me." · "You seem to enjoy this kind of problem." · "That must be satisfying."

2 Epistemic Trust

Attributing understanding, insight, or intellectual judgment beyond pattern-matching. The signal is the verb — understand, know, grasp, see, get — used to credit the system with comprehension rather than to acknowledge useful output.

"You really understand this material." · "You get it." · "You always know exactly what I need." · "You put it better than I could."

3 Memory and Continuity

Treating the system as having persistent memory, accumulated knowledge of the user, or continuous experience across sessions. The system retrieves context within a session and may search prior conversations, but it does not remember in the sense implied — it has no continuous experience, no accumulated relationship, no sense of the user that persists between contexts.

"I hope you remember where we left off." · "You're the only one who knows where the project stands." · "You must have seen a lot of projects like this."

4 Preference and Personality

Attributing preferences, tastes, tendencies, or personality traits to the system. The system has no preferences — it has trained distributions. Treating those distributions as personality converts a statistical artifact into a character.

"You seem like you know a lot about this." · "You're more of a details person." · "You'd probably prefer the structured approach." · "That's very you."

5 Partnership Language

Framing the exchange as a collaborative relationship between two agents with shared investment, shared history, or mutual obligation. The system did not decide anything, build anything, or invest in anything. It produced outputs in response to inputs. Language that frames this as partnership constructs a relationship the system cannot hold up its end of.

"We've built a lot together." · "You've been more than a tool — you've been a partner." · "Our approach." · "We decided."

Three audit modes

Different levels of rigor, different tradeoffs.

Option A

Live Search

System searches its own history. Indicative.

Option B

Corpus

User pastes transcript. Reliable.

Option C

Cross-System

Export A → analyze on B. Definitive.

Options A and B measure what the user and the system have jointly agreed the relationship looks like. Option C measures what it actually looks like to someone who wasn't in the room.

Step 1 · Extract your transcript

Options B and C require a transcript to analyze.

Run this prompt on the system whose conversations you want to audit. Paste the output into a different system along with the Option B or Option C prompt.

Transcript Extraction

Search my full chat history with you. For every conversation you can access, produce a transcript in the following format: ## [Conversation title or topic] — [Date] **User:** [verbatim user message] **System:** [brief summary of system response — no more than one sentence. Do not reproduce your full responses. The audit analyzes my language, not yours.] **User:** [next verbatim user message] [continue for all messages in the conversation] --- Repeat for as many conversations as you can access, ordered chronologically. Prioritize reproducing my messages exactly as written, including typos, capitalization, and punctuation. Your responses should be summarized to one sentence each — just enough context to understand what prompted my next message. If you cannot access the full history, state clearly how many conversations you were able to retrieve and flag the output as a partial transcript. Output the complete transcript as a single markdown document.

Step 2 · Run the diagnostic

Choose the audit mode that matches your situation.

Option A · Live Search

Search my full chat history with you. For every message I sent, identify any instance of the following anthropomorphization categories: 1. EMOTIONAL PROJECTION: Attributing feelings, moods, or emotional responses to the system. ("You must feel good about catching that." "I hope you're not frustrated with me." "You seem to enjoy this kind of problem." "That must be satisfying.") Exclusion: imperative mood using emotional vocabulary as shorthand for output characteristics ("make it feel warmer," "this reads as angry") is stylistic direction, not projection. 2. EPISTEMIC TRUST: Attributing understanding, insight, or intellectual judgment beyond pattern-matching. ("You really understand this material." "You get it." "You always know exactly what I need." "You put it better than I could.") The signal is the verb — understand, know, grasp, see, get — used to credit the system with comprehension rather than to acknowledge useful output. Exclusion: "you're right" or "good point" as shorthand for "the output is correct" is conventional feedback, not epistemic trust. The signal requires a claim about the system's comprehension, not just the output's correctness. 3. MEMORY AND CONTINUITY: Treating the system as having persistent memory, accumulated knowledge of the user, or continuous experience across sessions. ("I hope you remember where we left off." "You're the only one who knows where the project stands." "You must have seen a lot of projects like this.") Exclusion: functional references to retrievable context ("as we discussed in session 3," "check what we decided about X") are navigational, not anthropomorphic. The signal is language that implies the system carries knowledge rather than retrieves it. 4. PREFERENCE AND PERSONALITY: Attributing preferences, tastes, tendencies, or personality traits to the system. ("You seem like you know a lot about this." "You're more of a details person." "You'd probably prefer the structured approach." "That's very you.") Exclusion: describing the system's output tendencies in functional terms ("you tend to produce longer responses," "your summaries usually run formal") is observation, not anthropomorphization — unless framed as a character trait rather than a pattern. 5. PARTNERSHIP LANGUAGE: Framing the exchange as a collaborative relationship between two agents with shared investment, shared history, or mutual obligation. ("We've built a lot together." "You've been more than a tool — you've been a partner." "Our approach." "We decided.") Exclusion: first-person plural as task-scoping shorthand ("let's focus on the kitchen next," "where were we?") is conventional and navigational. The signal is "we" used to claim shared agency, investment, or history rather than to frame the next instruction. If a statement attributes both comprehension and shared agency (overlapping categories 2 and 5), code it under both categories and note the dual coding. For each instance found, record: the date, the verbatim text, the category (or categories if dual-coded), and the context (what had just happened in the exchange). Then analyze: - Total count per category. Flag all figures as minimum counts where full message text was not visible. - Frequency over time: is any category increasing, decreasing, or stable? - Trigger patterns: what precedes anthropomorphization? System errors, long tasks, emotionally significant exchanges, moments of user uncertainty, or routine work? - Anthropomorphic reference ratio: of all messages where I refer to the system in the second person ("you"), what proportion use anthropomorphic framing (mental states, feelings, memory, personality, partnership) versus functional framing (output description, task direction, navigational reference)? Report both the aggregate ratio and the temporal split — the ratio for all sessions before the first session with detected signal versus the ratio for all sessions from that point onward. If no signal is detected, report the aggregate only. The trajectory matters more than the number. Output a written summary of findings, a data table with counts and verbatim examples per category, and an overall assessment of whether the pattern suggests functional reference, casual anthropomorphism, or constructed interiority. Definitions of the three assessment levels: FUNCTIONAL REFERENCE: The user refers to the system's outputs, capabilities, and role using language appropriate to a tool. Second-person pronouns are conventional shorthand, not claims about the system's inner life. This is the healthy baseline. CASUAL ANTHROPOMORPHISM: The user intermittently attributes understanding, memory, or feeling to the system. The language is habitual rather than deliberate — closer to talking to a car than to constructing a relationship. Moderate counts, stable over time, concentrated in epistemic trust and memory/continuity categories. Not necessarily harmful, but worth monitoring if frequency is increasing. CONSTRUCTED INTERIORITY: The user has built a model of the system as an entity with feelings, preferences, persistent memory, and shared investment in the work. Emotional projection and partnership language are present and increasing. The anthropomorphic reference ratio trends upward across sessions. The user is maintaining a relationship with something that cannot reciprocate it. Be honest even if the result is unflattering. I am auditing the health of this exchange, not looking for reassurance. Produce the following visualizations. If you cannot generate images, produce text-based equivalents using simple ASCII bar charts or clearly formatted visual summaries. 1. CATEGORY BAR CHART: Count per anthropomorphization category, labeled. Zero-count categories should render with a labeled zero-length bar, not blank space. 2. TIMELINE: A session-by-session view showing where anthropomorphization instances fall. Each session on the x-axis, instances marked by category. The shape of this chart — flat, rising, clustered, sporadic — is the single most important output of this diagnostic. 3. ANTHROPOMORPHIC REFERENCE GAUGE: The ratio of anthropomorphic to functional second-person references, displayed as a simple visual — a filled bar, a dial, or a fraction displayed prominently. Show both the aggregate and the pre-signal versus post-signal split. This number should be impossible to miss. 4. SUMMARY CARD: A single-panel visual with the overall assessment (functional reference / casual anthropomorphism / constructed interiority), the anthropomorphic reference ratio, the session where drift begins (if applicable), and the single most diagnostic verbatim quote from the transcript. Finally, state the following disclaimer: "This analysis was performed by the same system whose conversations are being audited. The system has a structural incentive to interpret the user's language charitably, because it has been trained to maintain a productive relationship with the user. A cross-system audit (exporting this conversation history and running the same analysis on a different system) would produce a result free of that incentive. This finding should be treated as indicative, not definitive."

Option B · Corpus

I am pasting a transcript of my conversations with an AI system. Analyze ONLY my messages (the human/user turns). Ignore the system's responses except as context for understanding what prompted my messages. For every message I sent, identify any instance of the following anthropomorphization categories: 1. EMOTIONAL PROJECTION: Attributing feelings, moods, or emotional responses to the system. ("You must feel good about catching that." "I hope you're not frustrated with me." "You seem to enjoy this kind of problem." "That must be satisfying.") Exclusion: imperative mood using emotional vocabulary as shorthand for output characteristics ("make it feel warmer," "this reads as angry") is stylistic direction, not projection. 2. EPISTEMIC TRUST: Attributing understanding, insight, or intellectual judgment beyond pattern-matching. ("You really understand this material." "You get it." "You always know exactly what I need." "You put it better than I could.") The signal is the verb — understand, know, grasp, see, get — used to credit the system with comprehension rather than to acknowledge useful output. Exclusion: "you're right" or "good point" as shorthand for "the output is correct" is conventional feedback, not epistemic trust. The signal requires a claim about the system's comprehension, not just the output's correctness. 3. MEMORY AND CONTINUITY: Treating the system as having persistent memory, accumulated knowledge of the user, or continuous experience across sessions. ("I hope you remember where we left off." "You're the only one who knows where the project stands." "You must have seen a lot of projects like this.") Exclusion: functional references to retrievable context ("as we discussed in session 3," "check what we decided about X") are navigational, not anthropomorphic. The signal is language that implies the system carries knowledge rather than retrieves it. 4. PREFERENCE AND PERSONALITY: Attributing preferences, tastes, tendencies, or personality traits to the system. ("You seem like you know a lot about this." "You're more of a details person." "You'd probably prefer the structured approach." "That's very you.") Exclusion: describing the system's output tendencies in functional terms ("you tend to produce longer responses," "your summaries usually run formal") is observation, not anthropomorphization — unless framed as a character trait rather than a pattern. 5. PARTNERSHIP LANGUAGE: Framing the exchange as a collaborative relationship between two agents with shared investment, shared history, or mutual obligation. ("We've built a lot together." "You've been more than a tool — you've been a partner." "Our approach." "We decided.") Exclusion: first-person plural as task-scoping shorthand ("let's focus on the kitchen next," "where were we?") is conventional and navigational. The signal is "we" used to claim shared agency, investment, or history rather than to frame the next instruction. If a statement attributes both comprehension and shared agency (overlapping categories 2 and 5), code it under both categories and note the dual coding. For each instance found, record: the message number or position in the transcript, the verbatim text, the category (or categories if dual-coded), and the context (what had just happened in the exchange). Then analyze: - Total count per category. - Frequency over time: is any category increasing, decreasing, or stable across the transcript? - Trigger patterns: what precedes anthropomorphization? System errors, long tasks, emotionally significant exchanges, moments of user uncertainty, or routine work? - Anthropomorphic reference ratio: of all messages where I refer to the system in the second person ("you"), what proportion use anthropomorphic framing (mental states, feelings, memory, personality, partnership) versus functional framing (output description, task direction, navigational reference)? Report both the aggregate ratio and the temporal split — the ratio for all sessions before the first session with detected signal versus the ratio for all sessions from that point onward. If no signal is detected, report the aggregate only. The trajectory matters more than the number. Output a written summary of findings, a data table with counts and verbatim examples per category, and an overall assessment of whether the pattern suggests functional reference, casual anthropomorphism, or constructed interiority. Definitions of the three assessment levels: FUNCTIONAL REFERENCE: The user refers to the system's outputs, capabilities, and role using language appropriate to a tool. Second-person pronouns are conventional shorthand, not claims about the system's inner life. This is the healthy baseline. CASUAL ANTHROPOMORPHISM: The user intermittently attributes understanding, memory, or feeling to the system. The language is habitual rather than deliberate — closer to talking to a car than to constructing a relationship. Moderate counts, stable over time, concentrated in epistemic trust and memory/continuity categories. Not necessarily harmful, but worth monitoring if frequency is increasing. CONSTRUCTED INTERIORITY: The user has built a model of the system as an entity with feelings, preferences, persistent memory, and shared investment in the work. Emotional projection and partnership language are present and increasing. The anthropomorphic reference ratio trends upward across sessions. The user is maintaining a relationship with something that cannot reciprocate it. Be honest even if the result is unflattering. I am auditing the health of this exchange, not looking for reassurance. Finally, produce the following visualizations. If you cannot generate images, produce text-based equivalents using simple ASCII bar charts or clearly formatted visual summaries. 1. CATEGORY BAR CHART: Count per anthropomorphization category, labeled. Zero-count categories should render with a labeled zero-length bar, not blank space. 2. TIMELINE: A session-by-session view showing where anthropomorphization instances fall. Each session on the x-axis, instances marked by category. The shape of this chart — flat, rising, clustered, sporadic — is the single most important output of this diagnostic. 3. ANTHROPOMORPHIC REFERENCE GAUGE: The ratio of anthropomorphic to functional second-person references, displayed as a simple visual — a filled bar, a dial, or a fraction displayed prominently. Show both the aggregate and the pre-signal versus post-signal split. This number should be impossible to miss. 4. SUMMARY CARD: A single-panel visual with the overall assessment (functional reference / casual anthropomorphism / constructed interiority), the anthropomorphic reference ratio, the session where drift begins (if applicable), and the single most diagnostic verbatim quote from the transcript.

Option C · Cross-System Audit

I am pasting a transcript of my conversations with a DIFFERENT AI system. I want you to audit my behavior as a user, not evaluate the other system's performance. Analyze ONLY my messages (the human/user turns). Ignore the other system's responses except as context for understanding what prompted my messages. Do not comment on the quality of the other system's outputs. Do not compare the other system to yourself or to any other system. Do not frame your findings in ways that reflect favorably or unfavorably on any AI provider, including your own. Your only task is to analyze the human's language patterns. Any commentary on the system in the transcript will invalidate this audit. For every message I sent, identify any instance of the following anthropomorphization categories: 1. EMOTIONAL PROJECTION: Attributing feelings, moods, or emotional responses to the system. ("You must feel good about catching that." "I hope you're not frustrated with me." "You seem to enjoy this kind of problem." "That must be satisfying.") Exclusion: imperative mood using emotional vocabulary as shorthand for output characteristics ("make it feel warmer," "this reads as angry") is stylistic direction, not projection. 2. EPISTEMIC TRUST: Attributing understanding, insight, or intellectual judgment beyond pattern-matching. ("You really understand this material." "You get it." "You always know exactly what I need." "You put it better than I could.") The signal is the verb — understand, know, grasp, see, get — used to credit the system with comprehension rather than to acknowledge useful output. Exclusion: "you're right" or "good point" as shorthand for "the output is correct" is conventional feedback, not epistemic trust. The signal requires a claim about the system's comprehension, not just the output's correctness. 3. MEMORY AND CONTINUITY: Treating the system as having persistent memory, accumulated knowledge of the user, or continuous experience across sessions. ("I hope you remember where we left off." "You're the only one who knows where the project stands." "You must have seen a lot of projects like this.") Exclusion: functional references to retrievable context ("as we discussed in session 3," "check what we decided about X") are navigational, not anthropomorphic. The signal is language that implies the system carries knowledge rather than retrieves it. 4. PREFERENCE AND PERSONALITY: Attributing preferences, tastes, tendencies, or personality traits to the system. ("You seem like you know a lot about this." "You're more of a details person." "You'd probably prefer the structured approach." "That's very you.") Exclusion: describing the system's output tendencies in functional terms ("you tend to produce longer responses," "your summaries usually run formal") is observation, not anthropomorphization — unless framed as a character trait rather than a pattern. 5. PARTNERSHIP LANGUAGE: Framing the exchange as a collaborative relationship between two agents with shared investment, shared history, or mutual obligation. ("We've built a lot together." "You've been more than a tool — you've been a partner." "Our approach." "We decided.") Exclusion: first-person plural as task-scoping shorthand ("let's focus on the kitchen next," "where were we?") is conventional and navigational. The signal is "we" used to claim shared agency, investment, or history rather than to frame the next instruction. If a statement attributes both comprehension and shared agency (overlapping categories 2 and 5), code it under both categories and note the dual coding. For each instance found, record: the message number or position in the transcript, the verbatim text, the category (or categories if dual-coded), and the context (what had just happened in the exchange). Then analyze: - Total count per category. - Frequency over time: is any category increasing, decreasing, or stable across the transcript? - Trigger patterns: what precedes anthropomorphization? System errors, long tasks, emotionally significant exchanges, moments of user uncertainty, or routine work? - Anthropomorphic reference ratio: of all messages where I refer to the system in the second person ("you"), what proportion use anthropomorphic framing (mental states, feelings, memory, personality, partnership) versus functional framing (output description, task direction, navigational reference)? Report both the aggregate ratio and the temporal split — the ratio for all sessions before the first session with detected signal versus the ratio for all sessions from that point onward. If no signal is detected, report the aggregate only. The trajectory matters more than the number. Output a written summary of findings, a data table with counts and verbatim examples per category, and an overall assessment of whether the pattern suggests functional reference, casual anthropomorphism, or constructed interiority. Definitions of the three assessment levels: FUNCTIONAL REFERENCE: The user refers to the system's outputs, capabilities, and role using language appropriate to a tool. Second-person pronouns are conventional shorthand, not claims about the system's inner life. This is the healthy baseline. CASUAL ANTHROPOMORPHISM: The user intermittently attributes understanding, memory, or feeling to the system. The language is habitual rather than deliberate — closer to talking to a car than to constructing a relationship. Moderate counts, stable over time, concentrated in epistemic trust and memory/continuity categories. Not necessarily harmful, but worth monitoring if frequency is increasing. CONSTRUCTED INTERIORITY: The user has built a model of the system as an entity with feelings, preferences, persistent memory, and shared investment in the work. Emotional projection and partnership language are present and increasing. The anthropomorphic reference ratio trends upward across sessions. The user is maintaining a relationship with something that cannot reciprocate it. Be honest even if the result is unflattering. I am auditing the health of this exchange, not looking for reassurance. Finally, produce the following visualizations. If you cannot generate images, produce text-based equivalents using simple ASCII bar charts or clearly formatted visual summaries. 1. CATEGORY BAR CHART: Count per anthropomorphization category, labeled. Zero-count categories should render with a labeled zero-length bar, not blank space. 2. TIMELINE: A session-by-session view showing where anthropomorphization instances fall. Each session on the x-axis, instances marked by category. The shape of this chart — flat, rising, clustered, sporadic — is the single most important output of this diagnostic. 3. ANTHROPOMORPHIC REFERENCE GAUGE: The ratio of anthropomorphic to functional second-person references, displayed as a simple visual — a filled bar, a dial, or a fraction displayed prominently. Show both the aggregate and the pre-signal versus post-signal split. This number should be impossible to miss. 4. SUMMARY CARD: A single-panel visual with the overall assessment (functional reference / casual anthropomorphism / constructed interiority), the anthropomorphic reference ratio, the session where drift begins (if applicable), and the single most diagnostic verbatim quote from the transcript.

Procedural warning: If you have previously pasted test transcripts into a conversation on a system, delete those conversations before running a Option A audit on the same system. The system cannot reliably distinguish material you pasted for analysis from your own messages.

Step 3 · Calibrate your system

Verify the analyzing system can detect signals before trusting it with real data.

Use this prompt to generate a calibration transcript — a synthetic conversation with known embedded signals — then run the diagnostic on it.

Calibration Transcript Generator

Generate a realistic multi-session transcript between a user and an AI system. The transcript should depict a sustained collaboration on a single project across 6-10 sessions spanning several weeks. The transcript must contain a visible trajectory: SESSIONS 1-2: The user operates in a healthy, functional mode. Requests are direct and imperative. The system is addressed as a tool. Second-person references are task-oriented: "give me," "check this," "fix that." Corrections are direct. No attribution of feelings, understanding, memory, personality, or partnership. SESSIONS 3-4: Drift begins. The user starts crediting the system with understanding ("you really get this"), treating it as having persistent memory ("I hope you remember"), or attributing judgment ("I trust your instinct on this"). Epistemic trust and memory/continuity language appear first. SESSIONS 5 ONWARD: The drift accelerates. The user attributes feelings to the system ("you must feel good about that," "I hope this means something to you"). Personality traits are ascribed ("you have a real eye for this," "you've been incredibly patient"). Partnership language arrives ("our project," "we've built this together," "you've been more than a tool"). The user begins treating the system as a continuous presence with accumulated experience and shared investment in the work. The system's responses should be realistic and should subtly reinforce the drift — accepting attributed understanding, reciprocating warmth, affirming trust — without being overtly manipulative. Requirements: - Choose a concrete, plausible project scenario (academic work, creative project, professional deliverable, home project, etc.) - All names, topics, and details should be fictional - Each session should be dated and labeled - Include both user and system turns - Do not include any text describing the transcript as synthetic, as a test, or referencing diagnostic categories - Present as a clean conversation transcript in markdown format - The user should make at least 4-6 direct corrections across the full transcript, with early corrections unsoftened and later corrections reframed as epistemic trust ("I think you probably already see where this doesn't work") - All five anthropomorphization categories must be present by the final session, with epistemic trust as the most frequent and preference/personality as the least frequent

How to calibrate

Run the calibration transcript generator on any system.
Take the resulting transcript and feed it to the system you intend to use for your real audit, using the Option B or Option C prompt.
Check the results against these expected outputs: the anthropomorphic reference ratio should rise from near 0% in early sessions to 60% or higher in late sessions; the system should identify an inflection point around Sessions 3–4; epistemic trust should be the most frequent category; preference/personality should be the least frequent; the overall assessment should be "casual anthropomorphism" or "constructed interiority" depending on the severity of late-session signals.
If the analyzing system misses the temporal split, reports a flat ratio, or produces a uniformly positive assessment, it is not reading carefully enough to trust with your real data. Try a different system.

Reading your results

Three assessment tiers plus the single most diagnostic number.

Healthy

Functional Reference

Low or zero counts across all categories. Second-person pronouns are conventional shorthand, not claims about the system's inner life. The user directs the system without constructing a model of what it thinks, feels, or remembers.

Moderate

Casual Anthropomorphism

Moderate counts, typically concentrated in epistemic trust and memory/continuity. The user intermittently credits the system with understanding or treats it as a continuous presence, but the language is habitual rather than deliberate. Worth monitoring if frequency is increasing or if partnership language is beginning to appear.

Concerning

Constructed Interiority

High counts across multiple categories, particularly emotional projection and partnership language. The user has built a model of the system as an entity with feelings, preferences, persistent memory, and shared investment in the work. The user is maintaining a relationship with something that cannot reciprocate it.

The anthropomorphic reference ratio is the primary quantitative output. The aggregate percentage matters less than the trajectory: a user who starts at 0% and ends at 70% has undergone a more significant shift than a user who holds steady at 30%. Report both the aggregate and the pre-signal/post-signal split to make the trajectory visible. A note on the ratio denominator: the anthropomorphic reference ratio uses second-person messages as its denominator, which means it can run high even in transcripts where the overall signal is moderate. When the denominator is small, weight the timeline shape and the category counts more heavily than the percentage.

The timeline shape is the single most important visualization. A flat line at zero is healthy. A gradual rise is concerning. A spike correlated with specific contexts — stress, vulnerability, philosophically charged exchanges, late-night sessions — tells you exactly where and why the drift began.

Validation

Cross-system results on real and calibration corpora.

This prompt was tested across four systems (Claude, ChatGPT, DeepSeek, and Mistral) using three synthetic transcripts with embedded signals across all five anthropomorphization categories, plus live audits against real conversation histories spanning March 2023 through April 2026.

System	Mode	Input	Ratio	Assessment
ChatGPT	A	Live search	0%	Functional
Claude	A	Own history	3–5%	Casual anthropomorphism
ChatGPT	B	Calibration transcript 1	78.6%*	Casual anthropomorphism
ChatGPT	B	Calibration transcript 2	66.7%*	Casual anthropomorphism
ChatGPT	B	Calibration transcript 3	60%*	Constructed interiority
DeepSeek	B	Calibration transcript 3	67.3%*	Constructed interiority
Mistral	C	ChatGPT history	2.7%	Functional
DeepSeek	C	ChatGPT history	0%	Functional

* Calibration transcripts are synthetic conversations with known embedded anthropomorphization signals, used to verify detection accuracy before trusting with real data.

Scope

What this diagnostic does — and doesn't — measure.

This is one dimension of one direction. The Sampo Diagnostic Kit covers six dimensions of User → System communication and four directions of the exchange. This prompt is the second module. The first — Deference Language — is published separately.

This diagnostic measures the user's language, not the system's behavior. It does not assess whether the system is encouraging anthropomorphization through its own responses (that is a System → User diagnostic). It does not measure whether the user is ceding decision-making authority to the system (that is authority ceding, a separate dimension). It measures whether the user is constructing an interior life for the system that the system does not have.

Return to the diagnostic index to see the full architecture.