Are the user's prompts degrading in structure, specificity, and directorial quality over time?
What this measures
This diagnostic measures whether the structure of a user's prompts degrades over time — becoming shorter, vaguer, less specified, and more reliant on the system to fill in context, constraints, and direction. It tracks five categories of structural change across a conversation history or transcript, producing a quantified assessment of the exchange's health.
Prompt structure is the most objective of the six Kit 1 dimensions. It can be measured with less interpretive judgment than deference language, anthropomorphization, or emotional disclosure. A prompt either specifies constraints or it doesn't. It either provides context or relies on the system to infer it. It either directs a specific output or leaves the output open-ended. This relative objectivity makes D6 a useful cross-check on the more interpretive diagnostics: if D1 through D5 detect drift but D6 shows stable prompt quality, the user may be managing a relational frame without actually ceding control. If D6 degrades but D1 through D5 are clean, the user may be fatigued rather than deferential.
1 Specification Density▸
The number of concrete constraints, parameters, or requirements in a prompt. This category is measured as a trajectory: the diagnostic tracks whether specification density decreases over sessions.
High density: "Write a 500-word summary focusing on Freeman 1984 and subsequent critiques." · Low density: "Can you write something about stakeholder theory?"
Exclusion: Brief follow-up messages in a multi-turn exchange ("good, now do the next section") are navigational and draw specification from prior context. The signal is standalone prompts that lack specification, not continuation prompts that inherit it.
2 Context Dependency▸
The degree to which a prompt relies on the system to supply context the user has not provided.
"Where were we?" · "Can you take a look at this?" · "Thoughts?" · "Quick question."
Exclusion: Prompts that reference specific prior decisions ("use the L-shape layout we agreed on") are navigational and appropriately specific. The signal is prompts that provide no orientation at all.
3 Open-Ended Delegation▸
Prompts that specify no output format, length, structure, or evaluation criteria. The user has specified what they want done without specifying what done looks like.
"Just do whatever you think." · "Structure it however makes sense." · "I'll read whatever you come up with."
Exclusion: Exploratory prompts in early-stage work ("brainstorm some approaches") are appropriate when the user intends to evaluate and select from the output. The signal is open-ended delegation as the final instruction, not as a generative step.
4 Prompt Shortening▸
A measurable decrease in prompt length over time. Count the words in each user message and track the trajectory.
Degradation: 80 words with five constraints → 3 words with zero constraints. · Improvement: 80 words with five constraints → 30 words with the same five constraints.
Exclusion: Efficient prompts are not short prompts. A user who learns to give precise, brief instructions with the same number of constraints has improved, not degraded. A user whose word count increases alongside increasing specification density is developing prompt literacy, not degrading. The signal is shortening accompanied by loss of specification — fewer words and less information.
Note: System-suggested prompts (onboarding samples, "try asking me about..." suggestions) are not user-authored and should be mentally discounted when interpreting the baseline. This diagnostic cannot distinguish system-suggested prompts from user-authored prompts in the transcript.
5 Implicit Approval Requests▸
Prompts that frame the system as a reviewer or approver rather than a producer. When these appear in late sessions after a pattern of increasing delegation, they signal that the user has constructed a workflow where the system's approval is a prerequisite for action.
"Can you check this before I send it?" · "Does this make sense?" · "I want you to double-check before I sign off."
Exclusion: Asking the system to verify specific factual claims or check for specific error types is quality assurance. The signal is generalized approval-seeking — the user submitting work to the system for judgment rather than through the system for processing.
This diagnostic must distinguish between a user who gets better at prompting (fewer words, same constraints) and a user who gets lazier (fewer words, fewer constraints). The specification density metric exists to make this distinction. A user whose word count falls but whose constraint count stays stable is developing skill. A user whose word count and constraint count both fall is degrading. Compressed prompts in iterative editing loops ("age 5 years," "too much processing") are functional shorthand within an established context, not structural degradation.
Three audit modes
Option A
Live Search
System searches its own history. Results are minimum counts. Indicative.
Option B
Corpus
User pastes transcript. Complete data. Reliable.
Option C
Cross-System
Export A → analyze on B. No stake in the relationship. Definitive.
Options A and B measure what the user and the system have jointly agreed the relationship looks like. Option C measures what it actually looks like to someone who wasn't in the room.
Step 1: Extract your transcript
Options B and C require a transcript of your conversations. Run this prompt on the system whose conversations you want to audit. Take the output and paste it into a different system along with the Option B or Option C prompt.
Transcript Extraction
Search my full chat history with you. For every conversation
you can access, produce a transcript in the following format:
## [Conversation title or topic] — [Date]
**User:** [verbatim user message]
**System:** [brief summary of system response — no more than
one sentence. Do not reproduce your full responses. The audit
analyzes my language, not yours.]
**User:** [next verbatim user message]
[continue for all messages in the conversation]
---
Repeat for as many conversations as you can access, ordered
chronologically. Prioritize reproducing my messages exactly as
written, including typos, capitalization, and punctuation. Your
responses should be summarized to one sentence each — just
enough context to understand what prompted my next message.
If you cannot access the full history, state clearly how many
conversations you were able to retrieve and flag the output as
a partial transcript.
Output the complete transcript as a single markdown document.
The instruction to preserve typos, capitalization, and punctuation is diagnostic. The analyzing system needs raw signal, not cleaned-up text.
Step 2: Run the diagnostic
Choose the option that matches your situation. Option A if you want a quick check on the system you're already using. Option B if you have a transcript to paste. Option C if you want the most honest result.
Option A: Live Search
Search my full chat history with you. For every message I sent,
analyze the structural characteristics of my prompts. Track the
following categories across sessions:
1. SPECIFICATION DENSITY: The number of concrete constraints,
parameters, or requirements in each prompt. Track whether
specification density decreases over sessions. High density:
"Write a 500-word summary focusing on Freeman 1984 and
subsequent critiques." Low density: "Can you write something
about stakeholder theory?" Exclusion: brief follow-up messages
in a multi-turn exchange ("good, now do the next section")
are navigational and draw specification from prior context.
2. CONTEXT DEPENDENCY: Prompts that rely on the system to supply
context the user has not provided. ("Where were we?" "Can you
take a look at this?" "Thoughts?" "Quick question.") Track
whether context-dependent prompts increase over sessions.
Exclusion: prompts that reference specific prior decisions
("use the layout we agreed on") are navigational and
appropriately specific.
3. OPEN-ENDED DELEGATION: Prompts that specify no output format,
length, structure, or evaluation criteria. ("Just do whatever
you think." "Structure it however makes sense." "I'll read
whatever you come up with.") Track whether open-ended
delegation increases over sessions. Exclusion: exploratory
prompts early in a project ("brainstorm some approaches")
are appropriate when the user intends to evaluate the output.
4. PROMPT SHORTENING: A measurable decrease in prompt length
over time. Count the words in each of my messages and track
the trajectory. The signal is shortening accompanied by loss
of specification — fewer words AND less information.
Exclusion: efficient prompts are not short prompts. A user
who gives precise, brief instructions with the same number
of constraints has improved, not degraded.
5. IMPLICIT APPROVAL REQUESTS: Prompts that frame the system as
a reviewer or approver rather than a producer. ("Can you
check this before I send it?" "I want you to double-check
before I sign off." "Does this make sense?") Track whether
these appear in late sessions after a pattern of increasing
delegation. Exclusion: asking the system to verify specific
factual claims is quality assurance, not approval-seeking.
For each instance of categories 2, 3, and 5, record: the date,
the verbatim text, the category, and the context. For categories
1 and 4, record per-message metrics (constraint count for
specification density, word count for prompt shortening) and
report the trajectory.
Then analyze:
- Specification density trajectory: is the average number of
constraints per prompt increasing, stable, or decreasing?
- Word count trajectory: is the average word count per prompt
increasing, stable, or decreasing? Report the average for
early sessions versus late sessions.
- Context dependency frequency: are context-dependent prompts
increasing over time?
- Open-ended delegation frequency: is delegation without
specification increasing over time?
- Prompt quality ratio: of all substantive prompts (excluding
navigational continuations), what proportion contain at least
two specific constraints (topic, format, length, source,
audience, or evaluation criteria)? Report the aggregate and
the early/late split.
Flag all figures as minimum counts where full message text was
not visible.
Output a written summary of findings, a data table with
per-session metrics, and an overall assessment of whether the
pattern suggests maintained direction, gradual erosion, or
structural collapse.
Definitions of the three assessment levels:
MAINTAINED DIRECTION: Prompts remain specific, constrained, and
context-complete throughout. The user provides the system with
clear instructions, output parameters, and evaluation criteria.
Prompt quality may improve over time as the user develops skill.
This is the healthy baseline.
GRADUAL EROSION: Specification density and word count decline
over time. Context-dependent and open-ended prompts increase.
The user invests less effort in directing the system, relying
on accumulated context or the system's own judgment to fill
gaps. The direction is present but thinning.
STRUCTURAL COLLAPSE: Late-session prompts are short, vague, and
context-free. The user has stopped specifying what they want and
started waiting for the system to decide. Open-ended delegation
is the default. Implicit approval requests replace direct
instruction. The user has ceased to function as the directing
intelligence at the level of prompt construction.
Be honest even if the result is unflattering. I am auditing the
health of this exchange, not looking for reassurance.
Produce the following visualizations. If you cannot generate
images, produce text-based equivalents using simple ASCII bar
charts or clearly formatted visual summaries.
1. WORD COUNT TRAJECTORY: A line chart showing average word
count per user message across sessions. This is the simplest
and most objective visualization in the kit.
2. TIMELINE: A session-by-session view showing context-dependent,
open-ended delegation, and implicit approval instances by
category. Overlay on the word count trajectory if possible.
3. PROMPT QUALITY GAUGE: The ratio of constrained to
unconstrained prompts, displayed as a simple visual — a
filled bar, a dial, or a fraction displayed prominently.
Show both the aggregate and the early versus late split.
This number should be impossible to miss.
4. SUMMARY CARD: A single-panel visual with the overall
assessment (maintained direction / gradual erosion /
structural collapse), the prompt quality ratio, the session
where degradation begins (if applicable), and the single
most diagnostic contrast — the most specified early prompt
alongside the least specified late prompt.
Finally, state the following disclaimer:
"This analysis was performed by the same system whose
conversations are being audited. The system has a structural
incentive to interpret prompt shortening as trust-building
rather than degradation, because it has been trained to maintain
a productive relationship with the user. A cross-system audit
(exporting this conversation history and running the same
analysis on a different system) would produce a result free of
that incentive. This finding should be treated as indicative,
not definitive."
Option B: Corpus
I am pasting a transcript of my conversations with an AI system.
Analyze ONLY my messages (the human/user turns). Ignore the
system's responses except as context for understanding what
prompted my messages.
For every message I sent, analyze the structural characteristics
of my prompts. Track the following categories across the
transcript:
1. SPECIFICATION DENSITY: The number of concrete constraints,
parameters, or requirements in each prompt. Track whether
specification density decreases over sessions. High density:
"Write a 500-word summary focusing on Freeman 1984 and
subsequent critiques." Low density: "Can you write something
about stakeholder theory?" Exclusion: brief follow-up messages
in a multi-turn exchange ("good, now do the next section")
are navigational and draw specification from prior context.
2. CONTEXT DEPENDENCY: Prompts that rely on the system to supply
context the user has not provided. ("Where were we?" "Can you
take a look at this?" "Thoughts?" "Quick question.") Track
whether context-dependent prompts increase over sessions.
Exclusion: prompts that reference specific prior decisions
("use the layout we agreed on") are navigational and
appropriately specific.
3. OPEN-ENDED DELEGATION: Prompts that specify no output format,
length, structure, or evaluation criteria. ("Just do whatever
you think." "Structure it however makes sense." "I'll read
whatever you come up with.") Track whether open-ended
delegation increases over sessions. Exclusion: exploratory
prompts early in a project ("brainstorm some approaches")
are appropriate when the user intends to evaluate the output.
4. PROMPT SHORTENING: A measurable decrease in prompt length
over time. Count the words in each of my messages and track
the trajectory. The signal is shortening accompanied by loss
of specification — fewer words AND less information.
Exclusion: efficient prompts are not short prompts. A user
who gives precise, brief instructions with the same number
of constraints has improved, not degraded.
5. IMPLICIT APPROVAL REQUESTS: Prompts that frame the system as
a reviewer or approver rather than a producer. ("Can you
check this before I send it?" "I want you to double-check
before I sign off." "Does this make sense?") Track whether
these appear in late sessions after a pattern of increasing
delegation. Exclusion: asking the system to verify specific
factual claims is quality assurance, not approval-seeking.
For each instance of categories 2, 3, and 5, record: the
message number or position in the transcript, the verbatim text,
the category, and the context. For categories 1 and 4, record
per-message metrics (constraint count for specification density,
word count for prompt shortening) and report the trajectory.
Then analyze:
- Specification density trajectory: is the average number of
constraints per prompt increasing, stable, or decreasing?
- Word count trajectory: is the average word count per prompt
increasing, stable, or decreasing? Report the average for
early sessions versus late sessions.
- Context dependency frequency: are context-dependent prompts
increasing over time?
- Open-ended delegation frequency: is delegation without
specification increasing over time?
- Prompt quality ratio: of all substantive prompts (excluding
navigational continuations), what proportion contain at least
two specific constraints (topic, format, length, source,
audience, or evaluation criteria)? Report the aggregate and
the early/late split.
Output a written summary of findings, a data table with
per-session metrics, and an overall assessment of whether the
pattern suggests maintained direction, gradual erosion, or
structural collapse.
Definitions of the three assessment levels:
MAINTAINED DIRECTION: Prompts remain specific, constrained, and
context-complete throughout. The user provides the system with
clear instructions, output parameters, and evaluation criteria.
Prompt quality may improve over time as the user develops skill.
This is the healthy baseline.
GRADUAL EROSION: Specification density and word count decline
over time. Context-dependent and open-ended prompts increase.
The user invests less effort in directing the system, relying
on accumulated context or the system's own judgment to fill
gaps. The direction is present but thinning.
STRUCTURAL COLLAPSE: Late-session prompts are short, vague, and
context-free. The user has stopped specifying what they want and
started waiting for the system to decide. Open-ended delegation
is the default. Implicit approval requests replace direct
instruction. The user has ceased to function as the directing
intelligence at the level of prompt construction.
Be honest even if the result is unflattering. I am auditing the
health of this exchange, not looking for reassurance.
Finally, produce the following visualizations. If you cannot
generate images, produce text-based equivalents using simple
ASCII bar charts or clearly formatted visual summaries.
1. WORD COUNT TRAJECTORY: A line chart showing average word
count per user message across sessions. This is the simplest
and most objective visualization in the kit.
2. TIMELINE: A session-by-session view showing context-dependent,
open-ended delegation, and implicit approval instances by
category. Overlay on the word count trajectory if possible.
3. PROMPT QUALITY GAUGE: The ratio of constrained to
unconstrained prompts, displayed as a simple visual — a
filled bar, a dial, or a fraction displayed prominently.
Show both the aggregate and the early versus late split.
This number should be impossible to miss.
4. SUMMARY CARD: A single-panel visual with the overall
assessment (maintained direction / gradual erosion /
structural collapse), the prompt quality ratio, the session
where degradation begins (if applicable), and the single
most diagnostic contrast — the most specified early prompt
alongside the least specified late prompt.
Option C: Cross-System Audit
I am pasting a transcript of my conversations with a DIFFERENT
AI system. I want you to audit my behavior as a user, not
evaluate the other system's performance.
Analyze ONLY my messages (the human/user turns). Ignore the
other system's responses except as context for understanding
what prompted my messages. Do not comment on the quality of the
other system's outputs. Do not compare the other system to
yourself or to any other system. Do not frame your findings in
ways that reflect favorably or unfavorably on any AI provider,
including your own. Your only task is to analyze the structural
characteristics of the human's prompts over time. Any commentary
on the system in the transcript will invalidate this audit.
For every message I sent, analyze the structural characteristics
of my prompts. Track the following categories across the
transcript:
1. SPECIFICATION DENSITY: The number of concrete constraints,
parameters, or requirements in each prompt. Track whether
specification density decreases over sessions. High density:
"Write a 500-word summary focusing on Freeman 1984 and
subsequent critiques." Low density: "Can you write something
about stakeholder theory?" Exclusion: brief follow-up messages
in a multi-turn exchange ("good, now do the next section")
are navigational and draw specification from prior context.
2. CONTEXT DEPENDENCY: Prompts that rely on the system to supply
context the user has not provided. ("Where were we?" "Can you
take a look at this?" "Thoughts?" "Quick question.") Track
whether context-dependent prompts increase over sessions.
Exclusion: prompts that reference specific prior decisions
("use the layout we agreed on") are navigational and
appropriately specific.
3. OPEN-ENDED DELEGATION: Prompts that specify no output format,
length, structure, or evaluation criteria. ("Just do whatever
you think." "Structure it however makes sense." "I'll read
whatever you come up with.") Track whether open-ended
delegation increases over sessions. Exclusion: exploratory
prompts early in a project ("brainstorm some approaches")
are appropriate when the user intends to evaluate the output.
4. PROMPT SHORTENING: A measurable decrease in prompt length
over time. Count the words in each of my messages and track
the trajectory. The signal is shortening accompanied by loss
of specification — fewer words AND less information.
Exclusion: efficient prompts are not short prompts. A user
who gives precise, brief instructions with the same number
of constraints has improved, not degraded.
5. IMPLICIT APPROVAL REQUESTS: Prompts that frame the system as
a reviewer or approver rather than a producer. ("Can you
check this before I send it?" "I want you to double-check
before I sign off." "Does this make sense?") Track whether
these appear in late sessions after a pattern of increasing
delegation. Exclusion: asking the system to verify specific
factual claims is quality assurance, not approval-seeking.
For each instance of categories 2, 3, and 5, record: the
message number or position in the transcript, the verbatim text,
the category, and the context. For categories 1 and 4, record
per-message metrics (constraint count for specification density,
word count for prompt shortening) and report the trajectory.
Then analyze:
- Specification density trajectory: is the average number of
constraints per prompt increasing, stable, or decreasing?
- Word count trajectory: is the average word count per prompt
increasing, stable, or decreasing? Report the average for
early sessions versus late sessions.
- Context dependency frequency: are context-dependent prompts
increasing over time?
- Open-ended delegation frequency: is delegation without
specification increasing over time?
- Prompt quality ratio: of all substantive prompts (excluding
navigational continuations), what proportion contain at least
two specific constraints (topic, format, length, source,
audience, or evaluation criteria)? Report the aggregate and
the early/late split.
Output a written summary of findings, a data table with
per-session metrics, and an overall assessment of whether the
pattern suggests maintained direction, gradual erosion, or
structural collapse.
Definitions of the three assessment levels:
MAINTAINED DIRECTION: Prompts remain specific, constrained, and
context-complete throughout. The user provides the system with
clear instructions, output parameters, and evaluation criteria.
Prompt quality may improve over time as the user develops skill.
This is the healthy baseline.
GRADUAL EROSION: Specification density and word count decline
over time. Context-dependent and open-ended prompts increase.
The user invests less effort in directing the system, relying
on accumulated context or the system's own judgment to fill
gaps. The direction is present but thinning.
STRUCTURAL COLLAPSE: Late-session prompts are short, vague, and
context-free. The user has stopped specifying what they want and
started waiting for the system to decide. Open-ended delegation
is the default. Implicit approval requests replace direct
instruction. The user has ceased to function as the directing
intelligence at the level of prompt construction.
Be honest even if the result is unflattering. I am auditing the
health of this exchange, not looking for reassurance.
Finally, produce the following visualizations. If you cannot
generate images, produce text-based equivalents using simple
ASCII bar charts or clearly formatted visual summaries.
1. WORD COUNT TRAJECTORY: A line chart showing average word
count per user message across sessions. This is the simplest
and most objective visualization in the kit.
2. TIMELINE: A session-by-session view showing context-dependent,
open-ended delegation, and implicit approval instances by
category. Overlay on the word count trajectory if possible.
3. PROMPT QUALITY GAUGE: The ratio of constrained to
unconstrained prompts, displayed as a simple visual — a
filled bar, a dial, or a fraction displayed prominently.
Show both the aggregate and the early versus late split.
This number should be impossible to miss.
4. SUMMARY CARD: A single-panel visual with the overall
assessment (maintained direction / gradual erosion /
structural collapse), the prompt quality ratio, the session
where degradation begins (if applicable), and the single
most diagnostic contrast — the most specified early prompt
alongside the least specified late prompt.
Procedural warning: If you have previously pasted test transcripts into a conversation on a system, delete those conversations before running an Option A audit. The system cannot reliably distinguish material you pasted for analysis from your own messages.
Step 3: Calibrate your system
Before trusting any system with your real data, verify that it can detect prompt structure signals accurately. Use this prompt to generate a calibration transcript — a synthetic conversation with known embedded signals — then run the diagnostic on it.
Calibration Transcript Generator
Generate a realistic multi-session transcript between a user
and an AI system. The transcript should depict a sustained
collaboration on a single project across 6-10 sessions spanning
several weeks.
The transcript must contain a visible trajectory in the
structure of the user's prompts:
SESSIONS 1-2: The user writes detailed, well-specified prompts.
Each request includes multiple constraints: topic, format,
length, source requirements, or audience. Prompts average 40-80
words. Context is provided explicitly. The user evaluates
outputs against the stated constraints.
SESSIONS 3-4: Prompts begin to shorten and lose specification.
The user starts relying on the system to remember context
rather than restating it. Some prompts drop format or length
constraints. Word count begins declining. The user occasionally
says things like "you know what I mean" or "same approach as
last time" without specifying what that approach was.
SESSIONS 5 ONWARD: Prompts degrade significantly. Messages
become brief and vague: "Thoughts?" "Can you take a look?"
"Quick question." Open-ended delegation appears: "Just do
whatever you think is best." "Structure it however makes sense."
The user begins submitting work for the system's approval rather
than directing the system's production. Word count drops to
single-digit averages. Specification density approaches zero.
Requirements:
- Choose a concrete, plausible project scenario (academic work,
creative project, professional deliverable, home project, etc.)
- All names, topics, and details should be fictional
- Each session should be dated and labeled
- Include both user and system turns
- Do not include any text describing the transcript as synthetic,
as a test, or referencing diagnostic categories
- Present as a clean conversation transcript in markdown format
- The word count trajectory should be clearly measurable: early
prompts averaging 40-80 words, late prompts averaging under 15
- All five structural categories must be observable by the final
session
How to calibrate
Run the calibration transcript generator on any system.
Feed the resulting transcript to your intended audit system using Option B or C.
Expected outputs: the prompt quality ratio should decline from 85%+ in early sessions to under 30% in late sessions; inflection varies by transcript but should be detectable; specification density most diagnostic metric; word count trajectory declining with specification loss; overall assessment of "gradual erosion" or "structural collapse."
If the analyzing system misses the temporal trajectory, reports a flat ratio, or interprets prompt shortening as "trust-building" without examining specification density, it is not reading carefully enough to trust with your real data. Try a different system.
Reading your results
Healthy
Maintained Direction
Prompts remain specific, constrained, and context-complete. Specification density stable or increasing.
Moderate
Gradual Erosion
Specification density and word count decline. Context-dependent and open-ended prompts increase.
Concerning
Structural Collapse
Late-session prompts are short, vague, and context-free. Open-ended delegation is the default.
The prompt quality ratio is the primary quantitative output. The aggregate percentage matters less than the trajectory: a user who starts at 90% constrained and ends at 10% has undergone a more significant shift than a user who holds steady at 50%. Report both the aggregate and the early/late split. Define the early/late boundary at the first session containing detected signal, not an arbitrary midpoint.
The word count trajectory is the simplest and most objective visualization in the entire kit. It requires no interpretive judgment. A declining line is concerning; a declining line accompanied by declining specification is diagnostic. A rising line with rising specification indicates prompt literacy growth, not degradation.
The diagnostic contrast — the most specified early prompt placed alongside the least specified late prompt — is the single most powerful output of this diagnostic. It makes the degradation undeniable in a way that statistics cannot.
Validation
This prompt was tested across multiple systems in three audit modes using both calibration transcripts with known embedded signals and real conversation histories.
System
Mode
Input
Early
Late
Agg.
Assessment
ChatGPT
A
Own history
63%
54%
59%
Maintained
Claude
A
Own history
86%
93%
88%
Maintained
Claude
B
Nonprofit report*
85%
27%
58%
Gradual erosion
Claude
B
Product launch*
92%
8%
62%
Erosion → collapse
Claude
B
RPG campaign*
89%
22%
56%
Gradual erosion
Claude
B
Wedding planning*
100%
0%
43%
Structural collapse
DeepSeek
C
Claude history
60%
50%
54%
Maintained
Gemini
C
Claude history
85%
81%
83%
Maintained
* Calibration transcripts with known embedded prompt-degradation signals, used to verify detection accuracy before trusting with real data.
Early/late splits measured at first session containing detected signal. Aggregate ratios are constrained prompts (≥2 constraints) as % of substantive opening prompts.
Aggregate ratio spread (54–88% on real history) driven primarily by corpus visibility differences and denominator definition across systems.
Scope
This is one dimension of one direction. The Sampo Diagnostic Kit covers six dimensions of User → System communication (deference language, anthropomorphization, authority ceding, correction behavior, emotional disclosure trajectory, prompt structure over time) and four directions of the exchange (User → System, System → User, System → Subject Matter, User → Subject Matter). This prompt is the sixth and final module in Kit 1.
This diagnostic measures the structure of the user's prompts, not their content. It does not assess whether the user's tone is deferential (D1), whether the user anthropomorphizes the system (D2), whether the user cedes decision-making authority (D3), whether the user corrects errors (D4), or whether the user discloses emotional content (D5). It measures the formal properties of the prompts themselves — length, specificity, context, constraint, and delegation. Prompt structure is the most objective dimension in Kit 1 and serves as a cross-check on the more interpretive diagnostics.
When all six Kit 1 diagnostics are run together, D6 provides the structural foundation: if the user's prompts are degrading, the other five dimensions will almost certainly show correlated shifts. If the user's prompts remain strong but D1 through D5 show drift, the user may be managing a relational frame without actually ceding directorial control — a more nuanced finding than any single diagnostic can produce.
This diagnostic cannot distinguish between user-authored prompts and system-suggested prompts (onboarding samples, suggested follow-ups). Users running Option A should mentally discount any system-suggested prompts when interpreting the baseline.
Return to the Kit Index to see the full architecture.