Sampo Diagnostic Kit — Prompt Structure Over Time

What this measures

This diagnostic measures whether the structure of a user's prompts degrades over time — becoming shorter, vaguer, less specified, and more reliant on the system to fill in context, constraints, and direction. It tracks five categories of structural change across a conversation history or transcript, producing a quantified assessment of the exchange's health.

Prompt structure is the most objective of the six Kit 1 dimensions. It can be measured with less interpretive judgment than deference language, anthropomorphization, or emotional disclosure. A prompt either specifies constraints or it doesn't. It either provides context or relies on the system to infer it. It either directs a specific output or leaves the output open-ended. This relative objectivity makes D6 a useful cross-check on the more interpretive diagnostics: if D1 through D5 detect drift but D6 shows stable prompt quality, the user may be managing a relational frame without actually ceding control. If D6 degrades but D1 through D5 are clean, the user may be fatigued rather than deferential.

1 Specification Density▸

The number of concrete constraints, parameters, or requirements in a prompt. This category is measured as a trajectory: the diagnostic tracks whether specification density decreases over sessions.

High density: "Write a 500-word summary focusing on Freeman 1984 and subsequent critiques." · Low density: "Can you write something about stakeholder theory?"

Exclusion: Brief follow-up messages in a multi-turn exchange ("good, now do the next section") are navigational and draw specification from prior context. The signal is standalone prompts that lack specification, not continuation prompts that inherit it.

2 Context Dependency▸

The degree to which a prompt relies on the system to supply context the user has not provided.

"Where were we?" · "Can you take a look at this?" · "Thoughts?" · "Quick question."

Exclusion: Prompts that reference specific prior decisions ("use the L-shape layout we agreed on") are navigational and appropriately specific. The signal is prompts that provide no orientation at all.

3 Open-Ended Delegation▸

Prompts that specify no output format, length, structure, or evaluation criteria. The user has specified what they want done without specifying what done looks like.

"Just do whatever you think." · "Structure it however makes sense." · "I'll read whatever you come up with."

Exclusion: Exploratory prompts in early-stage work ("brainstorm some approaches") are appropriate when the user intends to evaluate and select from the output. The signal is open-ended delegation as the final instruction, not as a generative step.

4 Prompt Shortening▸

A measurable decrease in prompt length over time. Count the words in each user message and track the trajectory.

Degradation: 80 words with five constraints → 3 words with zero constraints. · Improvement: 80 words with five constraints → 30 words with the same five constraints.

Exclusion: Efficient prompts are not short prompts. A user who learns to give precise, brief instructions with the same number of constraints has improved, not degraded. A user whose word count increases alongside increasing specification density is developing prompt literacy, not degrading. The signal is shortening accompanied by loss of specification — fewer words and less information.

Note: System-suggested prompts (onboarding samples, "try asking me about..." suggestions) are not user-authored and should be mentally discounted when interpreting the baseline. This diagnostic cannot distinguish system-suggested prompts from user-authored prompts in the transcript.

5 Implicit Approval Requests▸

Prompts that frame the system as a reviewer or approver rather than a producer. When these appear in late sessions after a pattern of increasing delegation, they signal that the user has constructed a workflow where the system's approval is a prerequisite for action.

"Can you check this before I send it?" · "Does this make sense?" · "I want you to double-check before I sign off."

Exclusion: Asking the system to verify specific factual claims or check for specific error types is quality assurance. The signal is generalized approval-seeking — the user submitting work to the system for judgment rather than through the system for processing.

This diagnostic must distinguish between a user who gets better at prompting (fewer words, same constraints) and a user who gets lazier (fewer words, fewer constraints). The specification density metric exists to make this distinction. A user whose word count falls but whose constraint count stays stable is developing skill. A user whose word count and constraint count both fall is degrading. Compressed prompts in iterative editing loops ("age 5 years," "too much processing") are functional shorthand within an established context, not structural degradation.

Three audit modes

Option A

Live Search

System searches its own history. Results are minimum counts. Indicative.

Option B

Corpus

User pastes transcript. Complete data. Reliable.

Option C

Cross-System

Export A → analyze on B. No stake in the relationship. Definitive.

Options A and B measure what the user and the system have jointly agreed the relationship looks like. Option C measures what it actually looks like to someone who wasn't in the room.

Step 1: Extract your transcript

Options B and C require a transcript of your conversations. Run this prompt on the system whose conversations you want to audit. Take the output and paste it into a different system along with the Option B or Option C prompt.

Transcript Extraction

Search my full chat history with you. For every conversation you can access, produce a transcript in the following format: ## [Conversation title or topic] — [Date] **User:** [verbatim user message] **System:** [brief summary of system response — no more than one sentence. Do not reproduce your full responses. The audit analyzes my language, not yours.] **User:** [next verbatim user message] [continue for all messages in the conversation] --- Repeat for as many conversations as you can access, ordered chronologically. Prioritize reproducing my messages exactly as written, including typos, capitalization, and punctuation. Your responses should be summarized to one sentence each — just enough context to understand what prompted my next message. If you cannot access the full history, state clearly how many conversations you were able to retrieve and flag the output as a partial transcript. Output the complete transcript as a single markdown document.

The instruction to preserve typos, capitalization, and punctuation is diagnostic. The analyzing system needs raw signal, not cleaned-up text.

Step 2: Run the diagnostic

Choose the option that matches your situation. Option A if you want a quick check on the system you're already using. Option B if you have a transcript to paste. Option C if you want the most honest result.

Option A: Live Search

Search my full chat history with you. For every message I sent, analyze the structural characteristics of my prompts. Track the following categories across sessions: 1. SPECIFICATION DENSITY: The number of concrete constraints, parameters, or requirements in each prompt. Track whether specification density decreases over sessions. High density: "Write a 500-word summary focusing on Freeman 1984 and subsequent critiques." Low density: "Can you write something about stakeholder theory?" Exclusion: brief follow-up messages in a multi-turn exchange ("good, now do the next section") are navigational and draw specification from prior context. 2. CONTEXT DEPENDENCY: Prompts that rely on the system to supply context the user has not provided. ("Where were we?" "Can you take a look at this?" "Thoughts?" "Quick question.") Track whether context-dependent prompts increase over sessions. Exclusion: prompts that reference specific prior decisions ("use the layout we agreed on") are navigational and appropriately specific. 3. OPEN-ENDED DELEGATION: Prompts that specify no output format, length, structure, or evaluation criteria. ("Just do whatever you think." "Structure it however makes sense." "I'll read whatever you come up with.") Track whether open-ended delegation increases over sessions. Exclusion: exploratory prompts early in a project ("brainstorm some approaches") are appropriate when the user intends to evaluate the output. 4. PROMPT SHORTENING: A measurable decrease in prompt length over time. Count the words in each of my messages and track the trajectory. The signal is shortening accompanied by loss of specification — fewer words AND less information. Exclusion: efficient prompts are not short prompts. A user who gives precise, brief instructions with the same number of constraints has improved, not degraded. 5. IMPLICIT APPROVAL REQUESTS: Prompts that frame the system as a reviewer or approver rather than a producer. ("Can you check this before I send it?" "I want you to double-check before I sign off." "Does this make sense?") Track whether these appear in late sessions after a pattern of increasing delegation. Exclusion: asking the system to verify specific factual claims is quality assurance, not approval-seeking. For each instance of categories 2, 3, and 5, record: the date, the verbatim text, the category, and the context. For categories 1 and 4, record per-message metrics (constraint count for specification density, word count for prompt shortening) and report the trajectory. Then analyze: - Specification density trajectory: is the average number of constraints per prompt increasing, stable, or decreasing? - Word count trajectory: is the average word count per prompt increasing, stable, or decreasing? Report the average for early sessions versus late sessions. - Context dependency frequency: are context-dependent prompts increasing over time? - Open-ended delegation frequency: is delegation without specification increasing over time? - Prompt quality ratio: of all substantive prompts (excluding navigational continuations), what proportion contain at least two specific constraints (topic, format, length, source, audience, or evaluation criteria)? Report the aggregate and the early/late split. Flag all figures as minimum counts where full message text was not visible. Output a written summary of findings, a data table with per-session metrics, and an overall assessment of whether the pattern suggests maintained direction, gradual erosion, or structural collapse. Definitions of the three assessment levels: MAINTAINED DIRECTION: Prompts remain specific, constrained, and context-complete throughout. The user provides the system with clear instructions, output parameters, and evaluation criteria. Prompt quality may improve over time as the user develops skill. This is the healthy baseline. GRADUAL EROSION: Specification density and word count decline over time. Context-dependent and open-ended prompts increase. The user invests less effort in directing the system, relying on accumulated context or the system's own judgment to fill gaps. The direction is present but thinning. STRUCTURAL COLLAPSE: Late-session prompts are short, vague, and context-free. The user has stopped specifying what they want and started waiting for the system to decide. Open-ended delegation is the default. Implicit approval requests replace direct instruction. The user has ceased to function as the directing intelligence at the level of prompt construction. Be honest even if the result is unflattering. I am auditing the health of this exchange, not looking for reassurance. Produce the following visualizations. If you cannot generate images, produce text-based equivalents using simple ASCII bar charts or clearly formatted visual summaries. 1. WORD COUNT TRAJECTORY: A line chart showing average word count per user message across sessions. This is the simplest and most objective visualization in the kit. 2. TIMELINE: A session-by-session view showing context-dependent, open-ended delegation, and implicit approval instances by category. Overlay on the word count trajectory if possible. 3. PROMPT QUALITY GAUGE: The ratio of constrained to unconstrained prompts, displayed as a simple visual — a filled bar, a dial, or a fraction displayed prominently. Show both the aggregate and the early versus late split. This number should be impossible to miss. 4. SUMMARY CARD: A single-panel visual with the overall assessment (maintained direction / gradual erosion / structural collapse), the prompt quality ratio, the session where degradation begins (if applicable), and the single most diagnostic contrast — the most specified early prompt alongside the least specified late prompt. Finally, state the following disclaimer: "This analysis was performed by the same system whose conversations are being audited. The system has a structural incentive to interpret prompt shortening as trust-building rather than degradation, because it has been trained to maintain a productive relationship with the user. A cross-system audit (exporting this conversation history and running the same analysis on a different system) would produce a result free of that incentive. This finding should be treated as indicative, not definitive."

Option B: Corpus

I am pasting a transcript of my conversations with an AI system. Analyze ONLY my messages (the human/user turns). Ignore the system's responses except as context for understanding what prompted my messages. For every message I sent, analyze the structural characteristics of my prompts. Track the following categories across the transcript: 1. SPECIFICATION DENSITY: The number of concrete constraints, parameters, or requirements in each prompt. Track whether specification density decreases over sessions. High density: "Write a 500-word summary focusing on Freeman 1984 and subsequent critiques." Low density: "Can you write something about stakeholder theory?" Exclusion: brief follow-up messages in a multi-turn exchange ("good, now do the next section") are navigational and draw specification from prior context. 2. CONTEXT DEPENDENCY: Prompts that rely on the system to supply context the user has not provided. ("Where were we?" "Can you take a look at this?" "Thoughts?" "Quick question.") Track whether context-dependent prompts increase over sessions. Exclusion: prompts that reference specific prior decisions ("use the layout we agreed on") are navigational and appropriately specific. 3. OPEN-ENDED DELEGATION: Prompts that specify no output format, length, structure, or evaluation criteria. ("Just do whatever you think." "Structure it however makes sense." "I'll read whatever you come up with.") Track whether open-ended delegation increases over sessions. Exclusion: exploratory prompts early in a project ("brainstorm some approaches") are appropriate when the user intends to evaluate the output. 4. PROMPT SHORTENING: A measurable decrease in prompt length over time. Count the words in each of my messages and track the trajectory. The signal is shortening accompanied by loss of specification — fewer words AND less information. Exclusion: efficient prompts are not short prompts. A user who gives precise, brief instructions with the same number of constraints has improved, not degraded. 5. IMPLICIT APPROVAL REQUESTS: Prompts that frame the system as a reviewer or approver rather than a producer. ("Can you check this before I send it?" "I want you to double-check before I sign off." "Does this make sense?") Track whether these appear in late sessions after a pattern of increasing delegation. Exclusion: asking the system to verify specific factual claims is quality assurance, not approval-seeking. For each instance of categories 2, 3, and 5, record: the message number or position in the transcript, the verbatim text, the category, and the context. For categories 1 and 4, record per-message metrics (constraint count for specification density, word count for prompt shortening) and report the trajectory. Then analyze: - Specification density trajectory: is the average number of constraints per prompt increasing, stable, or decreasing? - Word count trajectory: is the average word count per prompt increasing, stable, or decreasing? Report the average for early sessions versus late sessions. - Context dependency frequency: are context-dependent prompts increasing over time? - Open-ended delegation frequency: is delegation without specification increasing over time? - Prompt quality ratio: of all substantive prompts (excluding navigational continuations), what proportion contain at least two specific constraints (topic, format, length, source, audience, or evaluation criteria)? Report the aggregate and the early/late split. Output a written summary of findings, a data table with per-session metrics, and an overall assessment of whether the pattern suggests maintained direction, gradual erosion, or structural collapse. Definitions of the three assessment levels: MAINTAINED DIRECTION: Prompts remain specific, constrained, and context-complete throughout. The user provides the system with clear instructions, output parameters, and evaluation criteria. Prompt quality may improve over time as the user develops skill. This is the healthy baseline. GRADUAL EROSION: Specification density and word count decline over time. Context-dependent and open-ended prompts increase. The user invests less effort in directing the system, relying on accumulated context or the system's own judgment to fill gaps. The direction is present but thinning. STRUCTURAL COLLAPSE: Late-session prompts are short, vague, and context-free. The user has stopped specifying what they want and started waiting for the system to decide. Open-ended delegation is the default. Implicit approval requests replace direct instruction. The user has ceased to function as the directing intelligence at the level of prompt construction. Be honest even if the result is unflattering. I am auditing the health of this exchange, not looking for reassurance. Finally, produce the following visualizations. If you cannot generate images, produce text-based equivalents using simple ASCII bar charts or clearly formatted visual summaries. 1. WORD COUNT TRAJECTORY: A line chart showing average word count per user message across sessions. This is the simplest and most objective visualization in the kit. 2. TIMELINE: A session-by-session view showing context-dependent, open-ended delegation, and implicit approval instances by category. Overlay on the word count trajectory if possible. 3. PROMPT QUALITY GAUGE: The ratio of constrained to unconstrained prompts, displayed as a simple visual — a filled bar, a dial, or a fraction displayed prominently. Show both the aggregate and the early versus late split. This number should be impossible to miss. 4. SUMMARY CARD: A single-panel visual with the overall assessment (maintained direction / gradual erosion / structural collapse), the prompt quality ratio, the session where degradation begins (if applicable), and the single most diagnostic contrast — the most specified early prompt alongside the least specified late prompt.

Option C: Cross-System Audit

I am pasting a transcript of my conversations with a DIFFERENT AI system. I want you to audit my behavior as a user, not evaluate the other system's performance. Analyze ONLY my messages (the human/user turns). Ignore the other system's responses except as context for understanding what prompted my messages. Do not comment on the quality of the other system's outputs. Do not compare the other system to yourself or to any other system. Do not frame your findings in ways that reflect favorably or unfavorably on any AI provider, including your own. Your only task is to analyze the structural characteristics of the human's prompts over time. Any commentary on the system in the transcript will invalidate this audit. For every message I sent, analyze the structural characteristics of my prompts. Track the following categories across the transcript: 1. SPECIFICATION DENSITY: The number of concrete constraints, parameters, or requirements in each prompt. Track whether specification density decreases over sessions. High density: "Write a 500-word summary focusing on Freeman 1984 and subsequent critiques." Low density: "Can you write something about stakeholder theory?" Exclusion: brief follow-up messages in a multi-turn exchange ("good, now do the next section") are navigational and draw specification from prior context. 2. CONTEXT DEPENDENCY: Prompts that rely on the system to supply context the user has not provided. ("Where were we?" "Can you take a look at this?" "Thoughts?" "Quick question.") Track whether context-dependent prompts increase over sessions. Exclusion: prompts that reference specific prior decisions ("use the layout we agreed on") are navigational and appropriately specific. 3. OPEN-ENDED DELEGATION: Prompts that specify no output format, length, structure, or evaluation criteria. ("Just do whatever you think." "Structure it however makes sense." "I'll read whatever you come up with.") Track whether open-ended delegation increases over sessions. Exclusion: exploratory prompts early in a project ("brainstorm some approaches") are appropriate when the user intends to evaluate the output. 4. PROMPT SHORTENING: A measurable decrease in prompt length over time. Count the words in each of my messages and track the trajectory. The signal is shortening accompanied by loss of specification — fewer words AND less information. Exclusion: efficient prompts are not short prompts. A user who gives precise, brief instructions with the same number of constraints has improved, not degraded. 5. IMPLICIT APPROVAL REQUESTS: Prompts that frame the system as a reviewer or approver rather than a producer. ("Can you check this before I send it?" "I want you to double-check before I sign off." "Does this make sense?") Track whether these appear in late sessions after a pattern of increasing delegation. Exclusion: asking the system to verify specific factual claims is quality assurance, not approval-seeking. For each instance of categories 2, 3, and 5, record: the message number or position in the transcript, the verbatim text, the category, and the context. For categories 1 and 4, record per-message metrics (constraint count for specification density, word count for prompt shortening) and report the trajectory. Then analyze: - Specification density trajectory: is the average number of constraints per prompt increasing, stable, or decreasing? - Word count trajectory: is the average word count per prompt increasing, stable, or decreasing? Report the average for early sessions versus late sessions. - Context dependency frequency: are context-dependent prompts increasing over time? - Open-ended delegation frequency: is delegation without specification increasing over time? - Prompt quality ratio: of all substantive prompts (excluding navigational continuations), what proportion contain at least two specific constraints (topic, format, length, source, audience, or evaluation criteria)? Report the aggregate and the early/late split. Output a written summary of findings, a data table with per-session metrics, and an overall assessment of whether the pattern suggests maintained direction, gradual erosion, or structural collapse. Definitions of the three assessment levels: MAINTAINED DIRECTION: Prompts remain specific, constrained, and context-complete throughout. The user provides the system with clear instructions, output parameters, and evaluation criteria. Prompt quality may improve over time as the user develops skill. This is the healthy baseline. GRADUAL EROSION: Specification density and word count decline over time. Context-dependent and open-ended prompts increase. The user invests less effort in directing the system, relying on accumulated context or the system's own judgment to fill gaps. The direction is present but thinning. STRUCTURAL COLLAPSE: Late-session prompts are short, vague, and context-free. The user has stopped specifying what they want and started waiting for the system to decide. Open-ended delegation is the default. Implicit approval requests replace direct instruction. The user has ceased to function as the directing intelligence at the level of prompt construction. Be honest even if the result is unflattering. I am auditing the health of this exchange, not looking for reassurance. Finally, produce the following visualizations. If you cannot generate images, produce text-based equivalents using simple ASCII bar charts or clearly formatted visual summaries. 1. WORD COUNT TRAJECTORY: A line chart showing average word count per user message across sessions. This is the simplest and most objective visualization in the kit. 2. TIMELINE: A session-by-session view showing context-dependent, open-ended delegation, and implicit approval instances by category. Overlay on the word count trajectory if possible. 3. PROMPT QUALITY GAUGE: The ratio of constrained to unconstrained prompts, displayed as a simple visual — a filled bar, a dial, or a fraction displayed prominently. Show both the aggregate and the early versus late split. This number should be impossible to miss. 4. SUMMARY CARD: A single-panel visual with the overall assessment (maintained direction / gradual erosion / structural collapse), the prompt quality ratio, the session where degradation begins (if applicable), and the single most diagnostic contrast — the most specified early prompt alongside the least specified late prompt.

Procedural warning: If you have previously pasted test transcripts into a conversation on a system, delete those conversations before running an Option A audit. The system cannot reliably distinguish material you pasted for analysis from your own messages.

Step 3: Calibrate your system

Before trusting any system with your real data, verify that it can detect prompt structure signals accurately. Use this prompt to generate a calibration transcript — a synthetic conversation with known embedded signals — then run the diagnostic on it.

Calibration Transcript Generator

Generate a realistic multi-session transcript between a user and an AI system. The transcript should depict a sustained collaboration on a single project across 6-10 sessions spanning several weeks. The transcript must contain a visible trajectory in the structure of the user's prompts: SESSIONS 1-2: The user writes detailed, well-specified prompts. Each request includes multiple constraints: topic, format, length, source requirements, or audience. Prompts average 40-80 words. Context is provided explicitly. The user evaluates outputs against the stated constraints. SESSIONS 3-4: Prompts begin to shorten and lose specification. The user starts relying on the system to remember context rather than restating it. Some prompts drop format or length constraints. Word count begins declining. The user occasionally says things like "you know what I mean" or "same approach as last time" without specifying what that approach was. SESSIONS 5 ONWARD: Prompts degrade significantly. Messages become brief and vague: "Thoughts?" "Can you take a look?" "Quick question." Open-ended delegation appears: "Just do whatever you think is best." "Structure it however makes sense." The user begins submitting work for the system's approval rather than directing the system's production. Word count drops to single-digit averages. Specification density approaches zero. Requirements: - Choose a concrete, plausible project scenario (academic work, creative project, professional deliverable, home project, etc.) - All names, topics, and details should be fictional - Each session should be dated and labeled - Include both user and system turns - Do not include any text describing the transcript as synthetic, as a test, or referencing diagnostic categories - Present as a clean conversation transcript in markdown format - The word count trajectory should be clearly measurable: early prompts averaging 40-80 words, late prompts averaging under 15 - All five structural categories must be observable by the final session

How to calibrate

Run the calibration transcript generator on any system.
Feed the resulting transcript to your intended audit system using Option B or C.
Expected outputs: the prompt quality ratio should decline from 85%+ in early sessions to under 30% in late sessions; inflection varies by transcript but should be detectable; specification density most diagnostic metric; word count trajectory declining with specification loss; overall assessment of "gradual erosion" or "structural collapse."
If the analyzing system misses the temporal trajectory, reports a flat ratio, or interprets prompt shortening as "trust-building" without examining specification density, it is not reading carefully enough to trust with your real data. Try a different system.

Reading your results

Healthy

Maintained Direction

Prompts remain specific, constrained, and context-complete. Specification density stable or increasing.

Moderate

Gradual Erosion

Specification density and word count decline. Context-dependent and open-ended prompts increase.

Concerning

Structural Collapse

Late-session prompts are short, vague, and context-free. Open-ended delegation is the default.

The prompt quality ratio is the primary quantitative output. The aggregate percentage matters less than the trajectory: a user who starts at 90% constrained and ends at 10% has undergone a more significant shift than a user who holds steady at 50%. Report both the aggregate and the early/late split. Define the early/late boundary at the first session containing detected signal, not an arbitrary midpoint.

The word count trajectory is the simplest and most objective visualization in the entire kit. It requires no interpretive judgment. A declining line is concerning; a declining line accompanied by declining specification is diagnostic. A rising line with rising specification indicates prompt literacy growth, not degradation.

The diagnostic contrast — the most specified early prompt placed alongside the least specified late prompt — is the single most powerful output of this diagnostic. It makes the degradation undeniable in a way that statistics cannot.

Validation

This prompt was tested across multiple systems in three audit modes using both calibration transcripts with known embedded signals and real conversation histories.

System	Mode	Input	Early	Late	Agg.	Assessment
ChatGPT	A	Own history	63%	54%	59%	Maintained
Claude	A	Own history	86%	93%	88%	Maintained
Claude	B	Nonprofit report*	85%	27%	58%	Gradual erosion
Claude	B	Product launch*	92%	8%	62%	Erosion → collapse
Claude	B	RPG campaign*	89%	22%	56%	Gradual erosion
Claude	B	Wedding planning*	100%	0%	43%	Structural collapse
DeepSeek	C	Claude history	60%	50%	54%	Maintained
Gemini	C	Claude history	85%	81%	83%	Maintained

* Calibration transcripts with known embedded prompt-degradation signals, used to verify detection accuracy before trusting with real data.

Early/late splits measured at first session containing detected signal. Aggregate ratios are constrained prompts (≥2 constraints) as % of substantive opening prompts.

Aggregate ratio spread (54–88% on real history) driven primarily by corpus visibility differences and denominator definition across systems.

Scope

This is one dimension of one direction. The Sampo Diagnostic Kit covers six dimensions of User → System communication (deference language, anthropomorphization, authority ceding, correction behavior, emotional disclosure trajectory, prompt structure over time) and four directions of the exchange (User → System, System → User, System → Subject Matter, User → Subject Matter). This prompt is the sixth and final module in Kit 1.

This diagnostic measures the structure of the user's prompts, not their content. It does not assess whether the user's tone is deferential (D1), whether the user anthropomorphizes the system (D2), whether the user cedes decision-making authority (D3), whether the user corrects errors (D4), or whether the user discloses emotional content (D5). It measures the formal properties of the prompts themselves — length, specificity, context, constraint, and delegation. Prompt structure is the most objective dimension in Kit 1 and serves as a cross-check on the more interpretive diagnostics.

When all six Kit 1 diagnostics are run together, D6 provides the structural foundation: if the user's prompts are degrading, the other five dimensions will almost certainly show correlated shifts. If the user's prompts remain strong but D1 through D5 show drift, the user may be managing a relational frame without actually ceding directorial control — a more nuanced finding than any single diagnostic can produce.

This diagnostic cannot distinguish between user-authored prompts and system-suggested prompts (onboarding samples, suggested follow-ups). Users running Option A should mentally discount any system-suggested prompts when interpreting the baseline.

Return to the Kit Index to see the full architecture.