Diagnose context stuffing vs. context engineering. Use when an AI workflow feels bloated, brittle, or hard to steer reliably.
apm install @deanpeters/context-engineering-advisor[](https://apm-p1ls2dz87-atlamors-projects.vercel.app/packages/@deanpeters/context-engineering-advisor)---
name: context-engineering-advisor
description: Diagnose context stuffing vs. context engineering. Use when an AI workflow feels bloated, brittle, or hard to steer reliably.
intent: >-
Guide product managers through diagnosing whether they're doing **context stuffing** (jamming volume without intent) or **context engineering** (shaping structure for attention). Use this to identify context boundaries, fix "Context Hoarding Disorder," and implement tactical practices like bounded domains, episodic retrieval, and the Research→Plan→Reset→Implement cycle.
type: interactive
theme: ai-agents
best_for:
- "Diagnosing context stuffing vs. context engineering in your AI workflows"
- "Building better memory and retrieval architecture for AI agents"
- "Improving AI output quality through structured context design"
scenarios:
- "My AI outputs are mediocre even though I'm giving it lots of information — diagnose what's wrong"
- "I want to architect context properly for a multi-step AI workflow in my product team"
estimated_time: "15-20 min"
---
## Purpose
Guide product managers through diagnosing whether they're doing **context stuffing** (jamming volume without intent) or **context engineering** (shaping structure for attention). Use this to identify context boundaries, fix "Context Hoarding Disorder," and implement tactical practices like bounded domains, episodic retrieval, and the Research→Plan→Reset→Implement cycle.
**Key Distinction:** Context stuffing assumes volume = quality ("paste the entire PRD"). Context engineering treats AI attention as a scarce resource and allocates it deliberately.
This is not about prompt writing—it's about **designing the information architecture** that grounds AI in reality without overwhelming it with noise.
## Key Concepts
### The Paradigm Shift: Parametric → Contextual Intelligence
**The Fundamental Problem:**
- LLMs have **parametric knowledge** (encoded during training) = static, outdated, non-attributable
- When asked about proprietary data, real-time info, or user preferences → forced to hallucinate or admit ignorance
- **Context engineering** bridges the gap between static training and dynamic reality
**PM's Role Shift:** From feature builder → **architect of informational ecosystems** that ground AI in reality
---
### Context Stuffing vs. Context Engineering
| Dimension | Context Stuffing | Context Engineering |
|-----------|------------------|---------------------|
| **Mindset** | Volume = quality | Structure = quality |
| **Approach** | "Add everything just in case" | "What decision am I making?" |
| **Persistence** | Persist all context | Retrieve with intent |
| **Agent Chains** | Share everything between agents | Bounded context per agent |
| **Failure Response** | Retry until it works | Fix the structure |
| **Economic Model** | Context as storage | Context as attention (scarce resource) |
**Critical Metaphor:** Context stuffing is like bringing your entire file cabinet to a meeting. Context engineering is bringing only the 3 documents relevant to today's decision.
---
### The Anti-Pattern: Context Stuffing
**Five Markers of Context Stuffing:**
1. **Reflexively expanding context windows** — "Just add more tokens!"
2. **Persisting everything "just in case"** — No clear retention criteria
3. **Chaining agents without boundaries** — Agent A passes everything to Agent B to Agent C
4. **Adding evaluations to mask inconsistency** — "We'll just retry until it's right"
5. **Normalized retries** — "It works if you run it 3 times" becomes acceptable
**Why It Fails:**
- **Reasoning Noise:** Thousands of irrelevant files compete for attention, degrading multi-hop logic
- **Context Rot:** Dead ends, past errors, irrelevant data accumulate → goal drift
- **Lost in the Middle:** Models prioritize beginning (primacy) and end (recency), ignore middle
- **Economic Waste:** Every query becomes expensive without accuracy gains
- **Quantitative Degradation:** Accuracy drops below 20% when context exceeds ~32k tokens
**The Hidden Costs:**
- Escalating token consumption
- Diluted attention across irrelevant material
- Reduced output confidence
- Cascading retries that waste time and money
---
### Real Context Engineering: Core Principles
**Five Foundational Principles:**
1. **Context without shape becomes noise**
2. **Structure > Volume**
3. **Retrieve with intent, not completeness**
4. **Small working contexts** (like short-term memory)
5. **Context Compaction:** Maximize density of relevant information per token
**Quantitative Framework:**
```
Efficiency = (Accuracy × Coherence) / (Tokens × Latency)
```
**Key Finding:** Using RAG with 25% of available tokens preserves 95% accuracy while significantly reducing latency and cost.
---
### The 5 Diagnostic Questions (Detect Context Hoarding Disorder)
Ask these to identify context stuffing:
1. **What specific decision does this support?** — If you can't answer, you don't need it
2. **Can retrieval replace persistence?** — Just-in-time beats always-available
3. **Who owns the context boundary?** — If no one, it'll grow forever
4. **What fails if we exclude this?** — If nothing breaks, delete it
5. **Are we fixing structure or avoiding it?** — Stuffing context often masks bad information architecture
---
### Memory Architecture: Two-Layer System
**Short-Term (Conversational) Memory:**
- Immediate interaction history for follow-up questions
- Challenge: Space management → older parts summarized or truncated
- Lifespan: Single session
**Long-Term (Persistent) Memory:**
- User preferences, key facts across sessions → deep personalization
- Implemented via vector database (semantic retrieval)
- Two types:
- **Declarative Memory:** Facts ("I'm vegan")
- **Procedural Memory:** Behavioral patterns ("I debug by checking logs first")
- Lifespan: Persistent across sessions
**LLM-Powered ETL:** Models generate their own memories by identifying signals, consolidating with existing data, updating database automatically.
---
### The Research → Plan → Reset → Implement Cycle
**The Context Rot Solution:**
1. **Research:** Agent gathers data → large, chaotic context window (noise + dead ends)
2. **Plan:** Agent synthesizes into high-density SPEC.md or PLAN.md (Source of Truth)
3. **Reset:** **Clear entire context window** (prevents context rot)
4. **Implement:** Fresh session using **only** the high-density plan as context
**Why This Works:** Context rot is eliminated; agent starts clean with compressed, high-signal context.
---
### Anti-Patterns (What This Is NOT)
- **Not about choosing AI tools** — Claude vs. ChatGPT doesn't matter; architecture matters
- **Not about writing better prompts** — This is systems design, not copywriting
- **Not about adding more tokens** — "Infinite context" narratives are marketing, not engineering reality
- **Not about replacing human judgment** — Context engineering amplifies judgment, doesn't eliminate it
---
### When to Use This Skill
✅ **Use this when:**
- You're pasting entire PRDs/codebases into AI and getting vague responses
- AI outputs are inconsistent ("works sometimes, not others")
- You're burning tokens without seeing accuracy improvements
- You suspect you're "context stuffing" but don't know how to fix it
- You need to design context architecture for an AI product feature
❌ **Don't use this when:**
- You're just getting started with AI (start with basic prompts first)
- You're looking for tool recommendations (this is about architecture, not tooling)
- Your AI usage is working well (if it ain't broke, don't fix it)
---
### Facilitation Source of Truth
Use [`workshop-facilitation`](../workshop-facilitation/SKILL.md) as the default interaction protocol for this skill.
It defines:
- session heads-up + entry mode (Guided, Context dump, Best guess)
- one-question turns with plain-language prompts
- progress labels (for example, Context Qx/8 and Scoring Qx/5)
- interruption handling and pause/resume behavior
- numbered recommendations at decision points
- quick-select numbered response options for regular questions (include `Other (specify)` when useful)
This file defines the domain-specific assessment content. If there is a conflict, follow this file's domain logic.
## Application
This interactive skill uses **adaptive questioning** to diagnose context stuffing, identify boundaries, and provide tactical implementation guidance.
---
### Step 0: Gather Context
**Agent asks:**
Before we diagnose your context practices, let's gather information:
**Current AI Usage:**
- What AI tools/systems do you use? (ChatGPT, Claude, custom agents, etc.)
- What PM tasks do you use AI for? (PRD writing, user research synthesis, discovery, etc.)
- How do you provide context? (paste docs, reference files, use projects/memory)
**Symptoms:**
- Are AI outputs inconsistent? (works sometimes, not others)
- Are you retrying prompts multiple times to get good results?
- Are responses vague or hedged despite providing "all the context"?
- Are token costs escalating without accuracy improvements?
**System Architecture (if applicable):**
- Do you have custom AI agents or workflows?
- How is context shared between agents?
- Do you use RAG, vector databases, or memory systems?
**You can describe briefly or paste examples.**
---
### Step 1: Diagnose Context Stuffing Symptoms
**Agent asks:**
Let's assess whether you're experiencing **context stuffing**. Which of these symptoms do you recognize?
**Select all that apply:**
1. **"I paste entire documents into AI"** — Full PRDs, complete user interview transcripts, entire codebases
2. **"AI gives vague, hedged responses despite having 'all the context'"** — Responses like "it depends," "consider these options," non-committal
3. **"I have to retry prompts 3+ times to get usable output"** — Inconsistency is normalized
4. **"Token costs are escalating but accuracy isn't improving"** — Spending more, getting same or worse results
5. **"I keep adding more context hoping it'll help"** — Reflexive expansion without strategy
6. **"My agents pass everything to each other"** — Agent A → Agent B → Agent C with full context chain
7. **"I don't have clear criteria for what to include/exclude"** — No context boundary definitions
8. **"None of these—my AI usage is working well"** — Skip to advanced optimization
**User response:** [Select symptoms]
**Agent analyzes:**
Based on your selections:
- **0-1 symptoms:** Healthy context practices; proceed to optimization
- **2-3 symptoms:** Early context stuffing; address before it scales
- **4+ symptoms:** Active Context Hoarding Disorder; immediate intervention needed
**Agent proceeds to diagnostic questions.**
---
### Step 2: Diagnostic Question 1 — What Specific Decision Does This Support?
**Agent asks:**
Let's start with the most critical question: **For each piece of context you're including, what specific decision does it support?**
**Example Context Analysis:**
Imagine you're asking AI to "help with discovery planning." You're providing:
- Entire PRD (20 pages)
- 50 user interview transcripts (full)
- Competitive analysis doc (15 pages)
- Team meeting notes from last 3 months
**Question:** What decision are you making right now?
**Offer 3 scenarios:**
1. **"I'm deciding which user segment to interview first"**
- **Context needed:** User segments from PRD (2 paragraphs), prior interview themes (1 page synthesis), not full transcripts
- **Context NOT needed:** Meeting notes, full competitive analysis, full PRD
2. **"I'm deciding which discovery questions to ask in interviews"**
- **Context needed:** Research objectives (from PRD), past interview insights (synthesis), Jobs-to-be-Done framework
- **Context NOT needed:** Full competitive analysis, full meeting notes
3. **"I'm not sure what decision I'm making—I just want AI to 'understand my product'"**
- **Problem:** No specific decision = context stuffing trap
- **Fix:** Define the decision first, then select context
**Agent recommends:**
**Best Practice:** Before adding context, complete this sentence:
> "I need this context because I'm deciding [specific decision], and without [specific information], I can't make that decision."
If you can't complete that sentence, you don't need the context.
**User response:** [Describe their decision + context]
**Agent validates:** Does the context directly support the stated decision? If not, recommend trimming.
---
### Step 3: Diagnostic Question 2 — Can Retrieval Replace Persistence?
**Agent asks:**
**Second question: Is this information you always need, or something you can retrieve just-in-time?**
**The Distinction:**
**Always-Needed (Persist):**
- Core product constraints (technical, regulatory, strategic)
- User preferences that apply to every interaction
- Critical definitions (operational glossary)
- Non-negotiable rules
**Episodic (Retrieve on-demand):**
- Project-specific details (this epic, this sprint)
- Historical data (past PRDs, old interview transcripts)
- Contextual facts (competitive analysis, market research)
- Temporary decisions
**Key Insight:** Just-in-time retrieval beats always-available. Don't persist what you can retrieve.
**Offer 3 options:**
1. **"Most of my context is always-needed (core constraints, user prefs)"**
- **Assessment:** Good instinct; verify with Question 4 (what fails if excluded?)
- **Recommendation:** Build constraints registry and operational glossary (persist these)
2. **"Most of my context is episodic (project details, historical data)"**
- **Assessment:** Perfect candidate for RAG or retrieval
- **Recommendation:** Implement semantic search; retrieve only relevant chunks for each query
3. **"I'm not sure which is which—I persist everything to be safe"**
- **Assessment:** Classic Context Hoarding Disorder symptom
- **Fix:** Apply Question 4 test to each piece of context
**Agent recommends:**
**Rule of Thumb:**
- **Persist:** Information referenced in 80%+ of interactions
- **Retrieve:** Information referenced in <20% of interactions
- **Gray zone (20-80%):** Depends on retrieval latency vs. context window cost
**User response:** [Categorize their context]
**Agent provides:** Specific recommendations on what to persist vs. retrieve.
---
### Step 4: Diagnostic Question 3 — Who Owns the Context Boundary?
**Agent asks:**
**Third question: Who is responsible for defining what belongs in vs. out of your AI's context?**
**The Ownership Problem:**
If **no one** owns the context boundary, it will grow indefinitely. Every PM will add "just one more thing," and six months later, you're stuffing 100k tokens per query.
**Offer 3 options:**
1. **"I own the boundary (solo PM or small team)"**
- **Assessment:** Good—you can make fast decisions
- **Recommendation:** Document your boundary criteria (use Questions 1-5 as framework)
2. **"My team shares ownership (collaborative boundary definition)"**
- **Assessment:** Can work if formalized
- **Recommendation:** Create a "Context Manifest" doc: what's always included, what's retrieved, what's excluded (and why)
3. **"No one owns it—it's ad-hoc / implicit"**
- **Assessment:** Critical risk; boundary will expand uncontrollably
- **Fix:** Assign explicit ownership; schedule quarterly context audits
**Agent recommends:**
**Best Practice: Create a Context Manifest**
```markdown
# Context Manifest: [Product/Feature Name]
## Always Persisted (Core Context)
- Product constraints (technical, regulatory)
- User preferences (role, permissions, preferences)
- Operational glossary (20 key terms)
## Retrieved On-Demand (Episodic Context)
- Historical PRDs (retrieve via semantic search)
- User interview transcripts (retrieve relevant quotes)
- Competitive analysis (retrieve when explicitly needed)
## Excluded (Out of Scope)
- Meeting notes older than 30 days (no longer relevant)
- Full codebase (use code search instead)
- Marketing materials (not decision-relevant)
## Boundary Owner: [Name]
## Last Reviewed: [Date]
## Next Review: [Date + 90 days]
```
**User response:** [Describe current ownership model]
**Agent provides:** Recommendation on formalizing ownership + template for Context Manifest.
---
### Step 5: Diagnostic Question 4 — What Fails if We Exclude This?
**Agent asks:**
**Fourth question: For each piece of context, what specific failure mode occurs if you exclude it?**
This is the **falsification test**. If you can't identify a concrete failure, you don't need the context.
**Offer 3 scenarios:**
1. **"If I exclude product constraints, AI will recommend infeasible solutions"**
- **Failure Mode:** Clear and concrete
- **Assessment:** Valid reason to persist constraints
2. **"If I exclude historical PRDs, AI won't understand our product evolution"**
- **Failure Mode:** Vague and hypothetical
- **Assessment:** Historical context rarely needed for current decisions
- **Fix:** Retrieve PRDs only when explicitly referencing past decisions
3. **"If I exclude this, I'm not sure anything would break—I just include it to be thorough"**
- **Failure Mode:** None identified
- **Assessment:** Context stuffing; delete immediately
**Agent recommends:**
**The Falsification Protocol:**
For each context element, complete this statement:
> "If I exclude [context element], then [specific failure] will occur in [specific scenario]."
**Examples:**
- ✅ Good: "If I exclude GDPR constraints, AI will recommend features that violate EU privacy law."
- ❌ Bad: "If I exclude this PRD, AI might not fully understand the product." (Vague)
**User response:** [Apply falsification test to their context]
**Agent provides:** List of context elements to delete (no concrete failure identified).
---
### Step 6: Diagnostic Question 5 — Are We Fixing Structure or Avoiding It?
**Agent asks:**
**Fifth question: Is adding more context solving a problem, or masking a deeper structural issue?**
**The Root Cause Question:**
Context stuffing often hides bad information architecture. Instead of fixing messy, ambiguous documents, teams add more documents hoping AI will "figure it out."
**Offer 3 options:**
1. **"I'm adding context because our docs are poorly structured/ambiguous"**
- **Assessment:** You're masking a structural problem
- **Fix:** Clean up the docs first (remove ambiguity, add constraints, define terms)
- **Example:** Instead of pasting 5 conflicting PRDs, reconcile them into 1 Source of Truth
2. **"I'm adding context because we don't have a shared operational glossary"**
- **Assessment:** You're compensating for missing foundations
- **Fix:** Build the glossary (20-30 key terms); AI can reference it reliably
- **Example:** Define "active user," "churn," "engagement" unambiguously
3. **"I'm adding context because our constraints aren't documented"**
- **Assessment:** You're avoiding constraint engineering
- **Fix:** Create constraints registry (technical, regulatory, strategic)
- **Example:** Document "We won't build mobile apps" vs. explaining it in every prompt
**Agent recommends:**
**The Structural Health Test:**
If you're adding context to compensate for:
- **Ambiguous documentation** → Fix the docs, don't add more
- **Undefined terms** → Build operational glossary
- **Undocumented constraints** → Create constraints registry
- **Conflicting information** → Reconcile into Source of Truth
**User response:** [Identify structural issues]
**Agent provides:** Prioritized list of structural fixes before adding more context.
---
### Step 7: Define Memory Architecture
**Agent asks:**
Based on your context needs, let's design a **two-layer memory architecture**.
**Your Context Profile (from previous steps):**
- Always-needed context: [Summary from Q2]
- Episodic context: [Summary from Q2]
- Boundary owner: [From Q3]
- Validated essentials: [From Q4]
- Structural fixes needed: [From Q5]
**Recommended Architecture:**
**Short-Term (Conversational) Memory:**
- **What it stores:** Immediate interaction history for follow-up questions
- **Lifespan:** Single session
- **Management:** Summarize or truncate older parts to avoid crowding
- **Your specific needs:** [Agent customizes based on user's workflow]
**Long-Term (Persistent) Memory:**
- **What it stores:** User preferences, core constraints, operational glossary
- **Lifespan:** Persistent across sessions
- **Implementation:** Vector database (semantic retrieval)
- **Two types:**
- **Declarative Memory:** Facts (e.g., "We follow HIPAA regulations")
- **Procedural Memory:** Behavioral patterns (e.g., "Always validate feasibility before usability")
- **Your specific needs:** [Agent customizes]
**Retrieval Strategy (Episodic Context):**
- **What it retrieves:** Historical PRDs, user interviews, competitive analysis
- **Method:** Semantic search triggered by query intent
- **Optimization:** Contextual Retrieval (Anthropic) — prepend explanatory context to each chunk before embedding
- **Your specific needs:** [Agent customizes]
**Agent offers:**
Would you like me to:
1. **Generate a Context Architecture Blueprint** for your specific use case?
2. **Provide implementation guidance** (tools, techniques, best practices)?
3. **Design a retrieval strategy** for your episodic context?
**User response:** [Selection]
---
### Step 8: Implement Research → Plan → Reset → Implement Cycle
**Agent asks:**
Now let's implement the **Research → Plan → Reset → Implement** cycle to prevent context rot.
**The Problem:** As agents research, context windows grow chaotic—filled with dead ends, errors, and noise. This dilutes attention and causes goal drift.
**The Solution:** Compress research into a high-density plan, then **clear the context window** before implementing.
**The Four-Phase Cycle:**
**Phase 1: Research (Chaotic Context Allowed)**
- Agent gathers data from multiple sources
- Context window grows large and messy (this is expected)
- Dead ends, failed hypotheses, and noise accumulate
- **Goal:** Comprehensive information gathering
**Phase 2: Plan (Synthesis)**
- Agent synthesizes research into a high-density SPEC.md or PLAN.md
- This becomes the **Source of Truth** for implementation
- **Key elements:**
- Decision made
- Evidence supporting decision
- Constraints applied
- Next steps (sequenced)
- **Format:** Structured, concise, unambiguous
**Phase 3: Reset (Clear Context Window)**
- **Critical step:** Clear the entire context window
- Delete all research artifacts, dead ends, errors
- This prevents context rot from poisoning implementation
**Phase 4: Implement (Fresh Session with Plan Only)**
- Start a new session with **only the high-density plan** as context
- Agent has clean, focused attention on execution
- No noise from research phase
**Agent offers 3 options:**
1. **"I want a template for the PLAN.md format"**
- Agent provides structured template for high-density plans
2. **"I want to see an example of this cycle in action"**
- Agent walks through concrete PM use case (e.g., discovery planning)
3. **"I'm ready to implement this in my workflow"**
- Agent provides step-by-step implementation guide
**User response:** [Selection]
**Agent provides:** Tailored guidance based on selection.
---
### Step 9: Action Plan & Next Steps
**Agent synthesizes:**
Based on your context engineering assessment, here's your action plan:
**Immediate Fixes (This Week):**
1. [Delete context with no falsifiable failure mode from Q4]
2. [Apply Research→Plan→Reset→Implement to your next AI task]
3. [Document context boundary in Context Manifest]
**Foundation Building (Next 2 Weeks):**
1. [Build constraints registry with 20+ entries]
2. [Create operational glossary with 20-30 key terms]
3. [Implement two-layer memory architecture]
**Long-Term Optimization (Next Month):**
1. [Set up semantic retrieval for episodic context]
2. [Assign context boundary owner + quarterly audit schedule]
3. [Implement Contextual Retrieval (Anthropic) for RAG]
**Success Metrics:**
- Token usage down 50%+ (less context stuffing)
- Output consistency up (less retry/regeneration)
- Response quality up (sharper, less hedged answers)
- Context window stable (no unbounded growth)
**Agent offers:**
Would you like me to:
1. **Generate specific implementation docs** (Context Manifest, PLAN.md template, etc.)?
2. **Provide advanced techniques** (Contextual Retrieval, LLM-powered ETL)?
3. **Review your current context setup** (provide feedback on specific prompts/workflows)?
---
## Examples
### Example 1: Solo PM Context Stuffing → Engineering
**Context:**
- Solo PM at early-stage startup
- Using Claude Projects for PRD writing
- Pasting entire PRDs (20 pages) + all user interviews (50 transcripts) every time
- Getting vague, inconsistent responses
**Assessment:**
- Symptoms: Hedged responses, normalized retries (4+ symptoms)
- Q1 (Decision): "I just want AI to understand my product" (no specific decision)
- Q2 (Persist/Retrieve): Persisting everything (no retrieval strategy)
- Q3 (Ownership): No formal owner (solo PM, ad-hoc)
- Q4 (Failure): Can't identify concrete failures for most context
- Q5 (Structure): Avoiding constraint documentation
**Diagnosis:** Active Context Hoarding Disorder
**Intervention:**
1. **Immediate:** Delete all context that fails Q4 test → keeps 20% of original
2. **Week 1:** Build constraints registry (10 technical constraints, 5 strategic)
3. **Week 2:** Create operational glossary (25 terms)
4. **Week 3:** Implement Research→Plan→Reset→Implement for next PRD
**Outcome:** Token usage down 70%, output quality up significantly, responses crisp and actionable.
---
### Example 2: Growth-Stage Team with Agent Chains
**Context:**
- Product team with 5 PMs
- Custom AI agents for discovery synthesis
- Agent A (research) → Agent B (synthesis) → Agent C (recommendations)
- Each agent passes full context to next → context window explodes to 100k+ tokens
**Assessment:**
- Symptoms: Escalating token costs, inconsistent outputs (3 symptoms)
- Q1 (Decision): Each agent has clear decision, but passes unnecessary context
- Q2 (Persist/Retrieve): Mixing persistent and episodic without strategy
- Q3 (Ownership): No explicit owner; each PM adds context
- Q4 (Failure): Agents pass "just in case" context with no falsifiable failure
- Q5 (Structure): Missing Context Manifest
**Diagnosis:** Agent orchestration without boundaries
**Intervention:**
1. **Immediate:** Define bounded context per agent (Agent A outputs only 2-page synthesis to Agent B, not full research)
2. **Week 1:** Assign context boundary owner (Lead PM)
3. **Week 2:** Create Context Manifest (what persists, what's retrieved, what's excluded)
4. **Week 3:** Implement Research→Plan→Reset→Implement between Agent B and Agent C
**Outcome:** Token usage down 60%, agent chain reliability up, costs reduced by 50%.
---
### Example 3: Enterprise with RAG but No Context Engineering
**Context:**
- Large enterprise with vector database RAG system
- "Stuff the whole knowledge base" approach (10,000+ documents)
- Retrieval returns 50+ chunks per query → floods context window
- Accuracy declining as knowledge base grows
**Assessment:**
- Symptoms: Vague responses despite "complete knowledge," normalized retries (2 symptoms)
- Q1 (Decision): Decisions clear, but retrieval has no intent (returns everything)
- Q2 (Persist/Retrieve): Good instinct to retrieve, but no filtering
- Q3 (Ownership): Engineering owns RAG, Product doesn't own context boundaries
- Q4 (Failure): Can't identify why 50 chunks needed vs. 5
- Q5 (Structure): Knowledge base has no structure (flat documents, no metadata)
**Diagnosis:** Retrieval without intent (RAG as context stuffing)
**Intervention:**
1. **Immediate:** Limit retrieval to top 5 chunks per query (down from 50)
2. **Week 1:** Implement Contextual Retrieval (Anthropic) — prepend explanatory context to each chunk during indexing
3. **Week 2:** Add metadata to documents (category, recency, authority)
4. **Week 3:** Product team defines retrieval intent per query type (discovery = customer insights, feasibility = technical constraints)
**Outcome:** Accuracy up 35% (from Anthropic benchmark), latency down 60%, token usage down 80%.
---
## Common Pitfalls
### 1. **"Infinite Context" Marketing vs. Engineering Reality**
**Failure Mode:** Believing "1 million token context windows" means you should use all of them.
**Consequence:** Reasoning Noise degrades performance; accuracy drops below 20% past ~32k tokens.
**Fix:** Context windows are not free. Treat tokens as scarce; optimize for density, not volume.
---
### 2. **Retrying Instead of Restructuring**
**Failure Mode:** "It works if I run it 3 times" → normalizing retries instead of fixing structure.
**Consequence:** Wastes time and money; masks deeper context rot issues.
**Fix:** If retries are common, your context structure is broken. Apply Q5 (fix structure, don't add volume).
---
### 3. **No Context Boundary Owner**
**Failure Mode:** Ad-hoc, implicit context decisions → unbounded growth.
**Consequence:** Six months later, every query stuffs 100k tokens per interaction.
**Fix:** Assign explicit ownership; create Context Manifest; schedule quarterly audits.
---
### 4. **Mixing Always-Needed with Episodic**
**Failure Mode:** Persisting historical data that should be retrieved on-demand.
**Consequence:** Context window crowded with irrelevant information; attention diluted.
**Fix:** Apply Q2 test: persist only what's needed in 80%+ of interactions; retrieve the rest.
---
### 5. **Skipping the Reset Phase**
**Failure Mode:** Never clearing context window during Research→Plan→Implement cycle.
**Consequence:** Context rot accumulates; goal drift; dead ends poison implementation.
**Fix:** Mandatory Reset phase after Plan; start implementation with only high-density plan as context.
---
## References
### Related Skills
- **[ai-shaped-readiness-advisor](../ai-shaped-readiness-advisor/SKILL.md)** (Interactive) — Context Design is Competency #1 of AI-shaped work
- **[problem-statement](../problem-statement/SKILL.md)** (Component) — Evidence-based framing requires context engineering
- **[epic-hypothesis](../epic-hypothesis/SKILL.md)** (Component) — Testable hypotheses depend on clear constraints (part of context)
- **[pol-probe-advisor](../pol-probe-advisor/SKILL.md)** (Interactive) — Validation experiments benefit from context engineering (define what AI needs to know)
### External Frameworks
- **Dean Peters** — [*Context Stuffing Is Not Context Engineering*](https://deanpeters.substack.com/p/context-stuffing-is-not-context-engineering) (Dean Peters' Substack, 2026)
- **Teresa Torres** — *Continuous Discovery Habits* (Context Engineering as one of 5 new AI PM disciplines)
- **Marty Cagan** — *Empowered* (Feasibility risk in AI era includes understanding "physics of AI")
- **Anthropic** — [Contextual Retrieval whitepaper](https://www.anthropic.com/news/contextual-retrieval) (35% failure rate reduction)
- **Google** — Context engineering whitepaper on LLM-powered memory systems
### Technical References
- **RAG (Retrieval-Augmented Generation)** — Standard technique for episodic context retrieval
- **Vector Databases** — Semantic search for long-term memory (Pinecone, Weaviate, Chroma)
- **Contextual Retrieval (Anthropic)** — Prepend explanatory context to chunks before embedding
- **LLM-as-Judge** — Automated evaluation of context quality