Why does Claude Code keep saying the session expired?

This happens when your authentication token becomes invalid, expired, or corrupted. It can also be caused by macOS keychain locks, VPN interference, or OAuth callback timeouts.

How do I reconnect Claude Code?

The fastest way to reconnect is to sign out, clear your cached credentials using 'rm -rf ~/.claude/cache', and then complete the full login flow again.

What is an OAuth callback timeout?

It occurs when the browser takes longer than 60 seconds to pass the authentication token back to the CLI, often due to too many open tabs or a slow network connection.

Why does Claude Code say 'installation failed' but then 'installation complete'?

This contradictory message almost always means a stale lock file exists from a previous interrupted install. The installer aborts due to the lock but then partially completes a different step, giving you two contradictory status messages. Remove the lock file at ~/.local/state/claude/locks/ and reinstall.

I get claude: command not found even after a successful install. What's wrong?

Your terminal session hasn't loaded the updated PATH that the installer added to your shell config. Open a new terminal window, or run source ~/.zshrc (or ~/.bashrc on Linux). If that doesn't work, confirm that npm's global bin directory is actually in your PATH.

Do I need sudo to install Claude Code?

No — and using sudo will likely cause more problems. Set your npm global directory to a user-owned path with npm config set prefix ~/.npm-global, add that to your PATH, and then install without sudo.

Claude Code installation fails with a TLS/SSL error. How do I fix it?

This happens in corporate environments that use SSL inspection. Export your company's CA certificate path: export NODE_EXTRA_CA_CERTS=/path/to/ca.pem. Your IT team can provide the certificate file.

The Claude Code installer just hangs and never completes. What's happening?

A hanging installer almost always means a network block. The installer downloads from storage.googleapis.com. Set your proxy with export HTTPS_PROXY=http://your-proxy:port and retry, or ask IT to whitelist that domain.

I'm on Windows. Should I use npm or the curl installer?

Use PowerShell with npm install -g @anthropic-ai/claude-code. Don't use Git Bash; it lacks TTY support and will cause raw mode errors. If you're in WSL, use the curl installer within your Linux environment.

How do I know if Claude Code installed correctly?

Run claude --version to confirm the binary is accessible, then run claude doctor for a full diagnostic report. Both should return clean results with no errors before you start using the tool in a real project.

Claude Code Context Window Full: How to Fix Token Limit Errors Fast

June 10, 2026 by

aliakram

What "Context Window Full" Actually Means

If you're seeing a context window full error in Claude Code, your session has accumulated more tokens than the model can hold in active memory. The fix is simple but understanding why it happens will stop it from repeatedly interrupting your workflow.

Claude Code runs on Claude Sonnet or Opus models with a context window of up to 200,000 tokens. That sounds enormous, but during an active coding session it fills faster than you'd expect: every message you send, every response Claude gives, every file Claude reads, and every tool call result all count toward that total. When the limit is hit, Claude cannot process new input until the context is reduced.

The good news: No files are modified. Nothing is deleted. The error is purely about in-memory conversation state, not your codebase.

The Hidden Token Overhead Nobody Talks About

This is the section most guides skip, and it's the reason many developers are confused that your context window didn't actually shrink. Your effective usable window is smaller than the raw 200K number because of system-level overhead you never see.

System Prompt Overhead

Claude Code injects a system prompt into every session automatically. This system prompt contains tool definitions, safety instructions, and context about the environment. Depending on your Claude Code version, this alone can consume 10,000–25,000 tokens before you type a single character.

A community investigation on Reddit (r/ClaudeAI) found that users experiencing "shrinking limits" weren't facing a policy change — they were running newer Claude Code versions with larger built-in system prompts. The model's raw limit stayed the same; the usable space got smaller.

Tool Call Overhead

Every tool Claude Code uses (read file, run bash, search codebase) appends its entire result to context. A tool call that returns a 500-line file reads costs ~5,000 tokens even if Claude only needed 10 lines. This is a major source of unexpected token drain.

Practical impact: In a typical debugging session where Claude reads 5–8 files, you can burn 30,000–50,000 tokens just in tool call results, before Claude even starts writing a fix.

Multi-Turn Accumulation

Each conversation turn re-sends the entire prior history. Turn 1 sends 500 tokens. Turn 2 sends 500 + 800 = 1,300. By turn 20, you're sending 30,000+ tokens just to maintain context. This compounds fast and is invisible unless you're watching API-level logs.

The Real Usable Window

Model	Raw Limit	System Prompt	Avg Session Overhead	Effective Usable
Claude Sonnet (latest)	200,000	~15,000–25,000	~10,000–30,000	~145,000–175,000
Claude Opus (latest)	200,000	~15,000–25,000	~10,000–30,000	~145,000–175,000

These aren't official Anthropic numbers — they're community-observed estimates. Your actual usable window depends on your Claude Code version, which tools you use, and your CLAUDE.md size.

Error Messages You'll See

Error Message	Meaning	Severity	Recommended Fix
Context window limit exceeded	Total tokens in session hit the model's max	High	/clear or /compact
Token limit exceeded	Same as above, different phrasing	High	Start new session
Conversation too long	Session history is too large to continue	High	/compact then continue
Context too large	A single attached file or paste exceeds limits	Medium	Reduce file size or split input
Input too long	A single message (not session) is too large	Medium	Shorten the message or split task
Request exceeds context window	API-level message, often seen in logs	Medium	Clear session or reduce file scope
Context overflow	Older Claude Code versions use this term	High	Update CLI, then /clear
Unable to process: history truncated	Claude silently truncated past context	Low	Review conversation; start fresh if inconsistent
rate_limit_error / 429	Too many API requests (NOT a context issue)	High	Wait and retry; see Rate Limit section below

Quick Fix Checklist {#quick-fix}

Use this when you need to get back to work immediately:

Type /clear in the Claude Code terminal to reset conversation history
Alternatively, type /compact to compress history and continue
If in VS Code, close and reopen the Claude Code panel
If repeating immediately, break your task into smaller steps
Avoid pasting entire large files — reference filenames instead
For large codebases, use CLAUDE.md to give context without pasting code
Update Claude Code to the latest version: npm update -g @anthropic-ai/claude-code
Check if this is actually a rate limit (429 error), not a context error — they look similar

Root Causes (Including Non-Obvious Ones)

1. Long Conversation History (Compounding Problem)

Every turn in a session is retained in context. A two-hour debugging session can easily accumulate 80,000–120,000 tokens before you've pasted a single file. The compounding nature means each new message is more expensive than the last.

2. Large File Pastes

Pasting a 1,000-line file directly into chat adds ~10,000 tokens instantly. Many developers do this repeatedly, stacking files in one session without realizing the cost.

3. Tool Call Result Bloat

This is the most underestimated cause. When Claude Code runs a bash command, reads a file, or does a codebase search, the entire result gets appended to context — not just what Claude needs. A find . -name "*.ts" on a large repo, or a cat on a 300-line config file, silently burns thousands of tokens.

4. Error Loops

If Claude repeatedly encounters an error and keeps trying to fix it with new approaches, each attempt adds to context. Unresolved loops can burn through the window in minutes. This is one of the most common patterns reported by developers in community discussions.

5. .cursorrules / System Config Files

In Cursor, large .cursorrules files are injected at the start of every session. A 500-line .cursorrules file can add 5,000–8,000 tokens to your base overhead. The same applies to Windsurf's config files.

6. Multiple Document Loads

Loading several READMEs, API docs, or config files at once multiplies consumption. Each document that gets read via tool-use is fully retained in context even after Claude has processed it.

7. Verbose Code Generation

Asking for boilerplate, full test suites, or complete file rewrites generates large outputs that fill context from Claude's side. Even if you don't paste anything, lengthy Claude responses consume tokens.

8. Outdated Claude Code Version

Older Claude Code versions had less efficient context handling. Running an outdated version can cause context overhead that newer versions handle better. Always keep Claude Code updated.

Step-by-Step Solutions

Fix 1: Clear the Context with /clear

What it fixes: Resets the entire conversation history, freeing up the full context window. When to use it: When you've completed a task phase and want a clean slate for the next one. Difficulty: Easy

In the Claude Code terminal prompt, type /clear and press Enter.
Claude will confirm the history has been cleared.
You can now start fresh with the same session.
Re-provide any essential context (the goal of the task, the file you're working on).

Why it works: /clear discards all prior messages from in-memory context. Project files are untouched.

Expected outcome: Immediate resolution. Context resets to system prompt overhead only (~15,000–25,000 tokens).

Fix 2: Compact the Conversation with /compact

What it fixes: Summarizes and compresses conversation history to free up tokens while preserving continuity. When to use it: When you're mid-task and don't want to lose progress or re-explain context. Difficulty: Easy

Type /compact at the Claude Code prompt.
Claude summarizes earlier turns into a condensed memory block.
The summary replaces the full history, reducing token count significantly.
Continue your session normally.

Pro tip: You can pass a custom instruction: /compact Focus on the authentication bug we were fixing and the decisions made about the token refresh logic. This gives you a targeted summary rather than a generic one.

Why it works: Rather than deleting history, /compact distills it. Claude retains key decisions, code changes, and goals without holding every word.

Expected outcome: Context freed by 50–80%, session continues uninterrupted.

Fix 3: Start a New Session

What it fixes: Completely resets all state and starts a fresh conversation. When to use it: When you've finished a major task block, or when the context is so full that even compacting leaves little room. Difficulty: Easy

Exit the current Claude Code session (Ctrl+C or close the terminal tab).
Navigate to your project directory.
Run claude to start a new session.
Provide a concise summary of where you left off (use a CLAUDE.md file for this).

Expected outcome: Full effective window available again.

Fix 4: Reduce File Size in Prompts

What it fixes: Prevents context overflow caused by pasting large files. When to use it: Before pasting any file larger than ~150 lines. Difficulty: Easy

Instead of pasting the full file, reference the filename: "Please read src/api/handler.ts and find the bug."
Claude Code can read files directly from your filesystem without you pasting them.
If you must paste, extract only the relevant section (the function or class in question).
Use line number ranges: "Focus on lines 45–90 of handler.ts."

Why it works: File-by-reference lets Claude's tool-use load only what's needed. Even then, be specific — "read the validateToken function in auth.service.ts" triggers a targeted read, not a full file read.

Fix 5: Split Large Tasks into Phases

What it fixes: Prevents context overflow from happening at all on complex, multi-hour tasks. When to use it: For any task that spans multiple files, components, or logical stages. Difficulty: Medium

Before starting, outline the task in phases (e.g., Phase 1: schema; Phase 2: API layer; Phase 3: tests).
Use one Claude Code session per phase.
At the end of each phase, write a brief summary to CLAUDE.md.
Begin the next phase with a new session, referencing the summary.

Expected outcome: No context overflow errors; cleaner, more focused Claude output per phase.

How to Cut Token Overhead by 40%+

Developers experimenting with Claude Code's token consumption have found practical techniques that meaningfully reduce overhead. Here are the most effective ones:

1. Write a Tight, Structured CLAUDE.md

The biggest single lever. A bloated CLAUDE.md (5,000+ characters) costs thousands of tokens on every session start. A tight one (under 2,000 characters) covering only critical conventions, tech stack, and must-know rules can cut base overhead by 20–30%.

Template structure that works:

# Project: [name]
## Stack: [language, framework, key deps]
## Conventions: [2-3 bullet points max]
## Avoid: [specific patterns Claude should not use]
## Key files: [list only what Claude will touch regularly]

2. Use --add-dir Instead of Pasting Context

Claude Code's --add-dir flag lets you scope which directories Claude can access, preventing accidental full-codebase reads when Claude searches for something.

claude --add-dir src/auth

This restricts tool-use to src/auth only — preventing Claude from reading your entire repo when you only needed the auth module.

3. Disable Auto-Tools for Simple Tasks

When you're asking a question that doesn't require file access, tell Claude explicitly:

"Answer this without reading any files: what's the standard pattern for JWT refresh token rotation in Node.js?"

This prevents Claude from reflexively triggering tool calls that add thousands of tokens to context.

4. Pipe Output Through head in Bash Commands

When Claude runs bash commands that might return large output, you can instruct it to limit output:

"Run npm test but only show the first 50 lines of output."

Or in your own bash calls:

npm test 2>&1 | head -50

Large test suite output is one of the worst context killers — a full Jest run can return 10,000+ tokens of output.

5. Use git diff Instead of Full File Reads

When reviewing changes, asking Claude to git diff HEAD~1 instead of reading modified files gives Claude exactly the changes it needs without loading full file contents.

# In your Claude Code prompt:
"Review the changes from git diff HEAD~1 and check for bugs."

6. Break Requests into Atomic Tasks

Instead of: "Refactor the entire auth module, add tests, and update the README."

Use: "Refactor only the validateToken function in auth.service.ts."

Atomic tasks use less context, produce better results, and make it easier to /compact or /clear between steps.

7. Avoid "Explain Your Reasoning" on Routine Tasks

Asking Claude to explain every decision it makes (common in learning workflows) doubles or triples output token consumption. Reserve detailed explanations for genuinely unclear decisions; let Claude work quietly for routine tasks.

8. Keep Error Messages Short When Pasting

When sharing an error, paste only the relevant error lines — not the full stack trace with 200 lines of node_modules frames. A focused error paste:

TypeError: Cannot read properties of undefined (reading 'user')
  at validateRequest (auth/middleware.ts:47:23)
  at Layer.handle [as handle_request] (express/router/layer.js:95:5)

Is far more token-efficient than 80 lines of full stack trace.

9. Use Targeted grep Instead of Broad Searches

Instead of asking Claude to search the codebase for "anything related to authentication," give it a specific grep:

"Run grep -r 'refreshToken' src/ --include='*.ts' -l and list just the file names."

The -l flag returns only filenames, not file contents — a huge token saver.

10. Summarize at Natural Breakpoints

Don't wait for the context overflow error. After completing each logical chunk of work, run /compact proactively. Think of it as hitting Save — it's much less disruptive before you hit the limit than after.

IDE-Specific Fixes

VS Code

Open the Command Palette (Cmd/Ctrl+Shift+P) and search for Claude Code: New Session to restart without leaving the editor.
If the Claude Code sidebar panel freezes after an overflow error, close and reopen it via the sidebar icon.
Check the VS Code output panel (View > Output > Claude Code) for additional error details.
Update the Claude Code VS Code extension via the Extensions panel if you're on an older version.
Long open files in VS Code can be auto-included in context by some extensions checking which Claude Code features are active under extension settings.

Cursor

Cursor embeds Claude via its own context management layer. If you hit a context limit, use Cursor's New Chat button — not just a new message in the same chat.
Cursor's "Codebase Indexing" feature adds significant tokens. Toggle it off for focused, narrow tasks: Settings > Features > Codebase Indexing.
Audit your .cursorrules file. Anything over 200 lines is likely adding 3,000–5,000 tokens to every session. Trim aggressively.
Cursor sometimes re-sends file context silently. If context fills faster than expected, open Cursor's debug panel to check what's being injected.

Windsurf

Windsurf's Cascade panel has a visible context usage meter — watch it proactively and run Reset Conversation before hitting the limit.
Windsurf's "deep context" mode reads more files automatically. Disable it for simple edits: useful for exploration, expensive for targeted fixes.
Check Windsurf's workspace configuration for any auto-loaded files — these add to base overhead just like CLAUDE.md.

JetBrains IDEs (IntelliJ, PyCharm, WebStorm)

Use the Claude Code CLI in the embedded terminal rather than any plugin-based chat.
Run /compact or start a new CLI session from the terminal panel.
JetBrains plugin state does not auto-clear between sessions; restart the plugin if the UI becomes unresponsive after a context overflow.

LibreChat and Other Self-Hosted UIs

Self-hosted Claude frontends like LibreChat accumulate conversation history differently than Claude Code CLI. Multi-turn conversations in LibreChat send the full prior conversation on every request, which can cause context overflow even in short sessions if each turn produces large outputs.

Fix for LibreChat: Use conversation branching or start new threads for new topics. Check LibreChat's context window settings in the admin panel — you may be able to set a max history window to prevent accumulation.

Advanced Troubleshooting

Check Current Token Usage

Claude Code doesn't natively display a running token counter in all versions. To estimate:

# Rough estimate: count words in current session log
wc -w ~/.claude/sessions/current.log
# Multiply result by ~1.3 for approximate token count

For precise tracking, run in verbose mode:

claude --verbose

Look for x-anthropic-input-tokens and x-anthropic-output-tokens in debug output.

Review Session Logs

# List recent Claude Code session logs
ls -lt ~/.claude/sessions/

# View the most recent session
cat ~/.claude/sessions/$(ls -t ~/.claude/sessions/ | head -1)

Log analysis reveals which part of a session consumed the most tokens — usually large tool call results or file reads.

Validate CLAUDE.md Size

# Check CLAUDE.md size
wc -c CLAUDE.md

# Over 5,000 characters warrants trimming
# Over 10,000 characters is actively hurting your sessions

Environment Variable Checks

# Confirm API key is set correctly
echo $ANTHROPIC_API_KEY

# Check Claude Code version
claude --version

# Update to latest
npm update -g @anthropic-ai/claude-code

Authentication Issues That Mimic Context Errors

An expired or invalid API key sometimes produces error messages that look like context errors. Verify:

# Re-authenticate Claude Code
claude auth login

# Check current auth status
claude auth status

Network Diagnostics

# Test connectivity to Anthropic API
curl -I https://api.anthropic.com

# Check for proxy or firewall issues
curl -v https://api.anthropic.com/v1/messages --max-time 10

VPN and corporate proxies can interrupt long streaming responses, producing errors that look like context overflow but are actually network timeouts.

Permission Troubleshooting

If Claude Code's file-reading tools fail due to permissions, Claude may try to work around it by requesting you paste content — which consumes more context:

# Check read permissions on project directory
ls -la /your/project/directory

# Fix if needed
chmod -R u+r /your/project/directory

Real-World Scenarios

Scenario 1: Claude Code worked fine yesterday but context fills almost immediately today

Cause: A recent Claude Code update increased the system prompt size, or a large CLAUDE.md was added to the project. Investigation: Run claude --verbose and check x-anthropic-input-tokens on the first message. If it's already 20,000+ tokens before you say anything, the base overhead is the problem. Fix: Trim CLAUDE.md. Check the Claude Code changelog for recent updates. This is the same pattern community members flagged as "limits shrinking" — the limits didn't change, the overhead grew.

Scenario 2: Refactoring a large codebase causes overflow halfway through

Cause: Claude reads multiple files per step (tool call results accumulate), spanning dozens of files. Fix: Use /compact at natural breakpoints (after completing each module). Split refactoring into per-module sessions and track progress in CLAUDE.md. Use --add-dir to restrict Claude to the module being worked on.

Scenario 3: Context fills during a debugging session with error loops

Cause: Claude tries multiple approaches to fix a bug, each producing verbose output. 10 iterations × 2,000 tokens each = 20,000 tokens burned on failed attempts. Fix: When Claude hasn't resolved a problem in 3 attempts, intervene. Run /compact and restate the problem with explicit constraints: "The previous three approaches all failed because X. Focus only on Y."

Scenario 4: Pasting API documentation causes immediate overflow

Cause: A large OpenAPI spec, README, or documentation file was pasted directly into chat. Fix: Save it as a file in the project (docs/api.md) and ask Claude to read a specific section: "Read docs/api.md and find the endpoint for user authentication." File reading is far more token-efficient than pasting.

Scenario 5: Context overflow in a CI/CD pipeline using Claude Code

Cause: Automated scripts run long Claude Code sessions without clearing context between tasks. Fix: Add --no-history or session-clearing logic between pipeline steps. Use short, scoped prompts for each CI task.

Scenario 6: Rate limit error mistaken for context overflow

Cause: Sustained heavy Claude Code usage triggers API rate limiting (HTTP 429), which produces error output that can be confused with context overflow. Symptoms: Error happens consistently after a period of heavy use; subsequent requests also fail even after /clear; errors resolve after waiting 60+ seconds. Fix: See the Rate Limit vs Context Limit section below.

Rate Limit vs Context Limit — Know the Difference

These two errors are frequently confused. They have different causes and different fixes.

Comparison	Context Window Full	Rate Limit (429)
Cause	Too many tokens in the conversation	Too many API requests in a time window
Error text	"context window limit exceeded", "conversation too long"	"rate_limit_error", "429", "Too Many Requests"
Fix	/clear, /compact, new session	Wait 60–120 seconds, then retry
Persists after /clear?	No — cleared solves it	Yes — /clear doesn't help
Affected by session length?	Yes — longer sessions hit it	Not directly — depends on request frequency
Affected by file size?	Yes — large files accelerate it	No
Resolves with time alone?	No	Yes — rate limits reset automatically

If you run /clear and the error immediately returns on your first new message, you're hitting a rate limit, not a context limit.

Rate limits on Claude API vary by plan tier. Claude Pro and Max plans have higher rate limits, but even these can be hit during intensive coding sessions with rapid back-and-forth. The solution is to slow down request frequency or upgrade the API tier.

Common Mistakes

Mistake	Consequence	Better Approach
Pasting entire files repeatedly	Context fills within minutes	Reference files by name; let Claude read them
Ignoring /compact until hitting the limit	Disruptive restart mid-task	Run /compact proactively after each logical chunk
Asking for exhaustive explanations on routine tasks	Claude's verbose responses burn tokens too	Use "briefly explain" or skip explanations on routine work
Running all tasks in one session	Overflow mid-task, context coherence degrades	Phase your work; use separate sessions per module
Bloated CLAUDE.md	Every session starts heavy; 10–20K tokens pre-burned	Keep CLAUDE.md under 2,000 characters
Large .cursorrules or Windsurf config files	Same as above, IDE-side	Audit and trim config files regularly
Working with minified or bundled files	Even worse token density than source	Always work with source files, never bundles
Confusing rate limits with context errors	Wrong fix wastes time	Check if /clear resolves it; if not, it's rate limiting
Not updating Claude Code CLI	Older versions have worse context handling	npm update -g @anthropic-ai/claude-code regularly
Pasting full stack traces	80-line stack traces = thousands of tokens for no gain	Paste only the relevant error line + immediate caller

Prevention & Best Practices

Design Sessions for Token Efficiency

Treat each Claude Code session as a focused work block. Define a clear goal at the start, complete it, close the session. This keeps context usage predictable and prevents the gradual drift that causes unexpected overflow.

Use CLAUDE.md Strategically

CLAUDE.md is loaded at the start of every session. Use it for: project coding conventions, key architectural decisions, tech stack summary, and recurring task patterns.

Avoid using it for: full documentation, extensive code examples, anything that changes task to task, or long setup instructions that only apply occasionally.

Target size: Under 2,000 characters. Test the impact: run a session with your current CLAUDE.md in verbose mode and check the input token count on the very first turn.

Set a Compaction Habit

Run /compact after completing any significant task within a session — before starting the next one. This is equivalent to saving your work: it preserves continuity without letting history grow unbounded.

Scope File Access Deliberately

Be specific about what you ask Claude to read:

✅ "Read the validateToken function in auth.service.ts and identify the bug."
❌ "Read all my auth files and look for problems."

Specific reads return less data, trigger fewer tool calls, and produce more targeted responses.

Treat Bash Output as a Token Resource

Every command Claude runs has output. Before asking Claude to run broad commands (tests, linters, builds), consider whether you need all the output. Use head, tail, grep, or wc -l to constrain output length.

Monitor Session Duration

Long sessions (60+ minutes of active back-and-forth) almost always approach context limits. Start fresh sessions for new task phases regardless of whether you've hit the limit — don't wait for the error.

Comparison: Context Management Strategies

Strategy	Token Savings	Continuity Preserved	Difficulty	Best For
/clear	Maximum (100%)	No	Easy	Starting a new task phase
/compact	High (50–80%)	Yes (summary)	Easy	Mid-task recovery
/compact with custom instruction	High (50–80%)	Yes (focused)	Easy	Mid-task with specific continuity needs
New session	Maximum (100%)	Partial (via CLAUDE.md)	Easy	End of day / new feature
Reduce file pastes	Preventive	Full	Easy	All sessions
Phase-based sessions	Preventive	Partial (via notes)	Medium	Large projects
--add-dir scoping	Preventive	Full	Easy	Large codebases
Trim CLAUDE.md	Preventive (base overhead)	Full	Low	Initial setup
--no-history flag	Maximum	No	Easy	CI/CD pipelines

Troubleshooting Decision Tree

Seeing a context or token error?
│
├── Does the error say "rate_limit" or "429"?
│   └── YES → Wait 60–120 seconds → Retry (not a context issue)
│
├── Are you mid-task and need to continue?
│   └── YES → Run /compact → Continue session
│
├── Are you at a natural stopping point?
│   └── YES → Run /clear or start new session
│
├── Does it overflow immediately on a NEW session?
│   └── YES → Check CLAUDE.md size → Run in --verbose mode
│             → If base tokens > 20K, trim CLAUDE.md or update Claude Code
│
├── Are you pasting large files?
│   └── YES → Stop pasting → Reference files by name instead
│
├── Is it a CI/CD pipeline?
│   └── YES → Add session clearing between pipeline steps → Use --no-history
│
└── Did /clear not fix it?
    └── YES → Check auth (`claude auth status`) and network connectivity
              → Likely a rate limit or auth issue, not context

Expert Insights

The overhead nobody budgets for:

System prompts, tool definitions, and multi-turn accumulation can consume 30,000–50,000 tokens before you've pasted anything. The developers who manage context best budget for this overhead from the start — they mentally allocate 25% of the window to infrastructure and plan their actual work within the remaining 75%.

On CLAUDE.md: This file is the highest-leverage configuration in Claude Code. A well-designed one (under 2,000 characters, focused) eliminates 80% of re-explanation overhead across sessions. A bloated one (over 10,000 characters) can consume 15–20% of your context window before you type your first message.

On error loops: The most expensive pattern in real-world Claude Code use is the unmanaged error loop — Claude tries fix A, fails, fix B, fails, and so on for 10 iterations. Each attempt adds thousands of tokens. If Claude hasn't resolved a problem in 3 attempts, intervene. Use /compact and reframe the problem with explicit constraints rather than letting Claude keep trying variations.

On rate limits vs context limits: These are routinely confused, even by experienced developers. The key test: if /clear resolves the issue, it was context. If the error persists immediately after /clear, it's rate limiting. Treating a rate limit like a context error wastes time; treating a context error like a rate limit (just waiting) also wastes time.

FAQ

1. What is the Claude Code context window size?

Claude Code uses Claude Sonnet and Opus models with a raw context window of up to 200,000 tokens. However, the effective usable window is smaller — system prompts, tool definitions, and multi-turn history overhead can consume 15,000–50,000 tokens before you paste anything. Practically, plan for roughly 150,000–170,000 tokens of usable space per session.

2. Does clearing the context window delete my files?

No. The context window is an in-memory conversation state only. Running /clear, /compact, or starting a new session has no effect on your project files, local filesystem, or code Claude has already written. Files modified by Claude during a session remain modified — only the conversation history is reset.

3. What is the difference between /clear and /compact?

/clear deletes all conversation history entirely. /compact summarizes and compresses it, preserving a condensed record. Use /clear when starting a new task; use /compact when you need to continue the current task but are running low on context. You can pass a custom focus to /compact: /compact Summarize only the auth bug investigation and what we decided.

4. Why does my context fill so fast? My limit seems to have shrunk.

It almost certainly didn't shrink. What changed is likely the base overhead — newer Claude Code versions have larger system prompts, your CLAUDE.md may have grown, or your .cursorrules / IDE config added more tokens. Run claude --verbose and check input tokens on your very first message. If it's 20,000+ before you've said anything, overhead is the issue.

5. Can the context window full error corrupt my code?

No. If a context overflow error occurs mid-generation, Claude stops generating but doesn't corrupt files already written. Any partial output should be reviewed, but underlying files are not damaged by the error.

6. Is there a way to see how many tokens I've used?

Not via a visible meter in all Claude Code versions, but claude --verbose shows API-level token counts in debug output. Look for x-anthropic-input-tokens on each turn. You can also estimate from session log size: wc -w ~/.claude/sessions/current.log × 1.3 ≈ tokens.

7. What's the difference between a context limit error and a rate limit error?

Context limit: too many tokens in the conversation (fixed by /clear or /compact). Rate limit: too many API requests in a time window, indicated by a 429 HTTP status or rate_limit_error (fixed by waiting 60–120 seconds). If /clear doesn't solve your error, you're likely hitting a rate limit, not a context limit.

8. Why does Claude seem to "forget" earlier decisions mid-session?

As context approaches the limit, the model silently truncates the oldest parts of the conversation to stay within the window. If Claude seems inconsistent or forgetful about decisions made early in a session, silent truncation is the likely cause. Run /compact to get an explicit summary rather than silent dropping of old context.

9. Does the IDE I use affect how fast context fills?

Yes, indirectly. Cursor's codebase indexing, Windsurf's deep context mode, and large IDE config files (.cursorrules) all add to base context overhead. Claude Code CLI alone has the lowest overhead. If you're hitting limits faster in an IDE than in the terminal, audit what the IDE is injecting automatically.

10. Can a VPN cause context window errors?

Not directly. A VPN doesn't cause context overflow. But it can cause network timeouts that interrupt long streaming responses, producing errors that superficially resemble context errors. If you're on a VPN and consistently seeing failures on longer responses (but not short ones), test with the VPN disabled.

11. What happens if I ignore an early context warning and keep going?

Claude's response quality degrades significantly near the token limit. Responses become truncated, earlier context gets silently dropped, and Claude may start producing inconsistent output that contradicts decisions made earlier. It's strongly recommended to run /compact at first warning rather than at error.

12. Is there a way to increase the context window?

Not via user configuration. The window size is a property of the deployed model. Claude Code automatically uses the highest-context model available. What you can control is how much of that window gets consumed by overhead — trimming CLAUDE.md, scoping file reads, and compacting regularly all maximize your effective usable space.

13. Does context reset between different projects?

Yes. Starting Claude Code in a new project directory starts a new session with empty conversation history. However, both the global ~/.claude/CLAUDE.md and the project-level CLAUDE.md are loaded at session start, so global settings carry over. If your global CLAUDE.md is large, it affects all projects.

14. Can I save and restore a session?

Not natively as a restore-and-continue feature. The most reliable approach is a well-maintained CLAUDE.md that captures persistent context, plus brief notes you add at session end. Some developers keep a notes/session-log.md in their project that they update manually at natural breakpoints.

15. How do I handle this on large refactoring projects without losing track?

Use phase-based sessions: one session per module or logical unit. After each session, update CLAUDE.md with what was completed and key decisions. Use scoped file access (--add-dir) per session. Run /compact at internal breakpoints. This workflow scales to codebases of any size — the technique is not about the project size, it's about how you scope each session's context requirements.

in Ai coding