Why does Claude Code keep saying the session expired?

This happens when your authentication token becomes invalid, expired, or corrupted. It can also be caused by macOS keychain locks, VPN interference, or OAuth callback timeouts.

How do I reconnect Claude Code?

The fastest way to reconnect is to sign out, clear your cached credentials using 'rm -rf ~/.claude/cache', and then complete the full login flow again.

What is an OAuth callback timeout?

It occurs when the browser takes longer than 60 seconds to pass the authentication token back to the CLI, often due to too many open tabs or a slow network connection.

Why does Claude Code say 'installation failed' but then 'installation complete'?

This contradictory message almost always means a stale lock file exists from a previous interrupted install. The installer aborts due to the lock but then partially completes a different step, giving you two contradictory status messages. Remove the lock file at ~/.local/state/claude/locks/ and reinstall.

I get claude: command not found even after a successful install. What's wrong?

Your terminal session hasn't loaded the updated PATH that the installer added to your shell config. Open a new terminal window, or run source ~/.zshrc (or ~/.bashrc on Linux). If that doesn't work, confirm that npm's global bin directory is actually in your PATH.

Do I need sudo to install Claude Code?

No — and using sudo will likely cause more problems. Set your npm global directory to a user-owned path with npm config set prefix ~/.npm-global, add that to your PATH, and then install without sudo.

Claude Code installation fails with a TLS/SSL error. How do I fix it?

This happens in corporate environments that use SSL inspection. Export your company's CA certificate path: export NODE_EXTRA_CA_CERTS=/path/to/ca.pem. Your IT team can provide the certificate file.

The Claude Code installer just hangs and never completes. What's happening?

A hanging installer almost always means a network block. The installer downloads from storage.googleapis.com. Set your proxy with export HTTPS_PROXY=http://your-proxy:port and retry, or ask IT to whitelist that domain.

I'm on Windows. Should I use npm or the curl installer?

Use PowerShell with npm install -g @anthropic-ai/claude-code. Don't use Git Bash; it lacks TTY support and will cause raw mode errors. If you're in WSL, use the curl installer within your Linux environment.

How do I know if Claude Code installed correctly?

Run claude --version to confirm the binary is accessible, then run claude doctor for a full diagnostic report. Both should return clean results with no errors before you start using the tool in a real project.

How to Fix Claude Code 429 Too Many Requests Error (2026)

May 28, 2026 by

aliakram

Introduction

If you use Claude Code regularly, you have almost certainly seen the dreaded Claude code rate limit error kill your session at the worst possible moment. One second Claude is helping you debug a tricky async race condition the next you are staring at this:

429 Too Many Requests

RateLimitError: Rate limit exceeded. Please retry after 60 seconds.

The good news is that the vast majority of these errors are temporary and completely fixable. This guide covers everything what the error means under the hood, all the types of API rate limits, step-by-step fixes you can apply right now, and proven strategies to prevent cloud throttling from ever breaking your workflow again.

What Is the Claude Code Rate Limit Error?

The claude code rate limit error is triggered when Anthropic's API detects that your usage has exceeded a defined threshold within a short time window. It is not a bug — it is a deliberate protection mechanism to maintain service stability across all users worldwide.

Common error messages you may see:

429 Too Many Requests

RateLimitError: Rate limit exceeded

usage_limit_exceeded: Monthly usage cap reached

Claude is unable to respond right now due to high usage

Error 529: Overloaded (Anthropic-side, not your fault)

PRO TIP: Error 529 is different; it is an Anthropic server overload, not your rate limit.

It will resolve on its own. Check https://status.claude.com first before debugging locally.

Run /doctor in Claude Code to rule out local config issues within 30 seconds.

Types of Claude API Rate Limits

Anthropic's rate limit system is layered. Understanding which ceiling you have hit is essential to picking the right fix.

Limit Type	What It Measures
Requests Per Minute (RPM)	Number of API calls made per minute
Input Tokens Per Minute (ITPM)	Volume of text sent to Claude per minute
Output Tokens Per Minute (OTPM)	Volume of text returned by Claude per minute
Daily Token Budget	Total tokens consumed in a 24-hour period
Monthly Usage Cap	Hard spending cap tied to your billing plan

Each limit type has a different resolution path. The retry-after value in the error response header tells you exactly how long to wait to read it before doing anything else.

Why Claude Code Rate Limit Errors Happen

1. Massive Context Windows

Every request you send includes your full conversation history. Uploading a 4,000-line file early in a session means every subsequent message pays the token cost of that file repeatedly. Large repositories and long logs are the number-one cause of burning through ITPM limits fast.

2. Multiple Concurrent Sessions Sharing One API Key

Anthropic enforces limits per API key, not per terminal window. Running Claude Code in three VS Code windows, a CI script, and a background automation all on the same key means their usage is pooled. One heavy job can starve all the others.

3. Agentic Workflows Firing Hidden API Calls

Modern Claude Code workflows — multi-step debugging, file editing chains, automated test-and-fix loops — can trigger several API calls behind the scenes for what looks like one user action. A complex refactor task may actually be 15 API calls.

4. Low Usage Tier (New Accounts)

Anthropic uses a tier system (Tier 1 through Tier 4). New accounts start at Tier 1 with the most restrictive limits. Tiers increase automatically as you reach spend thresholds but until then, heavy use will constantly brush the ceiling. You can check your current tier at console.anthropic.com > Settings > Limits.

5. Peak Usage Hours (Shared Infrastructure)

During peak global usage periods, even paid accounts may experience tighter throttling due to shared infrastructure load. If you consistently hit limits at the same time each day, try shifting intensive sessions to off-peak hours.

Step-by-Step Fix Guide

Step 1: Triage First — Check Status and Run /doctor

Before touching any configuration, rule out a platform-wide incident:

Go to https://status.claude.com — Anthropic publishes a 90-day uptime history.
Inside Claude Code, run /doctor — it checks installation health, malformed settings JSON, MCP config errors, and keybinding issues in about 30 seconds.
Check the error message for error code 529 (server overload, not your limit) vs 429 (your rate limit).

Step 2: Read the retry-after Header

Do NOT retry immediately. Every immediate retry can extend your cooldown window. The retry-after header in the 429 response tells you the exact wait time — honor it.

# Read the retry-after value from the error response

HTTP/1.1 429 Too Many Requests

retry-after: 60

anthropic-ratelimit-requests-remaining: 0

anthropic-ratelimit-tokens-remaining: 0

# Respect it. Wait, then retry once.

Step 3: Use /clear Between Tasks

The most impactful zero-cost fix. Every exchange in your session adds to the context payload sent on the next request. Clearing session context between unrelated tasks directly reduces your per-request token cost.

Step 4: Reference Specific Code — Not Entire Files

Surgical file references massively reduce token consumption. Compare these two approaches:

Bad Approach	Better Approach
Upload entire 4,200-line models.py	@models.py#120-180 (specific function only)
Paste full repo README into every prompt	Create CLAUDE.md with persistent project context
Ask Claude to review the whole codebase	Ask Claude to review one file or function at a time

Step 5: Implement Exponential Backoff in Scripts

Any programmatic integration should have retry logic built in from day one. Here is a production-ready exponential backoff implementation:

import anthropic

import time

client = anthropic.Anthropic()

def call_claude_with_backoff(prompt, max_retries=5):

for attempt in range(max_retries):

try:

return client.messages.create(

model='claude-opus-4-5',

max_tokens=1024,

messages=[{'role': 'user', 'content': prompt}]

)

except anthropic.RateLimitError as e:

if attempt == max_retries - 1:

raise

# Read retry-after if available, else use exponential backoff

wait = int(e.response.headers.get('retry-after', 2 ** attempt + 1))

print(f'Rate limited. Waiting {wait}s (attempt {attempt+1}/{max_retries})...')

time.sleep(wait)

Step 6: Separate API Keys by Workload

One API key for everything is the most common rate limit mistake. Create separate keys for different purposes:

Interactive Claude Code sessions (your daily dev work)
CI/CD pipeline integrations
Automated batch processing scripts
VS Code or IDE extensions
Any shared team tooling

Step 7: Use the Message Batches API for Non-Urgent Work

Anthropic's Message Batches API processes requests asynchronously and uses a separate, more permissive quota that does not count against your synchronous RPM limits. Use it for:

Bulk code documentation generation
Offline code review runs
Large-scale test generation
Any task where real-time response is not required

Step 8: Check and Upgrade Your Tier

If you have applied all the above and still hit limits regularly, your usage has grown beyond your plan tier. Visit console.anthropic.com > Settings > Limits to see your current RPM, ITPM, OTPM, and daily caps. Higher tiers unlock significantly larger budgets automatically as your spend history grows — or you can contact Anthropic Sales to request a custom limit increase.

Best Solutions Ranked by Impact

Solution	Impact vs Effort
Exponential backoff in scripts	Highest impact / Lowest effort — do this first
Use /clear between tasks	High impact / Zero effort — use always
Surgical file references (@file#L1-L80)	High impact / Low effort — habit to build
Separate API keys per workload	High impact / Medium effort — one-time setup
Create CLAUDE.md context file	Medium-high impact / Low effort — one-time setup
Use Message Batches API for bulk work	High impact / Medium effort — for pipelines
Shift heavy work off peak hours	Medium impact / Zero effort — quick win
Upgrade to higher tier / contact Sales	Highest long-term impact / Higher effort

Common Mistakes Developers Make

WARNING: Repeatedly retrying after a 429 can extend your cooldown — always honor retry-after.

Sharing one API key across interactive and automated workloads is the single biggest source of

constant throttling for teams. Separate them.

Ignoring the Error Type

Error 429 and error 529 look similar but need completely different responses. 429 means you exceeded a limit — fix your usage or wait. 529 means Anthropic's servers are under load — just wait, no code change needed.

Never Clearing Session Context

Many developers run an entire 8-hour work session inside a single Claude Code context. By hour 3, every request is carrying hours of conversation history. /clear is free, instant, and dramatically extends how long you can work before hitting TPM limits.

Dumping Entire Codebases Into Prompts

Feeding Claude a 3,000-line file when you need it to review a 40-line function burns 75x more tokens than necessary. Always target the smallest relevant context for each task.

Not Monitoring Usage in the Console

Anthropic's Console shows live token and request usage. Developers who never look at it are often surprised to find they burn their daily token budget by 11am. Check it weekly to spot patterns early.

Pro Tips for Heavy Claude Code Users

TIP 1: Write a CLAUDE.md project context file. Store your tech stack, conventions,

and current goals there. Claude reads it each session — saving hundreds of tokens

in repeated context-setting across every prompt.

TIP 2: Use streaming for long API responses. Streamed responses have better timeout

behavior under load and return partial results even if a request is interrupted.

TIP 3: For pipelines, pre-flight check your usage before big batch jobs:

usage = client.beta.usage.list()

This lets you catch approaching limits before they break a multi-hour job.

TIP 4: In VS Code with the Claude extension, each file you @-reference adds

to the token payload. Build the habit of being selective — reference functions,

not files. Reference files, not directories.

Real Developer Use Case

The Problem

A backend developer used Claude Code daily to work on a large Django REST API — 40+ files, complex ORM relationships, and custom middleware. Within three back-and-forth exchanges they reliably hit the 429 error, every single session.

Root Cause Analysis

models.py (4,200 lines) and serializers.py (1,800 lines) loaded into context at session start
CI/CD pipeline sharing the same API key as their interactive dev sessions
Never using /clear — a single session ran all day
No retry logic in their automated test-generation scripts

The Fix

Created CLAUDE.md with high-level architecture notes — eliminated repetitive context-setting
Used /clear after each logical task boundary (auth, ORM, views, serializers)
Switched to @models.py#L120-180 targeted references instead of full file uploads
Separated the CI pipeline onto its own API key with exponential backoff
Moved batch test generation to Message Batches API

Result

Zero rate limit errors across two full weeks of heavy development. Token usage dropped by roughly 60%. Response quality actually improved because Claude had tighter, more targeted context on each request. The entire fix took about 45 minutes to implement.

Frequently Asked Questions (7 Questions)

Q1: How long does a Claude Code rate limit error actually last?

For RPM-based limits, the retry-after header tells you the exact wait — usually 60 seconds. For daily or monthly usage caps, the reset happens at midnight UTC or your billing cycle date. The error message will usually indicate whether it is a short cooldown or a hard cap.

Q2: Is Claude Code rate limiting the same as the direct Anthropic API?

Yes. Claude Code uses the Anthropic API internally and is subject to the same tier-based rate limits tied to your API key. The interface you use does not change your limits — your usage tier does.

Q3: Will upgrading my Claude.ai subscription fix rate limit errors in Claude Code?

It depends on your setup. If you access Claude Code via claude.ai, a Pro or Team plan improves priority. If you use a direct API key from console.anthropic.com, your limits are governed by your Console tier, which is separate from your claude.ai subscription. Check both.

Q4: What is the difference between error 429 and error 529?

Error 429 (Too Many Requests) means you specifically exceeded your API usage limits — the fix is to reduce consumption or wait. Error 529 (Overloaded) is an Anthropic server-side overload affecting all users — it resolves on its own and checking status.claude.com confirms it.

Q5: Does the /clear command in Claude Code actually help with rate limits?

Yes, significantly. Every conversation turn you keep adds to the token payload sent with every subsequent request. Clearing session context resets this overhead. Heavy Claude Code users who use /clear regularly report dramatically fewer TPM-related throttle errors.

Q6: Can I increase my rate limits without upgrading my plan?

Your tier increases automatically as your API spend history grows (Tier 1 through Tier 4). You can also contact Anthropic Sales directly to request a custom rate limit increase for high-volume legitimate workloads. The request page is accessible through console.anthropic.com.

Q7: Does the Message Batches API bypass rate limits?

It uses a separate, more permissive async quota. Batch requests do not count against your real-time RPM limits, making it ideal for large-volume work like bulk code review, documentation generation, or test creation that does not require immediate responses.

Final Verdict

The claude code rate limit error is genuinely frustrating — but it is almost always solvable without spending more money or waiting indefinitely. The majority of 429 Too Many Requests errors are temporary RPM throttles that clear within 60 seconds once you stop hammering the endpoint.

The three things that matter most for any developer using Claude Code seriously:

Implement exponential backoff in every script that calls the API — this eliminates the majority of pipeline failures automatically.
Use /clear religiously between tasks and reference only the specific code you need — this directly extends how long you can work before hitting token limits.
Separate interactive dev sessions and automated pipelines onto different API keys — this prevents shared rate limit contention from starving your interactive work.

If you apply all three and still hit limits, check your tier in the Anthropic Console and consider whether your actual usage has simply outgrown your current plan. The usage data is right there — use it.SpaceX Buys Cursor AI for $60B: What It Means for Developers

FEATURED SNIPPET ANSWER

How do I fix the Claude Code rate limit error?

Wait 60 seconds (honor the retry-after header). Use /clear to reset session context.

Reference only specific files/functions, not entire codebases. Add exponential backoff

to any script using the API. Separate interactive and automated workloads onto

different API keys. Check your usage tier at console.anthropic.com > Settings > Limits.

For batch work, use the Message Batches API which has a separate, higher quota.

Schema FAQ — JSON-LD Structured Data

{

"@context": "https://schema.org",

"@type": "FAQPage",

"mainEntity": [

{

"@type": "Question",

"name": "What causes a Claude Code rate limit error?",

"acceptedAnswer": {

"@type": "Answer",

"text": "Exceeding RPM, ITPM, or OTPM limits — caused by large context windows,

multiple sessions sharing one API key, or agentic workflows firing

many hidden API calls."

}

{