Skip to Content

Claude Code Error 429: Complete Fix Guide (Rate Limit Exceeded)

June 9, 2026 by
aliakram

Table of Contents

  1. What Is Claude Code Error 429?

  2. Error Message Reference Table

  3. Quick Fix Checklist

  4. Root Causes

  5. How Token Counting Actually Works

  6. API Usage Tiers Explained

  7. Claude Subscription Limits — Free vs Pro vs Max

  8. Fix 1 — Wait for the Rate Limit Window to Reset

  9. Fix 2 — Reduce Context Size with /compact

  10. Fix 3 — Re-Authenticate Claude Code

  11. Fix 4 — Create a .claudeignore File

  12. Fix 5 — Upgrade Your API Usage Tier

  13. Fix 6 — Implement Exponential Backoff (API Users)

  14. Fix 7 — Enable Prompt Caching

  15. Advanced Troubleshooting

  16. Windows-Specific 429 Bug — Plan Usage Drain

  17. Real-World Scenarios

  18. Error 429 vs Error 529 — Key Differences

  19. Common Mistakes

  20. Prevention and Best Practices

  21. Troubleshooting Decision Tree

  22. FAQ

  23. Related Keywords and Entities

What Is Claude Code Error 429?

If you are working in Claude Code and see HTTP 429: Rate Limit Reached or "This request would exceed your account's rate limit", you have hit one of Anthropic's usage limits. The 429 status code is a standard HTTP response meaning "Too Many Requests."

The fastest fix in most cases: run /compact in your Claude Code session to shrink context size, then wait 60 seconds and retry. If that does not work, re-authenticate with claude logout && claude login.

This guide covers every variation of the error, why it happens including the confusing case where your dashboard shows available quota and every fix from beginner to advanced.

Error Message Reference Table

Error Message

Meaning

Severity

Recommended Fix

HTTP 429: rate_limit_error

Generic rate limit hit

Medium

Wait, then retry

This request would exceed your account's rate limit

Organization-level RPM or ITPM exceeded

Medium

Reduce context; wait for reset

This request would exceed your organization's rate limit of 30,000 input tokens per minute

Tier 1 ITPM limit exceeded (most common)

Medium

/compact; upgrade to Tier 2

Usage limit reached

Monthly or daily spend cap hit

High

Check Limits dashboard; upgrade tier

Quota exceeded

API tier quota exhausted

High

Upgrade tier or contact sales

Too many requests

RPM limit exceeded

Low–Medium

Wait 60 seconds; reduce request frequency

Rate limit reached for model

Per-model RPM limit hit

Medium

Switch to a lighter model; wait

API Error: Rate limit reached (CLI)

Claude Code CLI session blocked

Medium

Re-authenticate; clear cache; wait

API Error: Server is temporarily limiting requests (not your usage limit)

Server-side throttle (Windows bug)

Medium–High

Close all sessions; kill processes; wait

Your token may be exhausted from repeated retries

OAuth token flagged due to retry loops

High

claude logout && claude login

User has exceeded quota

Subscription message limit reached

High

Start fresh conversation; wait for reset

Quick Fix Checklist

Follow these steps in order. Most users are unblocked by step 3.

  • Check your context size. Run /stats in Claude Code. If active context exceeds 20,000 tokens, proceed to step 2.

  • Run /compact. This compresses conversation history by 40–50% and often resolves 429s caused by token bloat.

  • Wait 60 seconds and retry. Per-minute RPM and ITPM limits reset within one minute.

  • Re-authenticate. Run claude logout && claude login in your terminal.

  • Kill background Claude processes. Run pkill claude on macOS/Linux to stop any lingering sessions consuming your quota.

  • Delete the local cache. Remove the .claude directory in your project folder.

  • Check your Limits dashboard. Go to platform.claude.com/settings/limits to inspect your tier and usage.

  • Disable VPN if active. VPNs can cause IP-level throttling separate from account limits.

  • Check Anthropic status. Visit status.anthropic.com to rule out a 529 server-side incident.

  • Create a .claudeignore file to stop Claude Code from indexing build artifacts and node_modules.

Root Causes

Claude Code error 429 has three distinct causes. Knowing which one you are facing determines the right fix.

1. Tokens Per Minute (TPM) Limit Exceeded

Claude enforces both Input Tokens Per Minute (ITPM) and Output Tokens Per Minute (OTPM) limits. Agentic coding sessions are the most common trigger because Claude Code reads files, generates plans, writes code, and verifies output often in a single loop that consumes far more tokens than a simple chat prompt.

Running a large refactor on an unscoped repository is a common culprit. If Claude indexes your entire project including node_modules, dist, or build artifacts, a single session can burn through tens of thousands of tokens before you type a second prompt.

2. Requests Per Minute (RPM) Limit Exceeded

If your setup runs multiple concurrent Claude Code sessions, uses Claude Code inside a CI/CD pipeline, or calls the API through an automation tool (like an MCP server or agent framework), you may hit the RPM ceiling. All API keys in an organization share the same rate limit pool, a detail that surprises many teams when they scale from one developer to several.

3. Authentication Token Issues

A 429 can persist even after your per-minute limits have reset if your OAuth token has entered a degraded state often caused by repeated retry loops that flag the token on Anthropic's backend. This is the "it was working yesterday" scenario. The dashboard may show available quota, yet every API call returns 429 immediately. Re-authentication resolves this.

How Token Counting Actually Works

This is the most misunderstood part of 429 errors, and understanding it explains why so many users hit limits unexpectedly.

When you send a message to Claude whether through the web interface, Claude Code, or the API Claude does not only process the text you just typed. The model receives your entire conversation context in every single request. That includes:

  • Every previous message in the current conversation

  • The system prompt (if any)

  • Tool call results and outputs

  • Any uploaded files or document contents

This means if your conversation is 50 messages deep, each new message sends all 50 as input tokens. As a conversation grows, each subsequent message becomes progressively heavier.

A practical example: if your conversation context is 25,000 tokens, and your tier's ITPM limit is 30,000 (Tier 1), a single new message will consume most of your per-minute token budget. A second message sent within the same minute triggers a 429 immediately even though you have technically only sent two messages.

This is why starting a fresh conversation, running /compact, or deleting old messages are the fastest fixes. They cut the accumulated context weight that is making every request expensive.

API Usage Tiers Explained

Your Claude API organization's rate limit depends on your usage tier, which increases automatically from Tier 1 to Tier 4 as you reach certain spending thresholds. You can view your current tier at platform.claude.com/settings/limits.

Here are the current tier limits for Claude Sonnet 4.x (limits for other models differ — always check the official Rate Limits page):

Tier

Deposit Required

RPM

ITPM

OTPM

Tier 1

$5

50

30,000

8,000

Tier 2

$40

1,000

450,000

90,000

Tier 3

$200

2,000

800,000

160,000

Tier 4

$400

4,000

2,000,000

400,000

Key point: All API keys under one organization share the same rate limit pool. Moving from Tier 1 to Tier 2 costs only $35 in additional prepaid credit — not extra monthly spending — and unlocks 20x higher limits. If you are regularly hitting 429s on Tier 1, this is almost always the best investment.

For needs beyond Tier 4, contact Anthropic Sales for custom rate limits.

Claude Subscription Limits Free vs Pro vs Max

If you use Claude via the web interface or Claude.ai app rather than the API, your limits work differently; they are measured in messages and conversation sessions rather than tokens per minute.

Plan

Monthly Cost

Messages / Period

Weekly Limits

Notes

Free

$0

~40 short messages/day

N/A

Drops with longer conversations

Pro

$20

~45 messages / 5 hours

40–80 hrs Sonnet 4

5-hour rolling window

Max 5x

$100

~225 messages / 5 hours

140–280 hrs Sonnet, 15–35 hrs Opus

Auto-switches to Sonnet at 20% Opus usage

Max 20x

$200

~450 messages / 5 hours

240–480 hrs Sonnet, 24–40 hrs Opus

Auto-switches to Sonnet at 50% Opus usage

Team

$25/user

Higher than Pro

Workspace limits

Shared org pool

Important nuances:

  • The advertised message counts assume short, simple exchanges. Conversations with large file attachments, long history, or complex tool outputs consume capacity far faster.

  • Weekly limits were introduced in 2025 and create a hard ceiling on total weekly usage — you cannot bypass them by repeatedly waiting for 5-hour resets.

  • Max plans include automatic model switching (Opus → Sonnet) as you approach usage thresholds, rather than a hard cutoff.

Fix 1  Wait for the Rate Limit Window to Reset

What it fixes: RPM and ITPM limits exceeded in the current minute window. When to use it: You get a 429 after sending several large requests in quick succession. Difficulty: Beginner.

Steps:

  1. Stop sending requests.

  2. Wait 60 seconds. Per-minute limits reset at the start of each new minute window.

  3. Check the response headers if you have API access — anthropic-ratelimit-requests-reset and anthropic-ratelimit-tokens-reset show the exact reset timestamp.

  4. Retry a single small request to confirm you are unblocked.

Why it works: Anthropic's rate limits use a sliding window. Once the minute window closes, your quota for that period is restored.

Expected outcome: Normal Claude Code operation resumes immediately after the reset.

Fix 2 Reduce Context Size with /compact

What it fixes: ITPM limits triggered by oversized conversation context. When to use it: You are mid-session and /stats shows a large active context, or you are hitting 429s repeatedly during an agentic task. Difficulty: Beginner.

Steps:

  1. In your Claude Code session, type /compact and press Enter.

  2. Claude compresses your entire conversation history in place, reducing it by approximately 40–50% while retaining key context.

  3. Wait a moment for the compression to complete.

  4. Continue your session.

Why it works: Oversized context means each request sends a huge number of input tokens. By compressing history, you dramatically reduce ITPM consumption per request.

Expected outcome: Immediate relief for the current session. For long-running sessions, run /compact approximately every 50 messages as a preventive measure.

Pro tip: Always run /stats before starting a major refactor. If active context is already above 20,000 tokens, run /compact proactively.

Fix 3 Re-Authenticate Claude Code

What it fixes: Persistent 429 errors that continue even after the rate limit window has reset; degraded or flagged OAuth tokens. When to use it: Claude Code was working previously, your dashboard shows available quota, but every request still returns 429. Difficulty: Beginner.

Steps:

  1. Open your terminal.

  2. Run: claude logout

  3. Once logged out, run: claude login

  4. Follow the browser-based authentication flow to generate a fresh token.

  5. Return to your project and resume your session.

Why it works: Repeated retry loops can flag an OAuth token on the backend, causing it to return 429 even when your account has quota available. A fresh login generates a clean token.

Expected outcome: Normal operation restored within a few minutes of re-authentication.

Fix 4 Create a .claudeignore File

What it fixes: Excessive ITPM consumption caused by Claude indexing irrelevant project files. When to use it: You work on large repositories with build artifacts, dependency folders, or generated files. Best used as a permanent prevention measure. Difficulty: Beginner.

Steps:

  1. In your project's root directory, create a file named .claudeignore.

  2. Add patterns for directories and files Claude should not index. A solid starter configuration:

node_modules/
dist/
build/
.next/
.nuxt/
coverage/
*.log
*.lock
*.min.js
*.min.css
.env

.env.*

  1. Save the file and restart your Claude Code session.

  2. Run /stats to confirm the active context is smaller.

Why it works: Without a .claudeignore, Claude Code may attempt to read dependency folders, compiled outputs, and log files — all of which add tokens to your context without helping the task at hand.

Expected outcome: Significantly reduced per-request token consumption. One engineering team reported a 90% decrease in rate limit interruptions and a 30% reduction in monthly API costs after implementing .claudeignore alongside a regular /compact habit.

Fix 5 Upgrade Your API Usage Tier

What it fixes: Hard quota ceilings that cannot be resolved with optimization alone. When to use it: You are hitting 429s regularly despite efficient prompting and context management; your use case requires higher throughput. Difficulty: Intermediate.

Steps:

  1. Go to platform.claude.com/settings/limits to see your current tier and usage.

  2. Usage tiers advance automatically (Tier 1 through Tier 4) as you reach spend thresholds. Review the Rate Limits page in Anthropic's docs for current tier thresholds and limits.

  3. To advance faster, increase your account's prepaid credit or usage spend to cross the next tier threshold.

  4. For limits beyond Tier 4, contact Anthropic Sales at anthropic.com/contact-sales to discuss custom rate limits.

Why it works: Each tier grants progressively higher RPM, ITPM, and OTPM ceilings. Moving from Tier 1 to Tier 4 provides substantially higher throughput across all limit types.

Expected outcome: Higher sustained throughput without 429 interruptions for legitimate heavy usage.

Fix 6 Implement Exponential Backoff (API Users)

What it fixes: 429 errors in applications and scripts built on the Claude API. When to use it: You are building a product or automation on the Claude API and need resilient retry logic. Difficulty: Intermediate.

Steps — Python example:

import anthropic
import time
import random

client = anthropic.Anthropic()

def call_with_backoff(prompt, max_retries=5):
    for attempt in range(max_retries):
        try:
            response = client.messages.create(
                model="claude-sonnet-4-6",
                max_tokens=1024,
                messages=[{"role": "user", "content": prompt}]
            )
            return response
        except anthropic.RateLimitError as e:
            if attempt == max_retries - 1:
                raise
            # Exponential backoff with jitter
            wait = (2 ** attempt) + random.uniform(0, 1)
            print(f"Rate limited. Retrying in {wait:.1f}s...")

            time.sleep(wait)

Steps — using the retry-after header: When you receive a 429, check the retry-after response header. This tells you the exact number of seconds to wait before retrying. Using this value is more accurate than a fixed wait time.

Why it works: Exponential backoff with jitter prevents thundering herd problems — where multiple retries hit the API at the same moment and trigger another 429 immediately.

Expected outcome: Robust, self-healing API integrations that recover automatically from transient rate limits.

Fix 7 Enable Prompt Caching

What it fixes: High ITPM consumption for applications that repeatedly send the same system prompts, reference documents, or shared context. When to use it: You are building on the Claude API and your prompts include large static sections (system instructions, code files, documentation) that stay the same across many requests. Difficulty: Intermediate.

Why this matters: For most Claude models, cached input tokens do not count toward your ITPM rate limit — only uncached tokens and cache creation tokens count. With an effective cache hit rate of 80%, you can process up to 5x more total tokens per minute than your ITPM limit appears to allow. This is one of the highest-leverage optimizations available.

How it works: Add cache_control breakpoints to stable sections of your prompt. Once cached, subsequent requests reuse the stored version at a fraction of the cost and without consuming ITPM.

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    system=[
        {
            "type": "text",
            "text": "You are a coding assistant. Here is our full codebase context: ...",
            "cache_control": {"type": "ephemeral"}  # Cache this large block
        }
    ],
    messages=[{"role": "user", "content": "Fix the authentication bug"}]

)

Expected outcome: Dramatically reduced ITPM consumption for repeat sessions, lower API costs, and far fewer 429 errors during high-volume usage.

See Anthropic's prompt caching documentation for full implementation details.

Advanced Troubleshooting

Kill Lingering Background Processes

If Claude Code was interrupted mid-session, background processes may continue consuming your quota.

# macOS / Linux
pkill claude

# Verify no processes remain

ps aux | grep claude

Clear the Local Project Cache

# In your project root
rm -rf .claude

# Optionally clear global Claude config

rm -rf ~/.claude

After clearing, run claude login to re-authenticate.

Inspect Rate Limit Response Headers

When calling the API directly, examine these headers on any response — not just 429 responses — to monitor your quota proactively:

anthropic-ratelimit-requests-limit
anthropic-ratelimit-requests-remaining
anthropic-ratelimit-requests-reset
anthropic-ratelimit-tokens-limit
anthropic-ratelimit-tokens-remaining
anthropic-ratelimit-tokens-reset

retry-after

The *-reset headers contain ISO 8601 timestamps. The retry-after header gives you the exact wait time in seconds.

VS Code / Cursor / Windsurf Fixes

If you are using Claude Code through an IDE extension (Cursor, Windsurf, or a VS Code integration):

  1. Close all IDE windows to stop any background indexing.

  2. Restart the IDE.

  3. Check if the extension has a setting for "index on startup" or "auto-scan" and disable it for large repositories.

  4. Verify the extension is using your account credentials, not a shared or stale API key.

  5. Run claude logout && claude login from a terminal, then reload the IDE.

Check for Multiple Active Sessions

If you or your team has multiple Claude Code sessions open simultaneously, all of them share the same organization-level rate limit. Close sessions you are not actively using.

Environment Variable Checks

# Confirm ANTHROPIC_API_KEY is set correctly (API key users only)
echo $ANTHROPIC_API_KEY

# Should return your key starting with sk-ant-
# If empty, set it:

export ANTHROPIC_API_KEY="your-key-here"

Network Diagnostics

# Test connectivity to the Anthropic API

curl -I https://api.anthropic.com


# Check if a VPN or proxy is active

curl ifconfig.me

A VPN can cause IP-based throttling that compounds account-level rate limits. Disable it and retry.

Check Anthropic's Status Page

Before spending time debugging locally, confirm there is no ongoing incident:

https://status.anthropic.com

A 529 error (not 429) indicates a server-side capacity issue and requires no action on your part beyond waiting.

Windows-Specific 429 Bug Plan Usage Drain

A specific bug has been reported on Claude Code for Windows (confirmed in issue #51291 on the official GitHub repository, filed April 2026). Users on the Max plan noticed that when the "API Error: Server is temporarily limiting requests (not your usage limit) · Rate limited" banner appeared, a large chunk of their weekly plan usage was consumed even though they were not actively coding and nothing was running in the background.

Symptom: After seeing a rate limited banner, Settings → Usage shows a jump of ~26% session usage despite minimal interaction. The rate limit message is supposed to be separate from plan quota consumption.

Affected version: Claude Code 1.35 on Windows, Claude Max 20x plan, Opus model.

Workaround while the bug is being investigated:

  1. As soon as you see the rate limited banner, close all Claude Code windows immediately.

  2. Run pkill claude to terminate any background processes.

  3. Wait at least 5 minutes before reopening.

  4. Avoid leaving Claude Code open in an idle state if you are near usage limits.

Status: Filed as a bug with Anthropic. If you are on Windows and experiencing unexpected usage drain alongside 429 errors, report it at github.com/anthropics/claude-code/issues with your Claude Code version and Windows version.

Real-World Scenarios

Scenario 1: Claude Code worked yesterday, fails today

Symptom: Immediate 429 on every request. Dashboard shows available quota. Cause: OAuth token entered a degraded or flagged state, possibly due to a crash or repeated retries during yesterday's session. Fix: claude logout && claude login. If the problem persists, clear the .claude cache directory and re-authenticate.

Scenario 2: 429 mid-refactor on a large codebase

Symptom: Claude Code completes several steps of a refactor, then hits a 429 partway through. Cause: The conversation context grew large enough that each subsequent request consumed a burst of ITPM, exceeding the per-minute limit. Fix: Run /compact immediately. Add a .claudeignore file to prevent re-indexing of non-source files. Resume the task.

Scenario 3: Whole team hitting 429s after a new developer joins

Symptom: Individual usage looks fine, but 429s appear across the organization. Cause: Multiple developers sharing the same organization's rate limit pool. Adding a new active user pushed total consumption over the tier ceiling. Fix: Upgrade to the next tier. Establish team conventions around context management (regular /compact, scoped sessions).

Scenario 4: CI/CD pipeline fails with 429

Symptom: Automated tests or code generation jobs return 429 in CI. Cause: The CI pipeline runs multiple parallel Claude Code jobs that collectively exceed RPM limits. Fix: Add exponential backoff retry logic to the CI scripts. Stagger parallel jobs. Consider using a lighter model for automated tasks to preserve quota for interactive developer sessions.

Scenario 5: VPN user hits 429 intermittently

Symptom: 429 errors appear randomly and do not correlate with heavy usage. Cause: VPN exit nodes can share IPs across many users, triggering IP-based rate constraints. Fix: Disable the VPN, or configure a split tunnel that routes Anthropic API traffic directly.

Error 429 vs Error 529 — Key Differences

Factor

429 Rate Limit

529 Overloaded

Cause

You exceeded your account's limits

Anthropic's servers are at capacity

Responsibility

Client-side (your usage)

Server-side (Anthropic infrastructure)

Dashboard shows quota

Usually yes

Yes, irrelevant

Fix

Reduce usage, wait, optimize, upgrade

Wait and retry; nothing you can do

Count against the backoff timer?

Yes

No

Retry strategy

Exponential backoff with jitter

Simple wait-and-retry; do not hammer

Appears on status page

No

Yes (typically)

Developers frequently confuse 529 for 429 because both stop your request. The correct response to a 529 is to wait — aggressive retries worsen server load and reduce your success rate.

Common Mistakes

Mistake 1: Blaming the server when the problem is context size

The majority of 2026 Claude Code 429 errors are caused by "prompt bloat"  , a session context that has grown far larger than necessary. Most developers reach for account settings or blame Anthropic's infrastructure before checking /stats.

Better approach: Run /stats first. If the context is large, run /compact before anything else.

Mistake 2: Repeated immediate retries after a 429

Retrying instantly after a 429 does not help and can degrade your token's status further. It also wastes RPM quota on requests that will fail.

Better approach: Wait at least 60 seconds. Use the retry-after header to determine the exact wait time.

Mistake 3: Assuming dashboard quota = current availability

The dashboard reflects cumulative usage against monthly or daily totals. It does not show per-minute window consumption. You can be at 6% monthly usage and still hit a per-minute ITPM limit.

Better approach: Monitor the anthropic-ratelimit-* response headers for real-time window data.

Mistake 4: Running Claude Code without a .claudeignore on a large repo

Without scope control, Claude may try to read your entire repository including node_modules, compiled assets, and log files inflating token consumption dramatically.

Better approach: Set up .claudeignore before your first session on any non-trivial codebase.

Mistake 5: Multiple concurrent sessions without awareness of shared limits

Each team member using Claude Code simultaneously draws from the same organizational rate limit pool. One heavy session can starve others.

Better approach: Coordinate heavy refactors, establish team norms around context management, and consider a tier upgrade if the team is growing.

Prevention and Best Practices

Context hygiene:

  • Run /stats before starting any large task.

  • If context exceeds 20,000 tokens, run /compact before proceeding.

  • Establish a rule: run /compact every 50 messages or at the start of each new task.

Scoping your sessions:

  • Use .claudeignore to exclude build artifacts, dependencies, and generated files.

  • Use the --include flag to target only the relevant source directory for a given task.

  • Break large refactors into smaller, focused sessions rather than one sprawling conversation.

For API integrations:

  • Always implement exponential backoff with jitter in production code.

  • Read anthropic-ratelimit-* headers on every response to detect approaching limits before hitting them.

  • Enable prompt caching for repeated system prompts and shared context — cached tokens do not count toward ITPM limits, effectively multiplying your throughput.

Authentication maintenance:

  • Refresh credentials periodically, especially after long idle periods or crashes.

  • Do not share API keys across many automated systems without understanding the shared rate limit impact.

Monitoring:

Troubleshooting Decision Tree

Start: Error 429 in Claude Code
├─ Dashboard shows 0% or low usage?
│   ├─ YES → Is Claude still returning 429 immediately?
│   │         ├─ YES → Authentication issue → Fix 3 (re-authenticate)
│   │         └─ NO  → Per-minute limit hit → Fix 1 (wait 60s)
│   └─ NO  → Approaching monthly/tier limit → Fix 5 (upgrade tier)
├─ Mid-session during large task?
│   ├─ YES → Run /stats
│   │         ├─ Context > 20k tokens → Fix 2 (/compact)
│   │         └─ Context normal → Fix 1 (wait 60s), then Fix 3
│   └─ NO  → Single request failing → Check Fix 3, then Fix 1
├─ Multiple developers affected simultaneously?
│   └─ YES → Shared org limit exceeded → Fix 5 (upgrade tier)
├─ CI/CD pipeline failing?
│   └─ YES → Fix 6 (exponential backoff in scripts)
├─ Error is 529 not 429?

│   └─ YES → Server-side issue; wait and retry; check status.anthropic.com

└─ VPN active?
    └─ YES → Disable VPN and retryStart: Error 429 in Claude Code
├─ Dashboard shows 0% or low usage?
│   ├─ YES → Is Claude still returning 429 immediately?
│   │         ├─ YES → Authentication issue → Fix 3 (re-authenticate)
│   │         └─ NO  → Per-minute limit hit → Fix 1 (wait 60s)
│   └─ NO  → Approaching monthly/tier limit → Fix 5 (upgrade tier)
├─ Mid-session during large task?
│   ├─ YES → Run /stats
│   │         ├─ Context > 20k tokens → Fix 2 (/compact)
│   │         └─ Context normal → Fix 1 (wait 60s), then Fix 3
│   └─ NO  → Single request failing → Check Fix 3, then Fix 1
├─ Multiple developers affected simultaneously?
│   └─ YES → Shared org limit exceeded → Fix 5 (upgrade tier)
├─ CI/CD pipeline failing?
│   └─ YES → Fix 6 (exponential backoff in scripts)
├─ Error is 529 not 429?

│   └─ YES → Server-side issue; wait and retry; check status.anthropic.com

└─ VPN active?

    └─ YES → Disable VPN and retry

Frequently asked questions

Here are some common questions about our company.

The Anthropic usage dashboard shows cumulative consumption against monthly or daily totals. Rate limits, however, also operate on per-minute windows for requests (RPM) and tokens (ITPM/OTPM). You can be well within your monthly quota but still exceed the number of tokens or requests allowed in a single minute. Check the anthropic-ratelimit-tokens-remaining header in API responses for the real-time per-minute window status. Running /compact to reduce context size is the fastest fix.

Per-minute rate limits reset within 60 seconds. If you are hitting a 429 that does not resolve after waiting, the issue is likely a degraded authentication token rather than a rate limit. Run claude logout && claude login to generate a fresh session. In rare cases involving backend account state issues, resolution may require contacting Anthropic support.

Yes, especially for sessions involving agentic tasks. Each request to the Claude API includes your entire conversation history as input tokens. As context grows, each message consumes more ITPM. /compact compresses conversation history by approximately 40–50%, significantly reducing per-request token consumption and making it much less likely you will hit the per-minute token limit.

Yes. Some VPN providers route traffic through shared IP addresses used by many users simultaneously. Anthropic may apply IP-level throttling on top of account-level rate limits. If you are seeing intermittent 429 errors that do not correspond to heavy usage, try disabling your VPN and retrying. You can also configure a split tunnel so that API traffic bypasses the VPN.

They are related but distinct. A rate limit 429 means you exceeded a per-minute requests or tokens limit. A usage limit error typically means you have hit a monthly spend cap or your API tier's overall quota. Both return HTTP 429, but the error message and the appropriate fix differ. Check platform.claude.com/settings/limits to determine which applies to your account.

This is a known backend account state issue. When a Claude Code session crashes or retries aggressively, the account can be left in a rate-limited state on Anthropic's backend even after local state is cleared and the per-minute window has reset. The fix is re-authentication (claude logout && claude login). If the issue persists after 12+ hours and fresh login, contact Anthropic support, as a manual backend reset may be required.

No. All API keys and sessions under a single Anthropic organization share the same rate limit pool. Running multiple sessions in parallel does not increase your limits — it divides them. If your team is running multiple concurrent sessions and hitting 429s, consider upgrading to the next usage tier.

HTTP 429 means you have exceeded your account's rate limits — this is a client-side issue you can fix by reducing usage, waiting, or upgrading your tier. HTTP 529 is Anthropic-specific and means the server is temporarily overloaded — it is a server-side issue outside your control. For 529 errors, wait and retry with simple backoff; do not count the wait time against your rate limit backoff strategy.

For needs beyond Tier 4, you can contact Anthropic Sales to discuss custom rate limits. Enterprise arrangements typically include higher RPM and TPM ceilings tailored to your specific usage patterns. See anthropic.com/contact-sales.

Reinstalling Claude Code itself will not resolve a rate limit error, since the limit is enforced at the account level, not the client. However, a clean reinstall combined with fresh authentication (claude login) can resolve persistent 429 errors caused by corrupted local state or a degraded OAuth token. It is faster and less disruptive to try claude logout && claude login and cache deletion first.

Inspect the anthropic-ratelimit-* response headers. The header that shows 0 in its *-remaining field indicates which limit you hit. In a Claude Code CLI session without direct header access, context size is the most reliable indicator: oversized context almost always means an ITPM limit was hit. A high-frequency agentic loop with smaller prompts is more likely an RPM limit.

Yes, significantly. Cached input tokens do not count toward your ITPM limits. If your Claude Code sessions or API calls include repeated system prompts, shared context, or static reference documents, enabling prompt caching can multiply your effective throughput without requiring a tier upgrade.

The "45 messages per 5 hours" figure assumes short, simple exchanges starting from fresh conversations. Actual capacity drops quickly when conversations are long (Claude processes your entire history with each message), when large files are attached (file contents are converted to tokens and included in every subsequent request), or when using Opus instead of Sonnet. To stay closer to the advertised limits: start new conversations more frequently, avoid re-uploading files that Claude already has in context, and break complex tasks into focused sessions.

No. This violates Anthropic's Terms of Service and risks permanent suspension of all associated accounts. Anthropic's systems are designed to detect multi-account patterns. The correct approach is to optimize your usage patterns (context management, .claudeignore, /compact) or upgrade to a higher plan or API tier.

Claude.ai (web and app) uses session-based quotas measured in messages, with 5-hour rolling windows and weekly caps. The Claude API uses rate limits measured in requests per minute (RPM), input tokens per minute (ITPM), and output tokens per minute (OTPM), which reset continuously using a token bucket algorithm. API limits are generally more predictable but require payment from the first request — there is no free API tier.