GitHub HTTP 403 rate-limit

GitHub Error: `403_rate_limit` — REST API Rate Limit Exceeded

response.txt text

HTTP/1.1 403 Forbidden
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1714150800
X-RateLimit-Resource: core

{
  "message": "API rate limit exceeded for 203.0.113.42. (But here's the good news: Authenticated requests get a higher rate limit.)",
  "documentation_url": "https://docs.github.com/rest/overview/resources-in-the-rest-api#rate-limiting"
}

Unauthenticated requests share a 60/hour bucket per IPv4 — easy to exhaust from CI runners on shared NAT.

GitHub’s 403 with X-RateLimit-Remaining: 0 is the most common production blocker for any code that talks to the GitHub API. The fix is rarely “wait longer” — it’s almost always one of: authenticate, use conditional requests, switch to GraphQL, or upgrade from PAT to GitHub App. Each move multiplies your effective quota by 5-100x.

The single biggest win is conditional requests with ETags. Most GitHub data (repo metadata, branch SHAs, label lists) changes rarely. A naive integration that re-reads the same resource every minute spends 60 calls/hour per resource; the same code with If-None-Match spends maybe 3-5 because most reads return 304. That alone takes you from constant throttling to a comfortable margin.

Why this happens

Unauthenticated requests on a CI runner. GitHub's unauthenticated quota is 60/hour per source IP. CI runners (especially shared GitHub Actions runners on shared NAT) consume this collectively. A single noisy job can lock out everyone else on the same exit IP for 59 minutes.
PAT used for what should be a GitHub App. Personal access tokens get 5,000/hour. GitHub Apps get 15,000/hour per installation, scaling with your installation count. Heavy automation that runs through a PAT will throttle long before an equivalent GitHub App.
Polling instead of webhooks. Many GitHub integrations poll `GET /repos/.../events` or `GET /notifications` every minute. That alone burns 60 RPH; combined with other reads, you'll hit the cap. Switch to webhooks or the events stream for change-driven updates.
REST search API on its own tighter limit. `GET /search/...` has a separate, much lower limit (30/min authenticated, 10/min unauthenticated). Hitting search rate limits returns the same 403 with `X-RateLimit-Resource: search`.
Secondary (abuse) rate limit. If you hammer a single endpoint with high concurrency, GitHub may return a 403 with `Retry-After` and a different message — the secondary rate limit. It's separate from the per-hour budget and triggers on burst patterns.

How to fix it

Fixes are ordered by likelihood. Start with the first one that matches your context.

1. Authenticate every request

Unauthenticated 60/hour is a development convenience, not a production budget. Set the `Authorization: Bearer <token>` header on every call. PATs get 5,000/hour; fine-grained PATs and GitHub Apps get more.

github.js javascript

import { Octokit } from "@octokit/rest";

const octokit = new Octokit({
  auth: process.env.GITHUB_TOKEN,
  // Octokit auto-handles primary rate limit and retries
  throttle: {
    onRateLimit: (retryAfter, options) => {
      console.warn(`Rate limited on ${options.method} ${options.url}, retrying in ${retryAfter}s`);
      return options.request.retryCount < 3;
    },
    onSecondaryRateLimit: (retryAfter, options) => {
      console.warn(`Secondary rate limit on ${options.method} ${options.url}`);
      return options.request.retryCount < 1;
    },
  },
});

2. Honour X-RateLimit-Reset before retrying

Don't retry blindly — `X-RateLimit-Reset` is a unix timestamp telling you exactly when the budget refills. Sleep until then (plus a small jitter) before the next call.

github_retry.py python

import time, requests

def get(url, token, max_attempts=3):
    for _ in range(max_attempts):
        r = requests.get(url, headers={"Authorization": f"Bearer {token}"})
        if r.status_code != 403 or 'X-RateLimit-Remaining' not in r.headers:
            return r
        if int(r.headers['X-RateLimit-Remaining']) > 0:
            return r
        reset_at = int(r.headers['X-RateLimit-Reset'])
        sleep_for = max(0, reset_at - int(time.time())) + 5
        time.sleep(sleep_for)
    r.raise_for_status()

3. Use conditional requests to avoid spending quota

Every GitHub REST response has an `ETag`. Send `If-None-Match: <etag>` on subsequent reads — if the data hasn't changed, GitHub returns 304 Not Modified and the call doesn't count against your rate limit.

conditional.js javascript

const cache = new Map();

async function getRepo(owner, repo) {
  const url = `https://api.github.com/repos/${owner}/${repo}`;
  const headers = {
    Authorization: `Bearer ${process.env.GITHUB_TOKEN}`,
  };
  const cached = cache.get(url);
  if (cached) headers['If-None-Match'] = cached.etag;
  const r = await fetch(url, { headers });
  if (r.status === 304) return cached.body;  // free read
  const body = await r.json();
  cache.set(url, { etag: r.headers.get('etag'), body });
  return body;
}

4. Move to GraphQL for batched reads

The GraphQL API uses a points-based budget — fewer round trips for the same data. A single query that fetches a repo, its open PRs, their reviewers, and their checks would be 4-5 REST calls but one GraphQL call costing maybe 20 points out of 5,000.

5. Switch from PAT to GitHub App for sustained automation

A GitHub App installation gets up to 15,000/hour and scales with the number of installations. Bots, syncers, and CI integrations that hit PAT limits regularly should be GitHub Apps. The migration is mostly authentication plumbing — endpoints stay the same.

Detection and monitoring in production

Log `X-RateLimit-Remaining` on every response and emit it as a metric. Alarm when remaining drops below 10% of the limit for more than 5 minutes — that's an incoming 403. For GitHub Apps, expose the metric per installation so you can see which one is the heavy user.

Related errors

Frequently asked questions

Why does GitHub return 403 instead of 429 for rate limiting? +

Historical reasons — GitHub's API predates widespread 429 adoption. The 403 includes `X-RateLimit-Remaining: 0` and a body message explaining the limit, so it's structurally distinct from auth-failure 403s. GitHub's GraphQL API does return 403 too; some newer endpoints return 429.

What''s the difference between primary and secondary rate limits? +

Primary limit is the per-hour budget (60/5000/15000 depending on auth). Secondary limit (also called abuse rate limit) triggers on burst patterns even when you have primary budget remaining — high concurrency on a single endpoint, repeated content-creation calls, or tight loops. Secondary limit responses include `Retry-After`.

Does the search API share the same 5,000/hour budget? +

No. Search has its own bucket — 30 requests/minute for authenticated calls, 10/minute for unauthenticated. The response will tell you which resource you hit via `X-RateLimit-Resource: search`. Plan search-heavy workloads accordingly; consider caching SERP results or using webhook-driven discovery instead.

Are GraphQL and REST limits separate? +

Yes. GraphQL uses a points-based 5,000/hour budget, separate from REST's call-based 5,000/hour. Heavy clients can use both APIs concurrently to extend overall throughput, though you'll still hit secondary limits if you're abusive on either.

Why am I rate-limited on a GitHub Actions workflow? +

GitHub Actions runners share an exit IP pool with other Actions runners. If your workflow uses unauthenticated `curl` or `gh` calls, you're sharing 60/hour with everyone else on that IP. Use `${{ secrets.GITHUB_TOKEN }}` for authenticated calls — Actions injects a token automatically scoped to the workflow.

How do I know my current rate-limit budget without making a call? +

`GET /rate_limit` returns your current usage and is itself free — it doesn't count against any budget. Poll this when you suspect throttling but don't want to risk a real call.

What's the right retry strategy for secondary rate limits? +

Honour `Retry-After` (seconds), then add jitter and reduce concurrency on the offending endpoint. Unlike primary limits, secondary limits punish repeated bursts — so retrying with the same concurrency will trigger them again. Drop to single-threaded for the affected endpoint, then ramp back up.

Does authenticated traffic from one PAT count against another user''s PAT? +

No. Rate limits are per-token. Two PATs from the same user have independent budgets (each 5,000/hour). You can technically split work across multiple tokens, but for sustained automation, GitHub Apps are the supported path.

When to escalate to GitHub support

Open a GitHub support ticket if (a) your GitHub App installation is consistently hitting 15,000/hour and you have a legitimate use case requiring more, (b) secondary rate limits are firing on calls that aren't bursty, or (c) the rate-limit headers contradict each other (e.g., `Remaining: 100` but you still get 403). For routine quota hits, support can't grant manual exceptions — switch to GraphQL, conditional requests, or App auth instead.