Honor rate-limit Retry-After across Google/Anthropic/GitHub SDKs#2565
Merged
hiroshinishio merged 1 commit intomainfrom Apr 21, 2026
Merged
Honor rate-limit Retry-After across Google/Anthropic/GitHub SDKs#2565hiroshinishio merged 1 commit intomainfrom
hiroshinishio merged 1 commit intomainfrom
Conversation
get_rate_limit_retry_after extracts a retry-after delay from any SDK's rate-limit error — Gemini's "Please retry in N.NNNs" message body, GitHub's X-RateLimit-Reset/Retry-After headers, Anthropic's retry-after header. handle_exceptions calls it in both the HTTPError and generic Exception paths, sleeps the honored delay, and retries under the existing TRANSIENT_MAX_ATTEMPTS budget. Per-SDK parsers live in their own files. The old handle_github_rate_limit / handle_http_error github-special-case is gone — replaced by the generic path. Test fixture is the verbatim CloudWatch log from the gitautoai/website incident on 2026-04-20 16:23 UTC.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When the Gemini free-tier quota trips (
429 RESOURCE_EXHAUSTED), the API returns a "Please retry in N.NNNs" hint in the error message body. Previouslychat_with_googlejust raised, cascaded throughchat_with_model→chat_with_agent→handle_webhook_event, and every Lambda in the incident window died on Sentry (AGENT-3K5/3K6/3K7/3K8/36M/36Q, the gitautoai/website run on 2026-04-20 16:23 UTC).This PR generalizes rate-limit handling across every SDK we talk to.
utils/error/get_rate_limit_retry_after.pydispatches per error type:requests.HTTPError→ GitHub headers (X-RateLimit-Remaining=0+X-RateLimit-Resetprimary,Retry-Aftersecondary) viaparse_github_rate_limit_headers, or genericRetry-Afterviaparse_retry_after_headerAPIStatusError(status_code=429) →retry-afterheaderClientError(code=429) → "Please retry in N.NNNs" in message body viaparse_google_retry_in_messagehandle_exceptionscalls it from both therequests.HTTPErrorand genericExceptionbranches. If a delay comes back AND the existingTRANSIENT_MAX_ATTEMPTSbudget allows, it sleeps the honored delay and continues the retry loop. No upper cap —should_bailat the handler layer already enforces Lambda timeout.handle_github_rate_limitand theapi_type=="github"special case inhandle_http_errorare gone. Both are now just input shapes for the same generic extractor.fixtures/real_google_429.txt) is the verbatim CloudWatch log line from the gitautoai/website incident (Please retry in 59.739387544s), preserved with fulldetails[]payload.Social Media Post (GitAuto)
Rate-limit 429s from Google/Anthropic/GitHub now retry cleanly instead of crashing the Lambda
Social Media Post (Wes)
The Gemini free-tier 429 takes a different shape than GitHub's 429, which takes a different shape than Anthropic's 429, and I had a per-SDK handler for exactly zero of them. Wrote one extractor that dispatches on error type, hooked it into the existing retry loop, honored whatever delay the server suggested. Three error shapes, one path, no more 429s in Sentry.