Add per-file download retry with exponential backoff#63
Open
Add per-file download retry with exponential backoff#63
Conversation
Defines per-file retry with exponential backoff for transient network errors, while preserving existing .failed marker semantics for permanent HTTP errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Seven tasks covering CLI option, retry loop, unit tests, mock server transient errors, integration tests, docs, and final verification. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Also catch http.client.HTTPException in the retry loop so that IncompleteRead (partial response / connection drop) is treated as a transient error and retried. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Covers http.client.IncompleteRead as a retryable transient error and verifies retry_count < 1 is rejected by main(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add TransientDownloadError raised when download retries exhaust - Circuit breaker in sync() aborts after 3 consecutive transient recording failures (raises UserWarning for clean exit) - Widen transient error catch to include OSError (covers ConnectionResetError, BrokenPipeError, ConnectionAbortedError) - Clean up partial temp file after exhausted retries - Replace unreachable return with AssertionError - Fix comment style to third-person per project guidelines - Fix backoff sequence in design doc (1s, 2s not 1s, 2s, 4s) - Add unit tests for OSError retry, circuit breaker, temp cleanup, backoff values - Add integration test for circuit breaker abort scenario Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove redundant URLError and socket.timeout from transient exception catch (both are OSError subclasses). Extract _fetch_file() helper to reduce download_file cognitive complexity below 15. Fix entrypoint.sh to propagate blackvuesync.sh exit code. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Dockerfile defaults CRON=1, so Docker integration tests were always running with --cron even when not intended. This caused the circuit breaker test to expect exit code 1 but get 0, since UserWarning returns 0 in cron mode. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add venv binary usage guidance to CLAUDE.md from Codex template. Move pytest permission to shared settings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Summary
--retry-countCLI option (default 3) controlling download attempts per fileURLError,socket.timeout,http.client.HTTPException)HTTPError) still immediately create.failedmarkers with no retry, preserving existing--retry-failed-aftersemanticssocket.timeoutduring file download no longer kills the entire sync runTest plan
RETRY_COUNTenv var works🤖 Generated with Claude Code