fix: improve model read retry logic for eventual consistency by shin-bot-litellm · Pull Request #22 · BerriAI/terraform-provider-litellm

shin-bot-litellm · 2026-02-09T22:48:41Z

Summary

Fixes #16 - litellm_model resource intermittently fails with "Root object was present, but now absent" on create

Problem

When creating multiple litellm_model resources in parallel, some randomly fail because:

The Create API call succeeds
The immediate Read to verify returns empty/nil due to eventual consistency
Terraform interprets this as "resource disappeared"

Root Cause

Database eventual consistency - the model is written but not immediately readable, especially under concurrent writes.

Solution

Improved the retry mechanism in retryModelRead():

Before	After
Fixed error string matching	Flexible pattern matching via `isRetryableModelError()`
No initial delay	200ms initial delay for database sync
1s starting retry delay	500ms starting delay (faster first retry)
5 retries max	8 retries max
~31s max wait	~45s max wait

Error Patterns Now Handled

"not found" messages
"Model id = X" patterns
"model_not_found" sentinel values
Cleared resource IDs

Retry Timeline

200ms initial delay
500ms → 1s → 2s → 4s → 8s → 10s → 10s → 10s
(exponential backoff with 10s cap)

Testing

The fix addresses the intermittent failure by:

Allowing more time for database replication
Using robust error detection that won't break with API changes
Providing better debug logging

Fixes BerriAI#16 - Model resource intermittently fails with 'Root object was present, but now absent' The issue occurs when creating multiple models in parallel due to eventual consistency in the LiteLLM database - the model is created successfully but the immediate read-back verification returns empty/nil. Changes: - Add isRetryableModelError() helper for flexible error pattern matching - Add 200ms initial delay before first read to allow database sync - Start with 500ms delay (was 1s) for faster first retry - Increase retry count from 5 to 8 (total max wait ~45s) - More robust error detection using pattern matching instead of exact string - Better logging for debugging eventual consistency issues The retry pattern now handles: - 'not found' error messages - 'Model id = X' patterns - 'model_not_found' sentinel values - Cleared resource IDs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: improve model read retry logic for eventual consistency#22

fix: improve model read retry logic for eventual consistency#22
shin-bot-litellm wants to merge 1 commit intoBerriAI:mainfrom
shin-bot-litellm:litellm_fix_model_create_race_condition

shin-bot-litellm commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shin-bot-litellm commented Feb 9, 2026

Summary

Problem

Root Cause

Solution

Error Patterns Now Handled

Retry Timeline

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants