-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Gap
patch_litellm() — the function used by auto_instrument() — only replaces 4 of the 6 functions that LiteLLMWrapper instruments. Embedding and moderation calls made via litellm.embedding() or litellm.moderation() are silently untraced when using auto-instrumentation.
Additionally, LiteLLMWrapper itself is missing aembedding() (async embeddings), creating an asymmetry with the rest of the wrapper which provides both sync and async variants for completion and responses.
patch_litellm() (line 630–667) patches:
| Function | Patched? |
|---|---|
litellm.completion |
Yes |
litellm.acompletion |
Yes |
litellm.responses |
Yes |
litellm.aresponses |
Yes |
litellm.embedding |
No |
litellm.aembedding |
No (also missing from LiteLLMWrapper) |
litellm.moderation |
No |
litellm.amoderation |
No (also missing from LiteLLMWrapper) |
LiteLLMWrapper (line 484–518) exposes:
completion()/acompletion()— sync + asyncresponses()/aresponses()— sync + asyncembedding()— sync only, no asyncmoderation()— sync only, no async
What is missing
patch_litellm()should also patchlitellm.embeddingandlitellm.moderationso auto-instrument users get tracing on these calls.LiteLLMWrappershould addaembedding()andamoderation()for async parity, andpatch_litellm()should patch those as well.
Embeddings are a core AI API surface used in RAG pipelines and semantic search. The EmbeddingWrapper class (line 423) already exists and works correctly — it's just not wired into patch_litellm().
Braintrust docs status
not_found — no LiteLLM-specific integration page was found at braintrust.dev/docs/integrations/ai-providers/litellm (404).
Upstream sources
- LiteLLM embedding docs: https://docs.litellm.ai/docs/embedding/supported_embedding
- LiteLLM
aembedding()function: https://docs.litellm.ai/docs/embedding/async_embedding - LiteLLM moderation: https://docs.litellm.ai/docs/moderation
Local files inspected
py/src/braintrust/wrappers/litellm.py:LiteLLMWrapper.__init__(line 484–494) — creates_embedding_wrapperand_moderation_wrapperbut no async variantsLiteLLMWrapperclass (line 496–518) — exposesembedding()andmoderation()sync onlypatch_litellm()(line 630–667) — only patchescompletion,acompletion,responses,aresponsesEmbeddingWrapperclass (line 423) — fully implemented with metrics extraction- TODO at line 446: "Add a flag to control whether to log the full embedding vector"
py/src/braintrust/wrappers/test_litellm.py— has embedding tests but only viawrap_litellm(), not viapatch_litellm()- Grep for
aembeddingin litellm.py: zero results