feat: add native Responses API support for hosted_vllm provider#22298
Conversation
Register HostedVLLMResponsesAPIConfig so that litellm.responses(model="hosted_vllm/...") routes directly to vLLM's /v1/responses endpoint instead of falling back to the chat completions → responses conversion pipeline. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryRegisters a new
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| litellm/llms/hosted_vllm/responses/transformation.py | New HostedVLLMResponsesAPIConfig class extending OpenAIResponsesAPIConfig. Follows existing patterns from chat/embedding configs correctly — uses HOSTED_VLLM_API_BASE/KEY env vars, defaults to "fake-api-key", handles /v1 suffix in URL construction. Clean and well-structured. |
| litellm/init.py | Adds TYPE_CHECKING import for HostedVLLMResponsesAPIConfig, placed logically next to existing hosted_vllm imports. Correct pattern. |
| litellm/_lazy_imports_registry.py | Registers HostedVLLMResponsesAPIConfig in both the LLM_CONFIG_NAMES tuple and the _LLM_CONFIGS_IMPORT_MAP dict. Follows the established lazy import pattern correctly. |
| litellm/utils.py | Adds HOSTED_VLLM case to get_provider_responses_api_config() dispatch, following the same pattern as all other providers. Placed correctly before the final return None. |
| tests/test_litellm/llms/hosted_vllm/responses/test_hosted_vllm_responses.py | Good test coverage with 7 tests total. Two tests (default API key and missing api_base) are fragile — they don't mock or clear env vars, so they may fail if HOSTED_VLLM_API_KEY or HOSTED_VLLM_API_BASE are set in the test environment. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["litellm.responses(model='hosted_vllm/...')"] --> B["ProviderConfigManager.get_provider_responses_api_config()"]
B --> C{"provider == HOSTED_VLLM?"}
C -->|Yes| D["HostedVLLMResponsesAPIConfig()"]
C -->|No / None returned| E["Fallback: chat completions → responses conversion"]
D --> F["validate_environment()"]
F --> G["api_key from params / HOSTED_VLLM_API_KEY / 'fake-api-key'"]
D --> H["get_complete_url()"]
H --> I["api_base from params / HOSTED_VLLM_API_BASE"]
I --> J["Append /v1/responses or /responses"]
G --> K["Direct POST to vLLM /v1/responses"]
J --> K
K --> L["OpenAI-compatible Response parsed by base class"]
Last reviewed commit: 7240266
| def test_hosted_vllm_responses_api_url_requires_api_base(): | ||
| """Test get_complete_url() raises ValueError when api_base is not set.""" | ||
| config = HostedVLLMResponsesAPIConfig() | ||
|
|
||
| with pytest.raises(ValueError, match="api_base not set"): | ||
| config.get_complete_url( | ||
| api_base=None, | ||
| litellm_params={}, | ||
| ) |
There was a problem hiding this comment.
Test is fragile when env var is set
test_hosted_vllm_responses_api_url_requires_api_base will not raise ValueError if the HOSTED_VLLM_API_BASE environment variable is set in the test runner's environment (e.g., CI), because get_secret_str("HOSTED_VLLM_API_BASE") will return a value before the None check. Consider patching get_secret_str to return None, or using monkeypatch.delenv to ensure the env var is unset:
| def test_hosted_vllm_responses_api_url_requires_api_base(): | |
| """Test get_complete_url() raises ValueError when api_base is not set.""" | |
| config = HostedVLLMResponsesAPIConfig() | |
| with pytest.raises(ValueError, match="api_base not set"): | |
| config.get_complete_url( | |
| api_base=None, | |
| litellm_params={}, | |
| ) | |
| def test_hosted_vllm_responses_api_url_requires_api_base(monkeypatch): | |
| """Test get_complete_url() raises ValueError when api_base is not set.""" | |
| monkeypatch.delenv("HOSTED_VLLM_API_BASE", raising=False) | |
| config = HostedVLLMResponsesAPIConfig() | |
| with pytest.raises(ValueError, match="api_base not set"): | |
| config.get_complete_url( | |
| api_base=None, | |
| litellm_params={}, | |
| ) |
| def test_hosted_vllm_validate_environment_default_api_key(): | ||
| """Test validate_environment() defaults to 'fake-api-key' when no key is provided.""" | ||
| config = HostedVLLMResponsesAPIConfig() | ||
|
|
||
| headers = config.validate_environment( | ||
| headers={}, | ||
| model="Qwen/Qwen3-8B", | ||
| litellm_params=GenericLiteLLMParams(), | ||
| ) | ||
|
|
||
| assert headers.get("Authorization") == "Bearer fake-api-key" |
There was a problem hiding this comment.
Test is fragile when env var is set
test_hosted_vllm_validate_environment_default_api_key will fail if HOSTED_VLLM_API_KEY is set in the test environment, because get_secret_str("HOSTED_VLLM_API_KEY") will return a real value instead of falling through to "fake-api-key". Consider clearing the env var:
| def test_hosted_vllm_validate_environment_default_api_key(): | |
| """Test validate_environment() defaults to 'fake-api-key' when no key is provided.""" | |
| config = HostedVLLMResponsesAPIConfig() | |
| headers = config.validate_environment( | |
| headers={}, | |
| model="Qwen/Qwen3-8B", | |
| litellm_params=GenericLiteLLMParams(), | |
| ) | |
| assert headers.get("Authorization") == "Bearer fake-api-key" | |
| def test_hosted_vllm_validate_environment_default_api_key(monkeypatch): | |
| """Test validate_environment() defaults to 'fake-api-key' when no key is provided.""" | |
| monkeypatch.delenv("HOSTED_VLLM_API_KEY", raising=False) | |
| config = HostedVLLMResponsesAPIConfig() | |
| headers = config.validate_environment( | |
| headers={}, | |
| model="Qwen/Qwen3-8B", | |
| litellm_params=GenericLiteLLMParams(), | |
| ) | |
| assert headers.get("Authorization") == "Bearer fake-api-key" |
1b4cfc2
into
BerriAI:litellm_oss_staging_02_28_2026
Register
HostedVLLMResponsesAPIConfigso thatlitellm.responses(model="hosted_vllm/...")routes directly to vLLM's/v1/responsesendpoint instead of falling back to the chat completions → responses conversion pipeline.Relevant issues
Relates to #19733 (stalled since 2025-01-27; this PR incorporates maintainer feedback on API key defaults and generic approach)
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewCI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🆕 New Feature
Changes
litellm/llms/hosted_vllm/responses/transformation.py—HostedVLLMResponsesAPIConfigextendingOpenAIResponsesAPIConfigwith:custom_llm_provider→LlmProviders.HOSTED_VLLMvalidate_environment()— defaults to"fake-api-key"when no key is provided (matching existingHostedVLLMChatConfigpattern)get_complete_url()— resolvesHOSTED_VLLM_API_BASEenv var and handlesapi_basewith/without/v1suffixlitellm/__init__.py— add TYPE_CHECKING exportlitellm/_lazy_imports_registry.py— register in lazy import systemlitellm/utils.py— addHOSTED_VLLMcase toget_provider_responses_api_config()tests/test_litellm/llms/hosted_vllm/responses/test_hosted_vllm_responses.py— add 5 new tests:test_hosted_vllm_provider_config_registrationtest_hosted_vllm_responses_api_urltest_hosted_vllm_responses_api_url_requires_api_basetest_hosted_vllm_validate_environment_default_api_keytest_hosted_vllm_validate_environment_custom_api_key