feat: add native Responses API support for hosted_vllm provider by anencore94 · Pull Request #22298 · BerriAI/litellm

anencore94 · 2026-02-27T17:03:53Z

Register HostedVLLMResponsesAPIConfig so that litellm.responses(model="hosted_vllm/...") routes directly to vLLM's /v1/responses endpoint instead of falling back to the chat completions → responses conversion pipeline.

Relevant issues

Relates to #19733 (stalled since 2025-01-27; this PR incorporates maintainer feedback on API key defaults and generic approach)

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Type

🆕 New Feature

Changes

New file litellm/llms/hosted_vllm/responses/transformation.py — HostedVLLMResponsesAPIConfig extending OpenAIResponsesAPIConfig with:
- custom_llm_provider → LlmProviders.HOSTED_VLLM
- validate_environment() — defaults to "fake-api-key" when no key is provided (matching existing HostedVLLMChatConfig pattern)
- get_complete_url() — resolves HOSTED_VLLM_API_BASE env var and handles api_base with/without /v1 suffix
litellm/__init__.py — add TYPE_CHECKING export
litellm/_lazy_imports_registry.py — register in lazy import system
litellm/utils.py — add HOSTED_VLLM case to get_provider_responses_api_config()
tests/test_litellm/llms/hosted_vllm/responses/test_hosted_vllm_responses.py — add 5 new tests:
- test_hosted_vllm_provider_config_registration
- test_hosted_vllm_responses_api_url
- test_hosted_vllm_responses_api_url_requires_api_base
- test_hosted_vllm_validate_environment_default_api_key
- test_hosted_vllm_validate_environment_custom_api_key

Register HostedVLLMResponsesAPIConfig so that litellm.responses(model="hosted_vllm/...") routes directly to vLLM's /v1/responses endpoint instead of falling back to the chat completions → responses conversion pipeline. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel · 2026-02-27T17:03:58Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Feb 27, 2026 5:05pm

CLAassistant · 2026-02-27T17:04:00Z

All committers have signed the CLA.

greptile-apps · 2026-02-27T17:07:53Z

Greptile Summary

Registers a new HostedVLLMResponsesAPIConfig so that litellm.responses(model="hosted_vllm/...") routes directly to vLLM's native /v1/responses endpoint, bypassing the chat-completions-to-responses conversion pipeline. The implementation closely follows the existing patterns used by other providers (GitHub Copilot, XAI, etc.) and mirrors the HostedVLLMChatConfig conventions for env var names and the "fake-api-key" default.

New config class in litellm/llms/hosted_vllm/responses/transformation.py extending OpenAIResponsesAPIConfig with vLLM-specific URL construction and API key defaults
Registration plumbing across __init__.py, _lazy_imports_registry.py, and utils.py — all follow established patterns
Test coverage with 7 tests (config registration, URL construction, env defaults, end-to-end mock). Two tests are fragile if HOSTED_VLLM_API_KEY or HOSTED_VLLM_API_BASE env vars are set in the runner environment

Confidence Score: 4/5

This PR is safe to merge after addressing the minor test fragility issues — the core implementation is clean and follows established patterns.
The implementation is minimal, well-structured, and follows existing patterns exactly. The only concerns are two tests that could be flaky in environments where HOSTED_VLLM_API_KEY or HOSTED_VLLM_API_BASE are set. The core transformation class and registration code are correct.
Pay attention to tests/test_litellm/llms/hosted_vllm/responses/test_hosted_vllm_responses.py — two tests need environment variable isolation to be robust in CI.

Important Files Changed

Filename	Overview
litellm/llms/hosted_vllm/responses/transformation.py	New HostedVLLMResponsesAPIConfig class extending OpenAIResponsesAPIConfig. Follows existing patterns from chat/embedding configs correctly — uses HOSTED_VLLM_API_BASE/KEY env vars, defaults to "fake-api-key", handles /v1 suffix in URL construction. Clean and well-structured.
litellm/init.py	Adds TYPE_CHECKING import for HostedVLLMResponsesAPIConfig, placed logically next to existing hosted_vllm imports. Correct pattern.
litellm/_lazy_imports_registry.py	Registers HostedVLLMResponsesAPIConfig in both the LLM_CONFIG_NAMES tuple and the _LLM_CONFIGS_IMPORT_MAP dict. Follows the established lazy import pattern correctly.
litellm/utils.py	Adds HOSTED_VLLM case to get_provider_responses_api_config() dispatch, following the same pattern as all other providers. Placed correctly before the final `return None`.
tests/test_litellm/llms/hosted_vllm/responses/test_hosted_vllm_responses.py	Good test coverage with 7 tests total. Two tests (default API key and missing api_base) are fragile — they don't mock or clear env vars, so they may fail if HOSTED_VLLM_API_KEY or HOSTED_VLLM_API_BASE are set in the test environment.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["litellm.responses(model='hosted_vllm/...')"] --> B["ProviderConfigManager.get_provider_responses_api_config()"]
    B --> C{"provider == HOSTED_VLLM?"}
    C -->|Yes| D["HostedVLLMResponsesAPIConfig()"]
    C -->|No / None returned| E["Fallback: chat completions → responses conversion"]
    D --> F["validate_environment()"]
    F --> G["api_key from params / HOSTED_VLLM_API_KEY / 'fake-api-key'"]
    D --> H["get_complete_url()"]
    H --> I["api_base from params / HOSTED_VLLM_API_BASE"]
    I --> J["Append /v1/responses or /responses"]
    G --> K["Direct POST to vLLM /v1/responses"]
    J --> K
    K --> L["OpenAI-compatible Response parsed by base class"]

_{Last reviewed commit: 7240266}

greptile-apps

_{5 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-27T17:07:57Z

tests/test_litellm/llms/hosted_vllm/responses/test_hosted_vllm_responses.py

+def test_hosted_vllm_responses_api_url_requires_api_base():
+    """Test get_complete_url() raises ValueError when api_base is not set."""
+    config = HostedVLLMResponsesAPIConfig()
+
+    with pytest.raises(ValueError, match="api_base not set"):
+        config.get_complete_url(
+            api_base=None,
+            litellm_params={},
+        )


Test is fragile when env var is set

test_hosted_vllm_responses_api_url_requires_api_base will not raise ValueError if the HOSTED_VLLM_API_BASE environment variable is set in the test runner's environment (e.g., CI), because get_secret_str("HOSTED_VLLM_API_BASE") will return a value before the None check. Consider patching get_secret_str to return None, or using monkeypatch.delenv to ensure the env var is unset:

Suggested change

def test_hosted_vllm_responses_api_url_requires_api_base():

"""Test get_complete_url() raises ValueError when api_base is not set."""

config = HostedVLLMResponsesAPIConfig()

with pytest.raises(ValueError, match="api_base not set"):

config.get_complete_url(

api_base=None,

litellm_params={},

)

def test_hosted_vllm_responses_api_url_requires_api_base(monkeypatch):

"""Test get_complete_url() raises ValueError when api_base is not set."""

monkeypatch.delenv("HOSTED_VLLM_API_BASE", raising=False)

config = HostedVLLMResponsesAPIConfig()

with pytest.raises(ValueError, match="api_base not set"):

config.get_complete_url(

api_base=None,

litellm_params={},

)

greptile-apps · 2026-02-27T17:07:58Z

tests/test_litellm/llms/hosted_vllm/responses/test_hosted_vllm_responses.py

+def test_hosted_vllm_validate_environment_default_api_key():
+    """Test validate_environment() defaults to 'fake-api-key' when no key is provided."""
+    config = HostedVLLMResponsesAPIConfig()
+
+    headers = config.validate_environment(
+        headers={},
+        model="Qwen/Qwen3-8B",
+        litellm_params=GenericLiteLLMParams(),
+    )
+
+    assert headers.get("Authorization") == "Bearer fake-api-key"


Test is fragile when env var is set

test_hosted_vllm_validate_environment_default_api_key will fail if HOSTED_VLLM_API_KEY is set in the test environment, because get_secret_str("HOSTED_VLLM_API_KEY") will return a real value instead of falling through to "fake-api-key". Consider clearing the env var:

Suggested change

def test_hosted_vllm_validate_environment_default_api_key():

"""Test validate_environment() defaults to 'fake-api-key' when no key is provided."""

config = HostedVLLMResponsesAPIConfig()

headers = config.validate_environment(

headers={},

model="Qwen/Qwen3-8B",

litellm_params=GenericLiteLLMParams(),

)

assert headers.get("Authorization") == "Bearer fake-api-key"

def test_hosted_vllm_validate_environment_default_api_key(monkeypatch):

"""Test validate_environment() defaults to 'fake-api-key' when no key is provided."""

monkeypatch.delenv("HOSTED_VLLM_API_KEY", raising=False)

config = HostedVLLMResponsesAPIConfig()

headers = config.validate_environment(

headers={},

model="Qwen/Qwen3-8B",

litellm_params=GenericLiteLLMParams(),

)

assert headers.get("Authorization") == "Bearer fake-api-key"

vercel bot deployed to Preview February 27, 2026 17:05 View deployment

greptile-apps bot reviewed Feb 27, 2026

View reviewed changes

krrishdholakia changed the base branch from main to litellm_oss_staging_02_28_2026 February 28, 2026 03:39

krrishdholakia merged commit 1b4cfc2 into BerriAI:litellm_oss_staging_02_28_2026 Feb 28, 2026
26 of 30 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add native Responses API support for hosted_vllm provider#22298

feat: add native Responses API support for hosted_vllm provider#22298
krrishdholakia merged 1 commit intoBerriAI:litellm_oss_staging_02_28_2026from
anencore94:feat/hosted-vllm-responses-api

anencore94 commented Feb 27, 2026

Uh oh!

vercel bot commented Feb 27, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Feb 27, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Feb 27, 2026

Important Files Changed

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 27, 2026

Uh oh!

greptile-apps bot Feb 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

anencore94 commented Feb 27, 2026

Relevant issues

Pre-Submission checklist

CI (LiteLLM team)

Type

Changes

Uh oh!

vercel bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLAassistant commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Feb 27, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vercel bot commented Feb 27, 2026 •

edited

Loading

CLAassistant commented Feb 27, 2026 •

edited

Loading