Skip to content

Gemini provider: model names not prefixed for LiteLLM Gemini API routing #5245

@Jos-Jerus

Description

@Jos-Jerus

Description

When using the remote::gemini provider with a Gemini API key (GEMINI_API_KEY), LiteLLM auto-detects model names like gemini-2.5-flash as Vertex AI models and attempts to use Application Default Credentials instead of the API key.

Expected Behavior

When GEMINI_API_KEY is set, the Gemini provider should route requests to the Gemini API (Google AI Studio), not Vertex AI.

Actual Behavior

LiteLLM receives gemini-2.5-flash and auto-detects it as a Vertex AI model, failing with:

litellm.APIConnectionError: Your default credentials were not found. 
To set up Application Default Credentials, see https://cloud.google.com/docs/authentication/external/set-up-adc

Root Cause

LiteLLM uses model name prefixes to determine routing:

  • gemini/gemini-2.5-flash → Gemini API (uses GEMINI_API_KEY)
  • gemini-2.5-flash (no prefix) → auto-detected as Vertex AI (uses ADC)

The Gemini provider passes model names without the gemini/ prefix to LiteLLM.

Suggested Fix

When the Gemini provider is configured with an API key (not Vertex AI), it should prefix model names with gemini/ before passing to LiteLLM:

# In the Gemini provider's inference call
if self.api_key:
    litellm_model = f"gemini/{model_id}"
else:
    litellm_model = model_id  # Vertex AI path

Workaround

Currently working around this by injecting a sitecustomize.py that sets litellm.model_alias_map:

import litellm
litellm.model_alias_map = {
    'gemini-2.5-flash': 'gemini/gemini-2.5-flash',
    'gemini-2.5-pro': 'gemini/gemini-2.5-pro',
    # ... other models
}

Environment

  • llama-stack version: 0.3.x (via lightspeed-stack 0.3.1)
  • LiteLLM version: latest bundled
  • Python: 3.12

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions