Context
Providers like HuggingFace serve 127+ models across 15 inference providers, and the catalog changes frequently. Currently, we use MODELS = [] (same as Ollama) and users must pass the model ID directly with explicit provider=. This works but disables parameter validation and prevents list_models(provider=Provider.HUGGINGFACE).
Several providers already define LIST_MODELS endpoints in their config files (Groq, Anthropic, Mistral, OpenAI, etc.) but none are actually called.
Proposal
Add a dynamic model discovery mechanism that can query a provider's /v1/models (or equivalent) endpoint and populate the model registry at runtime.
HuggingFace example
GET https://router.huggingface.co/v1/models returns rich metadata per model:
id — model ID (org/name format)
architecture.input_modalities — ["text"] or ["text", "image"] for vision
architecture.output_modalities — ["text"]
providers[] — inference backends with context_length, pricing, supports_tools, supports_structured_output
Considerations
- Caching strategy (avoid hitting the API on every import)
- Lazy vs eager loading
- Graceful fallback when API is unavailable
- Which providers to support initially (HuggingFace, Ollama, others?)
Related: #148, #165
Context
Providers like HuggingFace serve 127+ models across 15 inference providers, and the catalog changes frequently. Currently, we use
MODELS = [](same as Ollama) and users must pass the model ID directly with explicitprovider=. This works but disables parameter validation and preventslist_models(provider=Provider.HUGGINGFACE).Several providers already define
LIST_MODELSendpoints in their config files (Groq, Anthropic, Mistral, OpenAI, etc.) but none are actually called.Proposal
Add a dynamic model discovery mechanism that can query a provider's
/v1/models(or equivalent) endpoint and populate the model registry at runtime.HuggingFace example
GET https://router.huggingface.co/v1/modelsreturns rich metadata per model:id— model ID (org/name format)architecture.input_modalities—["text"]or["text", "image"]for visionarchitecture.output_modalities—["text"]providers[]— inference backends withcontext_length,pricing,supports_tools,supports_structured_outputConsiderations
Related: #148, #165