Skip to content

[Feature]: New config flag: Enforce token usage for streamed responses #22280

@darebfh

Description

@darebfh

Check for existing issues

  • I have searched the existing issues and checked that my issue is not a duplicate.

The Feature

I propose an additional config flag "ENFORCE_STREAMED_USAGE" that adds for requests with "stream": true the optional attribute

"stream_options": {
"include_usage": true
}

Currently, this might only be achieved by a custom pre_call_hook.

Motivation, pitch

Users might avoid token limits by only streaming outputs:
Streamed responses currently require an optional attribute

"stream_options": {
"include_usage": true
}

so that LiteLLM can track in- and output tokens of the request. Without that attribute, token count is zero.

What part of LiteLLM is this about?

Proxy

LiteLLM is hiring a founding backend engineer, are you interested in joining us and shipping to all our users?

No

Twitter / LinkedIn details

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions