Add device context and tensor identification params to custom tensor buffer handlers#6975
Add device context and tensor identification params to custom tensor buffer handlers#6975devathul wants to merge 2 commits intogoogle-ai-edge:mainfrom
Conversation
…or buffer implementation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
…or buffer implementation
There was a problem hiding this comment.
This patch introduces three parameters to the custom buffer handler signatures to enable deterministic hardware memory mapping:
LiteRtDispatchDeviceContext: Provides the hardware-specific context directly to the handler
tensor_index: Identifies the specific position within the tensor array.
is_input: Disambiguates between input and output tensor arrays.
Changes:
API Update: Enhanced CreateCustomTensorBuffer and ImportCustomTensorBuffer in litert_custom_tensor_buffer.h with new context and identification parameters.
New Public API: Added LiteRtCompiledModelCreateBufferForIoTensor to the C API and runtime to allow explicit buffer creation for named I/O tensors.
Runtime Integration: Updated DispatchDelegateKernel to track tensor IDs and ExternalLiteRtBufferContext to propagate metadata during the allocation flow.
Added CreateManagedTensorBufferWithContext and updated CustomBuffer::Alloc to support context-aware allocation.
The current custom tensor buffer handler signatures (LiteRtTensorBufferCreateFunc) provide no device context, model information, or tensor index parameters, making it impossible to assign hardware-specific memory addresses during buffer creation.
LiteRT's LiteRtRegisterTensorBufferHandlers API provides no way to determine:
Proposing an enhanced LiteRtTensorBufferCreateFunc API signature that includes LiteRtDispatchDeviceContext, tensor_index, and is_input parameters, enabling vendors to achieve zero I/O tensor copy.
Since LiteRT invokes CreateBstmTensorBuffer sequentially for every input and output, there is no deterministic way to map a specific call to its corresponding pre-allocated hardware payload. Without identifying metadata, assigning the correct hardware memory addresses to the appropriate tensor buffers becomes an impossible mapping problem.
This change will avoid using global variables to pass LiteRtDispatchDeviceContext