Feat: revamp build.py CLI to improve usability and maintainability#8437
Feat: revamp build.py CLI to improve usability and maintainability#8437kpedro88 wants to merge 14 commits intotriton-inference-server:mainfrom
Conversation
|
this does produce, in my opinion, a more helpful |
whoisj
left a comment
There was a problem hiding this comment.
Not sure, but I think there might an error in the examples in the docs.
* show defaults in help * remove redundant statements of default values * change default None to [] for lists * improve argparse usage: use nargs instead of manual parsing, restrict choices where possible * unify handling of all similar elements: components, backends, repoagents, caches, filesystems, endpoints, features
a3d469d to
b564d3b
Compare
|
Now rebased |
|
This change looks like a preset instructions implementation. I can't accept this change right away, as I see it can be achieved natively through CMake, and I would like to delegate or reuse it native functionality instead of extend Will try to schedule work on it, next week to see what we can achieve. Sure would be nice to have veriety of examples against suggested changes which you are looking for. I believe there a few cases and have them handy can help to come out faster with full review. @kpedro88 is it possible to share with us build instructions you are using? build.py --enable-all
# or
./build.py \
-v \
--build-type=Release \
--build-id=290398589 \
--version=2.68.0dev \
--target-platform=rhel \
--container-version=26.03 \
--upstream-container-version=26.03 \
--github-organization=https://github.com/triton-inference-server \
--no-container-interactive \
--enable-logging \
--enable-stats \
--enable-tracing \
--enable-metrics \
--enable-gpu-metrics \
--enable-cpu-metrics \
--enable-gpu \
--filesystem=gcs \
--filesystem=s3 \
--filesystem=azure_storage \
--endpoint=http \
--endpoint=grpc \
--repo-tag=common:main \
--repo-tag=core:main \
--repo-tag=backend:main \
--repo-tag=thirdparty:main \
--backend=ensemble \
--backend=identity:main \
--backend=repeat:main \
--backend=square:main \
--repoagent=checksum:main \
--cache=local:main \
--cache=redis:main \
--extra-core-cmake-arg=TRITON_IGPU_BUILD=0 \
--backend=tensorrt:main \
--backend=python:main \
--backend=pytorch:main \
--backend=onnxruntime:main |
|
@mc-nv I was starting to look at this today based on our discussion, but I am starting leave early (right now). I will address this when I am back in a few weeks. |
What does the PR do?
This pull request updates the command-line interface for build.py with two main ideas in mind.
A. Use native features of
argparserather than (potentially brittle) manual parsing of multi-valued inputs:choicesto enforce limited sets of values:--target-platform {linux,rhel,windows,igpu}--build-type {Release,Debug,RelWithDebInfo,MinSizeRel}nargsinstead of separate postprocessing to split by separating characters:--image IMAGE Use specified Docker image in build as <image-name>,<full-image-name>. <image-name> can be "base", "gpu-base", or "pytorch".->
--image <image-name> <full-image-name> Use specified Docker image in build. <image-name> can be "base", "gpu-base", or "pytorch". (default: [])B. Unify the handling of different elements of the build to reduce code duplication and make it easier to add new features to control the build more precisely and flexibly.
The different elements are: components, backends, repoagents, caches, filesystems, endpoints, features.
Each type of element can be (internally) assigned different properties, which determine the available command-line arguments for that element type.
The properties include:
required: elements without this property get--enableand--disableflags.strict: elements with this property are limited to a predefined set of choices (e.g.local,redisfor caches).tag: elements with this property get--<element>-tag <element> <tag>flags to specify a tag from the central repository.org: elements with this property get--<element>-org <element> <org>flags to specify a different repository. (new feature)cmake: elements with this property get--extra-<element>-cmake-arg <element> <name> <value>and--override-<element>-cmake-arg <element> <name> <value>flags to modify their build.The resulting arguments become:
--enable-logging,--enable-stats,--enable-metrics,--enable-gpu-metrics,--enable-cpu-metrics,--enable-tracing,--enable-nvtx,--enable-gpu,--enable-mali-gpu->
--enable-feature [<feature> [<feature> ...] ...]--disable-feature [<feature> [<feature> ...] ...](in case starting from--enable-all)--endpoint ENDPOINT->
--enable-endpoint [<endpoint> [<endpoint> ...] ...]--disable-endpoint [<endpoint> [<endpoint> ...] ...]--filesystem FILESYSTEM->
--enable-filesystem [<filesystem> [<filesystem> ...] ...]--disable-filesystem [<filesystem> [<filesystem> ...] ...]--backend BACKEND Include specified backend in build as <backend-name>[:<repo-tag>]...--extra-backend-cmake-arg EXTRA_BACKEND_CMAKE_ARG Extra CMake argument for a backend build as <backend>:<name>=<value>.--override-backend-cmake-arg OVERRIDE_BACKEND_CMAKE_ARG Override specified backend CMake argument in the build as <backend>:<name>=<value>.->
--enable-backend [<backend> [<backend> ...] ...]--disable-backend [<backend> [<backend> ...] ...]--backend-tag <backend> <tag>--backend-org <backend> <org>--extra-backend-cmake-arg <backend> <name> <value>--override-backend-cmake-arg <backend> <name> <value>--repo-tag REPO_TAG The version of a component to use in the build as <component-name>:<repo-tag>.--extra-core-cmake-arg EXTRA_CORE_CMAKE_ARG Extra CMake argument as <name>=<value>.--override-core-cmake-arg OVERRIDE_CORE_CMAKE_ARG Override specified CMake argument in the build as <name>=<value>.->
--component-tag <component> <tag>--extra-core-cmake-arg <name> <value>--override-core-cmake-arg <name> <value>--repoagent REPOAGENT Include specified repo agent in build as <repoagent-name>[:<repo-tag>]->
--enable-repoagent [<repoagent> [<repoagent> ...] ...]--disable-repoagent [<repoagent> [<repoagent> ...] ...]--repoagent-tag <repoagent> <tag>--repoagent-org <repoagent> <org>--extra-repoagent-cmake-arg <repoagent> <name> <value>--override-repoagent-cmake-arg <repoagent> <name> <value>--cache CACHE Include specified cache in build as <cache-name>[:<repo-tag>].->
--enable-cache [<cache> [<cache> ...] ...]--disable-cache [<cache> [<cache> ...] ...]--cache-tag <cache> <tag>--cache-org <cache> <org>--extra-cache-cmake-arg <cache> <name> <value>--override-cache-cmake-arg <cache> <name> <value>Several minor features are also added:
--no-container-cache, which propagates todocker --no-cache.--default-repo-tag <tag>to override the calculated default value, which is not always appropriate. For example, when trying to build a dev version (currently 25.08), the upstream container version is set to the previous version (25.07), which takes precedence in the calculated default value, but using the corresponding dev versions of component and backend repositories may be intended. Rather than having to override it for each repo, it is useful to be able to override it globally.--use-buildbaseto use the temporary "buildbase" image as the "base" image for backends that need it (e.g. onnxruntime).--library-pathsis found not to be used anywhere in the script, and is therefore removed (after demonstrating how it could be incorporated in the new scheme).Checklist
Agreement
<commit_type>: <Title>pre-commit install, pre-commit run --all)Commit Type:
Check the conventional commit type
box here and add the label to the github PR.
Related PRs:
This PR includes the changes from #8362.
Where should the reviewer start?
Changes only affect build.py and associated documentation.
Here is the help message before any of these changes:
Details
Here is the help message after all of the changes:
Details
Test plan:
I have used the script to build containers in several configurations and found that everything works as desired.
There are changes to some of the CLI flags, listed above.
My understanding from the discussion in #8362 is that there are internal
build.pycommands that need to be checked. I am happy to convert and check these if they can be provided, but I currently have no insight into the exact internal commands that need to be maintained.Caveats:
More features could be added for the "component" elements: using forked repositories, modifying CMake arguments (beyond
core). However, these changes would require modifying theCMakeLists.txtfiles in the various component repositories, and therefore are left for future PRs (so that this PR only involves modifications tobuild.py).Background
In the process of adding minor features in #8362, I noticed some friction: duplication of code, multiple parsing steps, etc. The partial refactor, implemented here, allowed new options for different elements of the build process to be added much more easily (see 304b1a7 for an example, which resolves an existing todo item noted in the code).
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
N/A