[Windows] Support CPU shared memory (Client/Frontend)#7048
[Windows] Support CPU shared memory (Client/Frontend)#7048fpetrini15 wants to merge 3302 commits intomainfrom
Conversation
* Update README and versions for 2.36.0 / 23.07 * Update Dockerfile.win10.min * Fix formating issue * fix formating issue * Fix whitespaces * Fix whitespaces * Fix whitespaces
* Reduce instance count to 1 for python bls model loading test * Add comment when calling unload
* Fix queue test to expect exact number of failures * Increase the execution time to more accurately capture requests
…6063) * Adding tests for bls * Added fixme, cleaned previous commit * Removed unused imports * Fixing commit tree: Refactor code, so that OTel tracer provider is initialized only once Added resource cmd option, testig Added docs * Clean up * Update docs/user_guide/trace.md Co-authored-by: Ryan McCormick <rmccormick@nvidia.com> * Revision * Update doc * Clean up * Added ostream exporter to OpenTelemetry for testing purposes; refactored trace tests * Added opentelemetry trace collector set up to tests; refactored otel exporter tests to use OTel collector instead of netcat * Revising according to comments * Added comment regarding 'parent_span_id' * Added permalink * Adjusted test --------- Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
Add tests for python 3.8-3.11 for L0_python_backends
* Improve L0_backend_python debugging * Use utils function for artifacts collection
Update docs with NVAIE messaging
…#6140) * Remove test checking for --shape option * Remove the entire test
…same time (#6150) * Add test when unload/load requests for same model received the same time * Add test_same_model_overlapping_load_unload * Use a load/unload stress test instead * Pre-merge test name update * Address pre-commit error * Revert "Address pre-commit error" This reverts commit 781cab1. * Record number of occurrence of each exception
* added debugging guide * Run pre-commit --------- Co-authored-by: David Yastremsky <dyastremsky@nvidia.com>
* Add utility functions for outlier removal * Fix functions * Add newline to end of file
* Testing: add gc collect to make sure gpu tensor is deallocated * Address comment
* Initial commit * Cleanup using new standard formatting * QA test restructuring * Add newline to the end of test.sh * HTTP/GRCP protocol changed to pivot on ready status & error status. Log file name changed in qa test. * Fixing unhandled error memory leak * Handle index function memory leak fix
* Eanable autodocs for python client library * Fixing the format and spelling
* Deprecate dynamic log file * Update error message
15f94bb to
a5b6b7e
Compare
* Add async execute decoupled test * Add decoupled bls async exec test * Enhance test with different durations for concurrent executes
Add trace_mode and trace_config to getTraceSettingsAPI --------- Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
| triton_client = httpclient.InferenceServerClient(_url, verbose=True) | ||
| # Custom setup method to allow passing of parameters | ||
| def _setUp(self, protocol, log_file_path): | ||
| self._tritonserver_ipaddr = os.environ.get("TRITONSERVER_IPADDR", "localhost") |
There was a problem hiding this comment.
Does this need to be configurable in practice? Do we expect to use shared memory for anything other than a co-located server on localhost?
There was a problem hiding this comment.
TBD: Currently on the Windows testing side of things, it's passed in as a variable and is different from "localhost". Still trying to get a CI pipeline up to see the new behavior for this test in particular. Will remove if no issue.
| self._build_model_repo() | ||
| self._build_server_args() | ||
| self._shared_memory_test_server_log = open(log_file_path, "w") | ||
| self._server_process = util.run_server( |
There was a problem hiding this comment.
How does util.run_server interact with test.sh also starting server? Is there conflict or issue there?
There was a problem hiding this comment.
I don't believe they should overlap. For this test my ultimate goal is to remove test.sh entirely.
There was a problem hiding this comment.
Isn't this also getting run in the linux case that runs test.sh? or is there changes on the gitlab-side to not run test.sh at all?
There was a problem hiding this comment.
Ah I see your point. There are changes on the gitlab side such that test.sh will not run at all for Windows. I will attempt to change the Linux test case so that it also will not run test.sh
| backend_dir = "C:\\opt\\tritonserver\\backends" | ||
| model_dir = "C:\\opt\\tritonserver\\qa\\L0_shared_memory\\models" | ||
| self._server_executable = "C:\\opt\\tritonserver\\bin\\tritonserver.exe" |
There was a problem hiding this comment.
Probably more of a random note or follow-up, but I was under the impression something like Pathlib.Path("/opt/tritonserver/backends") would translate to "C:\\opt\\tritonserver\\backends" for free when run on Windows. If so you could probably condense the cases to work for both.
Did you see otherwise?
There was a problem hiding this comment.
No, I believe you are right. ATM they are set to my local path and were hardcoded for convenience. They need to be modified and will once I determine the CI environment.
| # Constant members | ||
| shared_memory_test_client_log = Path(os.getcwd()) / "client.log" | ||
| model_dir_path = Path(os.getcwd()) / "models" | ||
| model_source_path = Path(os.getcwd()).parents[0] / "python_models/add_sub/model.py" |
There was a problem hiding this comment.
Future follow-up as we expand python utilities for CI testing, but might be nice to have some kind of utils.relative_path([path, to, thing]).
ex maybe something like this:
model_dir_path = utils.relative_path("models")
model_source_path = utils.relative_path("..", "python_models", "add_sub", "model.py")
nv-kmcgill53
left a comment
There was a problem hiding this comment.
LGTM. Great work on this!
GuanLuo
left a comment
There was a problem hiding this comment.
Left some comments, can be addressed in the future PR that adds clean up logic
126f7dc to
06d1e6b
Compare
|
@fpetrini15 |
|
@rmccorm4 @GuanLuo @nv-kmcgill53 |
Goal: Support CPU shared memory between the server and client for Windows
Sub-goals: Modify L0_shared_memory to run on bare-metal Windows using only Python.
Client changes: triton-inference-server/client#551
Some things to note:
test.sh