Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 20 additions & 7 deletions src/pb_stub.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1059,13 +1059,7 @@ Stub::~Stub()
}
#endif

// Ensure the interpreter is active before trying to clean up.
if (Py_IsInitialized()) {
py::gil_scoped_acquire acquire;
py::object async_event_loop_local(std::move(async_event_loop_));
py::object background_futures_local(std::move(background_futures_));
py::object model_instance_local(std::move(model_instance_));
}
DestroyPythonObjects();

stub_message_queue_.reset();
parent_message_queue_.reset();
Expand All @@ -1088,6 +1082,11 @@ Stub::GetOrCreateInstance()
void
Stub::DestroyInstance()
{
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stub::DestroyInstance() unconditionally dereferences stub_instance. If DestroyInstance() is called before GetOrCreateInstance() (or called twice), this will crash. Add a null check (e.g., early-return if !stub_instance) before calling DestroyPythonObjects() / reset().

Suggested change
{
{
if (!stub_instance) {
return;
}

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done ✅

if (!stub_instance) {
return;
}

stub_instance->DestroyPythonObjects();
stub_instance.reset();
}

Expand Down Expand Up @@ -1503,6 +1502,20 @@ Stub::GetCUDAMemoryPoolAddress(std::unique_ptr<IPCMessage>& ipc_message)
#endif
}

void
Stub::DestroyPythonObjects()
Comment thread
whoisj marked this conversation as resolved.
{
// Ensure the interpreter is active before trying to clean up.
if (Py_IsInitialized()) {
py::gil_scoped_acquire acquire;
py::object async_event_loop_local(std::move(async_event_loop_));
py::object background_futures_local(std::move(background_futures_));
py::object model_instance_local(std::move(model_instance_));
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DestroyPythonObjects() only clears async_event_loop_, background_futures_, and model_instance_, but Stub also owns other py::object members (deserialize_bytes_, serialize_bytes_). If those remain non-empty, they will be decref'd later during Stub destruction (potentially after py::scoped_interpreter teardown / without the GIL), which can still segfault. Consider moving/clearing all py::object members here (and ideally reuse this helper from ~Stub() to keep the cleanup logic in one place).

Suggested change
py::object model_instance_local(std::move(model_instance_));
py::object model_instance_local(std::move(model_instance_));
py::object deserialize_bytes_local(std::move(deserialize_bytes_));
py::object serialize_bytes_local(std::move(serialize_bytes_));

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

@aleksn7 aleksn7 Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@whoisj What do you think about this? Should we listen copilot here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the other fields are not a problem, then it doesn't matter, honestly.

py::object deserialize_bytes_local(std::move(deserialize_bytes_));
py::object serialize_bytes_local(std::move(serialize_bytes_));
}
}

void
Stub::ProcessBLSResponseDecoupled(std::unique_ptr<IPCMessage>& ipc_message)
{
Expand Down
5 changes: 5 additions & 0 deletions src/pb_stub.h
Original file line number Diff line number Diff line change
Expand Up @@ -267,6 +267,11 @@ class Stub {
/// Get the CUDA memory pool address from the parent process.
void GetCUDAMemoryPoolAddress(std::unique_ptr<IPCMessage>& ipc_message);

/// Cleans up Python objects and must be called before the destructor.
/// This prevents problems that occur when Python object destructors
/// call Stub::GetOrCreateInstance.
void DestroyPythonObjects();

/// Calls the user's is_ready() Python method and returns its response
/// when handling model readiness check requests.
void ProcessUserModelReadinessRequest(
Expand Down
Loading