Skip to content

Add changes for compatibility with WASM components and collocated UDF servers#121

Open
kesmit13 wants to merge 19 commits intomainfrom
wasm-compat
Open

Add changes for compatibility with WASM components and collocated UDF servers#121
kesmit13 wants to merge 19 commits intomainfrom
wasm-compat

Conversation

@kesmit13
Copy link
Copy Markdown
Collaborator

@kesmit13 kesmit13 commented Apr 1, 2026

This PR makes several changes to allow the singlestoredb work in WASM environments. Many of these changes benefit standard installations as well such as lazy loading of numpy, pandas, polars, and pyarrow. Others move imports that are only needed in certain environments, but not within WASM.

A new collocated UDF server implementation is also included that uses a high-performance loop in the C extension to parse and call Python functions on each row. This function is used both by standard collocated servers as well as WASM-based UDF handlers.


Note

High Risk
Adds a large new collocated UDF server and expands the C extension with new mmap/socket I/O and a combined parse/call/serialize loop, which are performance- and memory-safety-sensitive changes. Also refactors optional dependency loading and result-shaping logic, which can subtly affect runtime behavior across pandas/polars/arrow/numpy users and WASM builds.

Overview
Adds a new collocated Python UDF server (singlestoredb.functions.ext.collocated) with a CLI (python-udf-server) that speaks the same Unix-socket protocol as the Rust wasm-udf-server, including @@control signals (health/functions/register), threaded or pre-fork process execution, and mmap-based request/response I/O.

Extends the _singlestoredb_accel C module with call_function_accel (single-pass rowdat_1 decode → Python callable invocation → rowdat_1 encode) plus low-level helpers mmap_read, mmap_write, and recv_exact (poll-based timeout, GIL released), with WASI stubs where unsupported; also hardens several PyObject_Length calls by checking for negative error returns.

Improves WASM/optional-dependency compatibility by lazy-importing numpy/pandas/polars/pyarrow and converting dtype maps to cached getters, moving jwt imports into call sites, broadening a few platform error catches (e.g., getpass, IPython), and updating _iquery to normalize results from pandas/polars/arrow/numpy into a consistent list-of-dicts output with optional camelCase key conversion.

Written by Cursor Bugbot for commit 548edcc. This will update automatically on new commits. Configure here.

kesmit13 and others added 12 commits March 19, 2026 15:38
Defer top-level `import jwt` to function scope in auth.py,
management/manager.py, and management/utils.py (jwt unavailable in WASM).
Catch OSError in mysql/connection.py getpass handling (pwd module
unavailable in WASM). Broaden except clause for IPython import in
utils/events.py.

Add singlestoredb/functions/ext/wasm/ package with udf_handler.py and
numpy_stub.py so componentize-py components can `pip install` this
branch and import directly from singlestoredb.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Required by componentize-py to build function-handler components.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Build complete @udf-decorated Python functions from signature metadata
and raw function body instead of requiring full source code. This adds
dtype-to-Python type mapping and constructs properly annotated functions
at registration time.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Heavy optional dependencies (numpy, pandas, polars, pyarrow) were
imported at module load time, causing failures in WASM environments
where these packages may not be available. This adds a lazy import
utility module and converts all eager try/except import patterns to
use cached lazy accessors. Type maps in dtypes.py are also converted
from module-level dicts to lru_cached factory functions. The pandas
DataFrame isinstance check in connection.py is replaced with a
duck-type hasattr check to avoid importing pandas at module scope.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace `str | None` with `Optional[str]` to maintain compatibility
with Python 3.9 and earlier.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add the call_function_accel function directly to accel.c, implementing
a combined load/call/dump operation for UDF function calls. This function
handles rowdat_1 deserialization, Python UDF invocation, and result
serialization in a single optimized C implementation.

Previously this function was injected at build time via a patch script
in the wasm-udf-server repository. Moving it into the source tree is a
prerequisite for cleaning up the custom componentize-py builder and
simplifying the WASM component build process.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add resources/build_wasm.sh that cross-compiles the package as a WASM
wheel targeting wasm32-wasip2. The script sets up a host venv, configures
the WASI SDK toolchain (clang, ar, linker flags), and uses `python -m
build` to produce the wheel, then unpacks it into build/.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
numpy is lazy-loaded throughout the codebase via the _lazy_import
helpers, so the WASM numpy_stub that patched sys.modules['numpy']
is no longer needed. Delete the stub module and remove its
references from udf_handler.py.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a standalone collocated UDF server package that can run as a
drop-in replacement for the Rust wasm-udf-server. Uses pre-fork
worker processes (default) for true CPU parallelism, avoiding GIL
contention in the C-accelerated call path. Thread pool mode is
available via --process-mode thread.

Collapse the wasm subpackage into a single wasm.py module since it
only contained one class re-exported through __init__.py.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Each forked worker previously created its own independent SharedRegistry
and FunctionRegistry. When @@register arrived at a worker, only that
worker's local registry was updated — the main process and sibling
workers never learned about the new function.

Add Unix pipe-based IPC (matching the R UDF server fix): each worker
gets a pipe back to the main process. When a worker handles @@register,
it writes the registration payload to its pipe. The main process reads
it via select.poll(), applies the registration to its own SharedRegistry,
then kills and re-forks all workers so they inherit the updated state.

Thread mode is unaffected — pipe_write_fd is None and the pipe write
is a no-op.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add poll()-based timeout to C recv_exact to avoid the interaction between
Python's settimeout() (which sets O_NONBLOCK on the fd) and direct
fd-level recv() in the C code. When the fd was non-blocking, recv()
returned EAGAIN immediately when no data was available, which the C code
treated as an error, closing the connection and causing EPIPE on the
client side.

- accel.c: Add optional timeout_ms parameter to recv_exact that uses
  poll(POLLIN) before each recv() call, raising TimeoutError on timeout.
  Also add mmap_read and mmap_write C helpers for fd-level I/O.
- connection.py: Only call settimeout() for the Python fallback path;
  keep fd blocking for C accel path. Pass 100ms timeout to C recv_exact.
  Catch TimeoutError instead of socket.timeout. Replace select() loop
  with timeout-based recv. Add C accel paths for mmap read/write.
  Add optional per-request profiling via SINGLESTOREDB_UDF_PROFILE=1.
- registry.py: Consolidate accel imports (mmap_read, mmap_write,
  recv_exact) under single _has_accel flag.
- wasm.py: Update to use renamed _has_accel flag.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces WASM-compatibility improvements (primarily by lazy-loading heavyweight optional dependencies and moving environment-specific imports into call sites) and adds a new collocated Python UDF server implementation, including a new C-extension hot path to accelerate rowdat_1 decode → Python call → rowdat_1 encode.

Changes:

  • Added a WIT interface definition and a WASM build helper script for external UDF component workflows.
  • Refactored optional dependency handling (numpy/pandas/polars/pyarrow, IPython, JWT) to be more robust in constrained/WASM-like environments.
  • Added a new collocated UDF server (socket + mmap protocol, thread/process modes, dynamic registration) and a C-extension accelerator entry point (call_function_accel).

Reviewed changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
wit/udf.wit Defines the external UDF WIT interface and exported world.
singlestoredb/utils/_lazy_import.py Adds cached lazy imports for heavy optional deps.
singlestoredb/utils/dtypes.py Converts dtype maps to lazily-evaluated, cached getters.
singlestoredb/utils/results.py Switches result formatting to lazy imports + cached type maps.
singlestoredb/utils/events.py Broadens IPython import failure handling.
singlestoredb/converters.py Uses lazy numpy import in vector converters.
singlestoredb/connection.py Adjusts internal result-to-dict conversion to avoid importing pandas.
singlestoredb/mysql/connection.py Adds WASM-friendly DEFAULT_USER detection (handles OSError).
singlestoredb/auth.py Moves jwt import into call site.
singlestoredb/management/utils.py Moves jwt import into call sites for WASM-friendliness.
singlestoredb/management/manager.py Moves jwt import into is_jwt call site.
singlestoredb/functions/dtypes.py Updates exports to use dtype-map getter functions.
singlestoredb/functions/ext/rowdat_1.py Replaces eager dtype maps with lazy getter functions.
singlestoredb/functions/ext/json.py Replaces eager dtype maps with lazy getter functions.
singlestoredb/functions/ext/collocated/* Adds collocated server, protocol handling, registry, control signals, and WASM adapter.
singlestoredb/tests/test_connection.py Makes pandas string dtype assertions version-tolerant.
resources/build_wasm.sh Adds a build helper for wasm32-wasip2 wheels.
pyproject.toml Adds python-udf-server CLI entry point.
accel.c Adds call_function_accel C hot path and exports it from the extension module.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

When numpy is not available (e.g., WASM), the `np` name is undefined.
The has_numpy flag was already used elsewhere but this check was missed
when the numpy_stub was removed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The mmap_read, mmap_write, and recv_exact functions use poll.h,
sys/mman.h, and sys/socket.h which are unavailable in WASI. Wrap
these includes, function bodies, and PyMethodDef entries with
#ifndef __wasi__ guards so the C extension compiles for wasm32-wasip2.
The core call_function_accel optimization remains available.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Without this, the accel status log messages ("Using accelerated C
call_function_accel loop" / "Using pure Python call_function loop")
are silently dropped because no logging handler is configured in the
WASM handler path. setup_logging() was only called from __main__.py
(collocated server CLI).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The _singlestoredb_accel C extension ifdef'd out the mmap and socket
functions for __wasi__ builds, but registry.py imports all four symbols
(call_function_accel, mmap_read, mmap_write, recv_exact) in a single
try block. The missing exports caused the entire import to fail,
silently falling back to the pure Python call_function loop.

Add #else stubs that raise NotImplementedError if called, so the
symbols are importable and call_function_accel works in WASM. Also
capture the accel import error and log it in initialize() for future
diagnostics.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 26 changed files in this pull request and generated 11 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

accel.c:
- Replace empty TODO type stubs with NotImplementedError raises
- Add CHECK_REMAINING macro for bounds checking on buffer reads
- Replace unaligned pointer-cast reads with memcpy for WASM/ARM safety
- Fix double-decref in output error paths (set to NULL before goto)
- Fix Py_None reference leak by removing pre-switch INCREF
- Fix MYSQL_TYPE_NULL consuming an extra byte from next column
- Add PyErr_Format in default switch cases
- Add PyErr_Occurred() checks after PyLong/PyFloat conversions

Python:
- Align list/tuple multi-return handling in registry.py with C path
- Add _write_all_fd helper for partial os.write() handling
- Harden handshake recvmsg: name length bound, ancdata validation,
  MSG_CTRUNC check, FD cleanup on error
- Wrap get_context('fork') with platform safety error
- Narrow events.py exception catch to (ImportError, OSError)
- Fix _iquery DataFrame check ordering (check before list())
- Expand setblocking(False) warning comment
- Update WIT and wasm.py docstrings for code parameter

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@singlestore-labs singlestore-labs deleted a comment from bri-tong Apr 2, 2026
Guard against protocol desynchronization when poll() times out after
partial data has been consumed from the socket. In the C path
(accel_recv_exact), switch to blocking mode when pos > 0 so the
message is always completed. Apply the same fix to the Python fallback
(_recv_exact_py) by catching TimeoutError mid-read and removing the
socket timeout.

Add error checking at all PyObject_Length call sites that cast the
result to unsigned. PyObject_Length returns -1 on error, which when
cast to unsigned long long produces ULLONG_MAX, leading to massive
malloc allocations or out-of-bounds access. Each site now checks for
< 0 and gotos error before casting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
_iquery must always return List[Dict[str, Any]], but when the connection
uses a non-tuple results_type (polars, pandas, numpy, arrow), the
specialized cursor's fetchall() returns a DataFrame/ndarray instead of
tuples. The previous code had two bugs:

1. list() on a DataFrame iterates by columns, producing Series objects
   instead of row dicts.
2. to_dict(orient='records') is pandas-specific and fails on polars.

Dispatch on the raw fetchall() result type before converting to dicts:
- pandas DataFrame: to_dict(orient='records')
- polars DataFrame: to_dicts()
- Arrow Table: to_pydict() with column-to-row transposition
- numpy ndarray: tolist() with cursor.description column names
- tuples/dicts: existing logic preserved

Centralize fix_names camelCase conversion as a single post-processing
step applied uniformly to all result types.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

# Partial message already consumed — must finish it.
# Remove timeout to avoid protocol desync.
sock.settimeout(None)
continue
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Socket timeout permanently lost after partial read recovery

Medium Severity

_recv_exact_py calls sock.settimeout(None) when recovering from a partial-read timeout, but the caller in _handle_udf_loop never restores the 0.1s timeout afterward. Once this rare path executes, all subsequent recv_into calls block indefinitely, meaning the shutdown_event.is_set() check in the while loop is never reached. This prevents graceful shutdown of the Python-fallback worker. The C accel path avoids this by using internal poll() that doesn't mutate the socket state.

Additional Locations (1)
Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants