Conversation
There was a problem hiding this comment.
Pull request overview
Updates Ruff tooling versions and adds documentation + a new benchmark script to demonstrate parallel highlighting scaling on free-threaded Python (3.14t).
Changes:
- Bump Ruff versions in
pyproject.tomland.pre-commit-config.yaml. - Add
benchmarks/benchmark_parallel.pyto compare sequentialhighlight()vs parallelhighlight_many(). - Refresh README messaging around free-threading and add instructions + sample output for the new benchmark.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
pyproject.toml |
Updates Ruff dev dependency and reformats Ruff per-file ignores. |
benchmarks/benchmark_parallel.py |
New benchmark script for parallel highlighting/free-threading visibility. |
README.md |
Updates free-threading bullet + adds benchmark instructions and example table. |
.pre-commit-config.yaml |
Bumps ruff-pre-commit hook revision to match the newer Ruff. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| - **O(n) guaranteed** — Hand-written state machines, no regex backtracking | ||
| - **Zero ReDoS** — No exploitable patterns, safe for untrusted input | ||
| - **Thread-safe** — Immutable state, optimized for Python 3.14t free-threading | ||
| - **Free-threading native** — All lexer state is local variables, keyword tables are `frozenset`, tokens are immutable. Highlight from any number of threads with zero contention. |
There was a problem hiding this comment.
The README claim of “zero contention” is stronger than what the implementation guarantees. Even if lexer state is local/immutable, the library still uses shared machinery (e.g., registry caching via functools.cache, ThreadPoolExecutor internals), so there can be contention outside the lexer state. Consider rephrasing to something precise like “no shared mutable lexer state” / “safe to call concurrently” rather than “zero contention.”
| - **Free-threading native** — All lexer state is local variables, keyword tables are `frozenset`, tokens are immutable. Highlight from any number of threads with zero contention. | |
| - **Free-threading native** — All lexer state is local variables, keyword tables are `frozenset`, tokens are immutable. Safe to call from many threads concurrently with no shared mutable lexer state. |
|
|
||
| Demonstrates thread scaling when highlighting many code blocks in parallel | ||
| under Python 3.14t free-threading. Uses highlight_many() for parallel path | ||
| and sequential highlight() loop for baseline. Stdlib only. |
There was a problem hiding this comment.
The module docstring says “Stdlib only”, but this script imports project modules (rosettes, benchmarks.sample_code). If the intent is “no third-party deps”, consider updating the wording to avoid being misleading.
| and sequential highlight() loop for baseline. Stdlib only. | |
| and sequential highlight() loop for baseline. No third-party dependencies. |
|
|
||
|
|
||
| def _load_sample_code() -> None: | ||
| """Add repo root to path and load sample_code (for script execution).""" |
There was a problem hiding this comment.
_load_sample_code()’s docstring says it “load[s] sample_code”, but the function only prepends the repo root to sys.path. Consider either importing benchmarks.sample_code inside the function (so it actually “loads” it) or updating the docstring to describe what it really does.
| """Add repo root to path and load sample_code (for script execution).""" | |
| """Ensure repo root is on sys.path so benchmarks.sample_code can be imported.""" |
| # GIL detection | ||
| gil_enabled = getattr(sys, "_is_gil_enabled", lambda: True)() | ||
| if gil_enabled: | ||
| print("Free-threaded build: No (GIL enabled)") | ||
| print("\nRun with Python 3.14t (free-threading) to see parallel scaling.") | ||
| print("Example: python3.14t benchmarks/benchmark_parallel.py") |
There was a problem hiding this comment.
This benchmark duplicates free-threading/GIL detection via sys._is_gil_enabled. Since the project already has rosettes._parallel.is_free_threaded() (with fallback behavior), consider reusing that helper to keep detection logic consistent and avoid depending on a CPython private attribute in two places.
No description provided.