Skip to content

Move parser execution to a Web Worker with timeout handling#69

Draft
omer-biz wants to merge 4 commits intopest-parser:masterfrom
omer-biz:master
Draft

Move parser execution to a Web Worker with timeout handling#69
omer-biz wants to merge 4 commits intopest-parser:masterfrom
omer-biz:master

Conversation

@omer-biz
Copy link
Contributor

@omer-biz omer-biz commented Mar 9, 2026

This PR resolves the issue #66, recursive grammar would halt the entire site. The solution, as suggested in the issue, was to move the parsing logic to a web worker and attaching a timeout to it.

Changes:

  • src/lib.rs:

    • Substituted direct parse_input() calls with a custom trigger_worker event dispatch.
    • Exposed a new pub fn parse(rule: String, input: String) -> String to be explicitly called by the worker, completely avoiding unnecessary DOM manipulation.
  • static/scripts/parser_worker.ts:

    • Implemented a lean Web Worker that loads the Wasm module, caches grammar compilations via lint, and responds to main thread parse requests.
  • static/scripts/editor.ts:

    • Wired up parserWorker spawning and termination. Added a parseId mechanism to ensure responses map correctly to the active input session.
    • Implemented the window.setTimeout(..., 1000) wrapper to manage worker termination and timeout UI updates.
  • Cargo.toml:

    • Added the CustomEvent feature to web-sys dependencies to support the new event-driven refactor.

Summary by CodeRabbit

  • New Features
    • Parsing now runs in a background web worker to keep the UI responsive.
    • Added a 1000ms parser timeout with automatic worker restart on hung parses.
    • Implemented grammar caching to avoid unnecessary recompilation and speed repeated checks.
    • Parsing can now be triggered via dispatched browser events, enabling decoupled integrations.

@coderabbitai
Copy link

coderabbitai bot commented Mar 9, 2026

📝 Walkthrough

Walkthrough

Refactors parsing to an event-driven flow: adds CustomEvent feature in Cargo.toml, exports a new wasm parse(rule, input) -> String, introduces a dedicated parser web worker with timeout/restart logic, and dispatches parsing via a trigger_worker CustomEvent instead of direct function calls.

Changes

Cohort / File(s) Summary
Feature Enablement
Cargo.toml
Added CustomEvent to the features array for the web-sys dependency.
WASM API & Control Flow
src/lib.rs
Replaced inline parse invocations with event-based triggering; added #[wasm_bindgen] pub fn parse(rule: String, input: String) -> String; removed direct DOM textarea usage and call sites now dispatch trigger_worker.
Main Thread Orchestration
static/scripts/editor.ts
New worker lifecycle (spawnWorker()), trigger_worker event listener that posts {id, grammar, rule, input} to worker, parse-id tracking, 1000ms parse timeout and worker respawn on timeout, updates output DOM on responses.
Worker Implementation
static/scripts/parser_worker.ts
New worker script that lazy-loads the WASM module (pest_site.js), caches last grammar for conditional linting, handles messages, calls exported parse(), and posts success/error responses back to main thread.

Sequence Diagram

sequenceDiagram
    actor User
    participant Main as Main Thread (editor.ts)
    participant Window as Browser Window
    participant Worker as Parser Worker
    participant WASM as WASM Module (lib.rs)

    User->>Main: trigger compile/input
    Main->>Window: dispatchEvent(trigger_worker)
    Window->>Main: trigger_worker listener runs
    Main->>Main: validate state, increment parseId, set 1000ms timeout
    Main->>Worker: postMessage({id, grammar, rule, input})
    Worker->>Worker: ensure WASM loaded, lint grammar if changed
    Worker->>WASM: call parse(rule, input)
    alt Parse Success
        WASM-->>Worker: Ok(result)
        Worker-->>Main: postMessage({id, result})
        Main->>Main: clear timeout, update output DOM
    else Parse Error
        WASM-->>Worker: Err(error)
        Worker-->>Main: postMessage({id, error})
        Main->>Main: clear timeout, update output DOM
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 a hop and a sniff
I sent a tiny event on the breeze,
A worker hummed softly, parsing with ease,
Timeouts kept watch while grammars danced,
Results hopped back — oh what a pranced! 🥕✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: moving parser execution to a Web Worker with timeout handling, which is the core objective of the PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/lib.rs (1)

14-15: ⚠️ Potential issue | 🔴 Critical

Remove the unused HtmlTextAreaElement import.

HtmlTextAreaElement is imported on line 15 but never used in the codebase. This will cause cargo clippy -- -D warnings to fail.

Proposed fix
-use web_sys::{HtmlOptionElement, HtmlSelectElement, HtmlTextAreaElement};
+use web_sys::{HtmlOptionElement, HtmlSelectElement};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib.rs` around lines 14 - 15, Remove the unused import
HtmlTextAreaElement from the web_sys import list in src/lib.rs: locate the use
statement that currently reads use web_sys::{HtmlOptionElement,
HtmlSelectElement, HtmlTextAreaElement}; and delete HtmlTextAreaElement from
that brace-enclosed list so only the actually used symbols (HtmlOptionElement,
HtmlSelectElement) remain.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@static/scripts/editor.ts`:
- Around line 23-37: The handler parserWorker.onmessage currently clears
parseTimeout for every message which lets stale responses cancel the active
timeout; update the logic to first destructure const { id, type, result, error }
= e.data and check if id === parseId before touching parseTimeout or updating
outputDom, so only the matching (non-stale) response clears and nulls
parseTimeout and writes to outputDom (handle type === "success" vs error as
before); this ensures stale worker messages do not cancel the timeout for the
current parse.

In `@static/scripts/parser_worker.ts`:
- Around line 10-12: The onmessage handler currently drops messages when loaded
is false; instead add a pending buffer (e.g., pendingMessage or messageQueue)
and update onmessage to store the incoming event (replace with the latest or
push to a small queue) when loaded is false; after init() resolves, drain that
buffer by invoking the same message-processing function used for live messages
(reference onmessage and init()) so the first request is not lost and the
caller's timeout can proceed; also ensure you clear pendingMessage on processing
and preserve the original event data shape when calling the processor.
- Around line 17-18: The worker currently calls lint(), which triggers
compile_grammar() and add_rules_to_select() that use element() and
window.document (DOM) and thus will throw in a Web Worker; extract the
VM-compilation portions of lint/compile_grammar (the code that compiles the
grammar to the VM / wasm-friendly representation) into a new DOM-free wasm
export (e.g., compile_grammar_vm or compileGrammarForVM) that does not call
add_rules_to_select or element(), update the worker to call this new export
instead of lint(), and keep any UI-only functions (add_rules_to_select, DOM
mutations) in the main thread code that still uses lint() where appropriate;
ensure references to lint(), compile_grammar(), add_rules_to_select(),
element(), and parse() are updated so the worker only invokes the new DOM-free
compile function and then parse().
- Around line 24-25: Replace the unsafe catch typing and toString usage in the
try/catch that posts errors (the catch block that currently reads "catch (err:
any)" and calls postMessage({ id, type: "error", error: err.toString() })).
Change the catch to "catch (err: unknown)" and normalize the thrown value before
posting: if err is an instance of Error, extract err.message (and optionally
err.name/stack), otherwise convert the value with String(err) to produce a safe
error string; then call postMessage({ id, type: "error", error:
normalizedMessage }). This preserves type-safety while ensuring non-Error throws
are correctly handled.

---

Outside diff comments:
In `@src/lib.rs`:
- Around line 14-15: Remove the unused import HtmlTextAreaElement from the
web_sys import list in src/lib.rs: locate the use statement that currently reads
use web_sys::{HtmlOptionElement, HtmlSelectElement, HtmlTextAreaElement}; and
delete HtmlTextAreaElement from that brace-enclosed list so only the actually
used symbols (HtmlOptionElement, HtmlSelectElement) remain.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0ff075e1-09c4-4bf8-8bb8-fdc5b83a95a4

📥 Commits

Reviewing files that changed from the base of the PR and between d11a037 and 29a5dff.

📒 Files selected for processing (4)
  • Cargo.toml
  • src/lib.rs
  • static/scripts/editor.ts
  • static/scripts/parser_worker.ts

Comment on lines +23 to +37
parserWorker.onmessage = (e) => {
if (parseTimeout !== null) {
clearTimeout(parseTimeout);
parseTimeout = null;
}

const { id, type, result, error } = e.data;
if (id === parseId) {
if (type === "success") {
outputDom.value = result;
} else {
outputDom.value = error;
}
}
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Ignore stale worker responses before clearing the active timeout.

Lines 24-27 clear parseTimeout for every worker message, even when id !== parseId. If parse N-1 responds after parse N starts, the stale message cancels parse N's timeout and the current request can hang indefinitely.

Proposed fix
   parserWorker.onmessage = (e) => {
-    if (parseTimeout !== null) {
-      clearTimeout(parseTimeout);
-      parseTimeout = null;
-    }
-
     const { id, type, result, error } = e.data;
-    if (id === parseId) {
-      if (type === "success") {
-        outputDom.value = result;
-      } else {
-        outputDom.value = error;
-      }
+    if (id !== parseId) {
+      return;
+    }
+
+    if (parseTimeout !== null) {
+      clearTimeout(parseTimeout);
+      parseTimeout = null;
+    }
+
+    if (type === "success") {
+      outputDom.value = result;
+    } else {
+      outputDom.value = error;
     }
   };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
parserWorker.onmessage = (e) => {
if (parseTimeout !== null) {
clearTimeout(parseTimeout);
parseTimeout = null;
}
const { id, type, result, error } = e.data;
if (id === parseId) {
if (type === "success") {
outputDom.value = result;
} else {
outputDom.value = error;
}
}
};
parserWorker.onmessage = (e) => {
const { id, type, result, error } = e.data;
if (id !== parseId) {
return;
}
if (parseTimeout !== null) {
clearTimeout(parseTimeout);
parseTimeout = null;
}
if (type === "success") {
outputDom.value = result;
} else {
outputDom.value = error;
}
};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@static/scripts/editor.ts` around lines 23 - 37, The handler
parserWorker.onmessage currently clears parseTimeout for every message which
lets stale responses cancel the active timeout; update the logic to first
destructure const { id, type, result, error } = e.data and check if id ===
parseId before touching parseTimeout or updating outputDom, so only the matching
(non-stale) response clears and nulls parseTimeout and writes to outputDom
(handle type === "success" vs error as before); this ensures stale worker
messages do not cancel the timeout for the current parse.

Comment on lines +10 to +12
onmessage = (e) => {
if (!loaded) return;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Queue startup messages instead of dropping them.

Line 11 silently discards requests that arrive before init() resolves. The caller still starts its timeout, so the first parse can degrade into a spurious "Parser timeout" with no retry. Buffer the latest request or add a ready handshake before accepting work.

🧰 Tools
🪛 Biome (2.4.4)

[error] 10-10: A global variable should not be reassigned.

(lint/suspicious/noGlobalAssign)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@static/scripts/parser_worker.ts` around lines 10 - 12, The onmessage handler
currently drops messages when loaded is false; instead add a pending buffer
(e.g., pendingMessage or messageQueue) and update onmessage to store the
incoming event (replace with the latest or push to a small queue) when loaded is
false; after init() resolves, drain that buffer by invoking the same
message-processing function used for live messages (reference onmessage and
init()) so the first request is not lost and the caller's timeout can proceed;
also ensure you clear pendingMessage on processing and preserve the original
event data shape when calling the processor.

Comment on lines +17 to +18
if (grammar !== lastGrammar) {
lint(grammar);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, locate and read parser_worker.ts to see lines 17-18
fd parser_worker.ts --type f

Repository: pest-parser/site

Length of output: 91


🏁 Script executed:

# Find lib.rs and examine the functions mentioned
fd lib.rs --type f

Repository: pest-parser/site

Length of output: 70


🏁 Script executed:

# Search for the lint function definition
rg "fn lint|export.*lint" --type ts --type rust -A 5

Repository: pest-parser/site

Length of output: 337


🏁 Script executed:

# Search for compile_grammar function
rg "fn compile_grammar|compile_grammar\s*\(" --type rust -A 10

Repository: pest-parser/site

Length of output: 1052


🏁 Script executed:

# Search for add_rules_to_select function
rg "fn add_rules_to_select|add_rules_to_select\s*\(" --type rust -A 15

Repository: pest-parser/site

Length of output: 2379


🏁 Script executed:

# Read parser_worker.ts to see the context around lines 17-18
cat -n static/scripts/parser_worker.ts

Repository: pest-parser/site

Length of output: 912


🏁 Script executed:

# Check what element() and document() do in lib.rs
rg "fn element|fn document" src/lib.rs -A 5

Repository: pest-parser/site

Length of output: 388


Don't call lint() from the worker.

lint() routes through compile_grammar() and add_rules_to_select() in src/lib.rs, which accesses window.document via the element() helper. Web Workers have no Window object, so this will throw before parse() runs, preventing successful grammar compilation in the worker. Extract the VM-compilation logic into a DOM-free wasm export and call that instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@static/scripts/parser_worker.ts` around lines 17 - 18, The worker currently
calls lint(), which triggers compile_grammar() and add_rules_to_select() that
use element() and window.document (DOM) and thus will throw in a Web Worker;
extract the VM-compilation portions of lint/compile_grammar (the code that
compiles the grammar to the VM / wasm-friendly representation) into a new
DOM-free wasm export (e.g., compile_grammar_vm or compileGrammarForVM) that does
not call add_rules_to_select or element(), update the worker to call this new
export instead of lint(), and keep any UI-only functions (add_rules_to_select,
DOM mutations) in the main thread code that still uses lint() where appropriate;
ensure references to lint(), compile_grammar(), add_rules_to_select(),
element(), and parse() are updated so the worker only invokes the new DOM-free
compile function and then parse().

Comment on lines +24 to +25
} catch (err: any) {
postMessage({ id, type: "error", error: err.toString() });
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, check if the file exists and read the relevant lines
if [ -f "static/scripts/parser_worker.ts" ]; then
  echo "=== File content at lines 20-30 ==="
  sed -n '20,30p' static/scripts/parser_worker.ts
  echo ""
  echo "=== Full file info ==="
  wc -l static/scripts/parser_worker.ts
else
  echo "File not found at static/scripts/parser_worker.ts"
  # Try to find it
  find . -name "parser_worker.ts" -type f
fi

Repository: pest-parser/site

Length of output: 362


🏁 Script executed:

# Check for ESLint configuration files
echo "=== Looking for ESLint config ==="
find . -maxdepth 2 -name ".eslintrc*" -o -name "eslint.config.*" | head -20

Repository: pest-parser/site

Length of output: 109


🏁 Script executed:

# Search for error handling patterns with 'any' in catch blocks
echo "=== Checking catch blocks with 'any' ==="
rg "catch\s*\(\s*err:\s*any\s*\)" -A 2

Repository: pest-parser/site

Length of output: 294


🏁 Script executed:

# Read the entire file to understand the context
echo "=== Full file content ==="
cat -n static/scripts/parser_worker.ts

Repository: pest-parser/site

Length of output: 938


🏁 Script executed:

# Check the ESLint configuration
echo "=== ESLint config ==="
cat .eslintrc.cjs

Repository: pest-parser/site

Length of output: 289


🏁 Script executed:

# Check if there are any TypeScript configuration rules that might relate to 'any'
echo "=== TypeScript config (if exists) ==="
cat tsconfig.json 2>/dev/null || echo "No tsconfig.json found"

Repository: pest-parser/site

Length of output: 120


Use unknown in the catch path and properly handle error values.

The @typescript-eslint/no-explicit-any rule from your recommended ESLint configuration flags this catch clause. Switching to unknown type maintains type safety while allowing proper error normalization:

Proposed fix
-    } catch (err: any) {
-        postMessage({ id, type: "error", error: err.toString() });
+    } catch (err: unknown) {
+        postMessage({
+            id,
+            type: "error",
+            error: err instanceof Error ? err.message : String(err),
+        });
     }

This approach safely handles both Error instances (extracting the message property) and non-Error thrown values (converting to string), avoiding the unsafe assumption that err has a toString() method.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
} catch (err: any) {
postMessage({ id, type: "error", error: err.toString() });
} catch (err: unknown) {
postMessage({
id,
type: "error",
error: err instanceof Error ? err.message : String(err),
});
}
🧰 Tools
🪛 ESLint

[error] 24-24: Unexpected any. Specify a different type.

(@typescript-eslint/no-explicit-any)

🪛 GitHub Actions: publish

[error] 24-24: pnpm run lint failed. ESLint error: Unexpected any. Specify a different type. @typescript-eslint/no-explicit-any (in static/scripts/parser_worker.ts:24:19).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@static/scripts/parser_worker.ts` around lines 24 - 25, Replace the unsafe
catch typing and toString usage in the try/catch that posts errors (the catch
block that currently reads "catch (err: any)" and calls postMessage({ id, type:
"error", error: err.toString() })). Change the catch to "catch (err: unknown)"
and normalize the thrown value before posting: if err is an instance of Error,
extract err.message (and optionally err.name/stack), otherwise convert the value
with String(err) to produce a safe error string; then call postMessage({ id,
type: "error", error: normalizedMessage }). This preserves type-safety while
ensuring non-Error throws are correctly handled.

@omer-biz omer-biz marked this pull request as draft March 9, 2026 19:52
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
static/scripts/parser_worker.ts (2)

6-12: ⚠️ Potential issue | 🟠 Major

Don't drop the first parse request during worker startup.

If the main thread posts before init() resolves, Line 11 returns immediately while static/scripts/editor.ts has already started its 1-second timeout. That turns cold-start parses into false "Parser timeout" failures with no worker-side retry. Buffer the latest request and replay it once loading completes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@static/scripts/parser_worker.ts` around lines 6 - 12, The current onmessage
handler drops the first parse request if it arrives before init() resolves
because it returns when loaded is false; add a single-request buffer (e.g.,
pendingRequest) to capture and overwrite the latest incoming event when loaded
is false, then after init() completes (in the init().then(...) continuation
where loaded = true) check if pendingRequest exists, replay it into the same
message-processing path used by onmessage (e.g., call the same handler or
dispatch logic used for incoming events), and clear pendingRequest; update
onmessage to store e (or e.data) into pendingRequest and return when !loaded so
the worker does not lose the first parse request and will process the latest
buffered request once ready.

17-19: ⚠️ Potential issue | 🔴 Critical

Don't route worker compilation through lint().

lint() still goes through compile_grammar() in src/lib.rs, and that path calls add_rules_to_select(), which resolves DOM nodes via window.document. In a Web Worker this throws before the worker-side VM is rebuilt, so parses for a changed grammar fail. Split out a DOM-free wasm compile entry point for the worker and keep lint() on the window side.

#!/bin/bash
set -euo pipefail

printf '\n== worker calls lint ==\n'
rg -nC2 '\blint\s*\(' static/scripts/parser_worker.ts

printf '\n== lint / compile path in Rust ==\n'
rg -nC3 'pub fn lint|fn compile_grammar|fn add_rules_to_select' src/lib.rs

printf '\n== DOM access reached by that path ==\n'
rg -nC2 'query_selector|create_element|document\(' src/lib.rs
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@static/scripts/parser_worker.ts` around lines 17 - 19, The worker is calling
lint(grammar), which on the Rust side routes through compile_grammar() ->
add_rules_to_select() and thus touches window.document, causing failures in a
Web Worker; replace the worker call to lint(grammar) in parser_worker.ts with a
call to a new DOM-free WASM compile entrypoint (e.g., compile_grammar_no_dom or
compile_grammar_for_worker) that you must add in Rust and export from the wasm
module; implement that new Rust function to perform the grammar compilation
without calling add_rules_to_select() (or any
document/query_selector/create_element paths) and ensure the worker rebuilds its
VM using this DOM-free function while leaving lint() unchanged for the main
window.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@static/scripts/parser_worker.ts`:
- Line 10: Replace the direct global assignment to onmessage with the Worker
event API: locate the onmessage = (e) => { ... } assignment in parser_worker.ts
and change it to self.addEventListener("message", (e) => { ... }); this avoids
the global reassignment lint (noGlobalAssign), allows multiple handlers, and
preserves the existing message handling logic and event parameter usage.

---

Duplicate comments:
In `@static/scripts/parser_worker.ts`:
- Around line 6-12: The current onmessage handler drops the first parse request
if it arrives before init() resolves because it returns when loaded is false;
add a single-request buffer (e.g., pendingRequest) to capture and overwrite the
latest incoming event when loaded is false, then after init() completes (in the
init().then(...) continuation where loaded = true) check if pendingRequest
exists, replay it into the same message-processing path used by onmessage (e.g.,
call the same handler or dispatch logic used for incoming events), and clear
pendingRequest; update onmessage to store e (or e.data) into pendingRequest and
return when !loaded so the worker does not lose the first parse request and will
process the latest buffered request once ready.
- Around line 17-19: The worker is calling lint(grammar), which on the Rust side
routes through compile_grammar() -> add_rules_to_select() and thus touches
window.document, causing failures in a Web Worker; replace the worker call to
lint(grammar) in parser_worker.ts with a call to a new DOM-free WASM compile
entrypoint (e.g., compile_grammar_no_dom or compile_grammar_for_worker) that you
must add in Rust and export from the wasm module; implement that new Rust
function to perform the grammar compilation without calling
add_rules_to_select() (or any document/query_selector/create_element paths) and
ensure the worker rebuilds its VM using this DOM-free function while leaving
lint() unchanged for the main window.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 628189fd-d808-47b6-95c5-88bbdc14a20c

📥 Commits

Reviewing files that changed from the base of the PR and between 29a5dff and 07cb9cd.

📒 Files selected for processing (2)
  • src/lib.rs
  • static/scripts/parser_worker.ts

loaded = true;
});

onmessage = (e) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Find the file and check if it exists
find . -name "parser_worker.ts" -o -name "parser_worker.js"

Repository: pest-parser/site

Length of output: 93


🏁 Script executed:

# Check the directory structure around the file
git ls-files | grep -E "(parser_worker|worker)" | head -20

Repository: pest-parser/site

Length of output: 91


🏁 Script executed:

# Read the file to see the current implementation
cat -n static/scripts/parser_worker.ts

Repository: pest-parser/site

Length of output: 859


🏁 Script executed:

# Check for Biome configuration
find . -maxdepth 2 -name "biome.*" -o -name ".biomerc*" -o -name "biome.json*"

Repository: pest-parser/site

Length of output: 72


🏁 Script executed:

# Check if there's any lint configuration in package.json or similar
cat package.json | head -50

Repository: pest-parser/site

Length of output: 894


🏁 Script executed:

# Search for any references to noGlobalAssign or biome lint config
rg -i "noGlobalAssign|biome" --max-count 10

Repository: pest-parser/site

Length of output: 132


🏁 Script executed:

# Read the biome.json configuration
cat biome.json

Repository: pest-parser/site

Length of output: 1834


🏁 Script executed:

# Check if there's an ESLint configuration file
find . -maxdepth 2 -name ".eslintrc*" -o -name "eslint.config.*"

Repository: pest-parser/site

Length of output: 75


🏁 Script executed:

# Search for noGlobalAssign references in the codebase
rg "noGlobalAssign" .

Repository: pest-parser/site

Length of output: 42


🏁 Script executed:

# Read the ESLint configuration
cat .eslintrc.cjs

Repository: pest-parser/site

Length of output: 267


🌐 Web query:

Biome lint/suspicious/noGlobalAssign rule onmessage worker

💡 Result:

Biome’s lint/suspicious/noGlobalAssign flags onmessage = … in a worker because it treats onmessage as a global and disallows reassigning globals. [1]

Preferred fix (avoid global assignment)

Use an event listener instead of assigning to onmessage:

self.addEventListener("message", (event) => {
  // ...
});

If you really want onmessage = …

Suppress just that line:

// biome-ignore lint/suspicious/noGlobalAssign: setting worker handler
onmessage = (event) => {};

Biome supports targeted inline / file suppressions via biome-ignore… comments. [2]

Or disable the rule for worker files via config override

Use overrides to change linter rules for matching globs. [3]

{
  "overrides": [
    {
      "includes": ["**/*.worker.{js,ts}", "**/workers/**"],
      "linter": {
        "rules": {
          "suspicious": {
            "noGlobalAssign": "off"
          }
        }
      }
    }
  ]
}

Sources: Biome rule docs for noGlobalAssign [1], Biome suppression syntax [2], Biome config overrides / per-glob linter configuration [3].


🌐 Web query:

Web Worker API addEventListener message vs onmessage assignment

💡 Result:

In Web Workers, both patterns listen for the same "message" events; the difference is how many handlers you can have and how they’re managed:

  • onmessage = handler (event handler property)

    • Only one handler “slot”. Assigning again overwrites the previous one.
    • Setting onmessage = null removes it. (developer.mozilla.org)
  • addEventListener("message", handler, options) (event listener list)

    • You can register multiple independent handlers for "message" (and remove specific ones with removeEventListener).
    • Supports listener options like signal / once (where supported). (developer.mozilla.org)

Both can coexist (handlers registered via addEventListener still run even if you also set onmessage; changing onmessage only replaces the property-based handler). (developer.mozilla.org)

Citations:


Use the Worker event API instead of reassigning onmessage.

The onmessage property assignment triggers Biome's lint/suspicious/noGlobalAssign rule (enabled by default in your biome.json configuration). Using self.addEventListener("message", ...) is the preferred modern pattern for Web Workers, supports multiple independent message handlers, and avoids the global assignment lint violation.

Proposed fix
-onmessage = (e) => {
+self.addEventListener("message", (e) => {
   if (!loaded) return;

   const { id, grammar, rule, input } = e.data;

   try {
     // This has the side effect of compiling the grammar for the wasm worker.
     if (grammar !== lastGrammar) {
       lint(grammar);
       lastGrammar = grammar;
     }

     const result = parse(rule, input);
     postMessage({ id, type: "success", result });
   } catch (err) {
     postMessage({ id, type: "error", error: err.toString() });
   }
-};
+});
🧰 Tools
🪛 Biome (2.4.4)

[error] 10-10: A global variable should not be reassigned.

(lint/suspicious/noGlobalAssign)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@static/scripts/parser_worker.ts` at line 10, Replace the direct global
assignment to onmessage with the Worker event API: locate the onmessage = (e) =>
{ ... } assignment in parser_worker.ts and change it to
self.addEventListener("message", (e) => { ... }); this avoids the global
reassignment lint (noGlobalAssign), allows multiple handlers, and preserves the
existing message handling logic and event parameter usage.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR moves potentially unbounded parser execution off the main thread into a Web Worker and adds timeout-based worker termination to prevent recursive grammars from freezing the UI (issue #66).

Changes:

  • Added a dedicated parser_worker.ts that loads the Wasm module and performs lint + parse on request.
  • Refactored the Wasm-side UI trigger to dispatch a trigger_worker browser event and exposed a worker-callable parse() export.
  • Wired the editor to manage the worker lifecycle, correlate responses via parseId, and enforce a 1000ms timeout.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
static/scripts/parser_worker.ts New worker implementation to run parsing in the background with simple grammar caching.
static/scripts/editor.ts Spawns/respawns the worker, posts parse requests, and adds timeout + response correlation.
src/lib.rs Replaces direct DOM parsing with event dispatch; adds exported parse() for worker use.
Cargo.toml Enables web-sys CustomEvent support for the event-driven refactor.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +24 to +30
if (parseTimeout !== null) {
clearTimeout(parseTimeout);
parseTimeout = null;
}

const { id, type, result, error } = e.data;
if (id === parseId) {
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parserWorker.onmessage clears parseTimeout before checking the response id. If an older response arrives after a newer parse request was sent, this will cancel the timeout for the in-flight (latest) parse and can leave a hung parse without any timeout protection. Only clear the timeout when id matches the active parseId (or track timeouts per-request).

Suggested change
if (parseTimeout !== null) {
clearTimeout(parseTimeout);
parseTimeout = null;
}
const { id, type, result, error } = e.data;
if (id === parseId) {
const { id, type, result, error } = e.data;
if (id === parseId) {
if (parseTimeout !== null) {
clearTimeout(parseTimeout);
parseTimeout = null;
}

Copilot uses AI. Check for mistakes.
if (parseId === currentId) {
outputDom.value = "Parser timeout";
spawnWorker();
}
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the timeout fires, parseTimeout isn’t reset to null. This leaves a stale handle around and later code may attempt to clear an already-fired timeout. Consider setting parseTimeout = null inside the timeout callback after it runs.

Suggested change
}
}
parseTimeout = null;

Copilot uses AI. Check for mistakes.
Comment on lines +6 to +12
init().then(() => {
loaded = true;
});

onmessage = (e) => {
if (!loaded) return;

Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The worker drops any message received before init() completes (if (!loaded) return;). On slower loads this can cause the main thread to time out and restart the worker even though parsing never began. Consider queuing the first request until loaded is true, or sending a "ready" message from the worker and waiting for it before posting parse requests.

Copilot uses AI. Check for mistakes.
Comment on lines +16 to +22
// This has the side effect of compiling the grammar for the wasm worker.
if (grammar !== lastGrammar) {
lint(grammar);
lastGrammar = grammar;
}

const result = parse(rule, input);
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The worker calls lint(grammar) but ignores its returned errors and unconditionally caches lastGrammar = grammar. Since compile_grammar only updates the global VM on success, this can leave the worker using a stale VM from a previous valid grammar (or None) while thinking the current grammar is "cached", producing incorrect parse results or confusing errors. Consider checking the lint result and only updating lastGrammar (and calling parse) when there are zero lint errors; otherwise respond with a clear grammar-error message and skip parsing.

Copilot uses AI. Check for mistakes.
const result = parse(rule, input);
postMessage({ id, type: "success", result });
} catch (err) {
postMessage({ id, type: "error", error: err.toString() });
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

err.toString() can produce unhelpful output (e.g., "[object Object]") for non-Error throws. Consider serializing errors as String(err) or err instanceof Error ? err.message : String(err) (optionally include err.stack when available) to improve diagnostics.

Suggested change
postMessage({ id, type: "error", error: err.toString() });
postMessage({ id, type: "error", error: err instanceof Error ? err.message : String(err) });

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants