Skip to content

feat(sync-rules): railroad diagram generation from EBNF grammars#535

Open
Sleepful wants to merge 12 commits intopowersync-ja:mainfrom
Sleepful:grammar-svgs
Open

feat(sync-rules): railroad diagram generation from EBNF grammars#535
Sleepful wants to merge 12 commits intopowersync-ja:mainfrom
Sleepful:grammar-svgs

Conversation

@Sleepful
Copy link
Contributor

@Sleepful Sleepful commented Feb 27, 2026

This PR goes in hand with powersync-ja/powersync-docs#372

Summary

Tool that parses W3C EBNF grammar files and generates railroad diagram documentation (SVG + MDX + HTML).

  • Script at packages/sync-rules/scripts/generate-grammar-docs.ts
  • Local review: flat HTML with inline SVGs and clickable cross-references
  • Docs output: flat single-page MDX per grammar with co-located SVGs — for the Mintlify docs site

What it generates

From two EBNF grammars (bucket-definitions.ebnf and sync-streams-compiler.ebnf):

  • 41 SVG railroad diagrams (15 Sync Rules + 26 Sync Streams)
  • Flat MDX pages: one per grammar with all productions, cross-reference links (Used by / References), operator table, and lexical rules — ready for Mintlify
  • Flat HTML review pages: all diagrams on one page with clickable cross-references and inline SVGs
  • Resolved EBNF files: committed for diffability of inlining changes
  • Operator precedence tables embedded in ScalarExpr sections
  • Lexical rules with token summary tables and per-rule descriptions

Workflow

From packages/sync-rules/:

# Generate local review output (flat HTML + MDX + resolved EBNF)
pnpm generate:grammar

# Also generate docs output (flat MDX + co-located SVGs for the docs site)
pnpm exec tsx scripts/generate-grammar-docs.ts --outdir /path/to/powersync-docs/sync/grammar --base-url /sync/grammar

# Preview in Mintlify (from the docs repo)
cd /path/to/powersync-docs && npx mintlify dev --port 3333
  • --outdir — where docs MDX + SVG files are written (per-grammar subdirectories: sync-rules/, sync-streams/)
  • --base-url — URL prefix for absolute <img> src paths in MDX (must match the docs site structure)
  • Local review output always goes to packages/sync-rules/grammar/docs/ regardless of --outdir

Key design decisions

  • Inlining is controlled by a fixed config (GRAMMARS array), not heuristics
  • SVGs are static files co-located with MDX (not inline SVG — Mintlify MDX parser strips <style> and <text> elements)
  • Three coverage guards ensure every grammar term is accounted for
  • Each production section includes Used by (parent terms) and References (child terms) cross-links

Docs repo PR

The generated MDX pages are committed in the docs repo: powersync-ja/powersync-docs#372

Files changed

  • packages/sync-rules/scripts/generate-grammar-docs.ts — the generation script (~1400 lines)
  • packages/sync-rules/grammar/README.md — explains the grammar directory
  • packages/sync-rules/grammar/docs/*.resolved.ebnf — committed resolved grammars
  • packages/sync-rules/package.json — added scripts + devDependencies
  • packages/sync-rules/.gitignore — gitignore generated output except resolved EBNF

@changeset-bot
Copy link

changeset-bot bot commented Feb 27, 2026

⚠️ No Changeset found

Latest commit: c61c08d

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@Sleepful Sleepful force-pushed the grammar-svgs branch 3 times, most recently from 8e9a4a6 to c8b0324 Compare March 4, 2026 10:24
@Sleepful Sleepful marked this pull request as ready for review March 5, 2026 04:46
Sleepful added 12 commits March 4, 2026 22:52
Script at packages/sync-rules/scripts/generate-grammar-docs.ts parses W3C
EBNF grammar files and generates railroad diagram documentation:
- SVG diagrams with configurable production inlining
- Flat HTML review page with inline SVGs and clickable cross-references
- Flat MDX page for docs site consumption
- Resolved EBNF files showing post-inlining grammar (committed for diffability)
- descriptions.yaml extracted from EBNF comments

Run: pnpm --filter='./packages/sync-rules' generate:grammar-flat
…cing, and improve README

- Add two coverage guards: assertNoSkippedTerms and assertAllRefsAreDiagrammed
- Add lexical rule diagrams (Identifier, StringLiteral, IntegerLiteral, NumericLiteral) with summary table
- Add HTML review features: inlining notes per production, inlined-only terms table, clickable NonTerminal links
- Fix terminal modifier bug ("NOT"? now correctly renders as optional NOT)
- Rebalance diagrams: decompose FromSource, Predicate, CaseExpr into smaller sections
- Extract CaseCondition from SearchedCaseExpr to reduce width (1821px -> 1325px)
- Inline SVGs in HTML for text selection and link interactivity
- Add xmlns attribute to SVGs for browser rendering
- Update grammar/README.md with inlining config, lexical rules, and coverage checks documentation
- Remove comment preprocessor and descriptions.yaml output (docs written directly in MDX)
- Add stale config check (assertNoStaleInlineRefs) to catch orphaned inline references
- Decompose grammars: promote TableValuedCall, Reference, WhereAtom, MemberSuffix,
  ScalarBinaryOp, CaseExpr variants (WhenCaseExpr, WhenScalarExpr) to own diagrams
- Merge TableRef into Reference, remove Literal wrapper, extract ScalarBinaryOp
- Inline Predicate into WhereAtom for sync-streams only
- Promote ParameterQuery variants to top-level diagrams in bucket-definitions
Add per-production MDX pages, lexical rules page, and index page for
each grammar (sync-streams, sync-rules). SVGs are generated with
split-mode links baked in; flat HTML replaces them with #anchor links.

- Per-production pages with diagram, TODO placeholder, inlines note,
  referenced terms list with cross-links
- ScalarExpr pages include embedded operator precedence table
- Lexical rules page with prose descriptions and EBNF code blocks
- Index page per grammar listing all productions and lexical rules
- Coverage guards updated: lexical and operator-table rules exempt
- Lexical rules no longer get SVG diagrams (EBNF text only)
- Add generate:grammar-docs pnpm script (split mode)
- Resolved EBNF updated for upstream grammar changes
Lexical rules section now shows a summary table (Token | Examples | Rule)
followed by per-rule description subsections. All lexical rule links in
SVG diagrams point to the Lexical Rules heading so the table is the first
thing seen. Fixed flat HTML to also link lexical rules (previously stripped).
…it-mode MDX

Replace inline SVG approach (which fought Mintlify MDX parser stripping
<style> and <text> elements) with co-located .svg files referenced via
<img> tags. Add --base-url CLI param for absolute URL paths. Strip <a>
links from static SVGs since they do not work inside <img> tags.
…g in split MDX

Remove repeated ## heading from production pages (title in frontmatter
is sufficient). Escape | in lexical rule patterns so markdown tables
render correctly. Remove redundant name from description frontmatter.
Remove --mode/split from grammar docs generator. The docs repo now gets
flat single-page MDX per grammar with co-located SVGs via --outdir.

Each production section includes Used by / References cross-links.
Lexical term links point to the Lexical Rules heading. Pipe and angle
bracket escaping applied to flat MDX operator table.
Backtick code spans don't decode HTML entities, so &lt; &gt; were
rendering as literal text. Use raw < > characters instead.
Copy link
Contributor

@rkistner rkistner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not review the script in detail, but the output looks good to me

"build:tests": "tsc -b test/tsconfig.json",
"test": "vitest"
"test": "vitest",
"generate:grammar": "tsx scripts/generate-grammar-docs.ts"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for tsx anymore - you can run this directly with node (and remove the tsx dependency below).

@@ -0,0 +1,29 @@
ParameterQuery ::= TableValuedParameterQuery | TableParameterQuery | StaticParameterQuery
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These resolved definitions are ignored by .gitignore, but still present in the PR here - should this be removed from the PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants