feat(sync-rules): railroad diagram generation from EBNF grammars#535
feat(sync-rules): railroad diagram generation from EBNF grammars#535Sleepful wants to merge 12 commits intopowersync-ja:mainfrom
Conversation
|
8e9a4a6 to
c8b0324
Compare
Script at packages/sync-rules/scripts/generate-grammar-docs.ts parses W3C EBNF grammar files and generates railroad diagram documentation: - SVG diagrams with configurable production inlining - Flat HTML review page with inline SVGs and clickable cross-references - Flat MDX page for docs site consumption - Resolved EBNF files showing post-inlining grammar (committed for diffability) - descriptions.yaml extracted from EBNF comments Run: pnpm --filter='./packages/sync-rules' generate:grammar-flat
…cing, and improve README
- Add two coverage guards: assertNoSkippedTerms and assertAllRefsAreDiagrammed
- Add lexical rule diagrams (Identifier, StringLiteral, IntegerLiteral, NumericLiteral) with summary table
- Add HTML review features: inlining notes per production, inlined-only terms table, clickable NonTerminal links
- Fix terminal modifier bug ("NOT"? now correctly renders as optional NOT)
- Rebalance diagrams: decompose FromSource, Predicate, CaseExpr into smaller sections
- Extract CaseCondition from SearchedCaseExpr to reduce width (1821px -> 1325px)
- Inline SVGs in HTML for text selection and link interactivity
- Add xmlns attribute to SVGs for browser rendering
- Update grammar/README.md with inlining config, lexical rules, and coverage checks documentation
- Remove comment preprocessor and descriptions.yaml output (docs written directly in MDX) - Add stale config check (assertNoStaleInlineRefs) to catch orphaned inline references - Decompose grammars: promote TableValuedCall, Reference, WhereAtom, MemberSuffix, ScalarBinaryOp, CaseExpr variants (WhenCaseExpr, WhenScalarExpr) to own diagrams - Merge TableRef into Reference, remove Literal wrapper, extract ScalarBinaryOp - Inline Predicate into WhereAtom for sync-streams only - Promote ParameterQuery variants to top-level diagrams in bucket-definitions
Add per-production MDX pages, lexical rules page, and index page for each grammar (sync-streams, sync-rules). SVGs are generated with split-mode links baked in; flat HTML replaces them with #anchor links. - Per-production pages with diagram, TODO placeholder, inlines note, referenced terms list with cross-links - ScalarExpr pages include embedded operator precedence table - Lexical rules page with prose descriptions and EBNF code blocks - Index page per grammar listing all productions and lexical rules - Coverage guards updated: lexical and operator-table rules exempt - Lexical rules no longer get SVG diagrams (EBNF text only) - Add generate:grammar-docs pnpm script (split mode) - Resolved EBNF updated for upstream grammar changes
Lexical rules section now shows a summary table (Token | Examples | Rule) followed by per-rule description subsections. All lexical rule links in SVG diagrams point to the Lexical Rules heading so the table is the first thing seen. Fixed flat HTML to also link lexical rules (previously stripped).
…it-mode MDX Replace inline SVG approach (which fought Mintlify MDX parser stripping <style> and <text> elements) with co-located .svg files referenced via <img> tags. Add --base-url CLI param for absolute URL paths. Strip <a> links from static SVGs since they do not work inside <img> tags.
…g in split MDX Remove repeated ## heading from production pages (title in frontmatter is sufficient). Escape | in lexical rule patterns so markdown tables render correctly. Remove redundant name from description frontmatter.
Remove --mode/split from grammar docs generator. The docs repo now gets flat single-page MDX per grammar with co-located SVGs via --outdir. Each production section includes Used by / References cross-links. Lexical term links point to the Lexical Rules heading. Pipe and angle bracket escaping applied to flat MDX operator table.
Backtick code spans don't decode HTML entities, so < > were rendering as literal text. Use raw < > characters instead.
rkistner
left a comment
There was a problem hiding this comment.
I did not review the script in detail, but the output looks good to me
| "build:tests": "tsc -b test/tsconfig.json", | ||
| "test": "vitest" | ||
| "test": "vitest", | ||
| "generate:grammar": "tsx scripts/generate-grammar-docs.ts" |
There was a problem hiding this comment.
No need for tsx anymore - you can run this directly with node (and remove the tsx dependency below).
| @@ -0,0 +1,29 @@ | |||
| ParameterQuery ::= TableValuedParameterQuery | TableParameterQuery | StaticParameterQuery | |||
There was a problem hiding this comment.
These resolved definitions are ignored by .gitignore, but still present in the PR here - should this be removed from the PR?
This PR goes in hand with powersync-ja/powersync-docs#372
Summary
Tool that parses W3C EBNF grammar files and generates railroad diagram documentation (SVG + MDX + HTML).
packages/sync-rules/scripts/generate-grammar-docs.tsWhat it generates
From two EBNF grammars (
bucket-definitions.ebnfandsync-streams-compiler.ebnf):Workflow
From
packages/sync-rules/:--outdir— where docs MDX + SVG files are written (per-grammar subdirectories:sync-rules/,sync-streams/)--base-url— URL prefix for absolute<img>src paths in MDX (must match the docs site structure)packages/sync-rules/grammar/docs/regardless of--outdirKey design decisions
GRAMMARSarray), not heuristics<style>and<text>elements)Docs repo PR
The generated MDX pages are committed in the docs repo: powersync-ja/powersync-docs#372
Files changed
packages/sync-rules/scripts/generate-grammar-docs.ts— the generation script (~1400 lines)packages/sync-rules/grammar/README.md— explains the grammar directorypackages/sync-rules/grammar/docs/*.resolved.ebnf— committed resolved grammarspackages/sync-rules/package.json— added scripts + devDependenciespackages/sync-rules/.gitignore— gitignore generated output except resolved EBNF