The EDN specification currently restricts the hex form of character literals to exactly 4 hex digits (\uXXXX). This limitation prevents EDN from representing Unicode characters in the Supplementary Planes (U+10000 to U+10FFFF), which includes:
- Emoji (U+1F300–U+1F9FF)
- Historic scripts (Linear B, Egyptian hieroglyphs, cuneiform)
- Musical notation symbols
- Mathematical alphanumeric symbols
- CJK ideographs extensions
- And many other modern and historic characters
Extending the character literal hex form to support 4-6 hex digits would allow direct representation of any valid Unicode code point (U+0000 to U+10FFFF).
Many languages that can use EDN already support extended Unicode:
- JavaScript: Supports
\u{XXXXX} syntax (ES6+)
- Python: Supports
\UXXXXXXXX (8 hex digits)
- Ruby: Supports
\u{XXXXX}
The EDN specification currently restricts the hex form of character literals to exactly 4 hex digits (
\uXXXX). This limitation prevents EDN from representing Unicode characters in the Supplementary Planes (U+10000 to U+10FFFF), which includes:Extending the character literal hex form to support 4-6 hex digits would allow direct representation of any valid Unicode code point (U+0000 to U+10FFFF).
Many languages that can use EDN already support extended Unicode:
\u{XXXXX}syntax (ES6+)\UXXXXXXXX(8 hex digits)\u{XXXXX}