Skip to content

BIP Draft: Formosa — Themed mnemonic sentences for generating deterministic keys#2108

Open
Yuri-SVB wants to merge 11 commits intobitcoin:masterfrom
Yuri-SVB:master
Open

BIP Draft: Formosa — Themed mnemonic sentences for generating deterministic keys#2108
Yuri-SVB wants to merge 11 commits intobitcoin:masterfrom
Yuri-SVB:master

Conversation

@Yuri-SVB
Copy link
Copy Markdown

Mnemonic sentences instead of words proposed as forwards- and backwards-compatible expansion to BIP39, itself as Bitcoin Improvement Proposal.

Mnemonic *sentences* instead of words proposed as forwards- and backwards-compatible expansion to BIP39, itself as Bitcoin Improvement Proposal.
Copy link
Copy Markdown
Member

@murchandamus murchandamus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Yuri, thank you for your submission. I see that your proposal was posted to the mailing list in 2023. Since then, we deployed BIP3 as a new BIP Process, so there are a few formatting changes that would be needed to the preamble. I would also suggest that you add a link to the prior discussion to the Discussion header.

At first glance, your document appears to be missing a Specification, a Rationale, and a Backwards Compability section. Please refer to BIP3 for more information.

Comment thread bip.mediawiki Outdated
Comment thread bip.mediawiki Outdated
@murchandamus murchandamus added New BIP PR Author action required Needs updates, has unaddressed review comments, or is otherwise waiting for PR author labels Mar 4, 2026
@murchandamus murchandamus changed the title Formosa as BIP BIP Draft: Formosa — Themed mnemonic sentences for generating deterministic keys Mar 16, 2026
@murchandamus
Copy link
Copy Markdown
Member

Hi @Yuri-SVB, I haven’t given this document a full review yet, because the initial submission has some formatting issues. If you are still working on this, please update your submission to meet the formatting requirements.

@Yuri-SVB
Copy link
Copy Markdown
Author

Yuri-SVB commented Mar 23, 2026

Hello, Murchandamus. Thank you for your attention, and thank you for remembering my earlier attempt from 3 years ago!
I believe the requirements are met now.

Yuri-SVB and others added 2 commits March 23, 2026 18:41
Co-authored-by: Mark "Murch" Erhardt <murch@murch.one>
Satisfying requirement of title in fewer than 50 characters.
@Yuri-SVB
Copy link
Copy Markdown
Author

Hi @Yuri-SVB, I haven’t given this document a full review yet, because the initial submission has some formatting issues. If you are still working on this, please update your submission to meet the formatting requirements.

Hello, Murch!
Could you confirm all the remaining formatting requirements were met?
Thank you!

@murchandamus
Copy link
Copy Markdown
Member

Hey Yuri,
sorry for not getting around to looking at this yet. The preamble looks much better. I’m afraid I’m gonna be afk next week, so I will not be able to give this a full read until I’m back the week after.

@murchandamus murchandamus removed the PR Author action required Needs updates, has unaddressed review comments, or is otherwise waiting for PR author label Mar 29, 2026
@Yuri-SVB
Copy link
Copy Markdown
Author

Yuri-SVB commented Apr 1, 2026

Hey Yuri, sorry for not getting around to looking at this yet. The preamble looks much better. I’m afraid I’m gonna be afk next week, so I will not be able to give this a full read until I’m back the week after.

Hello, Murch. No problem.
I hope this compilation of references on Formosa (how I call this BIP39 expansion) can be of help.

Copy link
Copy Markdown
Member

@murchandamus murchandamus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reads already pretty well, although the specification could be presented in a more technical manner. It seems a bit light on the Rationale. It would be preferable if there were a Backwards Compatibility section instead of the mention in the Abstract.

I think an example of a Formosa-encoded seed could help illustrate what you are trying to do, I was firmly expecting to see one until I got to the end.

Comment thread bip.mediawiki Outdated
Comment thread bip.mediawiki Outdated
Comment thread bip.mediawiki Outdated
Comment thread bip.mediawiki Outdated
@murchandamus murchandamus added the PR Author action required Needs updates, has unaddressed review comments, or is otherwise waiting for PR author label Apr 14, 2026
Restructure the draft to follow BIP-3 conventions and resolve the issues
raised by reviewers in bitcoin#2108:

- Introduce explicit Specification section with a Terminology subsection
  that distinguishes 'word', 'category', 'theme', 'sentence' and
  'mnemonic' / 'mnemonic story', removing the ambiguity of using
  'sentence' at two different scales.
- Replace the unclear 'if the category is led by another category'
  wording with an explicit LED_BY field description and a step-by-step
  algorithm that covers both the leaderless and led cases.
- Reflow the theme-property list (previously a/b/c/d/e split by an
  intervening paragraph) into a single numbered list so it renders as a
  list rather than as code blocks.
- Add a dedicated Rationale section covering the 33-bit sentence size,
  themed sentences, free-form theme schema, the LED_BY mechanism, the
  re-encoding-through-BIP-39 design, and why custom themes are
  discouraged.
- Add a dedicated Backwards Compatibility section describing
  compatibility at the mnemonic, entropy, and seed levels.
- Add a worked Example section showing a 128-bit entropy being encoded
  into a 4-sentence mnemonic story under a small illustrative theme,
  including bit splitting, FILLING_ORDER vs NATURAL_ORDER, and the
  LED_BY lookup.
- Tighten the Abstract and Motivation; clarify that BIP-39 is itself a
  Formosa theme.
Reviewer on PR bitcoin#2108 asked for no abbreviations in table labels. Replace:

- ENT / CS / S / MS column headers with 'Initial entropy bits',
  'Checksum bits', 'Total bits', 'Number of sentences', 'Mnemonic
  words (6-word theme)' and 'Mnemonic words (BIP-0039)'.
- 'List size / Bits / Chars to identify / Density (bits/char)' with
  'Wordlist size / Bits per word / Characters to identify / Density
  (bits per character)'.
- ADJ. with ADJECTIVE in the example bit-assignment diagram, and the
  surrounding narrative ENT/MS uses with the spelled-out forms.

The accompanying formulas now use the expanded names too, so the
algorithm description and the table column headers stay consistent.
Replace the previous hypothetical 5-category example with one that
mirrors the medieval_fantasy theme actually shipped at
https://github.com/Yuri-SVB/formosa/tree/master/src/mnemonic/themes,
including:

- the real 6 categories with their actual BIT_LENGTHs
  (VERB=5, SUBJECT=6, OBJECT=6, ADJECTIVE=5, WILDCARD=6, PLACE=5,
  summing to 33);
- the real FILLING_ORDER and NATURAL_ORDER;
- the real lead tree (VERB → SUBJECT; SUBJECT → OBJECT and WILDCARD;
  OBJECT → ADJECTIVE; WILDCARD → PLACE), showing that a single
  leader can have several dependent categories;
- a 33-bit block whose decoded indices (28, 32, 63, 27, 46, 29)
  pick existing words and existing sub-list entries: VERB[28]
  =unveil, SUBJECT_under_unveil[32]=king, OBJECT_under_king[63]
  =wine, ADJECTIVE_under_wine[27]=sweet, WILDCARD_under_king[46]
  =queen, PLACE_under_queen[29]=throne_room, yielding the sentence
  'king unveil sweet wine queen throne_room'.

This keeps the worked example faithful to the reference
implementation rather than to a fabricated theme, so that anyone can
reproduce the encoding by parsing medieval_fantasy.json.
Add a paragraph to the LED_BY rationale clarifying that a Formosa theme
behaves as a primitive language model (next-word predictor): each LED_BY relation
skews the conditional distribution over the next word so that probability
mass falls only on the 2^BIT_LENGTH words compatible with the already-
chosen leader, and zero elsewhere. The theme designer plays the role of
training data, hand-curating which combinations are semantically coherent.
This framing makes explicit why themes produce sentences that 'sound right'
while still covering all 2^33 bit patterns of a sentence.
…oncake)

which builds on this property by rendering each Formosa category as an
on-screen table whose rows and columns are permuted per input session.

Combined with the randomized-indexation property,
an attacker watching only the screen still learns nothing without also
recovering the press sequence.

Add a Rationale paragraph explaining a further benefit of splitting the
vocabulary into several short wordlists (32-128 entries each): such tables
fit on a mobile-device screen and admit input via on-screen lookup, which
a single 2048-word list does not.

The randomized indexation:

- defeats pure key-logging (keystrokes alone don't reveal words; the
  attacker also needs the session permutation),
- raises the bar for shoulder surfing (same as key-logging: only keys
  AND session's permutation suffice. Either alone is uniformative).

This gives an operational, security-focused argument for the
many-small-lists design that complements the existing memorization and
information-density arguments.

Formosa: document Mooncake's volume-key input on mobile

Add a paragraph to the Mooncake rationale describing the proposed mobile
input mechanism: reuse of the volume-up / volume-down keys as a two-button
binary selector. Because every Formosa category is sized 2^BIT_LENGTH and
the on-screen table is laid out in rows, sub-rows and columns whose counts
are powers of two, narrowing to a single cell takes exactly BIT_LENGTH
presses (5 for a 32-entry category, 6 for 64, 7 for 128). The per-category
press count is invariant therefore uninformative, and equal to the bits of
entropy encoded, and the 'one bit per press' bound matches the existing
side-channel argument.

Add three concrete reasons why volume-key input on mobile resists visual

shoulder surfing better than an on-screen keyboard:

- Subtler input motions: a single finger pressing a side rocker, much
  harder to read from a distance than multi-finger taps on a glass
  keyboard.
- Easy occlusion with the second hand: both volume keys are on one edge
  of the device, so the free hand (or the holding hand's thumb) can
  cover them without obscuring the screen for the user.
- Pocket input via headphone volume buttons: because the protocol is
  purely binary, headphone volume controls are sufficient, letting the
  user keep the buttons in a pocket while operating it by feel and
  removing the input motion from the observer's field of view entirely.
@murchandamus murchandamus removed the PR Author action required Needs updates, has unaddressed review comments, or is otherwise waiting for PR author label Apr 27, 2026
Fixed typo from "dektop"  to "desktop"
Fixed agreement of number from "Those of a mobile device" to "Those of mobile devices"
Copy link
Copy Markdown
Member

@murchandamus murchandamus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good improvements, this reads great. I’m gonna look into a number assignment.
It would probably be good if some wallet developers that have worked with BIP39 reviewed it, too.

Comment thread bip.mediawiki Outdated
Comment thread bip.mediawiki Outdated
Yuri-SVB and others added 2 commits April 29, 2026 19:46
Substituted triple hyphen for —

Co-authored-by: Murch <murch@murch.one>
Updated title to mention Formosa and be more self-explanatory.

Co-authored-by: Murch <murch@murch.one>
@Yuri-SVB
Copy link
Copy Markdown
Author

Good improvements, this reads great. I’m gonna look into a number assignment. It would probably be good if some wallet developers that have worked with BIP39 reviewed it, too.

Do you have someone in mind? Would you like me to invite a wallet develper?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants