[bitcoin-dev] BIP- & SLIP-0039 -- better multi-language support

Weiji Guo weiji.g at gmail.com
Thu Nov 22 17:25:07 UTC 2018


Hi Everyone,

Thank you very much in this thanks giving day for the detailed and well
thought out responses. :)

Steven Hatzakis via bitcoin-dev <bitcoin-dev at
lists.linuxfoundation.org
<https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>>:

>* *Option 2*: Perhaps a revision is needed to how the BIP39 seed is
*>* generated in the first place, such as by hashing the entropy instead of the
*>* words. Any thoughts on how viable that could be where the initial entropy
*>* is fed into the PBKDF2 function and not the words?*

If we go this direction, I'd suggest that we pull Shamir's Secret Sharing
into the game. Trezor's
SLIP-0039 proposal is great and has many security aspects already covered.
However, it does
not allow any language other than English and Trezor team clearly stated
that no other language
will be supported.

While I really want to keep the language independent design. So in the
revision, I'd like to see
a language id (allocated to each one having a defined wordlist) in the SSS
share, as well as
share id, threshold, index, share value, checksum etc.

Regarding checksum scheme, SLIP-0039 proposals a 3-word Reed-Solomon
design. It has a very
good error checking capability but not very good at providing hints to
error recovery. Trezor team
opposes to the idea of providing hints to users regarding how to fix an
error. This could lead to
difficulties for some vendors, and in small probability, confusions to
users (when there is a 2-word
error)

I do agree with Trezor team that it should be users' responsibility to
recover from a detected error.
However, there is a better way than solely rely on checksum. That is, as in
our revision, we can
support mnemonic in multiple languages simultaneously, why don't we use two
languages, or one
language + numbers to check each other? In Steven's example (language id,
share id, etc. skipped)
we could record a SSS share (assuming it is one of the shares just for the
sake of example) like:

>* *In English*: minimum fee sure ticket faculty banana gate purse caught
*>* valley globe shift
*>* *In Spanish*: mercado faja soledad tarea evadir aries gafas peine búho
*>* tumor gerente reja*

Or

>* *In English*: minimum fee sure ticket faculty banana gate purse caught
*>* valley globe shift*

>* Word Indexes: 1128, 676, 1744, 1805, 653, 145, 770, 1396, 291, 1927,
794, 1582*


Then software will have to check checksum as well as to check if words
match each other. For
example, "minimum"'s index value in English wordlist should equal to "
*mercado*"'s in Spanish,
or should equal to 1128.

If any error is detected, combining the checksum value and dual-encoding
information, it is much
easier to figure out which word was handprinted incorrectly.

BTW, it is very error prone to handprint. Some study suggests about 0.9%
per word rate. See
http://panko.shidler.hawaii.edu/HumanErr/Basic.htm

Hotopf [1980]

W sample (written exam). Per word

0.9%

It is important to have an error recovery mechanism easy to understand and
implement.

Thanks,
Weiji
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20181123/883b184a/attachment-0001.html>


More information about the bitcoin-dev mailing list