[Bitcoin-ml] BCH address change - analysis and alternative proposal

Roy Badami roy at gnomon.org.uk
Fri Nov 17 11:45:02 UTC 2017


Hi all, this is my first time posting here.

I've been thinking recently about the proposed BCH address change, and
I have to say that I'm not a big fan of either the cashaddr proposal
or the Bitpay proposal.

TL;DR: scroll down to 'PROPOSAL'

ANALYSIS
========

A big part of my lack of enthusiasm for the cashaddr proposal stems
from the fact that I was not a big fan of bech32 to begin with (on
which cashaddr is based).  But I'll leave my specific comments on
bech32 and cashaddr to an appendix to this post - I want to start with
an analysis of the requirements, followed by a specific proposal to
meet the immediate short term requirements without compromising future
extensibility.

So, to start by briefly analysing the requirements.  The primary
objective is to reduce the risk of accidentally sending BCH to a BTC
address or vice versa.  This gives us the requiements:

(1) A new-style BCH address should not be a valid BTC address; and

(2) A BTC address should not be a valid new style BCH address.

(Or at least, that the possibilitily of violating (1) or (2) should be
sufficiently small as to be improbable.)

An additional requirement that is desireable (perhaps not essential)
is:

(3) it should be possible to reliably visually distinguish new-style
BCH addresses from BTC addresses/old-style BCH addresses (with thanks
to freetrader for helping clarify this requirement to me when
critiquing an earlier version of this proposal).

Of course, both proposals on the table satisfy (1), (2) and (3).

But I'd like to consider a further requirement, stemming from the
practicality of changing address format for a coin that already has
significant use.  In fact, we have an (admitedly limited) case study
here in Litecoin: despite using a different version byte for P2PKH
addresses from the outset, when LTC initially adopted P2SH addresses
they used the same version byte as BTC, 5, for P2SH addresses,
resulting in the addresses (beginning with '3') that can be confused
with BTC P2SH addresses.  Earlier this year, LTC transitioned to a
version byte of 50, resulting in P2SH addresses beginning with 'M'.

Now, my impression (and I'd welcome feedback from LTC users) is that,
many months after the transition, support for the new 'M' addresses is
still far from universal.  A recent case in point - a new user
purchases a Trezor and creates an LTC wallet.  Trezor gives them a
segwit wallet by default - so the user has P2SH addresses (beginning
with 'M').  The user finds a service to purchase some BTC with fiat,
and then tries to use Shapeshift to convert some of their coin to LTC.
But Shapeshift doesn't recognise their 'M' address - they need to use
Treor's conversion tool to convert their address to an old style '3'
address before being able to use Shapeshift.

I think it is reasonable to expect that a BCH address transition would
be at least somewhat similar: take up of the new address format would
likely be patchy for a significant period of time, and the reality is
that users would need to be able to convert between old and new style
BCH addresses for a significant period of time, if not indefinitely.

Now, relying on address conversion tools is inconvenient, and also
introduces the risk of misappropration of funds by malicious
conversion tools.  So I'd like to posit a third requiement:

(4) It should be possible to easily convert between old- and new-style
BCH addresses without requiring to trust software to perform the
conversion.

Neither cashaddr nor the original Bitpay proposal satisfy (4).

Note, the proposal in this note specifically does *not* address future
requirements.  It is solely focussed on representing *existing*
address types in a way that satisfied requirement (4), as well as
other existing data such as private keys and xpubs.

It is likely that a new address format will still be needed to address
future requirements, but to satisfy (4) this new address format should
only be used for *new* address types.  This is similar to what BTC is
doing: retaining base58 for existing address types, but introducing
bech32 for new address types.

Future extensibility is discussed further in the appendix.


PROPOSAL
========

New-style P2PKH and P2SH addresses in BCH are created by using base58,
with the same prefixes as existing BCH addresses (or BTC addresses),
and then inserting the character '0' (digit zero) between the first
and second characters of the base58 string.

This is also done for all other existing base58-encoded data
structures, such as private keys, extended public keys, and extended
private keys.

This immediately satisfies (1), (2) and (3) since a base58 string can
never contain the character '0'.  It also satisfies (4) since address
conversion between old- and new-style BCH addresses consists of adding
or removal of a single character,

That is, BTC and new-style BCH addresses are not only incompatible,
and visually disnguishable, but it is also possible to manually
convert addresses without resort to software tools - such conversion
being required during the (possibly quite long) transitional period.

Additional features of this proposal (which while not requirements,
and nonetheless desireable) are:

(5) New-style BCH addresses are consisent with existing user
understanding of '1' and '3' prefixes for P2PKH and P at SH addresses
(which will now begin '10' or '30')

(6) The same approach can (and probably should) be applied to all
base58 serialisations, such private keys, extended public keys, and
extended private keys (which will now begin 'K0', 'L0' or '50",
'x0pub', or 'x0prv')

As to the downsides:

The first, and probably biggest, is that we introduce the character
'0' (digit zero) into addresses.  This isn't completely ideal, because
it could be confused with 'O' (capital O) - but as capital O does not
occur in base58 addresses the scope for confusion is limited.  There
are three other characters that are particulary easily confusible -
'1' (digit one), 'I' (capital I) and 'l' (lowercase el) - and the
existing format discards two of those, but manages to use digit one
without significant confusion.  Digit zero is also already used in ETH
hex addresses, without significant confusion.

The second is that it doesn't explicitly address extensibility and
future requirements.  However, the use of prefixes has proved flexible
enough to date to handle multiple address formats and other data
structures for multiple coins, so may be adequete to address any near
term requirements, if desired.  In any case, a new address format
probably should be adopted for new address types, but to avoid
usability issues changes to existing address types should not be made
that violate requirement (4).  This would mean supporting two address
types forever, but the code complexity can be minimised by reusing the
base58 serialisation.  And in any case fully removing base58 from the
ecosystem would be an ambitious, and probably unrealistic, goal, given
it's use for private keys and extended public and private keys.
Extensibility is briefly discussed further in the appendix.


APPENDIX
========

BIP173 bech32
-------------

bech32 confuses me because it seems to be a proposed solution without
a clear set of requirements

* Much of the rationale for bech32 seems to be based on the perceived
  need to reduce the probability that a mistyped address will result
  in a valid, different, address below the one-in-four-billion
  probability of such an event that exists with the current base58
  encoding.  However no evidence or substantive discussion is provided
  to justify the (rather surprising) implicit proposition that
  one-in-four-billion is an unacceptably high risk.

* Much work in the design has gone into optimising the encoding based
  on an analysis of the similarity of printed characters.  And yet we
  end up with an address format that uses both '1' (digit 1) and 'l'
  (lower case el).  Admistedly, confusing the two cannot result in
  accepting an invalid address, but this is a highly undesirable
  feature for cryptocurrency addresses (and a surprising feature for
  an address format supposedly based on such an anaysis!).  It's also
  unclear whether we should solely be analysing visual similarity of
  printed characters rather than, for example, the proximity of keys
  on typical keyboard layouts (which might be an equally plausible
  model of errors).

* A large part of the rationale for bech32 seems to be based on an
  attempt to reduce the risks and inconvenience of manually
  transcribing Bitcoin addresses - and yet the justification for the
  acceptability of using longer addresses is that Bitcoin addresses
  are normally cut-and-pasted rather than being manually transcribed!

* BIP173 recognises the danger of using an error correcting code for
  addresses, and requires that implementations don't suggest a
  correction, but only identify a possible location for the error.
  But there is a real risk that some implementations may fail to
  follow this advice, and try to be 'helpful' by suggesting
  replacements.  If this were to happen, there is a possible risk that
  users may be inclined to trust the suggestions more than they
  should, risking sending funds to an incorrect address.

* The complexity of base58 is cited as a reason for desiring an
  alternative.  However, the reality is that there are multiple
  existing implementations of base58 in a wide variety of languages
  so, howeever compelex, there is little reason for anyone to
  reimplement it.  On the other hand, bech32 introduces the
  requirement to add a new algorithm without (in the case of Bitcoin,
  at least) removing the need for base58.  As discussed elsewhere in
  this note, removal of base58 is in any case a probably unrealistic
  goal.  BIP173 clearly adds complexity and adds to technical debt.

That's not to say that BIP173 is without benefits, though.  More
efficient use of QR codes is a benefit, and bech32 addresses
potentially are a little easier to read out over the telephone, due to
the lack of mixed case, even given the increased length.  It's not
clear to me however, that the complexity and other downsides of bech32
justify these minor benfits.

cashaddr
--------

Most of the concerns about BIP173 bech32 above also apply cashaddr,
since cashaddr is derived from this.  Notably, though, the criticism
that addresses can contain both '1' (digit one) and 'l' (lower case
el) does not apply to cashaddr, since cashaddr uses ':' as the
seperator rather than '1'.  However:

* The use of lower case letter 'l' is still undersibable, given
  cryptocurrency users are currently used to the fact that, given a
  charcter from the set '1', 'l', 'I', it should be assumed to be a
  digit one.  This is true for all existing coins that use base58
  addresses as well as Monero (which uses the same alphabet as base58)
  and ETH (which uses hex).

* cashaddr specifies that the 'bitcoincash:' prefix can be omited when
  presented to the users.  This is pragmatic, and the reality is that
  users and applications will inevitably omit the prefix anyway,
  whether or not this is mandated by the standard - in the same way
  that the 'http://' prefix is widely omited from HTTP URLs.  However,
  without the prefix, mainnet and testnet addresses will not be
  visually distinguishable: P2PKH addresses will begin 'q' (whether
  mainnet or testnet) and P2SH addresses will begin 'p' (whether
  mainnet or testnet).  Note that a mainnet address will never be a
  valid testnet address (or vice versa) because of the way the
  checksum computation is performed, but visually distinctive testnet
  addresses nonetheless a useful feature of existing address formats.


Extensibility
-------------

This proposal is primarily intended to provide a safe and
easy-to-implement way of ensuring that existing deployed base58
serialisations are not confused with BTC.  It is not specifically
intended to be used to meet future requirements, whether new address
types or interchange of any other data, although of course it would be
possible to use it for new purposes by changing the base58 version
byte.

It is assumed that a new address format will probably be needed in
time; however this proposal removes the urgency in finalising such a
proposal.

It is suggested that the following design features might be worthy of
consideration in such a future proposal:

* Consider encoding all metadata in a humanly-readable part at the
  beginning of the address string.  (However care needs to be taken to
  avoid formatting the humanly-readable part in such a way that the
  users may assume it is an optional prefix when it actually carries
  important data.)

* For the main body of the address (typically containing a hash)
  consider using a modified base58 as the serialisation.  Even in the
  case of the cashaddr proposal, implementations will need to retain a
  base58 implementation to handle private keys, xpubs, etc, so adding
  an additional serialisation is best avoided to minimise technical
  debt.  (The checksum calculation will have to be modified so as to
  also protect the humanly-readable metadata.  One approach might be
  to use a modified base58 without the 4-byte checksum, and then use
  another checksum algorithm to protect the entire textual address
  string.)


Regards,

Roy Badami


More information about the bitcoin-ml mailing list