[bitcoin-dev] Segregated witnesses and validationless mining

Peter Todd pete at petertodd.org
Wed Dec 23 01:31:19 UTC 2015


# Summary

1) Segregated witnesses separates transaction information about what
coins were transferred from the information proving those transfers were
legitimate.

2) In its current form, segregated witnesses makes validationless mining
easier and more profitable than the status quo, particularly as
transaction fees increase in relevance.

3) This can be easily fixed by changing the protocol to make having a
copy of the previous block's (witness) data a precondition to creating a
block.


# Background

## Why should a miner publish the blocks they find?

Suppose Alice has negligible hashing power. She finds a block. Should
she publish that block to the rest of the hashing power? Yes! If she
doesn't publish, the rest of the hashing power will build a longer chain
than her chain, and she won't be rewarded. Right?

Well, can other miners build on top of Alice's block? If she publishes
nothing at all, the answer is certainely no - block headers commit to
the previous block's hash, so without knowing at least the hash of
Alice's block other miners can't build upon it.


## Validationless mining

Suppose Bob knows the hash of Alice's new block, as well as the height
of it. This is sufficient information for Bob to create a new, valid,
block building upon Alice's block. The hash is needed because of the
prevhash field in the block header; the height is needed because the
coinbase has to contain the block height. (technically he needs to know
nTime as well to be 100% sure he's satisfying the median time rule) What
Bob is doing is validationless mining: he hasn't validated Alice's
block, and is assuming it is valid.

If Alice runs a pool her stratum or getblocktemplate interfaces give
sufficient information for Bob to figure all this out. Miners today take
advantage of this to reduce their orphan rates - the sooner you can
start mining on top of the most recently found block the more money you
earn. Pools have strong incentives to only publish work that's valid to
their hashers, so as long as the target pool doesn't know who you are,
you have high assurance that the block hash you're building upon is
real.

Of course, when this goes wrong it goes very wrong, greatly amplifying
the effect of 51% attacks and technical screwups, as seen by the July
4th 2015 chain fork, where a majority of hashing power was building on
top of an invalid block.


## Transactions

However other than coinbase transactions, validationless mined blocks
are nearly always empty: if Bob doesn't know what transactions Alice
included in her block, he doesn't know what transaction outputs are
still unspent and can't safely include transactions in his block. In
short, Bob doesn't know what the current state of the UTXO set is. This
helps limit the danger of validationless mining by making it visible to
everyone, as well as making it not as profitable due to the inability to
collect transaction fees. (among other reasons)


# Segregated witnesses and validationless mining

With segregated witnesses the information required to update the UTXO
set state is now separate from the information required to prove that
the new state is valid. We can fully expect miners to take advantage of
this to reduce latency and thus improve their profitability.

We can expect block relaying with segregated witnesses to separate block
propagation into four different parts, from fastest to propagate to
slowest:

1) Stratum/getblocktemplate - status quo between semi-trusting miners

2) Block header - bare minimum information needed to build upon a block.
Not much trust required as creating an invalid header is expensive.

3) Block w/o witness data - significant bandwidth savings, (~75%) and
allows next miner to include transactions as normal. Again, not much
trust required as creating an invalid header is expensive.

4) Witness data - proves that block is actually valid.

The problem is #4 is optional: the only case where not having the
witness data matters is when an invalid block is created, which is a
very rare event. It's also difficult to test in production, as creating
invalid blocks is extremely expensive - it would be surprising if an
anyone had ever deliberately created an invalid block meeting the
current difficulty target in the past year or two.


# The nightmare scenario - never tested code ~never works

The obvious implementation of highly optimised mining with segregated
witnesses will have the main codepath that creates blocks do no
validation at all; if the current ecosystem's validationless mining is
any indication the actual code doing this will be proprietary codebases
written on a budget with little testing, and lots of bugs. At best the
codepaths that actually do validation will be rarely, if ever, tested in
production.

Secondly, as the UTXO set can be updated without the witness data, it
would not be surprising if at least some of the wallet ecosystem skips
witness validation.

With that in mind, what happens in the event of a validation failure?
Mining could continue indefinitely on an invalid chain, producing blocks
that in isolation appear totally normal and contain apparently valid
transactions. It's easy to imagine this happening from an engineering
perspective: a simple implementation would be to have the main mining
codepaths be a separate, not-validating, process that receives "invalid
block" notifications from another process containing a validating
implementation of the Bitcoin protocol. If a bug/exploit is found that
causes that validation process to crash, what's to guarantee that the
block creation codepath will even notice? Quite likely it will continue
creating blocks unabated - the invalid block notification codepath is
never tested in production.


# Easy solution: previous witness data proof

To return segregated witnesses to the status quo, we need to at least
make having the previous block's witness data be a precondition to
creating a block with transactions; ideally we would make it a
precondition to making any valid block, although going this far may
receive pushback from miners who are currently using validationless
mining techniques.

We can require blocks to include the previous witness data, hashed with
a different hash function that the commitment in the previous block.
With witness data W, and H(W) the witness commitment in the previous
block, require the current block to include H'(W)

A possible concrete implementation would be to compute the hash of the
current block's coinbase txouts (unique per miner for obvious reasons!)
as well as the previous block hash. Then recompute the previous block's
witness data merkle tree (and optionally, transaction data merkle tree)
with that hash prepended to the serialized data for each witness.

This calculation can only be done by a trusted entity with access to all
witness data from the previous block, forcing miners to both publish
their witness data promptly, as well as at least obtain witness data
from other miners. (if not actually validate it!) This returns us to at
least the status quo, if not slightly better.

This solution is a soft-fork. As the calculation is only done once per
block, it is *not* a change to the PoW algorithm and is thus compatible
with existing miner/hasher setups. (modulo validationless mining
optimizations, which are no longer possible)


# Proofs of non-inflation vs. proofs of non-theft

Currently full nodes can easily verify both that inflation of the
currency has no occured, as well as verify that theft of coins through
invalid scriptSigs has not occured. (though as an optimisation currently
scriptSig's prior to checkpoints are not validated by default in Bitcoin
Core)

It has been proposed that with segregated witnesses old witness data
will be discarded entirely. This makes it impossible to know if miner
theft has occured in the past; as a practical matter due to the
significant amount of lost coins this also makes it possible to inflate
the currency.

How to fix this problem is an open question; it may be sufficient have
the previous witness data proof solution above require proving posession
of not just the n-1 block, but a (random?) selection of other previous
blocks as well. Adding this to the protocol could be done as soft-fork
with respect to the above previous witness data proof.

-- 
'peter'[:-1]@petertodd.org
000000000000000002c7cfc8455339de54444ac9798cad32cbfbcda77e0f2b09
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 650 bytes
Desc: Digital signature
URL: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20151222/6792b37a/attachment.sig>


More information about the bitcoin-dev mailing list