[Bitcoin-ml] Partial UTXO tree as commitment

Amaury Séchet deadalnix at gmail.com
Tue Sep 5 19:42:55 UTC 2017


A few quick remarks:

- This is maybe good enough, but far from perfect. I think we still want to
have a mechanism to update the UTXO set format no matter what. There is a
ton of work done on accumulator these days, so it's not completely crazy to
think that a better solution may be around the corner.
 - If the sharding is based on a constant, the size of the proof or the
number of shard grows with the size of the UTXO set. A nice trick is to
grow the number of shard based on the square root of the size of the UTXO
set. Now number of shards and proof size both grow in a sub linear manner
with the size of the UTXO set.
 - If we have a way to upgrade the way we compute the set (see first point)
I'd rather go for the simplest option at first.


2017-09-05 20:07 GMT+02:00 Chris Pacia via bitcoin-ml <
bitcoin-ml at lists.linuxfoundation.org>:

> Given the primary use case of fraud proofs would likely be rarely used I
> think the proof size is pretty good. Haven't had time to digest the rest
> of it yet.
>
>
> On 09/05/2017 08:40 AM, Tomas via bitcoin-ml wrote:
> > I would like to propose an efficient UTXO commitment scheme.
> >
> > A UTXO commitment can be useful for:
> >
> > 1. Fast syncing a full node, by downloading the UTXO-set
> > 2. Proofing (non) existence of a UTXO..
> >
> > Various schemes have been proposed:
> >
> > * Merkle/radix trees and variants; all of which have the problem that
> > they significantly increase the burden of maintaining the UTXO set.
> > Furthermore, such schemes tend to practically prescribe the UTXO storage
> > format severely limiting continuous per-implementation optimizations.
> > * A "flat" rolling hash, eg the ECMH proposed by Pieter Wiulle which is
> > cheap to calculate but only solves (1) and not (2).
> >
> > I propose a hybrid approach, with very limited extra burden to maintain
> > and reasonably small proofs:
> >
> > We divide the UTXO set in buckets by prefix of their TXID, then maintain
> > a rolling hash for each bucket. The commitment is then the root of the
> > tree constructed from the resulting bucket hashes. To construct the
> > tree: For each depth, we group the hashes of the next depth per 64
> > hashes and calculate the rolling hash of each. (Effectively, this is a
> > prefix tree with a fixed branch-size of 64).
> >
> > Bucketcount
> > -------------------
> > txcount = number of TXIDs in the UTXO set
> > bucketcount = (smallest power of 2 larger than sqrt(txcount))  << 6
> >
> > Rationale for bucketcount:
> >
> > * This currently gives a bucketcount of 2^19, which is very cheap to
> > maintain with a 16mb array of rolling hashes.
> > * This currently gives an average bucket size of 4kb. With a rolling
> > hash, full nodes don't need to maintain the buckets themselves, but they
> > are used for proofs.
> > * The burden of future UTXO growth is divided among maintaining the
> > rolling hashes and size of the proof: 10,000x as large UTXO set (20TB),
> > gives ~400kb buckets and ~1.6gb in maintaining rolling hashes.
> > * This gives a tree depth of 5, which means the cost of every UTXO
> > update is increased  by ~3 rolling hashes (and a double SHA), as the
> > lowest depths don't benefit from caching.
> > * A proof for (non) existence of a UTXO is ~ 4*64*32 =8kb (branch-nodes)
> > + 4kb (bucket) = ~12kb
> >
> > Specification [WIP]
> > ---------------------------
> > We define the "UTXO commitment" as the serialized byte array: "U" "T"
> > "X" "O" VARINT(version) VARINT(txcount) UINT256(UTXO-root)    [todo
> > clarify]
> >
> > A block that contains an output in the coinbase whose scriptPubKey
> > consists solely of OP_RETURN [UTXO commitment] must be rejected if in
> > the UTXO commitment the version equals 1 and either
> > * After updating the UTXO state, the number of distinct TXIDs in the
> > UTXO set is not equal to the txcount value of the UTXO commitment
> > * After updating the UTXO state, the UTXO-root in the UTXO commitment is
> > not equal to the UTXO-root defined below.
> >
> > The UTXO-root can be calculated as follows:
> >
> > * Define _bucketcount_ as (smallest power of 2 larger than
> > sqrt(txcount))  << 6
> > * Given a TXID in the UTXO set, define UTXO(TXID) as the double SHA256
> > of (TXID + coins). (coins is the serialization of unspent outputs to be
> > spec'ed).
> > * Let bucket N be the set of values UTXO(TXID) for each TXID in the
> > UTXO-set where (TXID mod _bucketcount_) equals N.
> > * Let rhash N be the rolling hash (TBD) of all values in bucket N
> > * Let the hash sequence be the ordered sequence  rhash
> > [0,_bucketcount_).
> >
> > 1. If the hash sequence contains at most 64 entries, then the UTXO-root
> > is the rolling hash of all entries in the hash sequence, otherwise:
> > 2. Group the hash sequence in ordered subsequences of 64 entries each.
> > 3. Find the rolling hash of each subsequence
> > 4. Continue with 1., with the hash sequence being the ordered sequence
> > of these rolling hashes.
> >
> > Note: an implementation may want to maintain and update the set of
> > rolling hashes at higher depths on each UTXO set operation.
> >
> > Note: the secure ECMH is a good candidate for the bucket hash. This
> > could also be used for the branch rolling hashes, but it might be worth
> > considering XOR for those as there seem to be simply not enough
> > candidates to find a colliding set?
> >
> > Note: two magic numbers are used: "<< 6" for the bucket count, and "64"
> > for the branch size. They work nicely but are pulled out of a dark place
> > and merit some experimentation.
> >
> > Use cases for light clients
> > -------------------------------------
> > These UTXO proofs could be used as compact fraud proofs, although the
> > benefit of this is not generally agreed upon.
> >
> > They can also be used to increase low-conf security to light clients, by
> > validating the signatures and order-validity of incoming transactions
> > against the right bucket of the current UTXO set.
> >
> > An interesting use case may be another type of light client. It could be
> > interesting for a light client to abandon the bloom filters, and instead
> > use the UTXO proofs to verify whether an incoming or outgoing
> > transaction is confirmed. This could be beneficial for "rarely active"
> > light clients such as smartphone apps, as it prevents the need to
> > synchronize previous blocks with bloom filters, and allows syncing to
> > the latest block with 12kb/output.
> >
> > Summary
> > --------------
> > * Allows fast full node syncing.
> > * Costs full nodes ~20mb extra in RAM
> > * Costs full nodes ~3 rolling hash operations per UTXO operation.
> > * Allows UTXO (non) existence proofs for currently avg ~12kb.
> > * Size of proof grows O(sqrt(N)) with UTXO set
> > * Size of extra full node memory grows O(sqrt(N)) with UTXO set
> >
> >
> > Tomas van der Wansem
> > bitcrust
> >
> > (Thanks to deadalnix for slack discussion)
> > _______________________________________________
> > bitcoin-ml mailing list
> > bitcoin-ml at lists.linuxfoundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-ml
> >
>
> _______________________________________________
> bitcoin-ml mailing list
> bitcoin-ml at lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-ml
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxfoundation.org/pipermail/bitcoin-ml/attachments/20170905/a4ce549e/attachment-0001.html>


More information about the bitcoin-ml mailing list