[bitcoin-dev] Making UTXO Set Growth Irrelevant With Low-Latency Delayed TXO Commitments

Tue May 17 14:25:22 UTC 2016

Nice!

We’ve been talking about doing this forever and it’s so desperately needed.

> On May 17, 2016, at 3:23 PM, Peter Todd via bitcoin-dev <bitcoin-dev at lists.linuxfoundation.org> wrote:
> 
> # Motivation
> 
> UTXO growth is a serious concern for Bitcoin's long-term decentralization. To
> run a competitive mining operation potentially the entire UTXO set must be in
> RAM to achieve competitive latency; your larger, more centralized, competitors
> will have the UTXO set in RAM. Mining is a zero-sum game, so the extra latency
> of not doing so if they do directly impacts your profit margin. Secondly,
> having possession of the UTXO set is one of the minimum requirements to run a
> full node; the larger the set the harder it is to run a full node.
> 
> Currently the maximum size of the UTXO set is unbounded as there is no
> consensus rule that limits growth, other than the block-size limit itself; as
> of writing the UTXO set is 1.3GB in the on-disk, compressed serialization,
> which expands to significantly more in memory. UTXO growth is driven by a
> number of factors, including the fact that there is little incentive to merge
> inputs, lost coins, dust outputs that can't be economically spent, and
> non-btc-value-transfer "blockchain" use-cases such as anti-replay oracles and
> timestamping.
> 
> We don't have good tools to combat UTXO growth. Segregated Witness proposes to
> give witness space a 75% discount, in part of make reducing the UTXO set size
> by spending txouts cheaper. While this may change wallets to more often spend
> dust, it's hard to imagine an incentive sufficiently strong to discourage most,
> let alone all, UTXO growing behavior.
> 
> For example, timestamping applications often create unspendable outputs due to
> ease of implementation, and because doing so is an easy way to make sure that
> the data required to reconstruct the timestamp proof won't get lost - all
> Bitcoin full nodes are forced to keep a copy of it. Similarly anti-replay
> use-cases like using the UTXO set for key rotation piggyback on the uniquely
> strong security and decentralization guarantee that Bitcoin provides; it's very
> difficult - perhaps impossible - to provide these applications with
> alternatives that are equally secure. These non-btc-value-transfer use-cases
> can often afford to pay far higher fees per UTXO created than competing
> btc-value-transfer use-cases; many users could afford to spend $50 to register
> a new PGP key, yet would rather not spend $50 in fees to create a standard two
> output transaction. Effective techniques to resist miner censorship exist, so
> without resorting to whitelists blocking non-btc-value-transfer use-cases as
> "spam" is not a long-term, incentive compatible, solution.
> 
> A hard upper limit on UTXO set size could create a more level playing field in
> the form of fixed minimum requirements to run a performant Bitcoin node, and
> make the issue of UTXO "spam" less important. However, making any coins
> unspendable, regardless of age or value, is a politically untenable economic
> change.
> 
> 
> # TXO Commitments
> 
> A merkle tree committing to the state of all transaction outputs, both spent
> and unspent, we can provide a method of compactly proving the current state of
> an output. This lets us "archive" less frequently accessed parts of the UTXO
> set, allowing full nodes to discard the associated data, still providing a
> mechanism to spend those archived outputs by proving to those nodes that the
> outputs are in fact unspent.
> 
> Specifically TXO commitments proposes a Merkle Mountain Range¹ (MMR), a
> type of deterministic, indexable, insertion ordered merkle tree, which allows
> new items to be cheaply appended to the tree with minimal storage requirements,
> just log2(n) "mountain tips". Once an output is added to the TXO MMR it is
> never removed; if an output is spent its status is updated in place. Both the
> state of a specific item in the MMR, as well the validity of changes to items
> in the MMR, can be proven with log2(n) sized proofs consisting of a merkle path
> to the tip of the tree.
> 
> At an extreme, with TXO commitments we could even have no UTXO set at all,
> entirely eliminating the UTXO growth problem. Transactions would simply be
> accompanied by TXO commitment proofs showing that the outputs they wanted to
> spend were still unspent; nodes could update the state of the TXO MMR purely
> from TXO commitment proofs. However, the log2(n) bandwidth overhead per txin is
> substantial, so a more realistic implementation is be to have a UTXO cache for
> recent transactions, with TXO commitments acting as a alternate for the (rare)
> event that an old txout needs to be spent.
> 
> Proofs can be generated and added to transactions without the involvement of
> the signers, even after the fact; there's no need for the proof itself to
> signed and the proof is not part of the transaction hash. Anyone with access to
> TXO MMR data can (re)generate missing proofs, so minimal, if any, changes are
> required to wallet software to make use of TXO commitments.
> 
> 
> ## Delayed Commitments
> 
> TXO commitments aren't a new idea - the author proposed them years ago in
> response to UTXO commitments. However it's critical for small miners' orphan
> rates that block validation be fast, and so far it has proven difficult to
> create (U)TXO implementations with acceptable performance; updating and
> recalculating cryptographicly hashed merkelized datasets is inherently more
> work than not doing so. Fortunately if we maintain a UTXO set for recent
> outputs, TXO commitments are only needed when spending old, archived, outputs.
> We can take advantage of this by delaying the commitment, allowing it to be
> calculated well in advance of it actually being used, thus changing a
> latency-critical task into a much easier average throughput problem.
> 
> Concretely each block B_i commits to the TXO set state as of block B_{i-n}, in
> other words what the TXO commitment would have been n blocks ago, if not for
> the n block delay. Since that commitment only depends on the contents of the
> blockchain up until block B_{i-n}, the contents of any block after are
> irrelevant to the calculation.
> 
> 
> ## Implementation
> 
> Our proposed high-performance/low-latency delayed commitment full-node
> implementation needs to store the following data:
> 
> 1) UTXO set
> 
>    Low-latency K:V map of txouts definitely known to be unspent. Similar to
>    existing UTXO implementation, but with the key difference that old,
>    unspent, outputs may be pruned from the UTXO set.
> 
> 
> 2) STXO set
> 
>    Low-latency set of transaction outputs known to have been spent by
>    transactions after the most recent TXO commitment, but created prior to the
>    TXO commitment.
> 
> 
> 3) TXO journal
> 
>    FIFO of outputs that need to be marked as spent in the TXO MMR. Appends
>    must be low-latency; removals can be high-latency.
> 
> 
> 4) TXO MMR list
> 
>    Prunable, ordered list of TXO MMR's, mainly the highest pending commitment,
>    backed by a reference counted, cryptographically hashed object store
>    indexed by digest (similar to how git repos work). High-latency ok. We'll
>    cover this in more in detail later.
> 
> 
> ### Fast-Path: Verifying a Txout Spend In a Block
> 
> When a transaction output is spent by a transaction in a block we have two
> cases:
> 
> 1) Recently created output
> 
>    Output created after the most recent TXO commitment, so it should be in the
>    UTXO set; the transaction spending it does not need a TXO commitment proof.
>    Remove the output from the UTXO set and append it to the TXO journal.
> 
> 2) Archived output
> 
>    Output created prior to the most recent TXO commitment, so there's no
>    guarantee it's in the UTXO set; transaction will have a TXO commitment
>    proof for the most recent TXO commitment showing that it was unspent.
>    Check that the output isn't already in the STXO set (double-spent), and if
>    not add it. Append the output and TXO commitment proof to the TXO journal.
> 
> In both cases recording an output as spent requires no more than two key:value
> updates, and one journal append. The existing UTXO set requires one key:value
> update per spend, so we can expect new block validation latency to be within 2x
> of the status quo even in the worst case of 100% archived output spends.
> 
> 
> ### Slow-Path: Calculating Pending TXO Commitments
> 
> In a low-priority background task we flush the TXO journal, recording the
> outputs spent by each block in the TXO MMR, and hashing MMR data to obtain the
> TXO commitment digest. Additionally this background task removes STXO's that
> have been recorded in TXO commitments, and prunes TXO commitment data no longer
> needed.
> 
> Throughput for the TXO commitment calculation will be worse than the existing
> UTXO only scheme. This impacts bulk verification, e.g. initial block download.
> That said, TXO commitments provides other possible tradeoffs that can mitigate
> impact of slower validation throughput, such as skipping validation of old
> history, as well as fraud proof approaches.
> 
> 
> ### TXO MMR Implementation Details
> 
> Each TXO MMR state is a modification of the previous one with most information
> shared, so we an space-efficiently store a large number of TXO commitments
> states, where each state is a small delta of the previous state, by sharing
> unchanged data between each state; cycles are impossible in merkelized data
> structures, so simple reference counting is sufficient for garbage collection.
> Data no longer needed can be pruned by dropping it from the database, and
> unpruned by adding it again. Since everything is committed to via cryptographic
> hash, we're guaranteed that regardless of where we get the data, after
> unpruning we'll have the right data.
> 
> Let's look at how the TXO MMR works in detail. Consider the following TXO MMR
> with two txouts, which we'll call state #0:
> 
>      0
>     / \
>    a   b
> 
> If we add another entry we get state #1:
> 
>        1
>       / \
>      0   \
>     / \   \
>    a   b   c
> 
> Note how it 100% of the state #0 data was reused in commitment #1. Let's
> add two more entries to get state #2:
> 
>            2
>           / \
>          2   \
>         / \   \
>        /   \   \
>       /     \   \
>      0       2   \
>     / \     / \   \
>    a   b   c   d   e
> 
> This time part of state #1 wasn't reused - it's wasn't a perfect binary
> tree - but we've still got a lot of re-use.
> 
> Now suppose state #2 is committed into the blockchain by the most recent block.
> Future transactions attempting to spend outputs created as of state #2 are
> obliged to prove that they are unspent; essentially they're forced to provide
> part of the state #2 MMR data. This lets us prune that data, discarding it,
> leaving us with only the bare minimum data we need to append new txouts to the
> TXO MMR, the tips of the perfect binary trees ("mountains") within the MMR:
> 
>            2
>           / \
>          2   \
>               \
>                \
>                 \
>                  \
>                   \
>                    e
> 
> Note that we're glossing over some nuance here about exactly what data needs to
> be kept; depending on the details of the implementation the only data we need
> for nodes "2" and "e" may be their hash digest.
> 
> Adding another three more txouts results in state #3:
> 
>                  3
>                 / \
>                /   \
>               /     \
>              /       \
>             /         \
>            /           \
>           /             \
>          2               3
>                         / \
>                        /   \
>                       /     \
>                      3       3
>                     / \     / \
>                    e   f   g   h
> 
> Suppose recently created txout f is spent. We have all the data required to
> update the MMR, giving us state #4. It modifies two inner nodes and one leaf
> node:
> 
>                  4
>                 / \
>                /   \
>               /     \
>              /       \
>             /         \
>            /           \
>           /             \
>          2               4
>                         / \
>                        /   \
>                       /     \
>                      4       3
>                     / \     / \
>                    e  (f)  g   h
> 
> If an archived txout is spent requires the transaction to provide the merkle
> path to the most recently committed TXO, in our case state #2. If txout b is
> spent that means the transaction must provide the following data from state #2:
> 
>            2
>           /
>          2
>         /
>        /
>       /
>      0
>       \
>        b
> 
> We can add that data to our local knowledge of the TXO MMR, unpruning part of
> it:
> 
>                  4
>                 / \
>                /   \
>               /     \
>              /       \
>             /         \
>            /           \
>           /             \
>          2               4
>         /               / \
>        /               /   \
>       /               /     \
>      0               4       3
>       \             / \     / \
>        b           e  (f)  g   h
> 
> Remember, we haven't _modified_ state #4 yet; we just have more data about it.
> When we mark txout b as spent we get state #5:
> 
>                  5
>                 / \
>                /   \
>               /     \
>              /       \
>             /         \
>            /           \
>           /             \
>          5               4
>         /               / \
>        /               /   \
>       /               /     \
>      5               4       3
>       \             / \     / \
>       (b)          e  (f)  g   h
> 
> Secondly by now state #3 has been committed into the chain, and transactions
> that want to spend txouts created as of state #3 must provide a TXO proof
> consisting of state #3 data. The leaf nodes for outputs g and h, and the inner
> node above them, are part of state #3, so we prune them:
> 
>                  5
>                 / \
>                /   \
>               /     \
>              /       \
>             /         \
>            /           \
>           /             \
>          5               4
>         /               /
>        /               /
>       /               /
>      5               4
>       \             / \
>       (b)          e  (f)
> 
> Finally, lets put this all together, by spending txouts a, c, and g, and
> creating three new txouts i, j, and k. State #3 was the most recently committed
> state, so the transactions spending a and g are providing merkle paths up to
> it. This includes part of the state #2 data:
> 
>                  3
>                 / \
>                /   \
>               /     \
>              /       \
>             /         \
>            /           \
>           /             \
>          2               3
>         / \               \
>        /   \               \
>       /     \               \
>      0       2               3
>     /       /               /
>    a       c               g
> 
> After unpruning we have the following data for state #5:
> 
>                  5
>                 / \
>                /   \
>               /     \
>              /       \
>             /         \
>            /           \
>           /             \
>          5               4
>         / \             / \
>        /   \           /   \
>       /     \         /     \
>      5       2       4       3
>     / \     /       / \     /
>    a  (b)  c       e  (f)  g
> 
> That's sufficient to mark the three outputs as spent and add the three new
> txouts, resulting in state #6:
> 
>                        6
>                       / \
>                      /   \
>                     /     \
>                    /       \
>                   /         \
>                  6           \
>                 / \           \
>                /   \           \
>               /     \           \
>              /       \           \
>             /         \           \
>            /           \           \
>           /             \           \
>          6               6           \
>         / \             / \           \
>        /   \           /   \           6
>       /     \         /     \         / \
>      6       6       4       6       6   \
>     / \     /       / \     /       / \   \
>   (a) (b) (c)      e  (f) (g)      i   j   k
> 
> Again, state #4 related data can be pruned. In addition, depending on how the
> STXO set is implemented may also be able to prune data related to spent txouts
> after that state, including inner nodes where all txouts under them have been
> spent (more on pruning spent inner nodes later).
> 
> 
> ### Consensus and Pruning
> 
> It's important to note that pruning behavior is consensus critical: a full node
> that is missing data due to pruning it too soon will fall out of consensus, and
> a miner that fails to include a merkle proof that is required by the consensus
> is creating an invalid block. At the same time many full nodes will have
> significantly more data on hand than the bare minimum so they can help wallets
> make transactions spending old coins; implementations should strongly consider
> separating the data that is, and isn't, strictly required for consensus.
> 
> A reasonable approach for the low-level cryptography may be to actually treat
> the two cases differently, with the TXO commitments committing too what data
> does and does not need to be kept on hand by the UTXO expiration rules. On the
> other hand, leaving that uncommitted allows for certain types of soft-forks
> where the protocol is changed to require more data than it previously did.
> 
> 
> ### Consensus Critical Storage Overheads
> 
> Only the UTXO and STXO sets need to be kept on fast random access storage.
> Since STXO set entries can only be created by spending a UTXO - and are smaller
> than a UTXO entry - we can guarantee that the peak size of the UTXO and STXO
> sets combined will always be less than the peak size of the UTXO set alone in
> the existing UTXO-only scheme (though the combined size can be temporarily
> higher than what the UTXO set size alone would be when large numbers of
> archived txouts are spent).
> 
> TXO journal entries and unpruned entries in the TXO MMR have log2(n) maximum
> overhead per entry: a unique merkle path to a TXO commitment (by "unique" we
> mean that no other entry shares data with it). On a reasonably fast system the
> TXO journal will be flushed quickly, converting it into TXO MMR data; the TXO
> journal will never be more than a few blocks in size.
> 
> Transactions spending non-archived txouts are not required to provide any TXO
> commitment data; we must have that data on hand in the form of one TXO MMR
> entry per UTXO. Once spent however the TXO MMR leaf node associated with that
> non-archived txout can be immediately pruned - it's no longer in the UTXO set
> so any attempt to spend it will fail; the data is now immutable and we'll never
> need it again. Inner nodes in the TXO MMR can also be pruned if all leafs under
> them are fully spent; detecting this is easy the TXO MMR is a merkle-sum tree,
> with each inner node committing to the sum of the unspent txouts under it.
> 
> When a archived txout is spent the transaction is required to provide a merkle
> path to the most recent TXO commitment. As shown above that path is sufficient
> information to unprune the necessary nodes in the TXO MMR and apply the spend
> immediately, reducing this case to the TXO journal size question (non-consensus
> critical overhead is a different question, which we'll address in the next
> section).
> 
> Taking all this into account the only significant storage overhead of our TXO
> commitments scheme when compared to the status quo is the log2(n) merkle path
> overhead; as long as less than 1/log2(n) of the UTXO set is active,
> non-archived, UTXO's we've come out ahead, even in the unrealistic case where
> all storage available is equally fast. In the real world that isn't yet the
> case - even SSD's significantly slower than RAM.
> 
> 
> ### Non-Consensus Critical Storage Overheads
> 
> Transactions spending archived txouts pose two challenges:
> 
> 1) Obtaining up-to-date TXO commitment proofs
> 
> 2) Updating those proofs as blocks are mined
> 
> The first challenge can be handled by specialized archival nodes, not unlike
> how some nodes make transaction data available to wallets via bloom filters or
> the Electrum protocol. There's a whole variety of options available, and the
> the data can be easily sharded to scale horizontally; the data is
> self-validating allowing horizontal scaling without trust.
> 
> While miners and relay nodes don't need to be concerned about the initial
> commitment proof, updating that proof is another matter. If a node aggressively
> prunes old versions of the TXO MMR as it calculates pending TXO commitments, it
> won't have the data available to update the TXO commitment proof to be against
> the next block, when that block is found; the child nodes of the TXO MMR tip
> are guaranteed to have changed, yet aggressive pruning would have discarded that
> data.
> 
> Relay nodes could ignore this problem if they simply accept the fact that
> they'll only be able to fully relay the transaction once, when it is initially
> broadcast, and won't be able to provide mempool functionality after the initial
> relay. Modulo high-latency mixnets, this is probably acceptable; the author has
> previously argued that relay nodes don't need a mempool² at all.
> 
> For a miner though not having the data necessary to update the proofs as blocks
> are found means potentially losing out on transactions fees. So how much extra
> data is necessary to make this a non-issue?
> 
> Since the TXO MMR is insertion ordered, spending a non-archived txout can only
> invalidate the upper nodes in of the archived txout's TXO MMR proof (if this
> isn't clear, imagine a two-level scheme, with a per-block TXO MMRs, committed
> by a master MMR for all blocks). The maximum number of relevant inner nodes
> changed is log2(n) per block, so if there are n non-archival blocks between the
> most recent TXO commitment and the pending TXO MMR tip, we have to store
> log2(n)*n inner nodes - on the order of a few dozen MB even when n is a
> (seemingly ridiculously high) year worth of blocks.
> 
> Archived txout spends on the other hand can invalidate TXO MMR proofs at any
> level - consider the case of two adjacent txouts being spent. To guarantee
> success requires storing full proofs. However, they're limited by the blocksize
> limit, and additionally are expected to be relatively uncommon. For example, if
> 1% of 1MB blocks was archival spends, our hypothetical year long TXO commitment
> delay is only a few hundred MB of data with low-IO-performance requirements.
> 
> 
> ## Security Model
> 
> Of course, a TXO commitment delay of a year sounds ridiculous. Even the slowest
> imaginable computer isn't going to need more than a few blocks of TXO
> commitment delay to keep up ~100% of the time, and there's no reason why we
> can't have the UTXO archive delay be significantly longer than the TXO
> commitment delay.
> 
> However, as with UTXO commitments, TXO commitments raise issues with Bitcoin's
> security model by allowing relatively miners to profitably mine transactions
> without bothering to validate prior history. At the extreme, if there was no
> commitment delay at all at the cost of a bit of some extra network bandwidth
> "full" nodes could operate and even mine blocks completely statelessly by
> expecting all transactions to include "proof" that their inputs are unspent; a
> TXO commitment proof for a commitment you haven't verified isn't a proof that a
> transaction output is unspent, it's a proof that some miners claimed the txout
> was unspent.
> 
> At one extreme, we could simply implement TXO commitments in a "virtual"
> fashion, without miners actually including the TXO commitment digest in their
> blocks at all. Full nodes would be forced to compute the commitment from
> scratch, in the same way they are forced to compute the UTXO state, or total
> work. Of course a full node operator who doesn't want to verify old history can
> get a copy of the TXO state from a trusted source - no different from how you
> could get a copy of the UTXO set from a trusted source.
> 
> A more pragmatic approach is to accept that people will do that anyway, and
> instead assume that sufficiently old blocks are valid. But how old is
> "sufficiently old"? First of all, if your full node implementation comes "from
> the factory" with a reasonably up-to-date minimum accepted total-work
> thresholdⁱ - in other words it won't accept a chain with less than that amount
> of total work - it may be reasonable to assume any Sybil attacker with
> sufficient hashing power to make a forked chain meeting that threshold with,
> say, six months worth of blocks has enough hashing power to threaten the main
> chain as well.
> 
> That leaves public attempts to falsify TXO commitments, done out in the open by
> the majority of hashing power. In this circumstance the "assumed valid"
> threshold determines how long the attack would have to go on before full nodes
> start accepting the invalid chain, or at least, newly installed/recently reset
> full nodes. The minimum age that we can "assume valid" is tradeoff between
> political/social/technical concerns; we probably want at least a few weeks to
> guarantee the defenders a chance to organise themselves.
> 
> With this in mind, a longer-than-technically-necessary TXO commitment delayʲ
> may help ensure that full node software actually validates some minimum number
> of blocks out-of-the-box, without taking shortcuts. However this can be
> achieved in a wide variety of ways, such as the author's prev-block-proof
> proposal³, fraud proofs, or even a PoW with an inner loop dependent on
> blockchain data. Like UTXO commitments, TXO commitments are also potentially
> very useful in reducing the need for SPV wallet software to trust third parties
> providing them with transaction data.
> 
> i) Checkpoints that reject any chain without a specific block are a more
>   common, if uglier, way of achieving this protection.
> 
> j) A good homework problem is to figure out how the TXO commitment could be
>   designed such that the delay could be reduced in a soft-fork.
> 
> 
> ## Further Work
> 
> While we've shown that TXO commitments certainly could be implemented without
> increasing peak IO bandwidth/block validation latency significantly with the
> delayed commitment approach, we're far from being certain that they should be
> implemented this way (or at all).
> 
> 1) Can a TXO commitment scheme be optimized sufficiently to be used directly
> without a commitment delay? Obviously it'd be preferable to avoid all the above
> complexity entirely.
> 
> 2) Is it possible to use a metric other than age, e.g. priority? While this
> complicates the pruning logic, it could use the UTXO set space more
> efficiently, especially if your goal is to prioritise bitcoin value-transfer
> over other uses (though if "normal" wallets nearly never need to use TXO
> commitments proofs to spend outputs, the infrastructure to actually do this may
> rot).
> 
> 3) Should UTXO archiving be based on a fixed size UTXO set, rather than an
> age/priority/etc. threshold?
> 
> 4) By fixing the problem (or possibly just "fixing" the problem) are we
> encouraging/legitimising blockchain use-cases other than BTC value transfer?
> Should we?
> 
> 5) Instead of TXO commitment proofs counting towards the blocksize limit, can
> we use a different miner fairness/decentralization metric/incentive? For
> instance it might be reasonable for the TXO commitment proof size to be
> discounted, or ignored entirely, if a proof-of-propagation scheme (e.g.
> thinblocks) is used to ensure all miners have received the proof in advance.
> 
> 6) How does this interact with fraud proofs? Obviously furthering dependency on
> non-cryptographically-committed STXO/UTXO databases is incompatible with the
> modularized validation approach to implementing fraud proofs.
> 
> 
> # References
> 
> 1) "Merkle Mountain Ranges",
>   Peter Todd, OpenTimestamps, Mar 18 2013,
>   https://github.com/opentimestamps/opentimestamps-server/blob/master/doc/merkle-mountain-range.md
> 
> 2) "Do we really need a mempool? (for relay nodes)",
>   Peter Todd, bitcoin-dev mailing list, Jul 18th 2015,
>   https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2015-July/009479.html
> 
> 3) "Segregated witnesses and validationless mining",
>   Peter Todd, bitcoin-dev mailing list, Dec 23rd 2015,
>   https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2015-December/012103.html
> 
> -- 
> https://petertodd.org 'peter'[:-1]@petertodd.org
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev at lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev