[Bitcoin-development] Chain pruning

Pieter Wuille pieter.wuille at gmail.com
Thu Apr 10 16:59:54 UTC 2014

On Thu, Apr 10, 2014 at 6:47 PM, Brian Hoffman <brianchoffman at gmail.com> wrote:
> Looks like only about ~30% disk space savings so I see your point. Is there
> a critical reason why blocks couldn't be formed into "superblocks" that are
> chained together and nodes could serve a specific superblock, which could be
> pieced together from different nodes to get the full blockchain? This would
> allow participants with limited resources to serve full portions of the
> blockchain rather than limited pieces of the entire blockchain.

As this is a suggestion that I think I've seen come up once a month
for the past 3 years, let's try to answer it thoroughly.

The actual "state" of the blockchain is the UTXO set (stored in
chainstate/ by the reference client). It's the set of all unspent
transaction outputs at the currently active point in the block chain.
It is all you need for validating future blocks.

The problem is, you can't just give someone the UTXO set and expect
them to trust it, as there is no way to prove that it was the result
of processing the actual blocks.

As Bitcoin's full node uses a "zero trust" model, where (apart from
one detail: the order of otherwise valid transactions) it never
assumes any data received from the outside it valid, it HAS to see the
previous blocks in order to establish the validity of the current UTXO
set. This is what initial block syncing does. Nothing but the actual
blocks can provide this data, and it is why the actual blocks need to
be available. It does not require everyone to have all blocks, though
- they just need to have seen them during processing.

A related, but not identical evolution is merkle UTXO commitments.
This means that we shape the UTXO set as a merkle tree, compute its
root after every block, and require that the block commits to this
root hash (by putting it in the coinbase, for example). This means a
full node can copy the chain state from someone else, and check that
its hash matches what the block chain commits to. It's important to
note that this is a strict reduction in security: we're now trusting
that the longest chain (with most proof of work) commits to a valid
UTXO set (at some point in the past).

In essence, combining both ideas means you get "superblocks" (the UTXO
set is essentially the summary of the result of all past blocks), in a
way that is less-than-currently-but-perhaps-still-acceptably-validated.


More information about the bitcoin-dev mailing list