[Bitcoin-development] Squashing redundant tx data in blocks on the wire
keziahw at gmail.com
Thu Jul 17 21:35:35 UTC 2014
To improve block propagation, add a new block message that doesn't include
transactions the peer is known to have. The message must never require an
additional round trip due to any transactions the peer doesn't have, but
be compatible with peers sometimes forgetting transactions they have known.
For peers advertising support for squashed blocks: a node tracks what txes
knows each peer has seen (inv received, tx sent, tx appeared in competing
known to peer). Nodes push block contents as txes-not-already-known +
A node should be able to forget invs it has seen without invalidating what
know about its known txes. To allow for this, a node assembles a bloom
a set of txes it is going to forget, and sends it to peers. The node can
the txes as soon as no blocks requested before the filter was pushed are in
flight (relying on the assumption that messages can be expected to be
When a node receives a forgotten-filter, it ORs it into its
that peer. Any transactions matching the forgotten-filter are always
full with a block. If the filter is getting full, the node can just clear it
along with peer.setTxKnown.
Since the bloom filter is likely to grow slowly and can be dropped when it
becoming full, a cheap set of hash functions and element size can be used to
keep overhead more restricted than the bloom filtering done for spv. It's
important for testing txes against the filter to be fast so that it doesn't
delay pushing the block more than the squashing helps.
Nodes currently forget txes rarely, so the bloom filters would only need to
used at all under conditions that are not currently common -- but I think
they're important to include to allow for different node behavior in this
in the future.
Tracking txes known to peers:
A multimap of txid->peerId would obviate the current setCurrentlyKnown, and
would not take much more space since each additional peer adds about 1
per txid (setCurrentlyKnown keeps a uint256 per peer per txid, although it
tracks somewhat fewer txid per node).
- Since the bloom filters will have lower maximum overhead than the current
filters and can be dropped at will, this shouldn't enable any resource
exhaustion attacks that aren't already possible.
- A squashed block with bogus or missing data would be easily detected not
produce the correct merkle root for its BlockHeader.
Assuming a fairly typical 500 tx block with transaction sizes averaging 300b
(both on the low side), for a 150kb block:
% pruned | block size reduction | relative size reduction
-------- | -------------------- | -----------------------
100 | 134 kB | 89%
50 | 67 kB | 45%
25 | 33.5 kB | 17%
I've been doing some logging, and when my node pushes a block to a peer it
to typically know that a peer has seen most of the txes in the block. Even
the case of a small block with only 25% known-known transactions, total
bandwidth saved is greater than the bloom filters transmitted unless a node
forgetting transactions so rapidly that it pushes new maximum-size
forget-filters every block.
So this is a net gain even in total bandwidth usage, but most importantly
an improvement in block propagation rate and in how block propagation rate
scales with additional transactions.
How should block squashing capability be advertised -- new service bit?
- How fast to test against could a suitable bloom filter be made?
- How much memory would each filter need to take, at maximum?
- Can the inputs all being 32 byte hashes be used to optimize filter hash
If there's support for this proposal, I can begin working on the specific
implementation details, such as the bloom filters, message format, and
capability advertisment, and draft a BIP once I have a concrete proposal for
what those would look like and a corresponding precise cost/benefit
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the bitcoin-dev