[bitcoin-dev] death to the mempool, long live the mempool

ZmnSCPxj ZmnSCPxj at protonmail.com
Thu Oct 28 01:04:10 UTC 2021


Good morning Gloria, et al,


> > Removing the mempool would... naturally resolve all current issues inherent in package relay and rbf rules.
>
> Removing the mempool does not help with this. How does a miner decide whether a conflicting transaction is an economically-advantageous replacement or a DoS attack? How do you submit your CPFP if the parent is below the mempool minimum feerate? Do they already have a different relay/mempool implementation handling all of these problems but don't aren't sharing it with the rest of the community?

This seems an important key point: *even if* miners maintain some kind of "accept any transaction!!!" endpoint, the miner still wants to have *some* policy on how to evict transactions from its pool of transactions, for the simple reason that *everyone* has limited resources, even miners.

Perhaps the issue is that eviction is done *immediately*, i.e. if a node receives a transaction below some feerate treshhold, it immediately drops the transaction.

What if instead we did the eviction lazily?

Suppose we used something like a garbage collector.
Unconfirmed transactions are objects that point to other objects (i.e. the input of a transaction "points to" another object).
"Primitive" objects are UTXOs of *confirmed* transactions, i.e. the UTXO set at the block tip.
Then, a GC algorithm would start at some roots and then terminate when it reaches primitive objects.

I describe here an algorithm based on semispace GC, but the GC algorithm space is well-studied and other algorithms may also be devised (in particular, spam is likely to match quite well with "infant mortality" concept in GC, i.e. "most objects die young", so some kind of nursery / generational GC may work better against spam in practice).

A semispace GC has two "spaces" for memory.
One is the "from-space" and the other is the "to-space".
During normal operation, the "from-space" is used and the "to-space" is empty.
(Note that we can implement a "space" as a table (`std::map`) from txid to transaction, and avoid having to *actually* copy the transaction data; the important thing is having two spaces)
There is a maximum size that from-space and to-space can be.

As we receive transactions, we allocate them on the from-space.
Once the from-space is filled, we stop operations and perform a GC cycle.

We select "roots" by ordering all transactions in the from-space, from highest-feerate to lowest-feerate (figure out some way to handle ties later, maybe put a timestamp or monotonic counter?).
Starting with the highest-feerate tx, we trace all the transactions they refer to, recursively, copying them from from-space to to-space.
We stop once the to-space is filled more than half.

At this point, we drop all transactions in the from-space that are not already in to-space, and then delete the from-space.
Then we promote the to-space as the from-space, and continue our merry way, allocating more transactions.

(Nothing prevents doing this on disk rather than in memory; xref Eric Voskuil)

Note that the algorithm operates on individual transactions, not packages of transactions.
The algorithm is vulnerable to spam where the spammer creates several large low-feerate transactions, then anchors them all using a tiny high-feerate transaction (a "tall" attack).
It is also vulnerable to spam where the spammer creates several high-feerate transactions spending the same UTXO (a "wide" attack), thus preventing anyone else from getting any transactions into the mempool.

Against these exploit, we can mitigate by *first* moving objects to a smaller "packagespace" instead of directly on the to-space.
When tracing a new root, we first move the transactions that are not already in to-space to the packagespace, then measure the aggregate fee divided by the aggregate memory usage.
If this is below, say, half the feerate of the root transaction, then we drop the packagespace (put it back into from-space) and move on to the next root.
This protects against "tall" attacks.
To protect against "wide" attacks, if the packagespace consumes a TXO that is already consumed in the to-space, we also drop the packagespace (i.e. only retain the highest-feerate version in a "wide" attack).
Once the above checks pass, we merge the packagespace into the to-space.

This algorithm means that we do not need package relay; instead, we just send transactions in the correct order (parents first, then children), and if the receiver does not need to do a GC in-between, then everything ends up in the mempool.
If the child transaction is high-fee enough to be a root transaction, and pays enough that its feerate dominates in the packagespace result, then the entire sequence will remain in the mempool.

The algorithm allows for conflicting transactions to be retained in the mempool temporarily, until the next GC triggers (at which point conflicting transactions are resolved by whoever is higher-feerate).
This is helpful since a conflicting transaction may be what ends up getting confirmed in a block from a miner whose mempool did not contain the "best" feerate transaction.


WDYT?
Regards,
ZmnSCPxj


More information about the bitcoin-dev mailing list