[bitcoin-dev] BIP Proposal: Compact Client Side Filtering for Light Clients

Olaoluwa Osuntokun laolu32 at gmail.com
Thu Nov 9 23:44:07 UTC 2017


Hi y'all,

Since my last email we've made a number of changes to the BIP. The changes
made
were driven by the feedback we've received so far in this thread, and also
as a
result of real-world testing using this new proposal as the basis for our
light
weight LN node which powers the demo Lightning desktop application we
recently
released.

A highlight of the changes made between this version and the last follows:

  * We've removed the modulus operation in the inner loop when constructing
    filters. This has been replaced with an alternative, more efficient
    mapping[1] as suggested by gmaxwell and sipa. In our implementation, we
    perform the operation in a piece-wise fashion by hand. Alternative
    implementations can take advantage of 128-bit arithmetic extensions on
    supporting CPU's.

  * The txid has been moved from the extended filter to the regular filter.
    During out testing of the new light client with our LN node
implementation,
    we found that we were able to reduce network traffic as we only need the
    extended filter for rare on-chain events.

  * We now use the 6th service bit. We realized that the bit we had chosen
    prior was already being used to signal support of x-thin block syncing.
To
    select this bit number, we ran a scanner on the addrman of our nodes and
    also the network to fin da bit that wasn't used widely.

  * An error in the BIP that didn't include the public key script of
coinbase
    transactions in the filter has been fixed.

  * An error in the BIP when constructing the initial "genesis" filter has
been
    fixed.

  * We no longer use the ProtocolVersion field in the getcfheaders message
or
    its response.

  * The specification of several newly defined messages were incorrect and
have
    been fixed.

  * A number of typos spotted by several reviewers have been fixed.

The full commit history of the BIP draft can be found here:
https://github.com/Roasbeef/bips/commits/gcs-bip-draft

At this point, we're ready to make a PR against the official BIP repo and to
request a number to be assigned to our proposal. Thanks to all those that
have
reviewed, and contributed to the proposal!

[1]:
https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/

-- Laolu


On Thu, Jun 8, 2017 at 8:59 PM Olaoluwa Osuntokun <laolu32 at gmail.com> wrote:

> Hi y'all,
>
> Thanks for all the comments so far!
>
> I've pushed a series of updates to the text of the BIP repo linked in the
> OP.
> The fixes include: typos, components of the specification which were
> incorrect
> (N is the total number of items, NOT the number of txns in the block), and
> a
> few sections have been clarified.
>
> The latest version also includes a set of test vectors (as CSV files),
> which
> for a series of fp rates (1/2 to 1/2^32) includes (for 6 testnet blocks,
> one of
> which generates a "null" filter):
>
>    * The block height
>    * The block hash
>    * The raw block itself
>    * The previous basic+extended filter header
>    * The basic+extended filter header for the block
>    * The basic+extended filter for the block
>
> The size of the test vectors was too large to include in-line within the
> document, so we put them temporarily in a distinct folder [1]. The code
> used to
> generate the test vectors has also been included.
>
> -- Laolu
>
> [1]: https://github.com/Roasbeef/bips/tree/master/gcs_light_client
>
>
> On Thu, Jun 1, 2017 at 9:49 PM Olaoluwa Osuntokun <laolu32 at gmail.com>
> wrote:
>
>> > In order to consider the average+median filter sizes in a world worth
>> larger
>> > blocks, I also ran the index for testnet:
>> >
>> >     * total size:  2753238530
>> >     * total avg:  5918.95736054141
>> >     * total median:  60202
>> >     * total max:  74983
>> >     * regular size:  1165148878
>> >     * regular avg:  2504.856172982827
>> >     * regular median:  24812
>> >     * regular max:  64554
>> >     * extended size:  1588089652
>> >     * extended avg:  3414.1011875585823
>> >     * extended median:  35260
>> >     * extended max:  41731
>> >
>>
>> Oops, realized I made a mistake. These are the stats for Feb 2016 until
>> about a
>> month ago (since height 400k iirc).
>>
>> -- Laolu
>>
>>
>> On Thu, Jun 1, 2017 at 12:01 PM Olaoluwa Osuntokun <laolu32 at gmail.com>
>> wrote:
>>
>>> Hi y'all,
>>>
>>> Alex Akselrod and I would like to propose a new light client BIP for
>>> consideration:
>>>    *
>>> https://github.com/Roasbeef/bips/blob/master/gcs_light_client.mediawiki
>>>
>>> This BIP proposal describes a concrete specification (along with a
>>> reference implementations[1][2][3]) for the much discussed client-side
>>> filtering reversal of BIP-37. The precise details are described in the
>>> BIP, but as a summary: we've implemented a new light-client mode that
>>> uses
>>> client-side filtering based off of Golomb-Rice coded sets. Full-nodes
>>> maintain an additional index of the chain, and serve this compact filter
>>> (the index) to light clients which request them. Light clients then fetch
>>> these filters, query the locally and _maybe_ fetch the block if a
>>> relevant
>>> item matches. The cool part is that blocks can be fetched from _any_
>>> source, once the light client deems it necessary. Our primary motivation
>>> for this work was enabling a light client mode for lnd[4] in order to
>>> support a more light-weight back end paving the way for the usage of
>>> Lightning on mobile phones and other devices. We've integrated neutrino
>>> as a back end for lnd, and will be making the updated code public very
>>> soon.
>>>
>>> One specific area we'd like feedback on is the parameter selection.
>>> Unlike
>>> BIP-37 which allows clients to dynamically tune their false positive
>>> rate,
>>> our proposal uses a _fixed_ false-positive. Within the document, it's
>>> currently specified as P = 1/2^20. We've done a bit of analysis and
>>> optimization attempting to optimize the following sum:
>>> filter_download_bandwidth + expected_block_false_positive_bandwidth. Alex
>>> has made a JS calculator that allows y'all to explore the affect of
>>> tweaking the false positive rate in addition to the following variables:
>>> the number of items the wallet is scanning for, the size of the blocks,
>>> number of blocks fetched, and the size of the filters themselves. The
>>> calculator calculates the expected bandwidth utilization using the CDF of
>>> the Geometric Distribution. The calculator can be found here:
>>> https://aakselrod.github.io/gcs_calc.html. Alex also has an empirical
>>> script he's been running on actual data, and the results seem to match up
>>> rather nicely.
>>>
>>> We we're excited to see that Karl Johan Alm (kallewoof) has done some
>>> (rather extensive!) analysis of his own, focusing on a distinct encoding
>>> type [5]. I haven't had the time yet to dig into his report yet, but I
>>> think I've read enough to extract the key difference in our encodings:
>>> his
>>> filters use a binomial encoding _directly_ on the filter contents, will
>>> we
>>> instead create a Golomb-Coded set with the contents being _hashes_ (we
>>> use
>>> siphash) of the filter items.
>>>
>>> Using a fixed fp=20, I have some stats detailing the total index size, as
>>> well as averages for both mainnet and testnet. For mainnet, using the
>>> filter contents as currently described in the BIP (basic + extended), the
>>> total size of the index comes out to 6.9GB. The break down is as follows:
>>>
>>>     * total size:  6976047156
>>>     * total avg:  14997.220622758816
>>>     * total median:  3801
>>>     * total max:  79155
>>>     * regular size:  3117183743
>>>     * regular avg:  6701.372750217131
>>>     * regular median:  1734
>>>     * regular max:  67533
>>>     * extended size:  3858863413 <(385)%20886-3413>
>>>     * extended avg:  8295.847872541684
>>>     * extended median:  2041
>>>     * extended max:  52508
>>>
>>> In order to consider the average+median filter sizes in a world worth
>>> larger blocks, I also ran the index for testnet:
>>>
>>>     * total size:  2753238530
>>>     * total avg:  5918.95736054141
>>>     * total median:  60202
>>>     * total max:  74983
>>>     * regular size:  1165148878
>>>     * regular avg:  2504.856172982827
>>>     * regular median:  24812
>>>     * regular max:  64554
>>>     * extended size:  1588089652
>>>     * extended avg:  3414.1011875585823
>>>     * extended median:  35260
>>>     * extended max:  41731
>>>
>>> Finally, here are the testnet stats which take into account the increase
>>> in the maximum filter size due to segwit's block-size increase. The max
>>> filter sizes are a bit larger due to some of the habitual blocks I
>>> created last year when testing segwit (transactions with 30k inputs, 30k
>>> outputs, etc).
>>>
>>>      * total size:  585087597
>>>      * total avg:  520.8839608674402
>>>      * total median:  20
>>>      * total max:  164598
>>>      * regular size:  299325029
>>>      * regular avg:  266.4790836307566
>>>      * regular median:  13
>>>      * regular max:  164583
>>>      * extended size:  285762568
>>>      * extended avg:  254.4048772366836
>>>      * extended median:  7
>>>      * extended max:  127631
>>>
>>> For those that are interested in the raw data, I've uploaded a CSV file
>>> of raw data for each block (mainnet + testnet), which can be found here:
>>>      * mainnet: (14MB):
>>> https://www.dropbox.com/s/4yk2u8dj06njbuv/mainnet-gcs-stats.csv?dl=0
>>>      * testnet: (25MB):
>>> https://www.dropbox.com/s/w7dmmcbocnmjfbo/gcs-stats-testnet.csv?dl=0
>>>
>>>
>>> We look forward to getting feedback from all of y'all!
>>>
>>> -- Laolu
>>>
>>>
>>> [1]: https://github.com/lightninglabs/neutrino
>>> [2]: https://github.com/Roasbeef/btcd/tree/segwit-cbf
>>> [3]: https://github.com/Roasbeef/btcutil/tree/gcs/gcs
>>> [4]: https://github.com/lightningnetwork/lnd/
>>>
>>> -- Laolu
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20171109/9c95b929/attachment-0001.html>


More information about the bitcoin-dev mailing list