[Lightning-dev] Quick analysis of channel_update data

Tue Jan 8 05:23:10 UTC 2019

Fabrice Drouin <fabrice.drouin at acinq.fr> writes:
> Follow-up: here's more detailed info on the data I collected and
> potential savings we could achieve:
>
> I made hourly routing table backups for 12 days, and collected routing
> information for 17 000 channel ids.
>
> There are 130 000 different channel updates :on average each channel
> has been updated 8 times. Here, “different” means that at least the
> timestamp has changed, and a node would have queried this channel
> update during its syncing process.

Side note: some implementations are also sending out updates with the
*same* timestamp.  This is not allowed...

> But only 18 000 pairs of channel updates carry actual fee and/or HTLC
> value change. 85% of the time, we just queried information that we
> already had!

Note that this can happen in two legitimate cases:
1. The weekly refresh of channel_update.
2. A node updated too fast (A->B->A) and the ->A update caught up with the
   ->B update.

Fortunately, this seems fairly easy to handle: discard the newer
duplicate (unless > 1 week old).  For future more advanced
reconstruction schemes (eg. INV or minisketch), we could remember the
latest timestamp of the duplicate, so we can avoid requesting it again.

> Adding a basic checksum (4 bytes for example) that covers fees and
> HTLC min/max value to our channel range queries would be a significant
> improvement and I will add this the open BOLT 1.1 proposal to extend
> queries with timestamps.
>
> I also think that such a checksum could also be used
> - in “inventory” based gossip messages
> - in set reconciliation schemes: we could reconcile [channel id |
> timestamp | checksum] first

I think this is overkill?

Thanks,
Rusty.