[Lightning-dev] Routemap scaling (was: Just in Time Routing (JIT-Routing) and a channel rebalancing heuristic as an add on for improved routing success in BOLT 1.0)

Fri Apr 5 05:50:58 UTC 2019

Good morning m.a.holden,

>
> I think you might have misunderstood what I was proposing, but it's probably my fault for not expressing it well. With my suggestion, all nodes continue to be equal participants and there are no special nodes. I used the term "endpoint" previously to mean one of the two nodes which a regular channel belongs to, and not to mean some kind of special node. Any node can open channels to any other node - the "buckets" are merely a strategy for locally organizing gossip so that it can be made more efficient. The term "buckets" is used in the descriptions of other DHTs such as Kademlia too, and they don't refer to splitting up the global network, they merely provide a perspective for looking at subsections of it.

This "bucket"ization, unfortunately, allows the possibility of targeted attacks within particular buckets (which I will describe later below).
It is this I refer to, when I bring up the possibility of politically-motivated attacks.

> I came up with the idea of using the quadtree specifically for trying to reduce the maxmimum (or typical) route length to query where possible, at the expense of storing much more information than existing DHTs, but trying to get reasonable savings on resources. Although the worst-case query cost is still O(log n), it seems that this is unlikely to occur and O(1) seems plausible for small depth.

I believe that for trees, lookup is only O(log n) if perfectly balanced, as all things should be.
For trees, worst case is always O(n) if the possibility of unbalanced trees exists.
And in a scenario where nobody controls the data that can be placed in a tree, you can always expect unbalanced trees in attacks.

Now, it seems to me what you propose, is to have octrees contain octrees, and so on.
Some node promises to keep track of the map, at some level of octree fineness.

Suppose there is a node I wish to attack, for whatever reason (probably politically-motivated).
Now, I know there are N nodes that have promised to keep track of the octant the target is on.

I can overload those N nodes by generating node addresses that also lie in that octant.
By just iterating over scalars, about 1/8 of the generated node addresses will lie in the target octant.
Once I get a node address in the targeted octant, I can commit a nominal amount of Bitcoin into some 2-of-2 with an existing actual node I control and the generated node address, then synthesize some node gossip containing those node addresses.
This unbalances the octree such that one octant has a far lerger number of nodes inside it than the other octants.

The N nodes that promised to keep track of the routemap within the octant find that they need to take up more and more memory to store the octant data because of this targeted attack.
Eventually, the N nodes start dropping out of this responsibility since they run out of resources.

Once the number of nodes that hold that octant drops, I can then switch to targeted DDoS attacks on the remaining octant-mapping nodes, especially easy to locate them since they openly broadcast the fact that they promise to map the octant they are in.

Now the octant becomes hard to access for 7/8 of the network.
I have successfully executed a partitioning attack and disrupted the operation of an entire octant.

This is not a situation I would like to enable, especially since we can expect politically-motivated attacks on Bitcoin and Lightning.

Now it is possible I misunderstand what you are proposing.
Is the above attack scenario plausible?

This is the primary reason why I think it is dangerous to split the network, even by just considering subsections of the network.
While we must at some point operate on network subsections as the network grows, I think we should hold the following position when designing:

* No node shall reveal the extent of its knowledge of the network, since if it reveals that it knows the entire network, it may become a target for attack in order to degrade the network.
   * This also implies that if a node does not know the location of some node, instead of admitting its ignorance, it should instead delegate to another node that might have better information; otherwise it would be possible to profile nodes to determine how much of the network they know.

The intent is not to have targets.
Nodes that openly broadcast that they know the nodes within the same octant they are, are nodes that want to be DDoS'ed.

Thus, my definition of a "special" node is a lot looser than you seem to define it.
Anything that makes it possible to point to a node and say "this is a good node to attack, in order to disrupt Lightning" is special.

I expressed, a similar opinion in the Trampoline Routing thread: https://lists.linuxfoundation.org/pipermail/lightning-dev/2019-April/001967.html

> There are currently ~4000 nodes with an average of ~20 channels per node (~40,000 channels) which we need to keep information about at depth=0.
> <snip>

I appreciate you did the calculations, but I would like to point out that in an attack scenario, your computations will become meaningless as the attacker can trivially manufacture fake nodes to cause whatever buckets to overflow etc.

------

Now let us return, to my proposal of a distance measurement.
This effectively maps the LN universe onto a circle.
Each node commits itself to storing an arc of this circle, and possibly various other nodes it happens to be directly connected to that may be far from the arc near it.
Crucially no node admits to how large an arc of this circle they map out, only that they are more likely to map points nearer to themselves.

In order to disrupt a particular node, I must place fake nodes near that node, in an attempt to force its neighbors to reduce the arc of the circle that they can map.
However, I need to generate fake nodes that are nearer to that node than genuine honest nodes.
This means the probability of me generating such node addresses are much lower than 1/8, thus requiring that I devote more energy to generating the falsified attack nodes.

Further, in executing this attack, while I disrupt one node very well, and nearby nodes somewhat, my effect will be far less than disrupting 1/8 of the network.

Finally, nodes can easily use heuristics to filter out attack patterns that have been identified.
This is because nodes only commit to *probabilistically* being more likely to know nodes near to them, than nodes that are not near to them.
They do not commit, as in your proposal, to absolutely knowing everything within their committed area.
This lets nodes degrade their performance more gracefully.

Regards,
ZmnSCPxj