[Lightning-dev] Routemap scaling (was: Just in Time Routing (JIT-Routing) and a channel rebalancing heuristic as an add on for improved routing success in BOLT 1.0)

Thu Apr 4 07:08:44 UTC 2019

Good morning ZmnSCPxj, thanks for the response.

> I would be hesitant to divide the world in such manner.
> I understand that in typical computer science, splitting your objects up into smaller parts is a long-accepted method of doing things.

> However, when it comes to finances and political power (the power to censor or disrupt), such splitting is highly undesirable.

> Thus I am strongly allergic to things that would "hierarchical"ize or "bucket"ize or even just have a separation between "endpoint" and "core" network.

> I would rather propose for acceptance into BOLT, such proposals that would keep the network homogeneous.

Firstly, I completely agree with you that we should not be splitting up the network in any way which nodes or channels can be politically targeted, or in which any node is designated special privelege or status. I have the same allergy as you and took this into considerations when proposing this, and some of your suggestions are along a simlar  line of thinking as mine.

I think you might have misunderstood what I was proposing, but it's probably my fault for not expressing it well. With my suggestion, all nodes continue to be equal participants and there are no special nodes. I used the term "endpoint" previously to mean one of the two nodes which a regular channel belongs to, and not to mean some kind of special node. Any node can open channels to any other node - the "buckets" are merely a strategy for locally organizing gossip so that it can be made more efficient. The term "buckets" is used in the descriptions of other DHTs such as Kademlia too, and they don't refer to splitting up the global network, they merely provide a perspective for looking at subsections of it.

I'm certainly interested in any solutions which keep the network permissionless because we really don't want to end up with DNS 2.0 or similar.

> Various nodes may have different resources (BTC, CPU, memory, bandwidth, latency).
> Such inequalities are inevitable in this universe.
> But these nodes should still, as much as we can, remain peers on the network.

My suggestion accounts for the difference in computational requirements, as each node can determine its own depth in  the tree based on the approximate quantity of information it wishes to gossip. A node could filter information to whichever depth of the tree they wished, by setting the bucket size at which they spill. This also allows for the size of gossip to dynamically shrink as the network grows, and is similar to garbage collection, in which anything which isn't part of the destination bucket on spilling is purged.

Nodes could also pick (multiple) specific quadtree buckets to communicate all gossip about through filters they negotiate (the filter being the hash prefix of the desired bucket). It might be desirable to broadcast their filter as part of the gossip itself, so that other nodes can learn who are the better information providers, but again this would be completely optional.

---

> This rule can be probabilistically fulfilled by having each node N know *at least* routes to some nodes Ls..., where for all L <- Ls, dist(N, L) < some constant.
> The constant here can be selected by each node independently, depending on its memory and CPU capacity.
(the node knowing more routes than that is perfectly fine)

The quadtree proposes a similar idea, but instead of using some constant, N knows about routes to Ls, where each L is within the same bucket. Essentially, i < dist(N, L) <= j, where i=BUCKET_MIN and j=BUCKET_MAX. For example, if using 8-bit ids and the first two bits identify the bucket, then i=00000000, j=00111111. I guess the equivalent distance using your constant would be k=00100000, to cover the same range but without discriminating the results based on any boundaries like my suggestion does.

> Then, we can have the following global rule:
> * Given three nodes X, Y, and Z, and dist(X, Z) < dist(Y, Z), then X SHOULD be more likely to know a route from itself to Z, than Y to know a route from itself to Z, is also met for nodes which are in different buckets.

This rule is essentially the same as what I was thinking for the distance between buckets. With the autopilot suggestion of decreasing the number of channels opened as distance increases, the probability of knowing about a neighbouring bucket is increased compared with a distant one.

---

> It looks for any node L <- Ls such that dist(L, D) < dist(N, D).
> In fact, it can just sort those nodes according to dist(L, D) and start with the lowest distance, and fail once it reaches a dist(L, D) that exceeds its own dist(N, D).
> The above algorithm converges since dist(N, D) for each N that is delegated to will progressively get smaller until we reach some N that knows the destination D.

This proposal also seems very similar to how existing DHTs work, unless I'm mistaken. My problem with the approach of existing DHTs is that they are suboptimal for the number of queries which must be made to find a route - which is worst-case O(log n) for example, in Chord or Kademlia. This isn't a problem for things like file-sharing where latency isn't a major concern, but we don't really want to be waiting for a bunch of queries if we're stood in queue to pay for a coffee. (Given also that some routes may fail due to inherent constraints of the LN itself). As the network grows, the efficiency of route finding declines too.

I came up with the idea of using the quadtree specifically for trying to reduce the maxmimum (or typical) route length to query where possible, at the expense of storing much more information than existing DHTs, but trying to get reasonable savings on resources. Although the worst-case query cost is still O(log n), it seems that this is unlikely to occur and O(1) seems plausible for small depth.

The other expense is that this approach will not find the most optimal routes, as it prioiritizes considering the smallest number of buckets. However, it is not possible to know the most optimal path without knowing about the entire network topology anyway, so this problem exists with the DHT too. I've optimized for reduced querying.

> All nodes remain peers, there are no special "flare nodes", no special "knows the entire map" nodes, no "knows my octant" nodes, no "endpoint" nodes, no hubs and spokes.

There are no special nodes in my approach, only a commitment to maintain information about nodes (and their channels) whose PKH prefix matches your own at the depth you have chosen to gossip, regardless of whether or not they've advertized the same feature. A node expressing depth=0 would be a "knows the entire map" node, but it does not give them any special status.

> The network remains homogeneous and all are still peers, some might have smaller routemaps than others, but all are still equal in the network, as all things should be.

All nodes are still equal, but the network isn't entirely homogeneous (in the distribution of channels) because of the autopilot suggestions which prioritize local channels and deprioritize distant channels, with the goal of improving efficiency. I guess there is some trade-off with this approach, but it's still worth considering, because "perfectly unstructured" isn't the end goal - there may be some middle ground between that and other approaches which tend towards centralization.

It may even be harmful to try to keep it perfectly unstructured, because with no structure, the network will effectively be "maximally inefficient" for those not maintaining sufficient gossip information. If by default, it is inefficient to find a route, then there will inevitably be somebody who will fill that market gap. Big players will have no problem keeping knowledge of the whole network and giving information about paths in just one query - in exchange for people's data. Like DNS.

A semi-structured approach might provide enough incentive for everyone to keep as much information locally as realistic for the resources they have, and like Adam Smith's invisible hand, by doing so they not only benefit themselves, but they benefit everybody by improving the efficiency of the network overall. The network would become "reasonably efficient" for everyone, rather than inefficient for most except the few with sufficient resources to maintain a useful routemap.

---

I'll try to convey by example how I think censorship should not be cause for concern with this quadtree, and what savings might be expected. This makes some crude assumptions using the current network size, with my previous suggestions for autopilot (ie, nodes open 1/2 their channels to nodes in their own bucket, and decreasing with distance), and assumes uniform distribution of node ids (we assume there is no numeric bias in the hashes of public keys).

There are currently ~4000 nodes with an average of ~20 channels per node (~40,000 channels) which we need to keep information about at depth=0.

If we spill to depth=1, then ~1,000 nodes will go into each bucket. If each has (on average) ~10 channels to other nodes in the same bucket, then there will be ~5,000 "intra-bucket" channels in each bucket. Each node also has ~10 channels to nodes in other buckets ("inter-bucket" channels), which is another ~10,000 channels it maintains information for.

Note that "intra-bucket" and "inter-bucket" channels are just plain channels - there's nothing special about them and they are merely the difference in perspective which each node will view them based on whether either node's PKH prefix is the same bucket as their own PKH prefix at the depth which they filter gossip.

Since we've filtered out all channels where neither node is in our own bucket, we now only need to know about ~15,000 channels in total, compared to the original 40,000 (~37.5% of total channels in the network). We still need to know about 1,000 nodes in our own bucket, and we keep information about the nodes at the other end of the 10,000 inter-bucket channels we know about, which could still be anything up to 100% of the nodes in the global network, but nodes will be filtered if none of the 10,000 inter-bucket channels we know about belong to them, so nodes we track information about is 1000 < N <= 4000.

We don't suddenly have no fault-tolerance or a censorship threat to find a node in another bucket here. Therea are still ~3,333 known channels on average between our bucket and each of the other 3 buckets at depth=1. Since they keep all of the gossip for their own bucket, then the chance that any of them know the destination for a payment is likely - so the number of queries you would typically need to make is 1 (or several *in parallel*). The node you query should know the remaining route from himself to the destination. You may chose to query more information for potential privacy enhancement.

If we spill to depth=2, then we now have 16 total leaf-buckets, with ~250 nodes each. Each bucket would have ~1250 intra-bucket channels, and ~2500 inter-bucket channels. Information requirement per bucket is now ~3750 channels (~10% of the total network, which is a reasonable saving). Of the inter-bucket channels, 1/2 of them are to the 3 sibling buckets, making ~400 potential routes to any of the siblings, and the other 1/2 are to "cousin" buckets, of which there are 12, resulting in ~100 channels on average to those buckets. Still likely sufficient to have a typical query of 1, with fault-tolerance.

At depth=3 it is ~62 nodes per bucket, ~310 intra-bucket channels, ~620 inter-bucket channels and 64 possible leaf-buckets. resulting in ~100 channels to each sibling, ~13 channels to each cousin, and just 1 or 2 channels to each second-cousin (the most distant buckets).

Only at this point does querying start to become potentially expensive if you want to make the payment to a distant node. You might still have some direct routes to the second-cousin buckets, but not much fault-tolerance. However, your cousins or the siblings of your second-cousin have a higher probability of knowing about a node at that distance than local nodes will, so you still have numerous options for discovering a node but it might take multiple queries. Queries would use a similar greedy algorithm approach to the one you have suggested. It should still be significantly less than worst-case query cost.

Resource constrained devices might spill to these limits at the cost of requiring more queries for payments. In practice, you probably wouldn't go beyond depth=2 as above unless the global network gets sufficiently large that bandwidth requirements at depth=2 are a problem, which probably isn't going to be the case until the network is a magnitude larger than it already is - and in which case the number of "inter-bucket" channels will be tenfold the current amount and spilling to depth=4 or 5 may become plausible.

There is also potential for analysts to find out which buckets do not have any, or have few channels between them, and to take the opportunity to fill that gap to try and benefit from routing fees, as those new channels would be prioritized over longer routes which span multiple buckets. The autopilot suggestions are only a starting point, but eventually it will be mostly market driven.

---

My idea is not fully researched, but since the topic was raised, I decided to share my incomplete thoughts about it. The choice of a quadtree itself was quite arbitrary, but it seemed reasonable after briefly considering alternative arity trees, and the potential for linking this to approximate geographic location to optimize local payments seemed like it could be valuable too.

I'll give some more thought to your suggestions and reconsider my own with these new suggestions.

Regards,
Mark H