[Lightning-dev] General question on routing difficulties

Sat Nov 25 19:16:12 UTC 2017

(final try as the prior mail hit the size limit, sorry for the spam!)

Hi Pedro,

I came across this paper a few weeks ago, skimmed it lightly, and noted a
few interesting aspects I wanted to dig into later. Your email reminded me
to re-read the paper, so thanks for that! Before reading the paper, I
wasn't aware of the concept of coordinate embedding, nor how that could be
leveraged in order to provide sender+receiver privacy in a payment network
using a distance-vector-like routing system. Very cool technique!

After reading the paper again, my current conclusion is that while the
protocol presents some novel traits in the design a routing system for
payment channel based networks, it lends much better to a
closed-membership, credit network, such as Ripple (which is the focus of
the paper).

In Ripple, there are only a handful of gateways, and clients that seek to
interact with the network must chose their gateways *very* carefully,
otherwise consensus faults can occur, violating safety properties of the
network. It would appear that this gateway model nicely translates well to
the concept of landmarks that the protocol is strongly dependant on.
Ideally, each gateway would be a landmark, and as there are a very small
number of gateways within Ripple (as you must be admitted to be a verified
gateway in the network), then parameter L (the total number of landmarks)
is kept small which minimizes routing overhead, the average path-length,
etc.

When we compare Ripple to LN, we find that the two networks are nearly
polar opposites of each other. LN is an open-membership network that
requires zero initial configuration by central administrators(s). It more
closely resembles *debit* network (a series of tubes of money), as the
funds within channels must be pre-committed in order to establish a link
between two nodes, and cannot be increased without an additional on-chain
control transaction (to add or remove funds). Additionally, AFAIK (I'm no
expert on Ripple of course), there's no concept of fees within the
network. While within LN, the fee structure is a critical component of the
inventive for node operators to lift their coins onto this new layer to
provider payment routing services.  Finally, in LN we rely on time-locks
in order to ensure that all transactions are atomic which adds another set
of constraints. Ripple has no such constraint as transfers are based on
bi-lateral trust.

With that said, the primary difference between this protocol is that
currently we utilize a source-routed system which requires the sender to
know "most" of the path to the destination. I say "most" as currently,
it's possible for the receiver of a payment to use a poor man's rendezvous
system to provide the sender with a set of suffix paths form what one can
consider ad-hoc landmarks. The sender can then concatenate these with
their own paths, and construct the Sphinx routing package which encodes
the full route. This itself only gives sender privacy, and the receiver
doesn't know the identity of the sender, but the sender learns the
identity of the receiver.

We have plans to achieve proper sender/receiver privacy by extending our
Sphinx usage to leverage HORNET, such that the payment descriptor (payment
request containing details of the payment) also includes several paths
from rendezvous nodes (Rodrigo's) to the receiver. The rendezvous route
itself will be nested as a further Anonymous Header (AHDR) which includes
the information necessary to complete the onion circuit from Rodrigo to
the receiver. As onion routing is used, only Rodrigo can decrypt the
payload and finalize the route. With such a structure, the only nodes that
need to advertise their channels are nodes which seek to actively serve as
channel routers. All other nodes (phones, laptops, etc), don't need to
advertise their channels to the greater network, reducing the size of the
visible network, and also the storage and validation overhead. This serves
to extend the "scale ceiling" a bit.

My first question is: is it possible to adapt the protocol to allow each
intermediate node to communicate their time lock and fee references to the
sender? Currently, as the full path isn't known ahead of time, the sender
is unable to properly craft the timelocks to ensure safety+atomicity of
the payment. This would mean they don't know what the total timelock
should be on the first outgoing link. Additionally, as they don't know the
total path and the fee schedule of each intermediate node, then once
again, they don't know how much to send on the first out going link. It
would seem that one could extend the probing phase to allow backwards
communication by each intermediate node back to the sender, such that they
can properly craft a valid HTLC. This would increase the set up costs of
the protocol however, and may also increase routing failures as it's
possible incompatibilities arise at run-time between the preferences of
intermediate nodes. Additionally, routes may fail as an intermediate node
consumes too many funds as their fee, causing the funds to be insufficient
when it reaches the destination. One countermeasure would maybe: the
sender always sends waaay more than necessary, and gives the receiver a
one-time payment identifier, requiring that they route the remainder of
the funds *back* to them.

To solve this issue presently, we extend the header in Sphinx to include a
per-hop payload which allows the sender to precisely dictate the
structure of the route, allows the intermediate nodes to authenticate the
information given to it, and also allow the intermediate node to verify
that their policies have properly been respected. These payloads can also
be utilized by applications to communicate a small-ish amount of data to
construct higher-level protocols on top of the system. Examples include:
cross-chain swaps, chance payment games, higher-level B2B protocols,
flavors of ZKCP's, media streaming, internet access proxying, etc.

>From my point-of-view, when extended to LN, the core component of the
protocol (landmarks), becomes the weakest component. From my reading,
*all* nodes need to be ware of an *identical* set of landmarks (more or
less similar to the desired homogeneity of Gateways), otherwise the
coordinate embedding scheme breaks down. Currently, there's no requirement
that all nodes have a globally consistent view of the network. So then an
important questions arises: who choose the landmarks? A desirable property
of a routing system for LN (IMO) is that is has close to zero required
initial set up by a central administrator. With this protocol, it would
seem that all nodes much ship with a hard coded set of global landmarks
for the path finding to succeed.  This itself pins a hard coordination
requirement amongst implementers to have something like this deployed.
Even ignoring this requirement for a minute, I see several other
downsides:

   * As *all* payments must flow through landmarks (since nodes break up
     their payment into L sub-flows), the landmarks must be very, very
     well capitalized. This would cause strong consolidation of the
     selection of landmarks, as they need extremely large channels in
     order to facilitate transfer within the network.

   * As landmarks must be globally known, this it seems this would
     introduce fragility in the network. If most of the landmarks go down
     (fails stop crashes) due to hardware issues, DoS, exploited bugs,
     etc, then the network's throughput instantly becomes crippled.

   * If all payment flow *must* go through landmarks, and the transfers
     within the network are relatively uni-directional (all payment going
     to Candy Crush Ultra: Lighting Strikes Twice), then their
     channels would become unbalanced very quickly.

The last point there invokes another component of the network: passive
channel rebalancing. With source routing, it's possible for nodes to
passive rebalance their channels, in order to keep the in equilibrium,
such that on average they'll be able to handle a payment flow coming from
any direction. This is possible as with source routing, it's easy for a
node to simply send a payment to himself incoming/outgoing from the pair
of channels they wish to adjust the available flow of. With
distance-vector-like protocols, this doesn't seem possible, as the node
doesn't have any control of the incoming channel that the payment will
arrive on.

Finally, the notion of value privacy within the scheme seems a bit weak.
>From this definition, any protocol that didn't broadcast intents to send
payments to the world would achieve this trait. The base Bitcoin
blockchain doesn't mask the values of transfers (yet), but even if it did
unconditionally maintaining value privacy of channel doesn't seem
compatible with multi-hop payment networks (nodes can simply perform
probing/tagging attacks to ascertain a range of the size of a channel). A
possible mitigation would be for nodes to probabilistically drop incoming
payments, with all nodes sampling from the same distribution. However,
this would dramatically increase routing failures by senders, removing the
"low-latency" trait of payment networks that many find desirable.

Personally, I've very excited to see additional research on the front of
routing within the network! Excellent work by all authors.

In the end, I don't think it'll be a one-size fits all solution, as each
routing protocol delivers with it a set of tradeoffs that should be
weighed depending on target characteristics, and use-cases. There's no
strong requirement that the network as a whole uses a *single* routing
protocol. Instead several distinct protocols can be deployed based on
use-case requirements, as we only need to share a single end-to-end
construct: the HTLC. I could see a future in a few years where we have
several deployed protocols, similar to the wide array of existing routing
protocols deployed on the Internet. What we have currently gets us from
Zero to One. We'll definitely need to experiment with additional
approaches as the size of the network grows, and the true economic flow
patterns emerge after we all deploy to mainnet.

-- Laolu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxfoundation.org/pipermail/lightning-dev/attachments/20171125/d57bbed0/attachment-0001.html>