[Lightning-dev] General question on routing difficulties

Thu Nov 30 16:59:35 UTC 2017

Hi Laolu,
Thanks for your detailed and interesting reply. Please see below some
points I would like to make in some of your comments and the answers to
your questions in the last email. And of course, I would be happy to
further discuss with you.

On 11/25/17 2:16 PM, Olaoluwa Osuntokun wrote:
> (final try as the prior mail hit the size limit, sorry for the spam!)
> 
> Hi Pedro, 
> 
> I came across this paper a few weeks ago, skimmed it lightly, and noted a
> few interesting aspects I wanted to dig into later. Your email reminded me
> to re-read the paper, so thanks for that! Before reading the paper, I
> wasn't aware of the concept of coordinate embedding, nor how that could be
> leveraged in order to provide sender+receiver privacy in a payment network
> using a distance-vector-like routing system. Very cool technique!
> 
> 
> After reading the paper again, my current conclusion is that while the
> protocol presents some novel traits in the design a routing system for
> payment channel based networks, it lends much better to a
> closed-membership, credit network, such as Ripple (which is the focus of
> the paper). 
> 
I personally do not agree with this conclusion. The Ripple network uses
a global ledger that contains all the nodes, all the credit links among
them and the weight of each credit link at all points in time.
Therefore, as such, every user can simply check such public information
and decide a route with enough credit on its own. There is no need for a
decentralized routing algorithm on Ripple as all the information is
being continuously updated in the ledger.

In the work we present in this paper, we focus instead on a
decentralized payment network, where the links’ states (e.g., bitcoin
balances in the payment channels within LN) are not continuously logged
in the blockchain but instead, every link is locally maintained by the
corresponding pair of users. With such setting in mind, we have designed
our routing algorithm (SpeedyMurmurs) and we believe it has potential to
be used in the Lightning network as well.
> 
> In Ripple, there are only a handful of gateways, and clients that seek to
> interact with the network must chose their gateways *very* carefully,
> otherwise consensus faults can occur, violating safety properties of the
> network. It would appear that this gateway model nicely translates well to
> the concept of landmarks that the protocol is strongly dependant on.
> Ideally, each gateway would be a landmark, and as there are a very small
> number of gateways within Ripple (as you must be admitted to be a verified
> gateway in the network), then parameter L (the total number of landmarks)
> is kept small which minimizes routing overhead, the average path-length,
> etc.
> 
I think this description confuses the terms: gateway, validator and
landmark. A gateway is an online exchange service that is trusted to
correctly create a credit link with a new user in the Ripple network. If
Alice has a Ripple wallet but no credit link yet in the Ripple network,
she can get her first issued credits on a link with a gateway.
Similarly, Alice can use such online exchange service to get her first
bitcoins by paying fiat currency, for instance. Importantly, a gateway
does not take part in the Ripple Consensus Algorithm, and therefore a
gateway cannot provoke consensus failures and violate safety properties.

The Ripple consensus algorithm is run among a set of validators. A
validator is the one that can provoke consensus failures. Note that a
validator is a participant in the consensus protocol that might not even
have a wallet in the Ripple network and therefore might not even be a
gateway. For example, MIT and Microsoft have run a validator and are not
gateways in the Ripple network.

Finally, a landmark as used in SpeedyMurmurs, the routing algorithm we
propose in this paper,  is simply a node in the network that it is
*only* used to bootstrap the routing algorithm, and any node in the
network can do so. In the very specific case of a decentralized version
of the Ripple network such as SilentWhispers
(https://eprint.iacr.org/2016/1054), gateways become good candidates to
be such landmarks. However, the routing algorithm that we propose in
SpeedyMurmurs is by no means tied to gateways and in particular,
landmarks can be simply nodes chosen at random from the complete set of
nodes in the network, as we show in the paper. Therefore, we believe
that SpeedyMurmurs might be suitable for different payment networks, and
in particular for the LN.
> 
> When we compare Ripple to LN, we find that the two networks are nearly
> polar opposites of each other. LN is an open-membership network that
> requires zero initial configuration by central administrators(s). It more
> closely resembles *debit* network (a series of tubes of money), as the
> funds within channels must be pre-committed in order to establish a link
> between two nodes, and cannot be increased without an additional on-chain
> control transaction (to add or remove funds). Additionally, AFAIK (I'm no
> expert on Ripple of course), there's no concept of fees within the
> network. While within LN, the fee structure is a critical component of the
> inventive for node operators to lift their coins onto this new layer to
> provider payment routing services.  Finally, in LN we rely on time-locks
> in order to ensure that all transactions are atomic which adds another set
> of constraints. Ripple has no such constraint as transfers are based on
> bi-lateral trust.
> 
I agree that comparing Ripple and LN is like comparing apples and
oranges. For starters, Ripple is a “blockchain-focussed” payment network
as *all* the operations in the network are registered in the blockchain
while the LN only uses the blockchain for opening and closing a payment
channel. Ripple is a credit network while the LN is a debit network. And
many other differences.

There are some similarities though. In Ripple *there is* also the
concept of fee being charged for forwarding a payment and for currency
exchange. This fee structure has a similar motivation as in the LN:
create an incentive for users to be used as intermediate hops in
payments. Both networks face/will have to face similar challenges:
bootstrapping, i.e., how to create the first credit link/payment channel
for a new user in the network and with whom so that one can transact
with the rest of the network; liquidity: i.e., what should be the
network topology and how much credit/debit there should be in the
links/channels so that every user can transact to any other user in the
network; and many others.

But again, in SpeedyMurmurs we are not presenting a routing algorithm
for a “blockchain-centered” payment network like Ripple but rather a
routing algorithm for a decentralized payment network such as
SilentWhispers, LN, Raiden Network or Interledger.
> 
> With that said, the primary difference between this protocol is that
> currently we utilize a source-routed system which requires the sender to
> know "most" of the path to the destination. I say "most" as currently,
> it's possible for the receiver of a payment to use a poor man's rendezvous
> system to provide the sender with a set of suffix paths form what one can
> consider ad-hoc landmarks. The sender can then concatenate these with
> their own paths, and construct the Sphinx routing package which encodes
> the full route. This itself only gives sender privacy, and the receiver
> doesn't know the identity of the sender, but the sender learns the
> identity of the receiver. 
> 
> We have plans to achieve proper sender/receiver privacy by extending our
> Sphinx usage to leverage HORNET, such that the payment descriptor (payment
> request containing details of the payment) also includes several paths
> from rendezvous nodes (Rodrigo's) to the receiver. The rendezvous route
> itself will be nested as a further Anonymous Header (AHDR) which includes
> the information necessary to complete the onion circuit from Rodrigo to
> the receiver. As onion routing is used, only Rodrigo can decrypt the
> payload and finalize the route. With such a structure, the only nodes that
> need to advertise their channels are nodes which seek to actively serve as
> channel routers. All other nodes (phones, laptops, etc), don't need to
> advertise their channels to the greater network, reducing the size of the
> visible network, and also the storage and validation overhead. This serves
> to extend the "scale ceiling" a bit.
> 
Two comments here:
 * As Christian Decker indicated in a previous email, this routing
technique assumes that the routing layer and the onion layer are
together by default. This is obviously an interesting alternative worth
exploring and investigating what advantages/disadvantages it has and
what are the inherent tradeoffs.

 * Nevertheless, it is obviously not the only alternative. We could
separate the routing layer from the onion layer and study what
advantages/disadvantages we can get from there. For instance, in
SpeedyMurmurs, nodes only need to announce their channels to their
neighbors and yet they are able to reconstruct a path from sender to the
receiver.  Our current experiments show that this simplicity has
enormous benefits in terms of performance.

In my opinion, it is interesting to look at tradeoffs and the
necessary/sufficient guarantees for the routing algorithm in a
decentralized payment network such as the LN before we stick to a solution.

> 
> My first question is: is it possible to adapt the protocol to allow each
> intermediate node to communicate their time lock and fee references to the
> sender? Currently, as the full path isn't known ahead of time, the sender
> is unable to properly craft the timelocks to ensure safety+atomicity of
> the payment. This would mean they don't know what the total timelock
> should be on the first outgoing link. Additionally, as they don't know the
> total path and the fee schedule of each intermediate node, then once
> again, they don't know how much to send on the first out going link. It
> would seem that one could extend the probing phase to allow backwards
> communication by each intermediate node back to the sender, such that they
> can properly craft a valid HTLC. This would increase the set up costs of
> the protocol however, and may also increase routing failures as it's
> possible incompatibilities arise at run-time between the preferences of
> intermediate nodes. Additionally, routes may fail as an intermediate node
> consumes too many funds as their fee, causing the funds to be insufficient
> when it reaches the destination. One countermeasure would maybe: the
> sender always sends waaay more than necessary, and gives the receiver a
> one-time payment identifier, requiring that they route the remainder of
> the funds *back* to them.
> 
> 
> To solve this issue presently, we extend the header in Sphinx to include a
> per-hop payload which allows the sender to precisely dictate the
> structure of the route, allows the intermediate nodes to authenticate the
> information given to it, and also allow the intermediate node to verify
> that their policies have properly been respected. These payloads can also
> be utilized by applications to communicate a small-ish amount of data to
> construct higher-level protocols on top of the system. Examples include:
> cross-chain swaps, chance payment games, higher-level B2B protocols,
> flavors of ZKCP's, media streaming, internet access proxying, etc.
> 
For what I understand, what you are asking/proposing is a mixture of the
routing layer (route from sender to receiver) + onion layer (using
“adapted”/”optimized” sphinx)+ payment layer (HTLCs). Even in this mixed
alternative, we might want to investigate what are the guarantees and
tradeoffs provided. For instance, in another recent work from us
(https://eprint.iacr.org/2017/820), we proposed multi-hop HTLC: A
modification of the HTLC contract so that the condition of the
conditional payment does not lead to privacy issues.

However, I rather prefer the approach of looking at one layer/component
at a time and see what optimizations we can perform there and what are
the inherent impossibilities. Therefore, we have not thoroughly studied
yet how to integrate the routing algorithm we propose in SpeedyMurmurs
with a possible payment operation. Having said that, some variation of
what you sketched might work. Another proposal might consist on a
payment operation that does not assume source-routing to start with.
There are many possibilities to investigate and think about.
> 
> From my point-of-view, when extended to LN, the core component of the
> protocol (landmarks), becomes the weakest component. From my reading,
> *all* nodes need to be ware of an *identical* set of landmarks (more or
> less similar to the desired homogeneity of Gateways), otherwise the
> coordinate embedding scheme breaks down. Currently, there's no requirement
> that all nodes have a globally consistent view of the network. So then an
> important questions arises: who choose the landmarks? A desirable property
> of a routing system for LN (IMO) is that is has close to zero required
> initial set up by a central administrator. With this protocol, it would
> seem that all nodes much ship with a hard coded set of global landmarks
> for the path finding to succeed.  This itself pins a hard coordination
> requirement amongst implementers to have something like this deployed.

I think your description here might not be accurate for what I
understand. What you describe here is indeed a problem inherent to the
original landmark routing mechanism. However, it is no longer an issue
in SpeedyMurmurs. In particular, any node could be a landmark or two
users could have a different view of what set of nodes constitute the
set of  landmarks. As soon as sender and receiver have a non-empty
intersection of their set of landmark-nodes, SpeedyMurmurs provides a
route among them.

In the approach we present in SpeedyMurmurs, the landmark is just a node
that declares himself a landmark by triggering to his neighbors the
chain of messages required to construct the embeddings in the whole
network. Any node in the network can do this. Moreover, in
SpeedyMurmurs, this is a one-time thing. Once a set of embedding are
created, the landmark is not used again. Changes in the network topology
are locally handled by the neighbors of the newly connected/disconnected
node.

> Even ignoring this requirement for a minute, I see several other
> downsides:
> 
>    * As *all* payments must flow through landmarks (since nodes break up
>      their payment into L sub-flows), the landmarks must be very, very
>      well capitalized. This would cause strong consolidation of the
>      selection of landmarks, as they need extremely large channels in
>      order to facilitate transfer within the network.

This is true in the original landmark routing. However, this is no
longer the case in SpeedyMurmurs. Paths no longer require to have the
landmark as intermediate user. Even further, a path from a sender to the
receiver might use links that are not even part of the spanning tree. We
call them “shortcuts” in the paper. We designed it like this having the
capitalization problem you mentioned in mind.

> 
>    * As landmarks must be globally known, this it seems this would
>      introduce fragility in the network. If most of the landmarks go down
>      (fails stop crashes) due to hardware issues, DoS, exploited bugs,
>      etc, then the network's throughput instantly becomes crippled.

Again, this is true in the original landmark routing, not the case in
SpeedyMurmurs. We use the landmarks as a one-time setup and any node
could be a landmark.  In particular, every node in the network could be
the landmark if it is really DoS-sensitive. We just require that a
sender and a receiver have a non-empty intersection of the landmarks
that they have seen so far.

> 
>    * If all payment flow *must* go through landmarks, and the transfers
>      within the network are relatively uni-directional (all payment going
>      to Candy Crush Ultra: Lighting Strikes Twice), then their
>      channels would become unbalanced very quickly.

Again, this is true in the original landmark routing, not the case in
SpeedyMurmurs. Payments *must not* go through landmarks any longer with
our approach. Moreover, unbalanced payment channels are an issue to
consider independently of the routing algorithm that is in use.

All in all, it seems that there might be some misconceptions and/or
aspects in the current draft of the paper that might need clarifications
so that the approach is well understood. We are more than happy to
further talk about it and answer questions, doubts or concerns that
might arise.

> 
> 
> The last point there invokes another component of the network: passive
> channel rebalancing. With source routing, it's possible for nodes to
> passive rebalance their channels, in order to keep the in equilibrium,
> such that on average they'll be able to handle a payment flow coming from
> any direction. This is possible as with source routing, it's easy for a
> node to simply send a payment to himself incoming/outgoing from the pair
> of channels they wish to adjust the available flow of. With
> distance-vector-like protocols, this doesn't seem possible, as the node
> doesn't have any control of the incoming channel that the payment will
> arrive on.
> 
I think this is another case of mixing functionalities at different
layers. I do agree that unbalanced channels is an interesting problem to
deal with. I, however, do not agree that we should choose one routing
approach or another based on how unbalanced channels are handled.

In my understanding, what you propose requires several assumptions: (i)
the node should be aware of “enough” network topology so that it is
aware of those “special topology shapes”. However, if opening channel
operations are indistinguishable from other payments in the Bitcoin
blockchain, this is not trivial. However, company X might not want to
reveal its channel with company Y to its competitors; (ii) such
circular/suitable paths actually exist. However, this might just not be
the case. (iii) The chosen path(s) has enough liquidity for the
“equilibrium”. As the node is not aware of the current capacity of each
of the channels, it might be the case that several payments are required
for the equilibrium to happen.

In any case, my point is not to discuss whether the problem of
unbalanced channels is interesting. My point is that I think we should
not stick to one routing algorithm depending on how another
algorithm/functionality at another layer is handled, at least not before
we explore and fully understand the tradeoffs, benefits and
impossibilities that we will have to face here.

> 
> Finally, the notion of value privacy within the scheme seems a bit weak.
> From this definition, any protocol that didn't broadcast intents to send
> payments to the world would achieve this trait. The base Bitcoin
> blockchain doesn't mask the values of transfers (yet), but even if it did
> unconditionally maintaining value privacy of channel doesn't seem
> compatible with multi-hop payment networks (nodes can simply perform
> probing/tagging attacks to ascertain a range of the size of a channel). A
> possible mitigation would be for nodes to probabilistically drop incoming
> payments, with all nodes sampling from the same distribution. However,
> this would dramatically increase routing failures by senders, removing the
> "low-latency" trait of payment networks that many find desirable. 
> 
I totally agree with you here. I believe that having stronger notions of
value privacy is conceptually hard. Imagine that the payment value must
be routed through the path from sender to the receiver. In particular,
any intermediate node must know what value is receiving from its
predecessor to forward the corresponding value to the successor node in
the path.
> 
> Personally, I've very excited to see additional research on the front of
> routing within the network! Excellent work by all authors.

Thanks. Really looking forward for further conversations and discussions.

> 
> 
> In the end, I don't think it'll be a one-size fits all solution, as each
> routing protocol delivers with it a set of tradeoffs that should be
> weighed depending on target characteristics, and use-cases. There's no
> strong requirement that the network as a whole uses a *single* routing
> protocol. Instead several distinct protocols can be deployed based on
> use-case requirements, as we only need to share a single end-to-end
> construct: the HTLC. I could see a future in a few years where we have
> several deployed protocols, similar to the wide array of existing routing
> protocols deployed on the Internet. What we have currently gets us from
> Zero to One. We'll definitely need to experiment with additional
> approaches as the size of the network grows, and the true economic flow
> patterns emerge after we all deploy to mainnet.
> 
I also agree with you here. I also think that there might be several
routing approaches at the routing layer, the same way we have BGP, OSPF,
RIP protocols for routing in IP today. I also believe that we might have
more than just HTLC-based payments in the LN, but this is the topic for
another long email :)

Many thanks again for the feedback and looking forward for more
conversations,
Pedro.

> 
> -- Laolu
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://lists.linuxfoundation.org/pipermail/lightning-dev/attachments/20171130/9fc27a52/attachment-0001.sig>