[Lightning-dev] Onion messages rate-limiting

Thu Jun 30 00:22:08 UTC 2022

Hi t-bast,

Happy to see this finally written up! With this, we have two classes of
proposals for rate limiting onion messaging:

  1. Back propagation based rate limiting as described here.

  2. Allowing nodes to express a per-message cost for their forwarding
  services, which is described here [1].

I still need to digest everything proposed here, but personally I'm more
optimistic about the 2nd category than the 1st.

One issue I see w/ the first category is that a single party can flood the
network and cause nodes to trigger their rate limits, which then affects the
usability of the onion messages for all other well-behaving parties. An
example, this might mean I can't fetch invoices, give up after a period of
time (how long?), then result to a direct connection (perceived payment
latency accumulated along the way).

With the 2nd route, if an attacker floods the network, they need to directly
pay for the forwarding usage themselves, though they may also directly cause
nodes to adjust their forwarding rate accordingly. However in this case, the
attacker has incurred a concrete cost, and even if the rates rise, then
those that really need the service (notifying an LSP that a user is online
or w/e) can continue to pay that new rate. In other words, by _pricing_ the
resource utilization, demand preferences can be exchanged, leading to more
efficient long term resource allocation.

W.r.t this topic, one event that imo is worth pointing out is that a very
popular onion routing system, Tor, has been facing a severe DDoS attack that
has lasted weeks, and isn't yet fully resolved [2]. The on going flooding
attack on Tor has actually started to affect LN (iirc over half of all
public routing nodes w/ an advertised address are tor-only), and other
related systems like Umbrel that 100% rely on tor for networking traversal.
Funnily enough, Tor developers have actually suggested adding some PoW to
attempt to mitigate DDoS attacks [3]. In that same post they throw around
the idea of using anonymous tokens to allow nodes to give them to "good"
clients, which is pretty similar to my lofty Forwarding Pass idea as relates
to onion messaging, and also general HTLC jamming mitigation.

In summary, we're not the first to attempt to tackle the problem of rate
limiting relayed message spam in an anonymous/pseudonymous network, and we
can probably learn a lot from what is and isn't working w.r.t how Tor
handles things. As you note near the end of your post, this might just be
the first avenue in a long line of research to best figure out how to handle
the spam concerns introduced by onion messaging. From my PoV, it still seems
to be an open question if the same network can be _both_ a reliable
micro-payment system _and_ also a reliable arbitrary message transport
layer. I guess only time will tell...

> The `shared_secret_hash` field contains a BIP 340 tagged hash

Any reason to use the tagged hash here vs just a plain ol HMAC? Under the
hood, they have a pretty similar construction [4].

[1]:
https://lists.linuxfoundation.org/pipermail/lightning-dev/2022-February/003498.html
[2]: https://status.torproject.org/issues/2022-06-09-network-ddos/
[3]: https://blog.torproject.org/stop-the-onion-denial/
[4]: https://datatracker.ietf.org/doc/html/rfc2104

-- Laolu

On Wed, Jun 29, 2022 at 1:28 AM Bastien TEINTURIER <bastien at acinq.fr> wrote:

> During the recent Oakland Dev Summit, some lightning engineers got together to discuss DoS
> protection for onion messages. Rusty proposed a very simple rate-limiting scheme that
> statistically propagates back to the correct sender, which we describe in details below.
>
> You can also read this in gist format if that works better for you [1].
>
> Nodes apply per-peer rate limits on _incoming_ onion messages that should be relayed (e.g.
> N/seconds with some burst tolerance). It is recommended to allow more onion messages from
> peers with whom you have channels, for example 10/seconds when you have a channel and 1/second
> when you don't.
>
> When relaying an onion message, nodes keep track of where it came from (by using the `node_id` of
> the peer who sent that message). Nodes only need the last such `node_id` per outgoing connection,
> which ensures the memory footprint is very small. Also, this data doesn't need to be persisted.
>
> Let's walk through an example to illustrate this mechanism:
>
> * Bob receives an onion message from Alice that should be relayed to Carol
> * After relaying that message, Bob stores Alice's `node_id` in its per-connection state with Carol
> * Bob receives an onion message from Eve that should be relayed to Carol
> * After relaying that message, Bob replaces Alice's `node_id` with Eve's `node_id` in its
> per-connection state with Carol
> * Bob receives an onion message from Alice that should be relayed to Dave
> * After relaying that message, Bob stores Alice's `node_id` in its per-connection state with Dave
> * ...
>
> We introduce a new message that will be sent when dropping an incoming onion message because it
> reached rate limits:
>
> 1. type: 515 (`onion_message_drop`)
> 2. data:
>    * [`rate_limited`:`u8`]
>    * [`shared_secret_hash`:`32*byte`]
>
> Whenever an incoming onion message reaches the rate limit, the receiver sends `onion_message_drop`
> to the sender. The sender looks at its per-connection state to find where the message was coming
> from and relays `onion_message_drop` to the last sender, halving their rate limits with that peer.
>
> If the sender doesn't overflow the rate limit again, the receiver should double the rate limit
> after 30 seconds, until it reaches the default rate limit again.
>
> The flow will look like:
>
> Alice                      Bob                      Carol
>   |                         |                         |
>   |      onion_message      |                         |
>   |------------------------>|                         |
>   |                         |      onion_message      |
>   |                         |------------------------>|
>   |                         |    onion_message_drop   |
>   |                         |<------------------------|
>   |    onion_message_drop   |                         |
>   |<------------------------|                         |
>
> The `shared_secret_hash` field contains a BIP 340 tagged hash of the Sphinx shared secret of the
> rate limiting peer (in the example above, Carol):
>
> * `shared_secret_hash = SHA256(SHA256("onion_message_drop") || SHA256("onion_message_drop") || sphinx_shared_secret)`
>
> This value is known by the node that created the onion message: if `onion_message_drop` propagates
> all the way back to them, it lets them know which part of the route is congested, allowing them
> to retry through a different path.
>
> Whenever there is some latency between nodes and many onion messages, `onion_message_drop` may
> be relayed to the incorrect incoming peer (since we only store the `node_id` of the _last_ incoming
> peer in our outgoing connection state). The following example highlights this:
>
>  Eve                       Bob                      Carol
>   |      onion_message      |                         |
>   |------------------------>|      onion_message      |
>   |      onion_message      |------------------------>|
>   |------------------------>|      onion_message      |
>   |      onion_message      |------------------------>|
>   |------------------------>|      onion_message      |
>                             |------------------------>|
> Alice                       |    onion_message_drop   |
>   |      onion_message      |                    +----|
>   |------------------------>|      onion_message |    |
>   |                         |--------------------|--->|
>   |                         |                    |    |
>   |                         |                    |    |
>   |                         |                    |    |
>   |    onion_message_drop   |<-------------------+    |
>   |<------------------------|                         |
>
> In this example, Eve is spamming but `onion_message_drop` is propagated back to Alice instead.
> However, this scheme will _statistically_ penalize the right incoming peer (with a probability
> depending on the volume of onion messages that the spamming peer is generating compared to the
> volume of legitimate onion messages).
>
> It is an interesting research problem to find formulas for those probabilities to evaluate how
> efficient this will be against various types of spam. We hope researchers on this list will be
> interested in looking into it and will come up with a good model to evaluate that scheme.
>
> To increase the accuracy of attributing `onion_message_drop`, more data could be stored in the
> future if it becomes necessary. We need more research to quantify how much accuracy would be
> gained by storing more data and making the protocol more complex.
>
> Cheers,
> Bastien
>
> [1] https://gist.github.com/t-bast/e37ee9249d9825e51d260335c94f0fcf
>
> _______________________________________________
> Lightning-dev mailing list
> Lightning-dev at lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxfoundation.org/pipermail/lightning-dev/attachments/20220629/be1c82d7/attachment-0001.html>