[bitcoin-dev] Composable MuSig

ZmnSCPxj ZmnSCPxj at protonmail.com
Mon Nov 25 11:00:22 UTC 2019


So I heard you like MuSig.


Introduction
============

Previously on lightning-dev, I propose Lightning Nodelets, wherein one signatory of a channel is in fact not a single entity, but instead an aggregate: https://lists.linuxfoundation.org/pipermail/lightning-dev/2019-October/002236.html

Generalizing:

* There exists some protocol that requires multiple participants agreeing.
  * This can be implemented by use of MuSig on the public keys of the participants.
* One or more of the participants in the above protocol is in fact an aggregate, not a single participant.
  * Ideally, no protocol modification should be needed to support such aggregates, "only" software development without modifying the protocol layer.
  * Obviously, any participant of such a protocol, whether a direct participant, or a member of an aggregated participant of that protocol, would want to retain control of its own money in that protocol, without having to determine if it is being Sybilled (and all other participants are in fact just one participant).
  * Motivating example: a Lightning Network channel is the aggregate of two participants, the nodes creating that channel.
    However, with nodelets as proposed above, one of the participants is actually itself an aggregate of multiple nodelets.
    * This requires that a Lightning Network channel with a MuSig address, to have one or both participants, be potentially an aggregate of two or more nodelet participants, e.g. `MuSig(MuSig(A, B), C)`

This is the "MuSig composition" problem.
That is, given `MuSig(MuSig(A, B), C)`, and the *possibility* that in fact `B == C`, what protocol can A use to ensure that it uses the three-phase MuSig protocol (which has a proof of soundness) and not inadvertently use a two-phase MuSig protocol?

Schnorr Signatures
==================

The scheme is as follows.

Suppose an entity A needs to show a signature.
At setup:

* It generates a random scalar `a`.
* It computes `A` as `A = a * G`, where `G` is the standard generator point.
* It publishes `A`.

At signing a message `m`:

* It generates a random scalar `r`.
* It computes `R` as `R = r * G`.
* It computes `e` as `h(R | m)`, where `h()` is a standard hash function and `x | y` denotes the serialization of `x` concatenated by the serialization of `y`.
* It computes `s` as `s = r + e * a`.
* It publishes as signature the tuple of `(R, s)`.

An independent validator can then get `A`, `m`, and the signature `(R, s)`.
At validation:

* It recovers `e[validator]` as so: `e[validator] = h(R | m)`
* It computes `S[validator]` as so: `S[validator] = R + e[validator] * A`.
* It checks if `s * G == S[validator]`.
  * If `R` and `s` were indeed generated as per signing algorithm above, then:
    * `S[validator] = R + e[validator] * A`
    * `== r * G + e[validator] * A`; subbstitution of `R`
    * `== r * G + h(R | m) * A`; substitution of `e[validator]`
    * `== r * G + h(R | m) * a * G`; substitution of `A`.
    * `== (r + h(R | m) * a) * G`; factor out `G`
    * `== (r + e * a) * G`; substitution of `h(R | m)` with `e`
    * `== s * G`; substitution of `r + e * a`.

MuSig
=====

Under MuSig, validation must remain the same, and multiple participants must provide a single aggregate key and signature.

Suppose there exist two participants A and B.
At setup:

* A generates a random scalar `a` and B generates a random scalar `b`.
* A computes `A` as `A = a * G` and B computes `B` as `B = b * G`.
* A and B exchange `A` and `B`.
* They generate the list `L`, by sorting their public keys and concatenating their representations.
* They compute their aggregate public key `P` as `P = h(L) * A + h(L) * B`.
* They publish the aggregate public key `P`.

Signing takes three phases.

1.  `R` commitment exchange.
  * A generates a random scalar `r[a]` and B generates a random scalar `r[b]`.
  * A computes `R[a]` as `R[a] = r[a] * G` and B computes `R[b]` as `R[b] = r[b] * G`.
  * A computes `h(R[a])` and B computes `h(R[b])`.
  * A and B exchange `h(R[a])` and `h(R[b])`.
2.  `R` exchange.
  * A and B exchange `R[a]` and `R[b]`.
  * They validate that the previous given `h(R[a])` and `h(R[b])` matches.
3.  `s` exchange.
  * They compute `R` as `R = R[a] + R[b]`.
  * They compute `e` as `h(R | m)`.
  * A computes `s[a]` as `s[a] = r[a] + e * h(L) * a` and B computes `s[b]` as `s[b] = r[b] + e * h(L) * b`.
  * They exchange `s[a]` and `s[b]`.
  * They compute `s` as `s = s[a] + s[b]`.
  * They publish the signature as the tuple `(e, s)`.

At validation, the validator knows `P`, `m`, and the signature `(R, s)`.

* It recovers `e[validator]` as so: `e[validator] = h(R | m)`
* It computes `S[validator]` as so: `S[validator] = R + e[validator] * P`.
* It checks if `s * G == S[validator]`.
  * `S[validator] = R + e[validator] * P`
  * `== R[a] + R[b] + e[validator] * P`; substitution of `R`
  * `== r[a] * G + r[b] * G + e[validator] * P`; substitution of `R[a]` and `R[b]`
  * `== r[a] * G + r[b] * G + e * P`; substitution of `e[validator]` with `e`
  * `== r[a] * G + r[b] * G + e * (h(L) * A + h(L) * B)`; substitution of `P`
  * `== r[a] * G + r[b] * G + e * h(L) * A + e * h(L) * B`; distribution of `e` inside parentheses.
  * `== r[a] * G + r[b] * G + e * h(L) * a * G + e * h(L) * b * G`; substitution of `A` and `B`.
  * `== (r[a] + r[b] + e * h(L) * a + e * h(L) * b) * G`; factoring out of `G`
  * `== (r[a] + e * h(L) * a + r[b] + e * h(L) * b) * G`; rearrangement of terms
  * `== (s[a] + s[b]) * G`; substitution of `r[a] + e * h(L) * a` and `r[b] + e * h(L) * b`
  * `== s * G`;  substitution of `s[a] + s[b]`


Two-Phase MuSig Unsafe
======================

Original proposal of MuSig only had two phases, `R` exchange and `s` exchange.
However, there was a flaw found in the security proof in this two-phase MuSig.
In response, an earlier phase of exchanging commitments to `R` was added.

Thus, two-phase MuSig is potentially unsafe.

https://eprint.iacr.org/2018/417.pdf describes the argument.
Briefly, with two-phase MuSig, one of the participants can deliberately delay its side of a `R` exchange and control the resulting sum `R` by cancelling the `R` of the other participant.
Executed over many (aborted) signing sessions, one participant can induce the other to create a signature for a message it might not agree to, by using the Wagner Generalized Birthday Paradox attack.

Briefly, a two-phase MuSig signing would go this way:

1.  `R` exchange.
  * A generates random scalar `r[a]` and B generates random scalar `r[b]`.
  * A computes `R[a]` as `r[a] * G` and B computes `R[b]` as `r[b] * G`.
  * They exchange `R[a]` and `R[b]`.
2.  `s` exchange.
  * They compute `R` as `R = R[a] + R[b]`.
  * They compute `e` as `h(R | m)`.
  * A computes `s[a]` as `s[a] = r[a] + e * h(L) * a` and B computes `s[b]` as `s[b] = r[b] + e * h(L) * b`.
  * They exchange `s[a]` and `s[b]`.
  * They compute `s` as `s = s[a] + s[b]`.
  * They publish the signature as the tuple `(R, s)`.

The sticking point is "exchange" here.
Given that we live in a relativistic universe where there is no such thing as simultaneity-in-time-without-simultaneity-in-space, it is impossible to ensure that both A and B send their data "at the same time" in such a way that it is impossible for, for example, the send of B to be outside the future light cone of the send of A.
Or in human-level terms, it is not possible to ensure over the network that B will not send `R[b]` *after* it receives `R[a]`.

Suppose that instead of B generating a random `r[b]` and *then* computing `R[b] = r[b] * G`, it instead selects an arbitrary `R[selected]` it wants to target, then compute `R[b]` as `R[selected] - R[a]`.
Then at `s` exchange:

* They compute `R` as `R[a] + R[b]`, which is in fact `R[a] + R[selected] - R[a]`, or `R[selected]`, i.e. `R == R[selected]`.
* They compute `e` as `h(R[selected] | m)`.
* A computes `s[a]` as `s[a] = r[a] + e * h(L) * a`.
* B is unable to compute `s[b]` as it has no `r[b]` it can use in the computation, and aborts the signing.

The attack involved is that multiple such signing sessions, for the same message or for separate distinct messages, might be done in parallel.
Suppose that there are `n` such sessions, such that A provides `n` different `R[a][i]`, e.g. `R[a][1]`, `R[a][2]`, `R[a][3]` up to `R[a][n]`.
Then:

* B delays each session, pretending to have Internet connectivity problems.
* B selects a message `m[target]` that it knows A will never sign (e.g. "I, A, give all my money to B").
* B computes `R[target]` as `sum where i = 1 to n of R[a][i]`.
* B uses the Wagner Generalized Birthday Paradox technique to find `R[selected][i]` with the following constraint:
  * `h(R[target] | m[target]) == sum where i = 1 to n of h(R[selected][i] | m[i])`.
  * Given a large enough number of parallel sessions `n`, this can greatly reduce the effort from 2^128 to find a match to within the realm of a large computer to compute within a few seconds.
* B computes `R[b][i]` as `R[selected][i] - R[a][i]`, for each `i` from 1 to `n`.
* B provides `R[b][i]` for each session.
* A computes `R[i]` as `R[a][i] + R[b][i]` for each session.
  * However we know that `R[b][i] == R[selected][i] - R[a][i]` for each session, cancelling out `R[a][i]` and leaving `R[i] == R[selected][i]`
* A computes `s[a][i]` as `r[a][i] + h(R[selected][i] | m[i]) * h(L) * a` for each session.
* A gives `s[a][i]` for each session.
* B aborts each session.
* B sums up all the `s[a][i]`:
  * `(sum where i = 1 to n of r[a][i]) + (sum where i = 1 to n of h(R[selected][i] | m[i]) * h(L) * a)`.
  * Remember, B has specifically selected `R[selected][i]` such that `h(R[target] | m[target])` is equal to the sum of `h(R[selected][i] | m[i])`.
  * `== (sum where i = 1 to n of r[a][i]) + h(R[target] | m[target]) * h(L) * a)`.
* B adds `h(R[target] | m[target]) * h(L) * b` to the above sum.
  * This results in a signature for MuSig(A, B) to the message `m[target]`, even though A would never have agreed to this message.

Thus, 2-phase MuSig enables a Wagner attack on the participant, thus it is unsafe.

Now, any method of ensuring a "simultaneous" exchange of `R` points is largely the same as adding a "commit to `R`" phase, i.e. the fix for this is simply to add the "`R` commitment exchange" phase.

References: https://eprint.iacr.org/2018/417.pdf

MuSig Composition
=================

Let us suppose that we have some protocol that requires a MuSig signing session between signers with public keys `P` and `C`.
Let us further suppose that in fact, `P = MuSig(A, B)`, i.e. one of the public keys in this protocol is, in reality, itself a MuSig of two participants.

At the point of view of signer C, P is a single participant and it acts as such.
However, in reality, P is an aggregate.

We want to have the following properties:

* C should never need to know that P is in fact an aggregate.
* Even if B is secretly the same as C, the entire protocol as a whole (including the aggregate signing of `MuSig(A, B)`) should remain three-phase MuSig.

Now, from the point of view of C, what it sees are:

At setup:

* It generates a random scalar `c` and the public key `C` as `C = c * G`.
* It exchanges keys with P and gets the public key `P`.
* It computes `L` by sorting `C` and `P` and concatenating them.
* It determines their aggregate key as `h(L) * C + h(L) * P`.

At signing:

1.  `R` commitment exchange.
  * It generates a random scalar `r[c]` and computes `R[c]` as `R[c] = r[c] * G`.
  * It computes `h(R[c])`.
  * It exchanges the hash `h(R[c])` with P and gets `h(R[p])`.
2.  `R` exchange.
  * It exchanges `R[c]` with P and gets `R[p]`.
  * It validates that the hash `h(R[p])` matches the previously-committed hash.
3.  `s` exchange.
  * It computes `R` as `R = R[c] + R[p]`.
  * It computes `e` as `e = h(R | m)`.
  * It computes `s[c]` as `s[c] = r[c] + e * c`.
  * It exchanges `s[c]` with P and gets `s[p]`.
  * It computes `s` as `s = s[c] + s[p]`.

However, from point of view of A and B, what actually happens is this:

At setup:

* A generates a random scalar `a` and computes `A = a * G`, B generates a random scalar `b` and computes `B = b * G`.
* They exchange `A` and `B`.
* They generate their own `L[ab]`, by sorting `A` and `B` and concatenating their representations.
* They compute the inner MuSig pubkey `P` as `P = h(L[ab]) * A + h(L[ab]) * B`.
* They exchange `P` with C, and get `C`.
* They compute the outer MuSig pubkey as `h(L) * P + h(L) * C`.

At signing:

1.  `R` commitment exchange.
  * A generates a random scalar `r[a]` and computes `R[a] = r[a] * G`, B generates a random scalar `r[b]` and computes `R[b] = r[b] * G`.
  * A computes `h(R[a])`, B computes `h(R[b])`.
  * They exchange `h(R[a])` and `h(R[b])`.
  * They need to compute `h(R[p])` for the protocol with C.
    * However, even if we know `R[p] == R[a] + R[b]`, we cannot generate `h(R[p])`.
    * Thus, they have no choice but to exchange `R[a]` and `R[b]` at this phase, even though this is supposed to be the `R` commitment exchange phase (and they should not share `R[a]` and `R[b]` yet)!

Unfortunately, this means that, between A and B, we are now reduced to a two-phase MuSig.
This is relevant if B and C happen to be the same entity or are cooperating.

Basically, before C has to provide its `h(R[c])`, B now knows the generated `R[a]` and `R[b]`.
If B and C are really the same entity, then C can compute `R[c]` as `R[selected] - R[a] - R[b]` before providing `h(R[c])`.
Then this devolves to the same attack that brings down 2-phase MuSig.

Thus, composition with the current MuSig proposal is insecure.

Towards a Composable Multi-`R` MuSig
====================================

A key element is that the first phase simply requires that all participants provide *commitments* to their individual `R`, and the second phase reveals their `R`.

I propose here the modification below:

* In the first phase, any participant in the MuSig may submit one *or more* `R` commitments.
* In the second phase, the participant in the MuSig submits each `R` that decommits each of the `R` commitments it sent.

I call this the Remote R Replacement Remanded: Redundant R Required Realistically, or, in shorter terms, the Multi-`R` proposal.

This is a simple and direct extension of the MuSig protocol, and expected to not have any effect on the security proof of MuSig.

In the case where all MuSig participants are singletons and each participant just generates and sends a single `R` commitment, then this proposal reduces to the original MuSig proposal.

However, in the case where one participant is in reality itself an aggregate, then we shall describe it below.
The below example is `MuSig(MuSig(A, B), C)`.

1.  `R` commitment exchange.
  * A generates a random number `r[a]`, B generates a random number `r[b]`, C generates a random number `r[c]`.
  * A computes `R[a]` as `r[a] * G`, B computes `R[b]` as `r[b] * G`, C computes `R[c]` as `r[c] * G`.
  * A computes `h(R[a])`, B computes `h(R[b])`, C computes `h(R[c])`.
  * A and B exchange their hashes with each other.
  * A and B jointly exchange their `h(R[a])` and `h(R[b])` with the `h(R[c])` from C.
2.  `R` exchange.
  * A and B reveal their `R[a]` and `R[b]` with each other.
  * A and B validate the given `R[a]` matches `h(R[a])` and the given `R[b]` matches `h(R[b])`.
  * A and B jointly exchange their `R[a]` and `R[b]` with the `R[c]` from C.
  * C validates `R[a]` and `R[b]`, A and B validate `R[c]`.
  * They compute `R` as the sum of all `R[a] + R[b] + R[c]`.
3.  `s` exchange.
  * They compute `e` as `h(R | m)`.
  * A computes `s[a]` as `r[a] + e * h(L[abc]) * h(L[ab]) * a`, B computes `s[b]` as `r[b] + e * h(L[abc]) * h(L[ab]) * b`.
  * C computes `s[c]` as `r[c] + e * h(L[abc]) * c`.
  * A and B exchange `s[a]` and `s[b]`.
  * A and B compute `s[ab]` as `s[a] + s[b]`.
  * A and B jointly exchange their `s[ab]` with `s[c]` from C.
  * They compute `s` as `s[ab] + s[c]`.
  * They publish the signature as the tuple `(R, s)`.

Of note, is that the number of `R` a participant provides is a strong hint as to whether it is actually an aggregate or not, and forms an upper bound as to the size of the aggregate (i.e. an aggregate of `n` members can pretend to be an aggregate of `m` members where `n < m`, but cannot pretend with `n > m`).
Thus, C here learns that its counterparty is actually itself an aggregate rather than a singleton.
This may be acceptable as a weak reduction in privacy (in principle, C should never learn that it is talking to an aggregate rather than a single party).

Alternative Composable MuSig Schemes
====================================

The above proposal is not the only one.
Below are some additional proposals which have various flaws, ranging from outright insecurity to practical implementation complexity issues.

Pedersen Commitments in Phase 1
-------------------------------

My initial proposal was to use Pedersen commitments in phase 1.
At phase 1, each party would generate a `r[x]` and `q[x]`, and exchange the Pedersen commitments `r[x] * G + q[x] * H`, where `H` is a NUMS point used as a second standard generator.
Then at phase 2, each party reveals its `q[x]`.
All the Pedersen commitments are summed, then all `q[x]` are summed, multiplied by `H`, then subtracted from the sum of Pedersen commitments.

Unfortunately, this leads to a Wagner attack.

Suppose A and B have an aggregate MuSig(A, B).

* B initiates multiple parallel signing sessions with A.
* B selects a message `m[target]` that it knows A will never sign (e.g. "I, A, give all my money to B").
* In the first phase, B selects random points `R[b][i]` for each session `i` and provides that as its Pedersen commitment, receiving `R[a][i] + q[a][i] * H` in exchange.
* In the second phase, B delays each session, pretending to have Internet connectivity problems.
* A sends B the `q[a][i]` for all `i`.
* B computes `R[a][i]` for all `i` by subtracting `q[a][i] * H` from the Pedersen commitments given by A.
* B computes `R[target]` as `sum where i = 1 to n of R[a][i]`.
* B uses the Wagner Generalized Birthday Paradox technique to find `q[b][i]` with the following constraint:
  * First compute `R[selected][i]` as `R[a][i] +  R[b][i] - q[b][i] * H` for all `i`.
  * Then ensure this constraint: `h(R[target] | m[target]) == sum where i = 1 to n of h(R[selected][i] | m[i])`.
* B sends the `q[b][i]` found above.
* A computes `R[i]` as `R[a][i] + q[a][i] * H + R[b][i] - q[a][i] * H - q[b][i] * H` for all `i`.
  * This resolves down to `R[a][i] + R[b][i] - q[b][i] * H`, or `R[selected][i]`.
* A computes `s[a][i]` as `r[a][i] + h(R[selected][i] | m[i]) * a` for all `i`.
* B sums all `s[a][i]` for all `i` together, forming `(sum where i = 1 to n of r[a][i]) + (sum where i = 1 to n of h(R[selected][i] | m[i])) * a`.
  * This is also a valid signature on `m[target]`, since `sum where i = 1 to n of h(R[selected][i] | m[i])` equals `h(R[target] | m[target])`.

Thus, Pedersen commitments for phase 1 are insecure, as it allows counterparties to control `R`.

ElGamal Commitments in Phase 1
------------------------------

ElGamal commitments prevent B from just giving random `q[b][i]`, thus preventing the above Wagner attack.
However, it is still possible for B to control the resulting `R`, but in doing so this prevents the protocol from completing (thus, even with full control of `R`, B is still unable to promote this to an `R`-reuse attack or the above Wagner attack schema).
This is not quite as bad as the above case, but having just one participant control the nonce `R` should make us worry that other attacks may now become possible (unless we acquire some proof that this will be equivalent in security to the hash-using MuSig).

Briefly, with ElGamal commitments in Phase 1:

1. `R` commitment exchange.
  * A generates random numbers `r[a]` and `q[a]`, B generates random numbers `r[b]` and `q[b]`.
  * A computes its commitment as two points, `q[a] * G` and `r[a] * G + q[a] * H`, B computes its commitment as two points, `q[b] * G` and `r[b] * G + q[b] * H`.
    * `H` is a NUMS point used as a second standard generator.
    * Note that one point uses `q[] * G` while the other adds `q[] * H` to `r[] * G`.
  * They exchange their pairs of points.
2. `R` exchange.
  * They exchange `q[a]` and `q[b]`, and the points `r[a] * G` (== `R[a]`) and `r[b] * G` (== `R[b]`).
  * They validate the exchanged data from the previous `R` commitments.
  * They compute `R` as `R[a]` + `R[b]`.
3. `s` exchange.
  * Same as before.

B can attack this by delaying its phases as below:

1. `R` commitment exchange.
  * B generates random `q[selected]`.
  * B selects target `R[selected]`.
  * After receiving `q[a] * G` and `r[a] * G + q[a] * H`, B computes `q[selected] * G - q[a] * G` and `R[selected] + q[selected] * H - r[a] * G - q[a] * H` and sends those points as its own commitment.
2. `R` exchange.
  * After receiving `q[a]` and `R[a]`, B computes `q[b]` as `q[selected] - q[a]` and computes `R[b]` as `R[selected] - R[a]` and sends both as its decommitment.
  * The resulting `R` will now be `R[selected]` chosen by B.

`s` exchange cannot complete, as that would require that B know `r[selected] - r[a]` where `R[selected] = r[selected] * G`.
Even if B generates `R[selected]` from `r[selected]`, it does not know `r[a]`.
A would provide `r[a] + h(R[selected] | m) * h(L[ab]) * a`, but B would be unable to complete this signature.

The difference here is that B has to select `R[selected]` before it learns `R[a]`, and thus is unable to mount the above Wagner attack schema.
In particular, B first has to compute an `R[target]` equal to `sum where i = 1 to n of R[a][i]` across `n` sessions numbered `i`, in addition to selecting a message `m[i]`.
Then B has to perform a Wagner attack with the constraint `h(R[target] | m[target]) == sum where i = 1 to n of h(R[selected][i] | m[i])`
Fortunately for this scheme, B cannot determine such an `R[target]` before it has to select `R[selected]`, thus preventing this attack.

It may be possible that this scheme is safe, however I am not capable of proving it safe.

Acknowledgments
===============

I contacted Yannick Seurin, Andrew Poelstra, Pieter Wuille, and Greg Maxwell, the authors of MuSig, regarding this issue, and proposing to use Pedersen commitments for the first phase.
They informed me that Tim Ruffing had actually been thinking of similar issue before I did, and also pointed out that Pedersen commitments do not commit to `r * G`, only to `r` (i.e. would have to reveal `r` to serve as a verifiable commitment).
It seemed to me that the general agreement was that ElGamal commitments should work for committing to `r * G`.
However as I show above, this still allows a delaying participant to cancel the `R` contributions of the other parties, allowing it to control the aggregate `R` (though not quite to launch a Wagner attack).

`nickler` and `waxwing` on IRC confirmed my understanding of the attack on 2-phase MuSig.
`waxwing` also mentioned that the paper attacking 2-phase MuSig really attacks CoSi, and that the paper itself admits that given a knowledge-of-secret-keys, CoSi (and presumably 2-phase MuSig) would be safe.



More information about the bitcoin-dev mailing list