[bitcoin-dev] Merkle branch verification & tail-call semantics for generalized MAST

Mon Oct 2 17:15:38 UTC 2017

(Subject was: [bitcoin-dev] Version 1 witness programs (first draft)), but
I'm moving part of that conversation to this thread.

On Sun, Oct 1, 2017 at 5:32 PM, Johnson Lau <jl2012 at xbt.hk> wrote:

> 3. Do we want to allow static analysis of sigop?
> BIP114 and the related proposals are specifically designed to allow static
> analysis of sigop. I think this was one of the main reason of OP_EVAL not
> being accepted. This was also the main reason of Ethereum failing to do a
> DAO hacker softfork, leading to the ETH/ETC split. I’m not sure if we
> really want to give up this property. Once we do it, we have to support it
> forever.

I would very much like to retain the ability to do static analysis.  More
generally, the idea of interpreting arbitrary data as code, as done in
OP_EVAL and in TAILCALL, makes me quite anxious.  This at the root of many
security problems throughout the software industry, and I don't relish
giving more fuel to the underhanded Bitcoin Script contestants.

On Sun, Oct 1, 2017 at 8:45 PM, Luke Dashjr <luke at dashjr.org> wrote:

> > 3. Do we want to allow static analysis of sigop?
> > BIP114 and the related proposals are specifically designed to allow
> static
> > analysis of sigop. I think this was one of the main reason of OP_EVAL not
> > being accepted. This was also the main reason of Ethereum failing to do a
> > DAO hacker softfork, leading to the ETH/ETC split. I’m not sure if we
> > really want to give up this property. Once we do it, we have to support
> it
> > forever.
>
> It seems inevitable at this point. Maybe we could add a separate
> "executable-
> witness" array (in the same manner as the current witness was softforked
> in),
> and require tail-call and condition scripts to merely reference these by
> hash,
> but I'm not sure it's worth the effort?
>
> Thinking further, we could avoid adding a separate executable-witness
> commitment by either:
> A) Define that all the witness elements in v1 are type-tagged (put the
> minor
>    witness version on them all, and redefine minor 0 as a stack item?); or
> B) Use an empty element as a delimiter between stack and executable items.
>
> To avoid witness malleability, the executable items can be required to be
> sorted in some manner.
>
> The downside of these approaches is that we now need an addition 20 or 32
> bytes per script reference... which IMO may possibly be worse than losing
> static analysis. I wonder if there's a way to avoid that overhead?
>

Actually, I have a half-baked idea I've been thinking about along these
lines.

The idea is to add a flag to each stack item in the Script interpreter to
mark whether the item in the stack is "executable" or "non-executable", not
so different from how computers mark pages to implement executable space
protection.  By default, all stack items are marked "non-executable".  We
then redefine OP_PUSHDATA4 as OP_PUSHCODE within ScriptSigs.  The
operational semantics of OP_PUSHCODE would remain the same as OP_PUSHDATA4
except it would set the pushed item's associated flag to "executable".  All
data pushed by OP_PUSHCODE would be subject to the sigops limits and any
other similar static analysis limits.

Segwit v0 doesn't use OP_PUSHDATA codes to create the input stack, so we
would have to mark executable input stack items using a new witness v1
format. But, IIUC, TAILCALL isn't going to be compatible with Segwit v0
anyway.

During a TAILCALL, it is required that the top item on the stack have the
"executable" flag, otherwise TAILCALL is not used (and the script succeeds
or fails based on the top item's data value as usual).

All other operations can treat "executable" items as data, including the
merkle branch verification.  None of the Script operations can create
"executable" items; in particular, OP_PUSHDATA4 within the ScriptPubKey
also would not create "executable" items.  (We can talk about the behaviour
of OP_CAT when that time comes).

One last trick is that when "executable" values are duplicated, by OP_DUP,
OP_IFDUP, OP_PICK. then the newly created copy of the value on top of the
stack is marked "non-executable".

Because we make the "executable" flag non-copyable, we are now free to
allow unbounded uses of TAILCALL (i.e. TAILCALL can be used multiplie times
in a single input).  Why is this safe?  Because the number of "executable"
items decreases by at least one every time TAILCALL is invoked. the number
of OP_PUSHCODE occurrences in the witness puts an upper bound on the number
of invocations of TAILCALL allowed.  Using static analysis of the script
pubkey and the data within the OP_PUSHCODE data, we compute an upper bound
on the number of operations (of any type) that can occur during execution.

Unbounded TAILCALL should let us (in the presence of OP_CHECKSIGFROMSTACK)
have unbounded delegation.

Overall, I believe that OP_PUSHCODE

1. is fully backwards compatible.
2. maintains our ability to perform static analysis with TAILCALL.
3. never lets us interpret computed values as executable code.
4. extends TAILCALL to safely allow multiple TAILCALLs per script.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20171002/794f9cab/attachment-0001.html>