Liquid and Taproot Activation

Blockstream posted today about Elements 0.21 and activating taproot on the Liquid sidechain. I think that’s worth talking about in a couple of ways.

First is that it’s another consensus update scheduled about six weeks after the previous “dynafed” update. That update failed fairly badly causing almost a full day’s downtime for the network, during which no blocks were able to be generated. There was an additional consensus bug in the development version of elements that prevented it from being able to follow the new chain once block signing resumed, though the most recent released version of elements was able to validate blocks without a problem.

I think there’s probably a few problems that played a part in causing that train-wreck.

First is that the block signing code for liquid is proprietary — it’s not quite clear to me if that’s proprietary to Blockstream, or a shared trade secret between Blockstream the functionaries that use the code and do the signing — but either way, it’s code that’s not included in elements, and not something that is widely available and able to be tested thoroughly before it’s used in deployment. That’s probably a legitimate tradeoff to make: keeping the signing mechanism secret is security by obscurity, but provided obscurity is not the only protection, it can still be a valuable additional measure; and additionally, selling a secure way of allowing the functionaries to coordinate around signing the sidechain blocks is (to my understanding) what makes this a business for Blockstream. So I think the conclusion there is that if it’s possible to open more of the block signing code, and then better automate testing of it, that’s great; but it may well not be reasonable to do that, and if so, it should be treated as a much more high risk module than it seems like it has been.

A simple way to mitigate that risk is in fork design. One of the principles we apply in Bitcoin soft forks is ensuring that we don’t break any mining hardware when introducing consensus changes: people have made large, real capital investments, and a software change that devalues that investment isn’t a great way of building mutual trust. We had an instance of exactly that occur in taproot signalling, where a modest amount of hashpower simply wasn’t able to signal for activation; and I’d argue that was the fundamental cause of many of the difficulties with segwit — it (unintentionally) reduced the value of significant amounts of capital investment due to being incompatible with covert ASICBoost.

So I think the second factor in giving rise to the dynafed activation issues was not taking enough advantage of that philosophy.

In the context of a hard-fork — which means accepting blocks that would previously have been unacceptable — a simple way to implement the same principle is to make it a pure hard-fork: that is, make sure you accept any block that would have been acceptable under the old rules, so that if it does turn out you have a bug, you can just keep building blocks as if the hard fork never happened. That way, rather than the chain dying until a fix is rolled out, you can keep building blocks, just without using the new features you were hoping to enable. This is complicated by the fact that, as a hard fork, it is not possible to continue running old validation software once a single block using the new features has been accepted; and because liquid has a two-block finality rule, reorgs of more than one block are not acceptable.

Without being able to see the block signer code, it’s hard to suggest specifics here, but that a majority of “nine functionaries, running an earlier version of the functionary software, reported errors but did not terminate” suggests to me that it should have been possible to design dynafed in a way that failed more gracefully than it did — perhaps by making it so a single, non-upgraded, functionary proposing non-dynafed blocks would be able to have their blocks signed by a majority of functionaries, with no observed downtime; or by making it so a quick downgrade by a majority of functionaries was enough to continue producing valid blocks, rather than an emergency patch having to be written, validated and deployed.

Another standard in the blockchain world is to have a live testnet — somewhere you can deploy code changes and test them out before they start affecting anything worth real money. To the best of my knowledge liquid doesn’t have a testnet anymore. There was originally “Elements Alpha” but that was discontinued at some point (probably because Bitcoin’s testnet isn’t really reliable enough to use with liquid’s peg-in/peg-out feature), and you can spin up your own “liquidv1test” test environment for local use, but that doesn’t test the proprietary block signing code. Testnets certainly aren’t a silver bullet for bugs, and you’d need to put some thought into how to both partition block signing for the testnet from liquid itself to prevent potential exploits, while also keeping the block signing code itself apropriately secret. Those seem like solvable problems, and worth the effort to detect consensus bugs earlier, however. So this is perhaps a third approach that could have detected this bug earlier, before it caused problems.

That’s not to say any of that is necessarily a crisis for liquid, or something that should necessarily be their single highest priority to fix. Rather, it seems to be about par for the course: Solana had 17 hours downtime a month ago due to increased transaction volume, Infura had a 7 hour outage last year due to an unexpected consensus incompatibility, and Ethereum had a consensus bug that was exploited to cause a chain split affecting just over 50% of nodes and a minority of miners. But on the other hand, if liquid isn’t trying to be substantially more reliable/robust than those alternatives, what’s its advantage over them?

I think Blockstream and the Liquid Federation need to step up their game here… Though, to be fair, I’d say the same about everything else in this space, including Bitcoin.

Anyway, the activation of taproot on liquid will be quite different. It’s a soft-fork that only affects transactions, so it should be possible for it to fallback cleanly if there ends up being a problem, and much of the updates for taproot have been ported from Bitcoin, and have been reasonably well tested there. On the other hand there are two substantial sets of consensus features that aren’t in Bitcoin that will be in liquid: one is a variety of additional opcodes which should be quite interesting, and the other is changes to signature hashing to support liquid’s pegged L-BTC and confidential assets features. (There are also various wallet features added to liquid, that don’t have the same failure modes as consensus changes)

I think those changes should have had more review — they have an in-repo document explaining the tapscript additions, rather than something like a BIP or EIP proposing the changes, eg, and they’ve been merged relatively quickly. I expect one consequence of that is that tapscript for liquid and tapscript for Bitcoin will diverge — liquid have claimed a bunch of opcodes for these new features, and I expect that will start to conflict with Bitcoin wanting to claim opcodes for its features. With a bit more time spent on actively seeking feedback on the proposal, that could have been avoided pretty easily, but, oh well. There’s always a risk those changes could result in a consensus incompatibility of some sort, though I think it’s low in this case. There’s a much bigger risk it could result in buggy smart contracts and thus loss of funds. Maybe that’s just a “buyer beware” thing, but if using liquid isn’t substantially safer than transacting on random other multicoin blockchains, then again, what’s the point? (Perhaps confidentiality is enough; I’m not sure if any of the other chains that do multiple assets also do confidential transactions on chain)

The other thing about how they’re deploying taproot compared to how they deployed dynafed is the activtion parameters. Dynafed used the default activation parameters of 75% signalling over a 144 block period (so about 2.5 hours, given liquid generates blocks every minute) and had a special rule that would require functionaries to explicitly enable signalling via the -con_dyna_deploy_signal parameter (that parameter also results in an erroneous “unknown new rules activated” warning, when it’s not used by non-mining nodes). Activation doesn’t occur until after a locked-in period, so another 2.5 hours later after signalling reaches the threshold.

By contrast, liquid’s taproot activation has customised parameters requiring 100% signalling over 10080 blocks (exactly one week) and signalling will occur by default for any functionary that upgrades. A locked-in period is also required, so activation is then delayed for a week after the 100% signalling threshold is reached.

That setup means that any functionary that is not upgraded by the time signalling begins (presumably on the 1st of November) will delay activation by a week, and means that if any functionary finds a problem with the taproot activation on liquid, they can unilaterally prevent activation indefinitely. All-in-all probably fine, but a pretty big change from how dynafed was activated.

Leave a Reply