Dismantling the Data Availability Layer: The Overlooked Lego Bricks of the Modular Future

This article is the original content of IOSG, which is only used for industry learning and communication, and does not constitute any investment reference. If you need to cite, please indicate the source. For reprinting, please contact the IOSG team for authorization and reprinting instructions.

  • For the data availability of light clients, there is little objection to using erasure codes to solve this problem, the difference lies in how to ensure that the erasure codes are correctly encoded. KZG promises are used in Polygon Avail and Danksharding, while fraud proofs are used in Celestia.

  • For the data availability of Rollup, if DAC is understood as a consortium chain, then what Polygon Avail and Celestia have done is to make the data availability layer more decentralized – equivalent to providing a “DA-Specific” public chain to improve the level of trust .

  • In the next 3 to 5 years, the architecture of the blockchain will inevitably evolve from monolithic to modular, with each layer showing a low coupling state. In the future, providers of many modular components such as Rollup-as-a-Service (RaaS) and Data Availability-as-a-Service (DAaaS) may appear to realize the composability of the blockchain architecture. Modular blockchain is one of the important narratives underpinning the next cycle.

  • In the modular blockchain, the executive layer has “four points”, and there are few latecomers; the consensus layer is competing in the Central Plains, and Aptos and Sui are emerging. Although the competition pattern of the public chain has not yet been settled, its narrative is old wine in a new bottle. , it is difficult to find reasonable investment opportunities. The value of the data availability layer remains to be discovered.

Modular Blockchain 

Before we talk about data availability, let’s take a moment to briefly review modular blockchains.

Dismantling the Data Availability Layer: The Overlooked Lego Bricks of the Modular Future

Image credit: IOSG Ventures, remodeled by Peter Watts

There is no strict definition of the layering of modular blockchains. Some layering methods start from Ethereum, while others tend to be generalized, depending on the context in which the discussion is conducted.

  • Execution layer: Two things happen at the execution layer. For a single transaction, the transaction is executed and the state changes; for transactions in the same batch, the state root of the batch is calculated. Part of the current Ethereum execution layer is assigned to Rollup, known as StarkNet, zkSync, Arbitrum, and Optimism.

  • Settlement layer: It can be understood as the process of verifying the validity of the state root (zkRollup) or fraud proof (Optimistic Rollup) by the Rollup contract on the main chain.

  • Consensus layer: Whether PoW, PoS or other consensus algorithms are used, the consensus layer is to reach a consensus on something in a distributed system, that is, to reach a consensus on the validity of state transitions. In the context of modularity, the meanings of the settlement layer and the consensus layer are somewhat similar, so some researchers unify the settlement layer and the consensus layer.

  • Historical State Layer: Proposed by Polynya (for Ethereum only). Because after the introduction of Proto-Danksharding, Ethereum only maintains instant data availability for a certain time window, and then performs pruning operations, leaving this work to others. For example Portal Network or other third parties that store this data can be classified in this layer.

  • Data Availability Layer: What’s wrong with data availability? What are the corresponding solutions? This is the issue that this article will focus on and will not summarize it here.

Dismantling the Data Availability Layer: The Overlooked Lego Bricks of the Modular Future

Image credit: IOSG Ventures

Back in 2018 and 2019, data availability was more in the context of light client nodes; and in the later Rollup perspective, data availability had another meaning. This article will explain data availability from two different contexts of “node” and “rollup”.

DA in Nodes

Dismantling the Data Availability Layer: The Overlooked Lego Bricks of the Modular Future

Image credit: https://medium.com/metamask/metamask-labs-presents-mustekala-the-light-client-that-seeds->

Let’s first look at the concepts of full nodes and light clients .

Since full nodes download and verify every transaction in every block themselves, no honest assumptions are required to ensure that the state is executed correctly , with good security guarantees. However, running a full node requires resource requirements of storage, computing power and bandwidth. Except for miners, ordinary users or applications have no incentive to run a full node. Moreover, if a node only needs to verify certain information on the chain, it is obviously unnecessary to run a full node.

This is what light clients are doing. Light clients are a term that is different from full nodes. They often do not interact directly with the chain, but rely on adjacent full nodes as an intermediary to request information from full nodes , such as downloading block headers or verifying account balances. .

The light client as a node can quickly synchronize the entire chain because it only downloads and verifies the block header; while in the cross-chain bridge model, the light client acts as a smart contract – the light client of the target chain only needs to verify Whether the tokens of the source chain are locked without verifying all transactions of the source chain.

What’s the problem?

There is an implicit problem with this: since light clients only download block headers from full nodes, rather than downloading and validating each transaction themselves, a malicious full node (block producer) can construct a block containing invalid transactions , and send it to light clients to trick them.

It is easy to think of using “fraud proof” to solve this problem: that is, only one honest full node is required to monitor the validity of the block, and after finding an invalid block, construct a fraud proof and send it to the light client to remind they. Or, after receiving the block, the light client actively asks the whole network whether there is any fraud proof. If it is not received after a period of time, the block can be defaulted to be valid. In this way, light clients can achieve almost the same level of security as full nodes (but still rely on the assumption of honesty).

However, in the above discussion, we actually assumed that block producers will always publish all block data, which is also the basic premise of generating fraud proofs. However, malicious block producers may hide some of the data when publishing blocks. At this point, full nodes can download the block and verify that it is invalid; but the characteristics of light clients make it impossible for them to do so. And due to lack of data, full nodes are also unable to generate fraud proofs to warn light clients.

Another situation is that, possibly due to network reasons, part of the data will not be uploaded until later, and we can’t even judge whether the data loss at this time is due to objective conditions or the block producer’s intention —then the reward and punishment mechanism for fraud proof also cannot take effect.

This is the problem of data availability in nodes that we are going to discuss.

Dismantling the Data Availability Layer: The Overlooked Lego Bricks of the Modular Future

‍Image source: https://github.com/ethereum/research/wiki/A-note-on->

Two situations are shown in the figure above: first, a malicious block producer publishes a block with missing data, at which time an honest full node issues a warning, but then the producer publishes the remaining data; the other Two, honest block producers publish full blocks, but malicious full nodes issue false warnings at this time. In both cases, the block data seen by others in the network after T3 is complete, but there are people who are doing evil in it.

In this way, the use of fraud proofs to ensure data availability for light clients is vulnerable.

solution

In September 2018, Mustafa AI-Bassam (now Celestia CEO) and Vitalik proposed the use of multi-dimensional erasure codes to check data availability in a paper co-authored – light clients only need to randomly download a portion of the data and verify it to ensure that all data is available. Data blocks are available and all data is reconstructed if necessary.

There is little objection to the use of erasure codes to solve the data availability problem of light clients, and Reed-Solomon erasure codes are used in Polygon Avail, Celestia (and Ethereum’s Danksharding).

The difference is how to ensure that the erasure code is correctly encoded : KZG promises are used in Polygon Avail and Danksharding, while fraud proofs are used in Celestia. Both have their own advantages and disadvantages, KZG promises not to be quantum resistant, while fraud proofs rely on certain honesty assumptions and synchronization assumptions.

In addition to the KZG commitment, there are schemes using STARK and FRI that can be used to prove the correctness of erasure codes.

DA in Rollup

Data availability in Rollup is: in zkRollup, it is necessary to enable anyone to rebuild the state of Layer2 by themselves to ensure censorship resistance; in Optimistic Rollup, it is necessary to ensure that all data of Layer2 is published, which is a prerequisite for constructing fraud proofs. So where is the problem?

Dismantling the Data Availability Layer: The Overlooked Lego Bricks of the Modular Future

‍Image source: https://forum.celestia.org/t/ethereum-rollup-call->

Let’s look at the cost structure of Layer 2. In addition to the fixed cost, the variables related to the number of transactions per batch are mainly the gas cost of Layer 2 and the cost of data availability on the chain. The impact of the former is minimal; the latter requires a constant payment of 16 gas per byte, accounting for as much as 80%-95% of the Rollup cost.

(On-chain) data availability is expensive, what to do?

One is to reduce the cost of storing data on-chain: this is what the protocol layer does. In previous articles, we mentioned that Ethereum is considering the introduction of Proto-Danksharding and Danksharding to provide “big blocks” for Rollup, i.e. larger data availability space, and adopt erasure coding and KZG commitment to solve the ensuing node burden problem. But from the perspective of Rollup, it is unrealistic to passively wait for Ethereum to adapt itself.

The second is to put data off-chain . The following figure lists the current off-chain data availability solutions. The generalized solutions include Celestia and Polygon Avail; the user-selectable solutions in Rollup include StarkEx, zkPorter, and Arbitrum Nova.

Dismantling the Data Availability Layer: The Overlooked Lego Bricks of the Modular Future

Image credit: IOSG Ventures

(Note: Validium originally refers to the expansion scheme combining zkRollup and off-chain data availability. For the sake of convenience, Validium is used in this article to refer to the off-chain data availability scheme and participate in the comparison together)

Below we look at these options in detail.

DA Provided by Rollup

In the simplest Validium solution , a centralized data operator is responsible for ensuring data availability, and users need to trust the operator not to do evil. The benefit of this is low cost, but virtually no security guarantees.

As a result, StarkEx further proposed the Validium scheme maintained by the Data Availability Council (DAC) in 2020. Members of a DAC are individuals or organizations that are well-known and within legal jurisdictions, and the assumption of trust is that they will not collude and do evil.

Arbitrum proposed AnyTrust this year, which also adopted a data council to ensure data availability, and built Arbitrum Nova based on AnyTrust .

zkPorter proposes that Guardians (zkSync Token holders) maintain data availability. They need to pledge zkSync Token. If a data availability failure occurs, the pledged funds will be confiscated.

All three provide an option called Volition : users can freely choose on-chain or off-chain data availability as needed, and choose between security and cost according to specific usage scenarios.

Dismantling the Data Availability Layer: The Overlooked Lego Bricks of the Modular Future

Image source: https://blog.polygon.technology/from-rollup-to-validium-with-polygon-avail/

General DA Scenarios

The above scheme is proposed based on the idea that since the reputation of ordinary operators is not high enough, a more authoritative committee is introduced to improve the credibility.

Is a small committee safe enough? The Ethereum community raised the issue of Validium’s ransomware attack two years ago : if enough committee members’ private keys were stolen to make off-chain data availability unavailable, users could be threatened — only if they paid enough ransom Layer2 withdrawals. Based on the history of the theft at Ronin Bridge and Harmony Horizon Bridge, we cannot ignore the possibility.

Since the off-chain data availability committee is not sufficiently secure, what if the blockchain is introduced as a trusted subject to ensure off-chain data availability?

If the aforementioned DAC is understood as a consortium chain, then what Polygon Avail and Celestia do is to make the data availability layer more decentralized – equivalent to providing a “DA-Specific” public chain with a series of verification nodes, districts Block producers and consensus mechanisms to increase the level of trust.

In addition to the improvement of security, if the data availability layer itself is a chain, it can actually not be limited to providing data availability for a Rollup or a chain, but as a generalized solution.

Dismantling the Data Availability Layer: The Overlooked Lego Bricks of the Modular Future

Image source: https://blog.celestia.org/celestiums/

We take Celestia’s application of Quantum Gravity Bridge on Ethereum Rollup as an example to explain. The L2 Contract on the Ethereum main chain verifies proof of validity or proof of fraud as usual, with the difference that data availability is provided by Celestia. There are no smart contracts on the Celestia chain, no calculations are performed on the data, only the data is guaranteed to be available.

The L2 Operator publishes the transaction data to the Celestia main chain, and the Celestia verifier signs the Merkle Root of the DA Attestation, and sends it to the DA Bridge Contract on the Ethereum main chain for verification and storage.

In this way, the Merkle Root of DA Attestation is actually used to prove the availability of all data. The DA Bridge Contract on the Ethereum main chain only needs to verify and store this Merkle Root, and the overhead is greatly reduced.

(Note: Other data availability schemes include Adamantium and EigenLayr. Users in the Adamantium scheme can choose to host their own off-chain data and sign after each state transition to confirm that their off-chain data is available, otherwise the funds will be automatically sent back The main chain is used to ensure security; or users can freely choose data providers. EigenLayr is an academic solution, proposing Coded Merkle Tree and data availability oracle ACeD. We will not discuss it here)

summary

Dismantling the Data Availability Layer: The Overlooked Lego Bricks of the Modular Future

Image source: IOSG Ventures, based on Celestia Blog

After discussing the above solutions one by one, we will make a horizontal comparison from the perspective of security/decentralization and gas cost. Note that this graph represents only the author’s personal understanding, as a vague rough division rather than a quantitative comparison.

The Pure Validium in the lower left corner has the lowest security/decentralization and gas cost.

The middle part is the DAC scheme of StarkEx and Arbitrum Nova, the Guardians validator set scheme of zkPorter, and the generalized Celestia and Polygon Avail schemes. The author believes that using zkPorter to use Guardians as the validator set is slightly more secure/decentralized than DAC; while the DA-Specific blockchain scheme is slightly higher than a set of validators. At the same time, the gas cost has also increased accordingly. Of course this is only a rough comparison.

The box in the upper right corner is the scheme of data availability on the chain , with the highest security/decentralization and gas cost. From the inside of the box, since the data availability of these three schemes is provided by the Ethereum main chain, they have the same degree of security/decentralization. Compared with the single Ethereum, the pure Rollup scheme obviously costs less gas, and after the introduction of Proto-Danksharding and Danksharding, the cost of data availability will be further reduced.

Note: Most of the “data availability” context discussed in this article is under Ethereum. It should be noted that Celestia and Polygon Avail are general solutions, not limited to Ethereum itself.

Finally, we summarize the above solutions in the table.

Dismantling the Data Availability Layer: The Overlooked Lego Bricks of the Modular Future

Image credit: IOSG Ventures

Closing Thoughts

  1. After discussing the above data availability issues, we found that all the solutions are essentially trade-offs under the mutual constraints of the trilemma , and the difference between the solutions lies in the “fine-grained” trade-offs .

  2. From a user perspective, it is reasonable for the protocol to provide the option of both on-chain and off-chain data availability. Because in different application scenarios or between different user groups, users are also different in sensitivity to security and cost.

  3. The data availability layer’s support for Ethereum and Rollup is discussed more above. In cross-chain communication, Polkadot’s relay chain provides native security guarantees of data availability for other parallel chains; Cosmos IBC relies on the light client model, so it ensures that the light client can verify the data availability of the source chain and the target chain up to important.

    The benefits of modularity lie in pluggability and flexibility, and can adapt to protocols as needed: for example, to remove the data availability burden of Ethereum while ensuring security and trust levels; or to improve light client communication in a multi-chain ecosystem The security level of the model, lowering the trust assumption. Not limited to Ethereum, data availability can also play a role in multi-chain ecology and even more application scenarios in the future.

  4. We believe that in the next 3 to 5 years, the architecture of the blockchain will inevitably evolve from monolithic to modular, with each layer showing a low coupling state. In the future, providers of many modular components such as Rollup-as-a-Service (RaaS) and Data Availability-as-a-Service (DAaaS) may appear to realize the composability of the blockchain architecture. Modular blockchain is one of the important narratives underpinning the next cycle.

    Among them, the valuation behemoth of the executive layer (ie Rollup) has already “four points”, and there are few latecomers; the consensus layer (ie, each Layer 1) is competing in the Central Plains. After the public chains such as Aptos and Sui began to emerge, the competition pattern of the public chains The dust is not settled, but its narrative is old wine in new bottles, and it is difficult to find reasonable investment opportunities.

    The value of the data availability layer remains to be discovered.

References

https://twitter.com/ptrwtts/status/1509869606906650626

https://twitter.com/0xAlec/status/1545176941002575872

https://github.com/ethereum/research/wiki/A-note-on- >https://vitalik.ca/general/2021/04/07/sharding.html

https://coinmarketcap.com/alexandria/article/what-is- >https://dankradfeist.de/ethereum/2019/12/20/ >https://vitalik.ca/general/2021/04/07/sharding.html

https://www.parity.io/blog/what-is-a-light-client/

https://ethereum.org/en/developers/docs/scaling/validium/

https://forum.celestia.org/t/ethereum-rollup-call- >https://ethresear.ch/t/adamantium-power-users/9600

https://notes.ethereum.org/DD7GyItYQ02d0ax_X-UbWg?view

https://blog.polygon.technology/introducing-avail-by-polygon-a-robust-general-purpose-scalable- >https://blog.polygon.technology/the- >https://blog.celestia.org/ethereum-off-chain- >https://blog.celestia.org/celestiums/

Posted by:CoinYuppie,Reprinted with attribution to:https://coinyuppie.com/dismantling-the-data-availability-layer-the-overlooked-lego-bricks-of-the-modular-future/
Coinyuppie is an open information publishing platform, all information provided is not related to the views and positions of coinyuppie, and does not constitute any investment and financial advice. Users are expected to carefully screen and prevent risks.

Like (0)
Donate Buy me a coffee Buy me a coffee
Previous 2022-08-08 12:26
Next 2022-08-08 12:28

Related articles