Introduce EIP-4444: Set limits on the historical data of the execution layer client

EIP-4444 proposes to set HISTORY_PRUNE_EPOCHS to 82,125 epochs (that is, 1 year on the beacon chain), so that the execution layer client in the PoS Ethereum no longer provides block headers and zones on the p2p network for more than a year Block body and receipt data, the client can prune these historical data locally. One of the authors of this EIP, @lightclients, wrote an introduction on the tweet, and this article is a translation of the tweet.

The Ethereum client currently stores 275 GB of historical data, which is unnecessary for verifying the blockchain. This number is growing at a rate of 140 GB per year. EIP-4444 proposes that clients prune data older than 1 year. So, why don’t we trim the data directly?

To understand why the data has not been pruned, and why this needs to be discussed, it is necessary to understand how historical data is used today. There are two main usage categories: synchronization and user requests via JSON-RPC.

There are two main methods in synchronization:

  • Full Sync: Download and execute every block from the creation to the top of the blockchain
  • State Sync: There are many solutions here, but the main one is to synchronize the block header with a proof-of-work check and download the state of the latest block.

In both cases, clients request historical data through the p2p network to extend their view of the chain. The trust model is usually to trust the creation state and then verify everything else-either fully verified or lightly verified through proof-of-work checks.

Proof of equity changes this . Because it is vulnerable to remote attacks, we must rely on “Weak Subjectivity Checkpoint” . This essentially means that our trust in a block on the authority chain is equivalent to the trust in the genesis block in PoW.

Weakly subjective checkpoints allow the client to skip the guided step of requesting historical data through the p2p network. Of course, they will still need to synchronize historical data after the checkpoint-so the checkpoint should always be before the trim boundary.

This sounds like a step backwards in security. Previously, we had a hash value on July 13, 2015 for verification. Now, what we have is a weakly subjective checkpoint that is changing. But in fact, we have always relied on weak subjectivity.

When was the last time you verified code differences between client versions? Most people do not have a technical background to do this. Therefore, every time you update your client, you rely on your client team to strictly implement the Ethereum protocol.

Fortunately, there are many people staring at software like go-ethereum. Only a whistleblower can expose malicious commits in the code. Similarly, only one whistleblower is required to point out that a client launches a malicious weakly subjective checkpoint.

In fact, it is much easier to verify that a client pushes the correct weakly subjective checkpoint than to ensure that the code executes the protocol correctly.

Therefore, from a security point of view, there is actually no retrogression. This also includes synchronization-the other main category of use required for historical data is to serve user requests.

Users can request two types of data:

  • Current data, such as the value of the storage slot, account balance, the latest block height, etc.
  • Historical data, such as the storage slot data in block N, the block header of block N, transaction receipts, etc.

Current data will continue to be accessible. When EIP-4444 is implemented, whether historical data can be accessed depends on how long ago it is.

The main users of historical data are dapp developers. Many dapps add historical data to their databases and provide them to users through their front ends. For them, it is important to be able to traverse all transactions and logs.

There are multiple ways to support this use case-now the most popular method is to release a multiplexer on the client side, and a version that supports a certain range of blocks will execute the blocks in that range. For example, geth version A may support blocks up to a block height of 10m, while geth version B supports blocks after 10m.

The multiplexer will use version A to execute blocks with a block height of 0 to 10m, output the state database and import it into geth version B, and then continue to execute blocks after 10m. The JSON-RPC request will be directed to the client that responds with the appropriate information.

However, if historical blocks are no longer available on the p2p network-who will provide the data? It is expected that many large, trusted institutions will provide mirroring of these data. Since the data is static, it is easy to reach consensus and verify its hash value. This is the 1-of-N trust model.

The new standard will be to not store historical data and run a client multiplexer. This means that the standard memory footprint of the Ethereum client will be reduced by 275 GB-but there is one last issue that needs to be mentioned.

Currently, when the requested data does not exist, Ethereum’s JSON-RPC will give an empty response. Assuming that the client is not synchronizing, this will be accepted as “this data does not exist in the authority chain or the most recent fork”.

Once the client starts pruning the old data, this immutability will be broken. When a user requests a receipt for a particular transaction, the client will not know whether the receipt has been trimmed or never existed. Currently, we expect that RPC will return an empty response for these two situations.

I would love to get feedback on this method. What do users of JSON-RPC think about this? How often do you access historical data for more than 1 year? Another method (although heavier) is to maintain an index of the hash values ​​of the pruned data so that more content can be returned to the user.

The data of 275 GB was found in the output of geth db inspect. Below is the screenshot:

Introduce EIP-4444: Set limits on the historical data of the execution layer client

The official EIP-4444 (incidentally, read EIP four 4s) specification can be found here:

Posted by:CoinYuppie,Reprinted with attribution to:
Coinyuppie is an open information publishing platform, all information provided is not related to the views and positions of coinyuppie, and does not constitute any investment and financial advice. Users are expected to carefully screen and prevent risks.

Leave a Reply