In-depth understanding of EVM storage mechanism and security issues

Preface

EVM is a lightweight virtual machine. Its original intention is to provide a virtual execution environment that can ignore the compatibility of hardware and operating systems for the Ethereum network to run smart contracts.

Simply put, the EVM is a completely independent sandbox, and the code running in the EVM cannot access the network, file system, and other processes, so as to prevent the wrong code from destroying the smart contract or affecting the external environment.

On this basis, Know Chuangyu Blockchain Security Lab will take everyone to deeply understand the storage mechanism and security issues of EVM.

EVM storage structure

In-depth understanding of EVM storage mechanism and security issues

It can be seen that EVM storage data is divided into two categories:

  • The data stored in code and storage is non-volatile (not easy to lose)
  • Data stored in stack, args, memory is volatile (easy to lose)

The meaning of each storage location

Code

code When deploying a contract, the data field is the space where the content of the contract is stored, that is, the space dedicated to storing the binary source code of the smart contract

Storage

Storage is a persistent storage space that can be read, written, and modified, and it is also a place where each contract stores data persistently. Storage is a huge map with a total of 2^256 slots, and each slot has 32 bytes. The “state variables” in the contract will be stored in these slots according to their specific types.

Stack

Stack is the so-called “running stack”, which is used to save the input and output data of EVM instructions. It can be used for free, there is no gas consumption, and the number of local variables used to store the function is limited to 16. The maximum depth of the stack is 1024, and each unit is 32 bytes.

In-depth understanding of EVM storage mechanism and security issues

Args

args, also called calldata, is a read-only addressable space for storing function call parameters. The difference from the stack is that if you want to use the data in calldata, you must manually specify the offset and the number of bytes read .

Memory

Memory is a simple byte array, which mainly stores data during operation and passes parameters to internal functions. Addressing and expansion based on 32byte.

EVM data storage overview

As mentioned earlier, Storage is the place where each contract stores data persistently. The way of storing data is implemented through slots. Now let’s introduce how it is implemented in detail:

State variables

1. For variables (constants) whose size is less than 32 bytes, they are stored in the order of their definition as their index values. That is, the index of the first variable is key(0), and the index of the second variable is key(1)…

2. For successively small values, they may be optimized and stored in the same location. For example: the first four state variables in the contract are all of type uint64, then the values ​​of the four state variables will be packed into a 32-byte value Stored in position 0.

Not optimized:

pragma solidity ^0.4.11;
contract C {
uint256 a = 12;
uint256 c = 12;
uint256 b = 12;
uint256 d = 12;
function m() view public returns(uint256,uint256,uint256,uint256){
return (a,b,c,d);
}
}

In-depth understanding of EVM storage mechanism and security issues

Optimized:

pragma solidity ^0.4.11;
contract C {
uint64 a = 12;
uint64 c = 12;
uint64 b = 12;
uint64 d = 12;
function m() view public returns(uint64,uint64,uint64,uint64){
return (a,b,c,d);
}
}

In-depth understanding of EVM storage mechanism and security issues

Structure

For structures with a size of less than 32 bytes, it is also stored sequentially. For example, the structure variable index is defined at position 0, and there are two members inside the structure, and the order of these two members is 0 and 1.

pragma solidity ^0.4.11;
contract C {
struct Info {
uint256 a ;
uint256 b ;
}
function m()  external returns(uint256,uint256){
Info storage info;
info.a = 12 ;
info.b = 24 ;
return(info.a,info.b);
}
}

In-depth understanding of EVM storage mechanism and security issues

Map

The map storage location is calculated by keccak256 (bytes32(key) + bytes32(position)), and position represents the storage location of the key corresponding to the storage type variable.

pragma solidity ^0.4.11;
contract Test {
mapping(uint256 => uint256) knownsec;
function go() public {
knownsec[0x60] = 0x40;
}
}

In-depth understanding of EVM storage mechanism and security issues

Array

Fixed-length array

Same as above, as long as it is within 32 bytes, it is also stored sequentially, but the compiler will check the boundary at compile time to prevent it from crossing the boundary.

pragma solidity ^0.4.11;
contract C {
uint256[3] a = [12,24,48] ;

function m() public view returns(uint256,uint256,uint256){
return (a[0],a[1],a[2]);
}

}

In-depth understanding of EVM storage mechanism and security issues

Variable length array

Since the length of the variable-length array is uncertain, storage space is generally reserved in advance when compiling the variable-length array, so the position of the state variable is used to store the length of the variable-length array.

The specific data address will be calculated by calculating keccak256 (bytes32(position)) to calculate the first address of the array, and then add the length offset of the array to obtain the specific element.

pragma solidity ^0.4.11;
contract C {
uint256[] a = [12,24,48] ;

function m() public view returns(uint256,uint256,uint256){
return (a[0],a[1],a[2]);
}

}

In-depth understanding of EVM storage mechanism and security issues

Byte arrays and strings

If the length is less than or equal to 31 bytes:

1. For fixed-length byte arrays, it is the same as fixed-length arrays;

2. For variable byte arrays and character strings, 0 up to 32 bytes will be added to the storage value position, and the last byte of the 0-supplemented byte will be used to store the code length of the string.

pragma solidity ^0.4.4;
contract A{
string public name0 = “knownsec”;
bytes8 public name=0x6b6e6f776e736563;
bytes public g ;

function test() public {
g.push(0xAA);
g.push(0xBB);
g.push(0xCC);
}
function go() public view returns(bytes){
return g;
}
}

In-depth understanding of EVM storage mechanism and security issues

When the length of the section array and string is greater than 31 bytes

1. The variable position stores the code length, and the code length formula is replaced with code length = number of characters * 2 + 1

2. The first position of the real storage value is obtained by the formula keccak256(bytes32(position)), and the remaining value is stored in the order of the obtained position, and the last storage position is also filled with 0 to 32 bytes.

string public name = “knownsecooooooooooooooooooooooooo”;

In-depth understanding of EVM storage mechanism and security issues

Security Question

The storage structure and storage mechanism of the EVM have been mentioned before, and now we will discuss its security issues.

Uninitialized variable

Vulnerability principle:

In the official manual, it is mentioned that structures, arrays and mapped local variables are placed in storage by default, and the default types of local variables set in functions in the solidity language depend on their own types.

Therefore, if the above storage type variables are set inside the function but not initialized, they are equivalent to storing pointers to other variables in the contract, and when we change them, the variables pointed to are changed. Vulnerable contract, the purpose is to modify owner to his own address:

pragma solidity ^0.4.0;
contract testContract{
bool public unlocked = false;
address public owner = 0xCA35b7d915458EF540aDe6068dFe2F44E8fa733c;
struct Person {
bytes32 name;
address mappedAddress;
}
function test(bytes32 _name , address  _mappedAddress) public{
Person person;
person.name = _name;
person.mappedAddress = _mappedAddress;
require(unlocked);
}
}

Vulnerability contract analysis:

It can be seen that the contract is not initialized when a new structure is created in the function part, so we can use this function to modify the owner. But to use this function, we have to pass require verification, but it is not difficult because the state variable unlocked is also within our controllable range.

Specific operation:

Call the test function to pass in respectively to _name: 0x0000000000000000000000000000000000000000000000000000000000000001 (true value)

_mappedAddress incoming: 0xfB89eCb0188cb83c220aADDa1468C1635208e821 (personal address)

Before passing the reference:

In-depth understanding of EVM storage mechanism and security issues

After passing the parameters:

In-depth understanding of EVM storage mechanism and security issues

You can see that the address has been successfully changed.

Summarize

It can be seen that the memory of the EVM is a key=>value key value database, and the stored data can be checked and checked to ensure consistency. But it also interacts with the smart contract language. When some of the rules conflict, it is likely to be used by people with ulterior motives to do evil. Therefore, the standardized use of smart contract language is a necessary condition to avoid loopholes.

Posted by:CoinYuppie,Reprinted with attribution to:https://coinyuppie.com/in-depth-understanding-of-evm-storage-mechanism-and-security-issues/
Coinyuppie is an open information publishing platform, all information provided is not related to the views and positions of coinyuppie, and does not constitute any investment and financial advice. Users are expected to carefully screen and prevent risks.

Leave a Reply