According to the definition of the Metaverse on the Internet, the Metaverse is the use of technological means (AR/VR/MR/XR, etc.) to link and create in the real world mapping or 3D virtual world. The author suggests that the Metaverse is regarded as a space parallel to the real world, and through different technical means, an infinite number of parallel spaces can be established. The AR/VR/MR/XR industry not only brings about the subversion of hardware interaction, but also brings new possibilities to human understanding of space and the use of space.
With the launch of Apple’s MR device called Reality next year, Meta is constantly iterating on its Oculus VR, and domestic AR device manufacturers such as Rokid, Light Particles, and Nreal are also shipping one after another. We are already standing on the eve of the Metaverse explosion. Based on the understanding of the hardware development of the Metaverse, this paper deduces the computing and transmission requirements for the large-scale explosion of Metaverse applications, and demonstrates why Web3 and distributed computing power will become the computing infrastructure in the Metaverse era.
The current state of development of the Metaverse
With content as the underlying logic driving the entire ecological development, content interacts with computing and storage consulting, and through transmission, content reaches AR/VR/MR/XR hardware devices. Users enter the Metaverse by using terminal hardware. This article talks about the development of AR/VR, the focus is mainly on computing and transmission, and storage is another big topic, which will be analyzed separately in subsequent articles.
AR/VR devices are the hardware entrance for users to enter the Metaverse.
AR stands for Augmented Reality/ Augmented Reality, which is to use computer simulation to superimpose virtual objects in the real environment, so that they exist in the same picture and space.
The technical trend of AR is optical waveguide + Micro LED.
The full name of VR is Virtual Reality/Virtual Reality, which uses computer simulation to generate a 3D virtual world and provides users with a realistic sensory simulation.
The technical trend of VR is folded light path + Micro OLED.
Although the specific technical routes of AR and VR are different, the overall technical framework is similar.Both require the selection of light sources and imaging schemes to present images on the display screen to create a sense of visual immersion for users.
The definition of immersion in this article is that people’s feelings about the virtual environment created and displayed by computer systems are infinitely close to people’s feelings about the real natural environment. In order to achieve the highest immersion, that is, the feeling that most tends to people’s perception of the real world, the display requirements for AR/VR devices are retina-like requirements. The display of AR/VR devices has two important parameters that directly affect the user’s sense of immersion:
- Resolution/Resolution Here is a unit called PPD (Pixel Per Degree), which determines the clarity of the display. Unlike traditional screens, near-eye devices such as AR/VR use PPD to measure the clarity of the screen, which represents the pixel density per degree of field of view in the field of view. Retina-like screens need to reach 60PPD. Another thing to understand is the FOV (Field of View) / field of view, which can also be understood as the field of view. Assuming that the horizontal FOV of a single eye is 110° and the vertical direction is about 120°: 2*(110 x 60) x (120 x 60) is defined according to the resolution, that is, a screen of almost 13k can meet the display requirements of retina-like.
- The refresh rate/Rate of Refresh can be understood by fps (frame per second), that is, how many frames of images per second (one frame is one image). See the figure below, when the fps is low, the movement of the object seen by the human eye is discontinuous. As the number of frames increases, what the human eye perceives is that the object is moving continuously.
引用：《Frame Rate: A Beginner’s Guide for Live Streaming》
When the fps is low, people will feel dizzy because of the discontinuity of the picture. So by increasing the fps, the dizziness felt when using AR/VR hardware can be reduced. When it reaches 60fps, there is basically no dizziness; when it reaches 144fps, it can reach the highest limit that the human eye can perceive, and it also achieves the effect of human perception of the real environment.
amount of users
IDC announced that global AR/VR shipments will reach 11.2 million units in 2021, a year-on-year increase of 92.1%. If we draw an analogy with smartphone shipments (see chart below), we are on the eve of the AR/VR era.
As total AR/VR shipments increase, more and more users will enter the Metaverse through AR/VR devices, and the number of online users using AR/VR devices will also increase.
Status and pain points
Immersion is achieved by using a computer-generated 3D virtual world or by generating a 3D reconstructed world from images, projected on a 2D screen to make people feel distance, light and shadow, and color difference. The 3D virtual world is generated using professional modeling software (such as Unity), and the 3D world (including the 3D reconstructed world) and how the model inside is imaged on the 2D screen is determined by rendering. Rendering is the type of computing job that uses the most computing power in the Metaverse world. The rendering algorithm calculates the shadow, color grayscale, etc. that the model should have through a series of parameters (such as lighting angle, intensity, etc.), and produces pictures that are very close to the real world to the eye. What the user sees in the AR/VR device is actually a rendered image stream. There are three main types of rendering (see below):
Local host: that is, the application runs in the AR/VR device, directly uses the hardware of the AR/VR device for rendering, and directly pushes the picture stream to the display screen in the AR/VR device after the rendering is completed. The localhost rendering architecture has several big pain points:
- The AR/VR application package is large, which makes it impossible to download and store multiple applications at the same time on the host;
- Because AR/VR OEMs need to control the cost and price of equipment, they do not choose to install many computing resources on the host. Therefore, AR/VR equipment cannot meet the application of scenes with high rendering requirements, which limits the ability of AR/VR equipment to be used in AR/VR equipment. the number and types of applications running on it;
- The AR/VR device is in contact with the human body, and it needs to maintain its operating temperature within the acceptable range of the human body. If too much local calculation is done, the temperature will rise, making thermal management a problem;
- Since it is a wearable device, if too much computing is done locally, the power consumption will be accelerated, which will cause users to need to charge frequently and reduce the wearing time of users.
Based on the above mentioned pain points, AR/VR content providers and OEMs generally recommend not to do too much computing locally.
- Streaming: That is, the application runs on the user’s computer, uses the computer’s hardware for rendering, and directly pushes the image stream to the display screen in the AR/VR device after rendering is complete. The biggest pain point of the streaming architecture is that the rendering speed is limited by the user’s hardware parameters. Very few users have hardware capable of rendering at 13k@144 fps. Therefore, to achieve large-scale human-like rendering, more economical and scalable computing solutions are required.
- Cloud rendering: That is, the application runs in the cloud server, uses the hardware of the cloud server for rendering, and directly pushes the image stream to the display screen in the AR/VR device after rendering. Since GPUs are designed to be best suited for rendering-type computations, assume:
- All rendering work is done by cloud GPU computing resources;
- Each GPU has a computing power of 7.4TFLOPS – or 7.4 teraflops per second. 7.4TFLOPS This figure is the computing power of the GPU with the largest shipment;
- Cloud vendors own 30% – 30% of all computing power This figure is based on the assumption that the global cloud penetration rate is 15.7%.
In 2021, the total global GPU shipments and ownership will be about 400 million, and it is estimated that by the end of 2025, the total GPU shipments and ownership will be ~1.5 billion. Assuming that cloud manufacturers have 30% of the computing power of the entire market, according to the same proportion, cloud manufacturers will have 120 million GPUs in 2021 and 450 million GPUs in 2025.
According to the resolution and fps requirements, calculate how many GPUs are used at the same time per second for each rendered photo; and then calculate the number of AR/VR devices that support simultaneous online globalization from the total number of GPUs. It can be seen that when the rendering computing intensity is human-like, that is, when the rendering computing power requirement is 13k@144 fps, the cloud rendering framework will only support about 6 million simultaneous online AR/VR devices by the end of 2025 . According to forecasts, in 2025 alone, AR/VR device shipments will exceed 100 million units.
Therefore, in the foreseeable future years, the centralized rendering computing power with cloud manufacturers as the main rendering providers will not be enough to support the arrival of the AR/VR era, and new computing solutions and frameworks are needed to support the explosion of the Metaverse.
Development Status of Web3-Driven Distributed Computing Networks
In this paper, the nodes owned by users are called distributed computing nodes, and the computing nodes owned by cloud factories/IDCs are called centralized computing nodes.
Incentivize the development of real-world computing power through blockchain technology + tokenomics.Multicoin Capital calls this mechanism Proof of Physical Work (PoPW). PoPW rewards users for performing verifiable physics work, in this case rendering computations. Contributors and validators are rewarded according to a predetermined set of rules by validating the state of the device using a protocol algorithm.
Distributed computing nodes perform a complete rendering job, or a part of a rendering job, and send the rendered image to the client (AR/VR device side) in the form of an image stream.
In the Metaverse, rendering is the most computationally demanding computing job. The rendering process can be roughly divided into 4 stages, also known as the rendering pipeline:
Quote “Introduction to 3D Visualization: Seeing How the Rendering Pipeline Works on the GPU”
- Application stage: Executed on the CPU side, usually used to prepare geometric data, Uniform data, etc. required for drawing, and send it to the GPU side;
- Geometry Processing stage: It can be divided into vertex shading, subdivision shading, geometry shading, Stream Out (for writing the results of Geometry Processing stage processing back to Buffer), clipping (cropping the coordinates beyond NDC and generating new ones vertices), and screen mapping (Screen Mapping, which maps NDC coordinates to screen coordinates);
- Rasterization (rasterization) stage: convert the image (points, lines, triangles) processed in the previous stage into discrete pixel blocks;
- Pixel Processing stage: Pixel Shading is performed on the pixels generated by rasterization, and then the calculated pixels are written into the frame buffer or the buffer specified by the developer according to certain rules.
Among these steps, several can be processed in parallel, such as pixel shading, vertex transformation, primitive assembly, or general computing. These computations that can be processed in parallel can be sent to different nodes for work, using the computing power of distributed nodes to become the computing infrastructure of the Metaverse.
How to become the infrastructure of the Metaverse to solve computing power anxiety
As mentioned above, when the rendering intensity is human-like, that is, when the rendering computing power requirement is 13k@144 fps, according to our assumptions and estimates, the cloud rendering framework will only support simultaneous online AR/AR/AR at the end of 2025. Number of VR devices ~ 6 million units. When this zero point is reached, distributed computing power will not only be used as a supplement to computing power, but will become a rigid need to support the development of the Metaverse. On the other hand, because the node distribution of the cloud factory is far from the physical distance of the edge nodes from the user, the images rendered on the cloud edge are not as fast as those rendered on the edge. In order to achieve a refresh rate of 144 fps, the edge node has a shorter transmission distance than the central server, that is, the transmission will be faster. Use tokenomics as a mechanism to motivate global nodes to contribute computing power, use POPW to verify data integrity and correctness, promote global nodes with computing power to join the network, and better serve users in different parts of the world. Metaverse rendering work demand.
The computing power distribution of the Metaverse we envision will be similar to the following structure:
- The GPU resources of cloud manufacturers will still be fully utilized, but as general compute resource providers, in addition to being used for rendering, a large part of GPU resources will be used by AI-related fields, so more computing resources need to be mobilized;
- Through the incentive model, more nodes can join the computing power network. Distribute the corresponding rendering work by evaluating the strength of the computing power of different nodes in the network;
- The AR/VR side will eventually receive photo streams pushed by multiple nodes. The AR/VR side only needs to sort these photo streams to display to the user on the display.
In theory, the propagation speed of electrical signals is the speed of light. The reason for the high delay in data transmission due to physical distance is that these signals will be lost during the transmission process. Therefore, long-distance transmission needs to go through a “transit station”. After receiving the data, these relay stations need to strengthen the signal of this data and restore it to the signal just sent. Because this process takes time, generally speaking, the greater the geographic distance, the higher the latency.
- As shown in the figure above, when the client 1 is physically far away from the cloud manufacturer’s computer room, it chooses to use the nearest distributed network node to complete the rendering task;
- When the client 2 is physically close to the computer room of the cloud manufacturer and is close to some distributed nodes, the hybrid rendering scheme is selected, that is, the central + distributed scheme;
- The AR/VR side will receive image streams pushed by multiple nodes, and after sorting, they will be displayed on the screen.
Tokenomics + PoPW
Using tokenomics to motivate global network users to join the computing power network, PoPW’s consensus protocol ensures the correctness and integrity of data.
Tokenomics for infrastructure networks has two benefits:
- It can quickly expand the network on a global scale and better respond to the needs of global rendering work;
- Large-scale physical networks require capital investment, whether it is the cost of purchasing hardware in the early stage or the cost of operation and maintenance in the later stage. It is a more cost-effective solution to build and maintain the network by using tokenomics and distribute the full benefit to the supply side.
Distributed computing power market Mapping
The author believes that in the short term, the realization of distributed computing power driven by Web3 is very difficult, and it will take time to develop whether it is for rendering models, lowering the GPU threshold used, or transmission. But in the medium and long term, this is an inevitable result and choice, which will support the large-scale explosion of the Metaverse and become an important infrastructure for computing power in the Metaverse.
Posted by:CoinYuppie，Reprinted with attribution to:https://coinyuppie.com/how-does-the-distributed-computing-network-driven-by-web3-become-an-important-infrastructure-of-the-metaverse/
Coinyuppie is an open information publishing platform, all information provided is not related to the views and positions of coinyuppie, and does not constitute any investment and financial advice. Users are expected to carefully screen and prevent risks.