If the hot word in the field of science and technology in 2021 is the Metaverse, then this year’s seat will most likely be reserved for “Web3”. All of a sudden, various popular science, analysis, outlook, and doubts come one after another, and this term has become well-deserved. Traffic password.

In various views, although everyone has different definitions of Web3, there is a consensus that Web3 enables users to have ownership and autonomy over their own data, which is also a key factor in promoting the evolution of Web2 to Web3 . As our lives and work are more thoroughly digitized, that is, when all human activities are presented as data streams, this transfer of data rights is particularly critical.

Therefore, we have reason to believe that the data track of Web3 will become the most important part of the new order, with a broad space for development. From the perspective of entrepreneurs, the decentralized network driven by blockchain technology, Its essence is an open, permissionless distributed database. There are naturally many scenarios in the data direction that need to be served. Selecting it has a high probability of evolving and growing on the correct technology tree. In today’s article, I will sort out the market structure and typical players of the existing Web3 data track, briefly interpret its future development trend, and share some investment judgments of SevenX.

The core point of this article:

1. Web3 breaks the data silos, and at the same time returns data rights to individual users, users can carry them at any time, and can combine and interact with applications at will.

2. The structure of Web3 data track can be divided into four levels, namely data source, data acquisition, data query and index, data analysis and application. The degree of decentralization, scalability, speed and accuracy of the services provided, and the irreplaceability of scenarios are the main dimensions for us to judge the project.

3. With the gradual enrichment of data market participants and the precipitation and accumulation of data itself, the value of data will be greatly increased, but how to better follow the fundamental spirit of blockchain to protect privacy while using data to generate greater value is a matter of Another important issue.

4. Building a decentralized reputation system through multi-dimensional data vectors is one of the next most important use cases in the Web3 data market. Based on the reputation system, it is possible to unlock variousfinancial scenarios such as credit lending.

What am I talking about when I’m talking about Web3 data

In the process of development of human civilization, a large amount of data will be generated. They may be forgotten, disappear in the long river of time, or be recorded and precipitated as a known history. The emergence of the Internet has made human beings record data. Sharing can be carried out in a more efficient and wide-ranging way, and the value of data can be further explored, and its importance has gradually become the consensus of the whole society. In the cover story of the May 2017 issue of The Economist Data is defined as “the most valuable resource in the world”.

However, as more and more data is deposited on the Internet, a fundamental problem begins to emerge: the data generated by individuals creates value, but these data do not belong to individuals, and the value created by them is not distributed to individuals. So people yearn for a new order, so Web3 came into being.

So how does Web3 reshape the value of data? There are three main aspects:

  • Data is open, transparent and immutable

In the Web2 world, applications provide free services to obtain user data, and then monopolize this data to profit and build their own business moats. Data is stored on their centralized servers, which cannot be accessed by the outside world, and there is no way to know which data is stored, in what way and at a granularity, and once these applications are attacked or voluntarily terminate services, user data can be stored overnight. In between became nothing. But with blockchain technology as the underlying We3 framework, the data on the chain is open, transparent and immutable, which is the premise that they can be better used.

  • Break data silos and improve interoperability

Whenever a new application is used, there is no need to go through the registration process tirelessly, which should be the most intuitive manifestation of the negative impact caused by Web2 data silos on the user side. Because each application has its own database, which is independent of each other and cannot be connected, this kind of repetitive collection is caused. At the same time, user behavior data is fragmented and held in the hands of different applications, which can neither be reused across platforms nor integrated. In the world of Web3, broadly speaking, users only need one address to access and use various decentralized applications, and for each on-chain interaction that occurs at this address, the corresponding data can be combined. , without any app permission.

  • Better value distribution through token economy

How the value created by data can be distributed to the individuals who generate the data is an important issue that Web3 has to answer for data, and the evolving token economy may be the core means to realize this value redistribution. Users who benefit from the airdrop should have a very intuitive feeling. In the context of Web3, the data accumulated and generated by the user’s interaction with any application is the carrier of value capture.

In fact, the evolution of the Crypto market itself has also largely driven the development of the Web3 data track. The supply side, the formation of the multi-chain universe, the explosion of various applications, the vigorous development ofNFTs , and the influx of new users have led to an exponential increase in the type and quantity of data; the demand side, the multidimensional and complex needs Transformation has spawned countless imaginative scenarios and opportunities around data acquisition, organization, access, query, processing, and analysis.

Web3 data track structure diagram

One article to understand the unicorns, game breakers and future stars of the Web3 data track

The structure of the Web3 data track can be divided into four levels, which are the bottom-level data source, the second-level data acquisition, the third-level data query and indexing, and the top-level data analysis and application.

Tier 1, Data Sources

The data sources are generally divided into on-chain and off-chain data. The data on the chain mainly includes: chain-related data (such as hash, timestamp, etc.), transfer transactions, wallet addresses, smart contract events, and some data stored in the cache (such as the queuing data in the Ethereum mempool). The data is maintained by a decentralized database, and the reliability is guaranteed by the consensus of the blockchain. In addition, storage is also the main source of data on the chain, currently focusing on protocols such as IPFS , Arweave, and Storj. Off-chain data mainly includes centralized exchange data, social media data, GitHub data, and some typical Web2 data, such as PV, UV, daily activity, monthly activity, download, search index, etc.

In the past two years, the types and quantities of data have grown exponentially, but at present, there are still three problems at the level of data sources:

1. Some public chains adopt the light node mode, resulting in incomplete data on the chain, such as Solana .

2. The storage layer is congested due to the large amount of data. My good friend REVA once uploaded her NFT works to IPFS, but when she wanted to call it, she failed to download a file of hundreds of megabytes in 2 hours (think about it for two hours. crash when downloading a standard definition movie). However, there are already projects in the market to solve this problem, such as SevenX’s Portfolio: Meson Network. It is a decentralized CDN network that aggregates idle servers through mining, schedules bandwidth resources and serves the file and streaming media acceleration market. The objects include traditional websites, videos, live broadcasts and blockchain storage solutions. Currently AR , IPFS, etc. are already supported .

3. The off-chain data lacks a method to ensure its authenticity, and the data dimension also needs to be expanded.

The second layer, data acquisition

The most important player in this layer is the node service provider. If you choose to build your own nodes to obtain on-chain data, it requires high time, money and technical costs, and may also face problems such as memory leaks and insufficient disk space. The node service provider has greatly optimized this process. As the infrastructure of the entire data track, node service providers are the first players to participate, and unicorns with a valuation of tens of billions of dollars have also been born.

At present, the well-known service providers are Infura, Quicknode, Alchemy and Pocket. When choosing, developers and entrepreneurs will mainly consider the number of coverage chains, business models and the diversity of additional services (is there any CDN-like services? Is it accessible? mempool data? Is it possible to provide private nodes?) and other factors, and Infura has experienced node downtime more than once before, whether decentralization is also one of the criteria for everyone to choose. (In November 2020, Infura did not run the latest version of the Geth client, and some special transactions triggered a bug in this version of the client, and then Infura went down and caused a series of chain reactions: mainstream trading platforms could not charge To mention ERC-20 tokens, MetaMask cannot be used, etc.)

A simple comparison of the four node service providers is as follows:

One article to understand the unicorns, game breakers and future stars of the Web3 data track

On February 8 this year, Alchemy completed a $200 million financing at a valuation of $10.2 billion; Infura’s parent company ConsenSys also completed a $200 million financing last year, with a valuation of $3.2 billion; as of March 2022, Pocket’s circulating market value reached $3.28 billion.

The third layer, data query and index

On top of the node service providers that directly interact with various public chains, there are market participants who provide data query and indexing services. They make raw data easier to access and use by parsing and formatting data.

  • The Graph

The Graph is a decentralized on-chain data indexing protocol. Mainnet launched in December 2020, and so far can support indexing of data from more than 30 different networks, including Ethereum, NEAR , Arbitrum , Optimism ,Polygon , Avalanche , Celo, Fantom , Moonbeam , Arweave, and more.

It is similar to traditional cloud-based APIs, the difference is that traditional APIs are operated by centralized companies; while on-chain data indexing consists of decentralized index nodes. With the help of the GraphQL API, users can get information directly through subgraph (subgraph), fast and save resources. The Graph designed the GRT token mechanism to encourage multiple parties to participate in its own network, involving Delegators, Indexers, Curators, and Developers. A brief summary of the flow of business is as follows: users put forward query requirements, indexers run The Graph nodes, principals pledge GRT tokens to indexers, and curators use GRT to guide which types of subgraphs have query value.

  • Covalent

Covalent provides a data query layer that allows its users to quickly call data in the form of API. Currently, it supports Ethereum, BNB Chain, Avalanche, Ronin, Fantom, Moonbeam, Klayth, HECO, SHIDEN and mainstream Layer2 networks.

Covalent supports not only the query of all data types of the blockchain, such as transaction, balance, log type, etc., but also the data query of a certain protocol. The most prominent feature of Covalent is to query across multiple chains. It does not need to re-establish an index similar to the subgraph of The graph, which can be achieved by changing the Chain ID. The project also has its own token, CQT, which holders can use to stake and vote for events such as database updates.

  • SubQuery

SubQuery provides data query services for Polkadot and Substrate projects, allowing developers to focus on their core use cases and frontend without wasting time building custom backends for data processing. Inspired by The Graph, SubQueary also uses the graphQL language, and its token economics are similar to The Graph: There are three types of roles in the SubQuery system: consumers, indexers, and principals. Consumers publish tasks, index this provides data, delegate idle SQT tokens to indexers, incentivizing them to participate more honestly.

  • Blocknative

Blocknative focuses on the retrieval function of real-time transaction data and provides mempool’s data browser, such as address tracking, internal transaction tracking, unsuccessful transaction information, and replaced transaction (accelerated or canceled) information. Because the data of the mempool will not be consistent with the final block data, the requirements for real-time performance are high. Field queries provided by Blocknative are more instant and precise.

  • KOII

KOII is a decentralized ecosystem for creators to help them own content permanently and earn content value. Anyone can use the KOII system to earn token rewards by deploying tasks, running nodes, or producing/registering content, and the system will reward participants based on data processed through real traffic proofs, enabling a cycle of “attention economy” . In addition, the Atomic NFT developed by the KOII team realizes the preservation and confirmation of the NFT and its Meta-Info (meta information, that is, the actual digital content represented by the NFT) on the same chain, so all the content on the KOII platform can follow the The same standard generation, KOII will also become an important content data indexing platform if this scalability can successfully promote the accumulation of content to a certain order of magnitude.

The projects listed below not only provide data query and indexing services, but also products belonging to the data application and analysis layer. For convenience, they are described here.

  • Dune Analytics

Dune Analytics is a comprehensive Web3 data platform that can query, analyze, and visualize massive on-chain data. It parses the on-chain data stored in the key-value database, and then records and collects it into a PostgreSQQL relational database. Users do not need to write scripts, but can query using simple SQL statements. The data tables that Dune Analytics can provide include raw transaction data tables, project-level data tables, and aggregated data tables.

Dune Analytics encourages data sharing. By default, all queries and datasets are public, and users can directly copy other people’s Dashboards and use them as references. At present, the best group of data analysts in the Web3 field are gathered here. Dune Analytics currently supports data queries for Ethereum, Polygon, Binance Smart Chain, Optimism and Gnosis Chain. In February this year, it completed the B round of financing of 69.42 million US dollars, with a valuation of 1 billion US dollars, and officially entered the ranks of unicorns.

  • Flipside Crypto

Like Dune Analytics, Flipside also uses visual tools and automatically generated API interfaces to allow users to query complex data through simple SQL statements, and to copy and edit SQL queries that have been generated by others.Flipside actively works with leading crypto projects to incentivize on-demand analytics through structured bounty programs and mentoring, helping projects quickly gain the data insights they need to grow.

Currently Flipside supports Ethereum, Solana, Terra , Algorand and other public chain networks. On April 19, Flipside announced the completion of a $50 million financing.

  • DeBank

DeBank is a tracker for DeFi portfolios. Through DeBank, users can track and manage the DeFi applications they have interacted with in one-stop, check address balances and changes, asset distribution, authorization, rewards to be claimed, loan positions, etc. 1147 protocols on 27 networks are currently supported.

In April last year, DeBank officially launched its own OpenAPI plan, which includes access to all protocols on a chain, access to all chains supported by a protocol and their contract addresses, and access to real-time investment portfolios in a protocol With 28 APIs, all institutions and individual developers can apply to become official partners and access DeBank’s DeFi analysis data in real time. Currently, imToken, TokenPocket, Math Wallet, Mask, Hashkey Me, OneKey and Zerion are all using DeBank’s API, and DeBank has also successfully extended its market from data application to data query and indexing.

  • CyberConnect

CyberConnect is a decentralized social graph protocol whose solution is to build a scalable and standardized social graph module, allowing developers to port the social graph module into new applications with simple code, saving time and economy cost, and for the end user, their own social data becomes a personal portable asset that can also be easily ported into new applications, breaking down the platform-to-platform barriers in the Web2 world.

  • RSS3

RSS3 is a next-generation data indexing and distribution protocol derived from the RSS protocol. It allows users to generate RSS3 files based on addresses, and associate their Twitter, Mirror, Instant and other social platforms into the file, and the files will synchronize the user’s assets and content in real time. and behavior data (transactions, likes, forwards, etc.), and store this information in the decentralized network of RSS3. With the permission of users, developers can call users to publish on different platforms through different API interfaces. content, and filter and display different information according to application characteristics.

  • Go+

Based on its own ” security engine”, Go+ is committed to creating a “secure data layer” in the Web3 world. At present, the token security monitoring function for C-side users has been released . Users can enter the token contract address to obtain nearly 30 security monitoring items in the three aspects of contract security, transaction security and information security, covering ETH , BSC, Polygon, Avalance, Arbitrum, HECO and other public chain ecosystems. At the same time, Go+’s security API can also be referenced by other developers and downstream applications to create a more secure encryption ecosystem for their own projects. These security APIs include: Token detection, NFT detection, real-time risk warning, dApp contract security, interaction security, etc.

The emergence of Go+ actually shows a trend in the Web3 data track, that is, the verticalization of data indexing. SevenX found in its research that with the surge in the number of agreements and projects, and the complexity of user behavior, more and more vertical data scenarios have appeared in the data market. These scenarios are characterized by non-general data and high frequency of user demand. High, users are both data users and data providers. In the future, there will be more and more data indexing, querying and analysis services for these vertical scenarios. market breaker.

The fourth layer, data analysis and application

This layer is directly oriented to C-end users (C-end in a broad sense, not only refers to individual users), and delivers ready-to-use data products. They help users complete all the heavy and responsible work, directly presenting data value for users from the perspective of their own data methodology. Participants in this layer can be roughly divided into on-chain transactions, token prices, DEFI protocols, DAOs , NFTs, security, social networking, etc. according to the type of data. Of course, more and more project departments focus on a certain type of data, aiming to become a more comprehensive data analysis platform.

  • blockchain browser

This may be the earliest data application layer product that appears, allowing users to search for information on the chain directly through the Web page, including chain data, block data, transaction data, smart contract data, address data, etc.

Glassnode & Messari &

Blockchain data and information provider, providing investors with on-chain data and transaction intelligence from different perspectives & indicators, and outputting market analysis insights and research reports.

CoinGecko & CoinMarketCap

A token analysis tool to observe and track token prices, trading volumes, market caps, and more.

Token Terminal

Analyze DeFi projects with traditional financial metrics such as P/S ratio, P/E ratio, and protocol revenue. The analysis of the NFT trading market is also currently supported.


A data analysis platform that deeply cultivates DeFi TVL, supports TVLs of nearly a thousand DeFi protocols on 107 Layer1 & Layer2 networks, and can be classified, compared and viewed with different indicators and time dimensions. At present, DeFiLlama also supports the analysis of NFT, focusing on the transaction volume and collection types of different trading markets on different chains.


A data platform focusing on the NFT market, providing services such as data analysis and whale wallet monitoring, aiming to help users better track and evaluate the value of NFT projects and assets, and help make informed investment decisions.


If Nansen could be summed up in one word, it would be “label”. Nansen has cumulatively analyzed 50 million+ Ethereum wallet addresses and their activity, combining on-chain data with a database of millions of tags to help users better find signals and new investment opportunities. Nansen is currently one of the most star projects in the Web3 data analysis and application layer, and completed a $75 million financing in December last year at a valuation of 750 million.


Founded in 2014, Chainalysis, known as the “FBI on-chain,” is an enterprise data solutions company that monitors and analyzes on-chain data to help clients such as governments, cryptocurrency exchanges , international law enforcement agencies, banks, and more comply with compliance requirements, assess risks, and identify illegal activities. Last June, Chainalysis announced a $100 million Series E financing at a valuation of $4.2 billion.

Footprint Analytics

Footprint is a comprehensive data analytics platform for discovering and visualizing blockchain data. Compared with other applications, Footprint has lower barriers to use and is very friendly to novice users. The platform provides rich data analysis templates, supports one-click fork , and helps users to easily create and manage personalized dashboards. At the same time, Footprint also has markers for other wallet addresses and their activities on the chain. make investment decisions.

Zerion & Zapper

The earliest DeFi portfolio trackers and managers have also added support for NFT assets.


DeepDAO is a comprehensive data platform focusing on various DAO organizations. Users can easily view the amount and changes of the treasury, the distribution of treasury tokens, the holdings of governance tokens, active members of the organization, proposals and voting status, etc. DeepDAO also provides dozens of tools for creating and managing DAOs.

There are many more applications in this layer, which are not listed here.

In fact, SevenX has been paying attention to the data track from a very early time, and has invested in Debank, Footprint, Zerion, DeepDao, RSS3, CyberConnect and Go+. In the process of screening projects, we have some experiences and judgments, which are briefly shared here:

In general, application layer traffic is no longer the core barrier. Users may migrate quickly at any time due to factors such as ease of use and update speed of other products. Products that have the ability to provide data and form a closed loop of data channels with users will be more powerful. Competitiveness, but traffic products have the possibility of feeding back before barriers are formed.

How do we evaluate it? There are the following 5 dimensions:

1. Scene selection:

(1) Is there a demand, and is the demand mature enough or will it happen in the future?

When a project is looking for a requirement, it needs to judge the maturity or stage of the requirement. Take GoPlus as an example. In the DeFi world, “safety” is already a necessity, and safety is a consensus requirement of almost everyone, and this requirement is endless and varied, and it is difficult for ordinary users to distinguish and prevent it. is activated and gradually matured after a security incident. So now everyone would rather pay an extra step or spend appropriately to buy a safer experience.

(2) Do the C-side or the protocol first?

We believe that when the scene needs are not fully stimulated, we should first make C-end products to find user pain points, otherwise it will be easy to find nails with a hammer. For example, GoPlus made the Go Pocket wallet in the early days, which is actually like a model room. With the model room, other partners can better understand what problems the product is solving, which will help the B-side gain when extending the agreement in the future. Clients have been of great help.

After that, SevenX will focus on GameFi , DeFi, DAO, NFT, social, security and other scenarios.

2. Data Capability:

Data acquisition and structuring are the basic skills, but whether or not to have data capabilities based on industry cognition is the key.

3. C-end product capabilities:

The ability of C-end products mainly depends on whether the urgent needs of the audience can be found as a cold start method, and it can be easy to use.

4. To B expansion capabilities:

The expansion of To B is a complex decision-making process. Whether it can acquire benchmark users, or whether it can efficiently acquire long-tail users according to product positioning, all need to be considered.

5. Team background:

(1) The background of the vertical track web2, and has independently operated a project

(2) Open source community experience

(3) Ability to learn quickly and without prejudice

Possibilities of Web3 Data

With the increase of on-chain analysis, the anonymous property of the blockchain is gradually broken. For example, you can track the transaction address and transaction behavior of large households according to the nansen tag, and you can also identify the activities and organizations that an address participates in through the on-chain address. And on-chain behavior, which exposes our data to the sun and loses the right to choose privacy. And Nansen recently said that more than 100 million wallets have been marked, which makes the need for privacy more and more important.

The current privacy solutions mainly include privacy coins , privacy computing protocols, privacy transaction networks, and privacy applications.

If we want to protect our on-chain transactions or selective discovery of activities, or we want the process to be invisible but the results to be visible, we can choose privacy computing protocols, such as Oasis Network, etc. Commonly used technologies include zero-knowledge proof, secure multi-party computation, Federated learning based on modern cryptography, Trusted Execution Session (TEE), etc.

However, the current protocol availability is relatively limited, and most of them are still in the development stage. The most popular is the Secret Network, which has launched applications such as the cross-chain bridge Secret Bridge, the privacy DeFi protocol Sienna Network, the privacy transaction protocol Secret Swap, and the Bitcoin trustless privacy solution protocol Shinobi Protocol.

Starting from the second half of 2021, leading VCs and developers have begun to flood into the privacy track. It is believed that with the gradual development of this market, people will better follow the fundamental principles of blockchain while using data to generate greater value. Find a balance between spiritual protection of privacy.

Finally, let’s briefly talk about our judgment on the market trend: building a decentralized reputation system through multi-dimensional data vectors is one of the most important use cases in the Web3 data market. Based on the reputation system, various financial scenarios such as credit Unlocking of loans is possible.

Lending has always been an important part of the DeFi ecosystem. At present, the types of products in the entire market are mainly mortgage lending (usually over-collateralized) and flash loans. Credit lending that does not rely on (or does not completely rely on) collateral has always been It is considered to be the most important evolution direction, because credit will create a more free exchange market.

However, the biggest obstacle to the introduction of credit lending in DeFi is that the lender only faces one address and cannot effectively verify the solvency of the borrower at the other end of the address and whether he has a bad credit record. Some solutions try to accomplish this goal by introducing off-chain credit data onto the chain, but the question of how to ensure the authenticity of the off-chain data itself and the on-chain process has not been well answered.

Now, with the gradual improvement of the on-chain identity system and the simultaneous growth of data and data analysis tools that can be used for analysis, what a user creates, contributes, earns and owns on the chain can gradually accumulate into the user’s reputation , so as to realize the effective credit evaluation of one address to another address. In fact, the Lens Protocol endorsed by AAVE is actually doing such a thing, using NFT to manage data and laying the foundation for unsecured credit loans on the chain.

write at the end

Although unicorns worth tens of billions of dollars have grown, the Web3 data track has just begun. Standing in the torrent of applications on the chain, every bit and byte is defining what kind of Web3 citizen you are. We need to find a new order and paradigm to jointly resist the entropy increase in the new world.

