Today, data has become a fundamental and key strategic resource comparable to oil, and is subverting the development model of various industries. With the construction of the digital economy and the rapid expansion of the data industry market, data is continuously generated, flowed and exchanged in various industries.
As we all know, data sharing and circulation is the key link to release the value of data. With the increase in data exchange and sharing, many risks such as ownership, compliance, and security have begun to emerge, resulting in difficulties in data sharing and inability to obtain privacy. security and other issues. How to realize the sharing and circulation of data and release the value of data under the premise of ensuring security and privacy is an urgent problem to be solved.
Status of data sharing business
At present, various industries and fields have the problem that data owners cannot successfully share data due to many factors. include:
“Unwilling” to share, the data owner cannot benefit from the shared data, the industry lacks a data value evaluation system, and there is a lack of incentives for each sharing participant.
“Don’t dare” to share, the security and privacy demands of data cannot be satisfied, and once the data leaves the original usage scenario, it will become uncontrollable, and there is a risk of improper use or abuse, which will endanger one’s own interests. At the same time, there is a lack of real-time monitoring methods for shared data and a data use authorization mechanism, and there are also problems of inability to obtain the scope of sharing and the legitimacy of sharing participants.
“It is not easy” to share, and the information standards of various institutions are not uniform. The inability to obtain continuous, multi-source and standardized data resources hinders the improvement of data exchange and sharing efficiency.
In view of the three pain points faced by current data sharing, the industry proposes to solve these problems through blockchain + privacy computing technology.
1 Privacy Computing
Privacy computing solves the most core data privacy issue and eliminates the concern that data holders “do not dare” to share. Privacy computing technologies include secure multi-party computing (MPC), trusted execution environment (TEE), and federated learning (FL) to achieve secure sharing of private data.
Among them, secure multi-party computing mainly solves the problem of using multi-party data to safely perform calculations without a trusted third party, ensuring that each data owner does not expose any other data except the calculation results, and is used for private arithmetic operations, set operations and Statistical Analysis. The trusted execution environment mainly relies on trusted hardware. By building a trusted execution environment with the help of the CPU chip, encrypted data can be decrypted and calculated in this environment, and the external (operating system, BIOS, etc.) cannot obtain the data, so as to ensure the original Data privacy and security.
In practical applications, it is necessary to encapsulate application protocols on top of secure multi-party computing or TEE, so that privacy computing technology can be used in specific scenarios, including joint query, joint statistics, joint modeling, and joint prediction.
Joint query: privacy set operations, including hidden query, privacy intersection, difference and union operations.
Joint Statistics: Numerical operations, including data privacy addition, subtraction, multiplication and division, mean and variance operations.
Joint Modeling: The ability to model private computing, support multi-party joint training of models without exposing private data.
Joint prediction: The prediction ability of private computing, which supports participants to perform offline or online prediction of trained models.
Blockchain technology provides co-governance and co-management capabilities, and is responsible for the trusted collaboration between data participants, data users, and data circulation infrastructure operators in the entire data circulation process. Blockchain plays a key role in all three pain points in the data sharing process.
Solve the problem of “unwillingness” to share: Blockchain alliance governance provides a collaborative governance mechanism between various participants in the process of data circulation and a service management mechanism for trusted data circulation of data elements, through a voting strategy based on blockchain smart contracts. The adjustment of profit distribution parameters of alliance participants, the entry and exit of alliance members, and the system upgrade and transformation, etc., solve the problem of incentives for data sharers.
Solve the problem of “dare not” sharing: Unlike privacy computing, which directly solves the problem of data privacy, thereby eliminating the concern of “dare not” sharing, blockchain is to ensure the authenticity of data use, authorization, and supervision in the process of data flow , to help eliminate the data party’s fear of “dare to” sharing due to concerns about data misuse, unauthorized use of data, and forged authorization. Provide the ownership relationship confirmation and authority control capabilities of each data through the smart contract authority confirmation and authorization service, and combine with effective mechanisms such as digital identity to ensure that the data authority and authorization are accurate to the person; the retrospective audit service supports the key in the process of trusted data circulation The steps are registered on the chain, and at the same time, it provides multi-dimensional and all-round record auditing capabilities based on trusted data, which is convenient for regulatory agencies to conduct full-process supervision inquiries on data circulation.
Solve the problem of “difficult” sharing: Blockchain technology provides on-chain data catalog + data circulation task lifecycle management and other capabilities, providing convenient retrieval and smooth collaboration services for the sharing process. The data catalog on the chain records the metadata of all data involved in data circulation, including data name, unit, access method, and release time, and provides retrieval, classification, verification, etc. Services; life cycle management of data circulation tasks includes life cycle management of distributed privacy computing tasks, task status management, and participant management.
The blockchain provides shared security and trusted storage, introduces data sharing contracts to achieve precise authorization on the data chain, and provides data supply and demand matching, and records the transfer and exchange records of receipt, grant, and receipt. The issuance, revocation, and dispute arbitration during the use of data use certificates are completed through the blockchain.
data sharing process
The two core technologies that make up the blockchain-based data sharing platform are introduced from the macro level above. The following is the sharing process, which connects the entire technical context with the process.
In a nutshell, blockchain provides a trusted meta-information storage medium, builds a data collaboration network, models and realizes business sharing processes on the network. By building a business computing model, the model is programmed according to the meta-information on the chain, the model flows with the process, and is executed using local data, and the shared goal is completed during the process operation and flow. In the process, the data of different institutions does not need to be released from the library, and only the calculation results are shared and transmitted.
1 Participant role
Data provider, owner of data. The data provider processes the local user’s data through cryptography.
The initiator, the person who needs to share the results. Provide shared task requests to the platform.
Participant refers to the actual executor of the data exchange and sharing task, and also provides local data to participate in the calculation.
Coordinator, coordinator of computing tasks, scheduling process, computing task execution.
In a shared task flow, there can be one or more data providers and participants, and the coordinator must participate.
2 Sharing process
The blockchain-based data sharing platform builds a set of unified implementation standards for collaborative and shared data representation, indexing, positioning, query, exchange and data traceability auditing for various data sources in the network, and provides capabilities such as business process customization. It supports rapid development, deployment and realization of shared services and business cooperation, realizes the trusted interconnection of data between institutions, and solves the security and privacy issues in the process of data collaboration.
The process is divided into the release of the data provider and the acquisition and use of the data demander. The whole process is divided into two parts:
(1) Release process of the data provider
Import: Upload data to the local data sharing node. Import provides data management requirements, and the import process will uniformly represent the data. Depending on the type of imported data, different processing will be performed, and there will be fragmentation and file system services.
Naming: Through a self-describing data structure, a network unique ID is formed, and the concept of a file system path is also provided. Provides the ability to locate and search data in the network.
Publishing: Publishing meta-information of shareable data (such as data title, data usage description, etc.) to the blockchain, which can set default data access permissions for certain institutions.
Synchronization: Synchronize the metadata information to the blockchain-based data sharing platform. The blockchain-based data sharing platform is a collection of data collections on the chain, which can be queried by any party to obtain the data here.
(2) The acquisition process of the data demander
The demander obtains data, arranges business processes, and initiates data sharing. The nodes will assign sharing tasks according to the instructions of the process. At the same time, the virtual machine of the node will load the business computing model and use the local data to execute the computing logic.
Retrieval: Retrieve the required data through a blockchain-based data sharing platform, including a unified description of the data and on-chain metadata information, which builds an index for retrieval.
Request: Request the access credential Token of the relevant data, and provide the integration mechanism in the platform. According to the method of data release, the transfer of the points and the data authorization record will be carried out when the data is requested. When the data is authorized for access, it is necessary to initiate a data access application to the data provider, and obtain the Token issued by the smart contract after the data provider’s review.
Acquisition: The data requester uses its own business customization process and calculation model, transfers to different institutions according to the process flow, obtains data from the provider through the node ID of the network and the acquired Token, and transmits data point-to-point through the blockchain-based data sharing platform or calculation results.
Use: The calculation model will use the data obtained by authorization or purchased with points, and use the data of the party by loading the model to obtain the result of calculation.
Despite the rapid development of blockchain and privacy computing technology, the relevant applications are still insufficient. The current data circulation method is still mainly based on the transmission of raw data, and data privacy and security issues need to be solved urgently. The data circulation industry is developing rapidly in terms of business development, technological evolution, policy and standard formulation, etc. In order to maximize the value of data, data must be fully circulated. Traditional data processing technology does not do enough in terms of privacy and security, which restricts data circulation to a certain extent. With the development of blockchain, privacy computing, big data, etc. With the continuous development of multi-party trusted collaboration and data processing technology, as well as the continuous improvement of national policies, regulations and standards, the safe sharing and circulation of data will be gradually solved, and the value of data will be gradually released.
Posted by:CoinYuppie，Reprinted with attribution to:https://coinyuppie.com/a-preliminary-study-on-a-data-sharing-platform-based-on-blockchain-privacy-computing-technology/
Coinyuppie is an open information publishing platform, all information provided is not related to the views and positions of coinyuppie, and does not constitute any investment and financial advice. Users are expected to carefully screen and prevent risks.