The impact of zero-width characters on ENS may be greater than you think

Yesterday, a senior Web2.0 domain name trader “Hero Jie” published an article on his personal Mirror called “Please stop registering all ENS domain names, because it is worthless”. Hero Jie claims to be a senior Web2.0 domain name trader, and has sold many well-known domain names such as xiaomiquan and wuyinli, and still holds the high-quality domain name ouyi.

The article pointed out that a design omission caused by a “ZWJ (zero-width character)” that is invisible to the naked eye is laying a major safety risk for ENS. The article was widely circulated in some crypto communities and caused some investors to question ENS.

This problem allows multiple .eth domains that are identical to the naked eye to appear at the same time. Just as Web3.0 revolutionized the old traditional Internet, in the Web3.0 era, ENS has also brought new and upgraded methods for phishing attacks that have never appeared before in Web2.0.

At this stage, the “.eth” domain name is more widely used as a “screen name”. A unique eth domain name is like the QQ name in the Web2.0 era. In this kind of application scenario that is difficult to call infrastructure, some design omissions may cause users to be troubled, but after all, they will not be able to shake the leading position of ENS in decentralized domain names.

After the realization of the ENS vision, can this omission still be ignored? “Decentralised naming for wallets, websites, & more .” This is the grand mission written in bold fonts on the ENS official website. In this vision, ENS will become the domain name system for naming all digital resources. Opening theblockbeats.eth is as natural as opening theblockbeats.info. At this time, the zero-width characters of ENS will bring profound security risks to the entire Web3.0 world. .

Zero-width characters make ENS a step further from becoming a Web3.0 infrastructure.

I, V God, play money

When you see “vitalik.eth”, who do you think this person is? There is no doubt that this ENS domain name is owned by V God. So, can I register this domain name? According to the rules of ENS, this domain name has already been registered, and other users will naturally not be able to register the same domain name. But it is worth noting that this only refers to the domain name that is identical to the computer. So, can I find a domain name that is different from the V God domain name but looks the same?

Of course you can , as long as you insert ZWJ in any position.

ZWJ (zero width joiner) is a zero-width character, this symbol is quite special. For computers, ZWJ is still a character and has an independent code in the Unicode character set. If you type this character in Word, it will still be counted in the word count. The width of this character is 0, which means that zero-width characters are completely invisible to the naked eye.

This means that as long as I insert zero-width characters anywhere in the word “vitalik”, I can register an ENS domain name that looks exactly the same as the V-God domain name to the naked eye.

The impact of zero-width characters on ENS may be greater than you think

When registering an ENS domain name, as long as you type “%E2%80%8C” or “%E2%80%8D” in any position, you can insert a zero-width character in the word. In this way, a V God with the same ENS can be successfully registered. If after inserting a zero-width character, it is still registered in advance, you can even insert two, three, four… and more zero-width characters in a row until no one is registered.

The ENS of the domain name is not well done, and the narrative cannot be sustained

ENS is not only one of the important infrastructures of the Ethereum network, but also an important infrastructure of the Web3.0 network. The founder of ENS has publicly stated that the vision of ENS is to be “a domain name service provider for every digital resource in the world.” Not only the user’s account name, but also the naming system of the entire Web3.0 network.

Remember the imagination of the first version of Web3.0 in the early years? Decentralized storage and preservation of files, decentralized domain names provide an addressing system, smart contracts have on-chain computing capabilities, and decentralized wallets act as payment channels. In this version of Web3.0, everything runs on decentralization. The Internet, without permission, without censorship, is a truly free Internet. In this version, using a Web3.0 browser to access theblockbeats.eth is as natural as you open theblockbeats.info.

Unfortunately, this version of Web3.0 has not yet been implemented. And mainstream browsers have not yet supported the access of .eth domain names. Although ENS is still under continuous construction, it seems difficult to become the mainstream infrastructure for this version of Web3.0. If it is really built, it will also leave a huge security risk for Internet surfing in the Web3.0 era.

Recall carefully, how did you open this article?

You must have seen the link to this article somewhere. A mouse or finger click brought you to this page. It is definitely not a long string of https://www.theblockbeats.info/news/28611 in the address bar. Undoubtedly, almost all users are surfing the web using URLs. One after another crisscrossed hyperlinks constitute the Internet of our age. Hyperlinks organize the complex information of the Internet. Hyperlinks provide search engines with a technical basis for finding information. Hyperlinks provide an open and free interconnection channel for information. It can be said that without hyperlinks, there would be no Internet in today’s world.

Can Web3.0 websites based on ENS domain names do all of this? At least it is extremely difficult at the moment. Because it brings us a great security risk.

In the Web2.0 era, phishing website attacks are always causing serious losses to netizens all over the world, but this is still in the case of domain names that cannot be duplicated. Imagine that you see a link shared by a netizen while surfing the Internet. The link is “visible to the naked eye” on a well-known platform. The spelling of the domain name and the real address are exactly the same, so you click on it. But in fact, this is a phishing website forged with zero-width characters.

When users are only making point-to-point transfers, the habit of manual input makes zero-width characters just a trivial prank. When ENS tried to achieve its mission and name all digital resources, all this changed. The phishing of Web2.0 is only similar to the domain name, while the phishing of Web3.0 has been iterated to be completely consistent. This will be a major safety hazard.

We are in an Internet woven based on hyperlinks. DeFi, trading platform, Web3.0 blog, Web3.0 social networking; website links, dapp links, API interface links, entry links for all use cases… If the .eth domain name in the form of a link is no longer credible, how can .eth be expanded? Its use case outside of “screen name”? How to become a Web3.0 infrastructure? How does the grand narrative of the ENS domain name continue to unfold? This risk may fundamentally impact ENS’s valuation system.

Ironically, this problem does not even exist in Web 2.0.

How does Web2.0 solve this problem?

The Web2.0 solution is simple and straightforward-it does not support the use of a mixture of zero-width characters and Latin letters as domain names. For details, please refer to the “UTS46” standard of the “IDN2008 Specification”.

In the previous article, we mentioned the two sets of mysterious codes of zero-width characters “%E2%80%8C” and “%E2%80%8D”. This is the hexadecimal UTF-8 encoding. Their Unicode numbers are “U+200C” and “U+200D” respectively. These characters are usually used in Arabic and Indic languages ​​to control whether there are ligatures between characters. In most other languages, you cannot type this character.

In the domain registration of Web2.0, such relatively special characters are not accepted. But this does not mean that Web2.0 has no similar means of attack. In fact, phishing websites disguised by similar-looking domain names have always been widespread in the Web 2.0 world.

For example, can you accurately distinguish between “e” and “е”, “a” and “а”, “Ο” and “O” and “О”? These letters include the Latin alphabet that we frequently use, as well as the Cyrillic and Greek letters that are rarely used.

At first, domain name registration only supported ASCII characters, which are the “English letters” and Arabic numerals in our spoken language. This is also the most widely used character set all over the world. Almost all devices that support character display support ASCII, but they may not be able to display other characters normally. After the popularity of IDN (Internationalized Domain Names), domain name registration added support for multiple languages ​​and characters, and extended the supported characters from the ASCII character set to some Unicode character sets. This allows people from all over the world to register domain names in their native language. Taking Chinese as an example, you can directly access Xinhuanet through “http://Xinhuanet.中国/”.

It is not difficult to find characters similar to the Latin alphabet in so many texts. The use of similar characters to pretend to be phishing websites for fraud is gradually increasing. This fraud is called a homograph attack.

As early as 2001, Israeli security personnel published a paper called “Homograph Attack” and registered a variant of microsoft.com containing Cyrillic letters. This is also the first homograph fraudulent domain name that can be verified. It can be said that the homograph problem has a long history in the Web2.0 era, but its harmfulness and seriousness are far less than that of the ENS domain name of Web3.0.

Let’s take a set of IDN domain names as an example: ԚԚ.com, аӏірау.com, аӏірау.com. Open these domain names, what can you see?

The impact of zero-width characters on ENS may be greater than you think

The browser automatically converts the domain name into a domain name starting with “xn--“, and this encoding method is called Punycode.

In the “IDNA2003” specification, in order to avoid homograph fraud, domain names should undergo secondary processing. This process is called “compatibility normalization (NFKC)”. In non-ASCII character domain names, all characters can be converted into more general ASCII characters (“xn--“domain name) through Punycode. This encoding method follows the UTR36 standard and has been used by mainstream browsers, which reduces the risk of homograph attacks from the user side.

Similarly, ICANN has also made corresponding regulations during the registration of IDN domain names. Domain name registration organizations in various countries are also gradually following up. For example, the domain name management agency in Russia has banned the mixing of Cyrillic and Latin letters in .ru domain names.

ENS domain names undoubtedly support far more characters than DNS domain names. Not only can you register domain names with various texts like DNS, you can even register domain names with emoji, as well as the zero-width character registration domain names that have been hotly discussed this time for security risks. (It is worth mentioning that emoji domain names are not a special case of Web3.0. The root domain names such as “.tk” and “.ws” in traditional domain names also support emoji registration.)

In Web3.0, can we eliminate this hidden danger by similar means?

Facing a homograph attack, ENS developers have an ambiguous attitude

Unfortunately, the ENS developers do not seem to plan to solve this problem from the registration portal.

In discussions in the ENS community, this issue has been raised by users as early as April 2021. The ENS developer explained that the support for zero-width characters is at the contract level, so these characters that may be used for fraud cannot be removed. In addition, there is a more important reason-zero-width characters support the application of emoji in ENS.

The impact of zero-width characters on ENS may be greater than you think

ENS founder nick.eth responded to the issue of zero-width characters: “We are not as strict as ICANN on most gTLDs. Domain names like emoji make good use of ENS.” “ENS prohibits parsing. Non-UTS-46 domain names are not implemented at the contract level-it is impractical to write the specifications into the contract-this should be part of the client’s need to solve the problem.” Of course, he also made positive for users Said, “We will consider adding to the standardization rules to prohibit this situation that you have discovered.”

There are many emojis. In fact, there are a large number of emojis that are composites of basic emojis. For example, “woman”, “zero-width character”, and “rocket” together will be recognized by the computer as “astronaut”. . With zero-width characters, more expressions can be incorporated on the basis of a streamlined code set. And this is also the basis for ENS to support almost all emojis. Therefore, ENS cannot shield the use of zero-width characters.

In the previous article, we mentioned the “.tk” domain name of Web2.0, which is the first domain name in the world that supports emoji. How does the traditional Web2.0 domain name solve this problem? In the “UTS46” standard of the “IDN2008 Specification” mentioned above, the use of zero-width characters in different texts and the use of emoji have been strictly regulated.

The impact of zero-width characters on ENS may be greater than you think

During the discussion in April, Nick explained to community members that the use of zero-width characters is at the smart contract level, “However, this is very good, and the design of ENS has always been like this.” “The whitelist rule is useless here, because The domain name can contain multiple characters, not just emoji.”

Risk control and hidden danger elimination

So far, we have not seen any measures taken by the ENS team to fix this security risk at the contract level. All precautions against this risk are made by the centralized Web2.0 front-end.

In OpenSea, ENS domain names containing zero-width characters are marked with a yellow exclamation mark.

The impact of zero-width characters on ENS may be greater than you think

In etherscan, ENS domains with similar risks are marked with an asterisk.

The impact of zero-width characters on ENS may be greater than you think

On Metamask, although there is no additional risk warning, Metamask can recognize that the string contains a zero-width character, and use “?” to display this character.

The impact of zero-width characters on ENS may be greater than you think

With the help of centralized methods, the security risks of ENS domain names are reduced to a certain extent. But when we enter a completely open Web3.0 world, how much effect will centralized methods play? If this hidden danger cannot be eliminated, ENS is still far from his vision of naming all digital resources.

One day in the future, someone will send you a link to an announcement at http://www.binance.eth. Do you dare to click it?

Posted by:CoinYuppie,Reprinted with attribution to:https://coinyuppie.com/the-impact-of-zero-width-characters-on-ens-may-be-greater-than-you-think/
Coinyuppie is an open information publishing platform, all information provided is not related to the views and positions of coinyuppie, and does not constitute any investment and financial advice. Users are expected to carefully screen and prevent risks.

Like (0)
Donate Buy me a coffee Buy me a coffee
Previous 2022-01-07 09:14
Next 2022-01-07 09:17

Related articles