Speaking of TSMC , friends who are always concerned about the consumer electronics field will not be unfamiliar. As the world’s largest semiconductor manufacturer with the most open capacity, it also undertakes the chip foundry business of many giants including Apple, AMD, and MediaTek. . And their production process directly affects the release cycle, shipment volume, performance, and even quality level of chip products of these manufacturers.
Because of this, when TSMC recently announced a series of experimental results on “on-chip water-cooled” chips, it naturally aroused our curiosity.
What is on-chip water cooling? It integrates the heat dissipation structure into the chip
As we all know, today’s semiconductor chips basically generate a lot of heat during operation, so a heat sink is needed to dissipate heat. So the question is, how does the heat sink dissipate the heat brought by the chip?
Some people may say that this is not easy. Apply a layer of thermal grease on the chip, then install the radiator. The thermal grease transfers the heat from the surface of the CPU to the bottom of the radiator, and then the heat is transferred to the fins by the heat pipe or water cooling pipe, and finally the fan blows away the fins. Isn’t the heat on the chip or cold row the process of heat dissipation?
A high-end desktop CPU, there is a layer of thermally conductive material under the metal top cover, and then the real core silicon chip
It is true, but there is actually a misunderstanding in this, that is, what we usually think of as the “top” of the chip is actually just a protective metal cover in most cases. There is a layer of thermally conductive material below it, and then the real semiconductor silicon wafer. In other words, heat actually has to pass through several layers of conduction of silicon wafer-internal thermal conductive material-CPU metal cover-external thermal conductive material (thermal conductive silicone grease) before it can be transferred to the radiator.
Not only that, even for chips without metal protective covers, such as graphics cards and notebook CPUs, the heat-generating parts can actually not directly touch the radiator. Because the layer of smooth silicon material on the top of the silicon wafer is essentially a “lid”, it protects the circuit structure below, but it also blocks the transfer of heat.
Knowing this, it is not difficult to understand the principle of TSMC’s “on-chip water cooling” design. Simply put, TSMC has experimented with three different “on-chip water cooling” designs. The first is to etch many grooves directly on the silicon cover layer on the surface of the chip, allowing the heat transfer liquid to flow through these grooves, thereby taking away heat. TSMC calls this design “DWC”, which should be the abbreviation of Direct Water Contact.
The remaining “on-chip water cooling” design is relatively conservative. In the process of chip manufacturing, a layer of silicon material etched with water is added to the smooth surface of the silicon wafer with a cover layer, which is equivalent to having a heat dissipation water path on the top of the chip when it leaves the factory. Compared with the first design, this design requires an additional layer of silicon structure on the surface of the silicon chip, so it is necessary to bond the two layers of silicon chips with a thermally conductive material, and according to the different thermally conductive materials, it has also evolved OX TIM ( There are two different solutions: silicon oxide thermal conductive material) and LMT (liquid metal thermal conductive material).
Obviously, compared with the traditional design that requires users (or equipment manufacturers) to apply silicone grease on the surface of the chip and install a heat sink, the principle of TSMC’s “on-chip water cooling” solution is actually the same. In essence, it is to make the chip surface layer thinner, to make the distance between the heating part of the chip and the heat dissipation structure closer, and at the same time to add more advanced and thinner heat conduction materials directly in the chip production step, so that the heat can be more It is well communicated from the inside of the chip to the surface and to the heat dissipation structure.
Of course, it also shows that TSMC may have been very clear that the future semiconductor chips will inevitably be more serious in heat generation, and new heat dissipation designs that are closer to the “heat source” need to be used. So the question is, why will the chip fever in the future become more and more serious?
First of all, chip design and semiconductor manufacturing process have indeed entered a bottleneck
Although semiconductor manufacturers always say that Moore’s Law has never failed. But whether it is from product design ideas, power consumption data, or from the actual energy efficiency ratio test results, the entire consumer electronics chip industry has actually entered a “significant improvement in performance, but also a rapid increase in power consumption. “The strange circle.
Take the familiar PC-side CPU as an example. In the past few years, most of the flagship products of Intel or AMD still maintained the power consumption level of 95W. Then from the 9th generation Core, Intel took the lead in relaxing the typical power consumption of the CPU to 125W, and then there will be a power consumption performance of more than 200W for the subsequent 10th and 11th generation Core Turbo. Not to mention in the field of high-end workstations, the power consumption of Xeon 2699V3 in 2014 was only 145W. In 2017, the E5-2699P V4 with default power consumption of 300W appeared. Today, the highest-end Xeon 9282 56-core processor has a design power consumption even Has come to the giant of 400W.
Of course, AMD is not much better. The original Ryzen 7 1700X is also only 95W. On the Zen2 3800X, the typical power consumption rises to 105W. After unlocking the PBO, we even measured an instantaneous power consumption of 300W+. On the higher-positioned Ryzen Threadripper platform, the original 1950X “only” 180W, it rose to 250W by 2990WX, and the latest Ryzen Threadripper PRO 3995WX has become a conventional radiator that is difficult to surrender. The 280W “monster”.
Not only that, according to the news that has been exposed so far, Intel’s next-generation Core will introduce a “large and small core” design to reduce daily power consumption, but it may also bring higher peak performance and peak power consumption. AMD’s next-generation CPUs will also greatly increase the cache capacity again. Unfortunately, the cache itself is also a high-power device, so it is said that the thermal design power consumption of the high-frequency model in the next-generation home version of Ruilong will reach 170W, which is a record. The new record of default TDP in the history of home CPU.
The area comparison of top CPU, high-end CPU and low-end CPU
And the increasing power consumption may not only occur in the PC CPU field. Recently, in the graphics card field, overseas netizens have counted the performance and power consumption of the two past generations of NVIDIA and AMD products. It turns out that since the DX11 era many years ago, the GPU unit transistor/frequency performance indicators of the two GPUs have almost been going downhill. In other words, the performance improvement of graphics cards over the years has basically been achieved by stacking more cores and pulling higher frequencies. However, the transistor structure design and computing architecture design at the bottom level are all achieved. Regress.
However, this may not be the worst case. According to a report issued by TSMC, in their view, some “large chips” with an area larger than 500 square millimeters (that is, almost 2cm*2.5cm or more) in the future, the target design power consumption may be as high as 2000W or more. Although usually low-end CPU/GPU is unlikely to be this large, everyone should know that the current chip area of enthusiast products such as thread tearers has reached more than 1,000 square millimeters, so TSMC’s prediction is naturally not It may be unfounded.
In addition, changes in software thinking have also led to an increase in power consumption
Almost two decades ago, when we were exposed to the first batch of PC software, they often left us with the deepest impression of “slowness.” This is not only because of the poor PC performance at that time, but also because for the software of that era, “reducing the performance consumption of the computer as much as possible, even at the expense of execution speed for this,” can be said to be a very prominent design feature.
For this reason, at the time, things such as antivirus software scanning for several hours, and Windows system updates taking an entire afternoon from download to installation, can be said to be very commonplace. Although this “low-consumption and low-speed” software design does allow more poorly-performing PCs to avoid the embarrassment of getting stuck in just starting an application software, but on the other hand, it does cause problems for those PCs. The waste of high-performance hardware is not conducive to the development of hardware technology.
So almost since the Windows Vista era, major software vendors have gradually changed their development thinking, fully optimized for new hardware as much as possible, and “eat” the performance of the computer to the limit as much as possible, and thereby achieve faster operating speeds. .
Chrome is not friendly to old computers, but it is too fast on new computers
Take anti-virus software as an example. Nowadays, the Defender anti-virus software integrated in Windows not only consumes the computing power of the CPU, but also uses the general computing power of the graphics card to speed up the scanning process; for example, it now occupies the absolute mainstream position in the browser industry. Chrome, in fact, is a software that consumes both memory and CPU, but no one can deny that its high hardware occupancy design does make its rendering speed and page loading speed much faster than other competitors. In addition, in terms of office software, Office has used GPU for font rendering and interface acceleration since the version several years ago. This is why many people feel that Office is obviously not small, but it runs. The reason why the interface fluency is faster than many lightweight office software.
More than that, in the field of games, now DirectX 12 has been developed from the very beginning on the premise that it can make better use of the computing power of super multi-core CPUs. In the latest Windows 11, Microsoft provides a series of game enhancements specifically for top PCs, allowing users to experience faster game loading speeds, better game screen effects, and better screen display quality than other computers. . Of course, these will also consume more hardware computing power, resulting in higher power consumption, and higher requirements for heat dissipation.
Really good hardware, who cares about power consumption (the picture shows a 56-core dual-channel PC + dual RTX3090)
In other words, whether it is hardware or software, the entire PC industry now actually presents a general trend of “replace power consumption for performance”. Although environmentalists may think this is a very bad situation, as users who benefit from such a substantial performance increase, they may not feel uncomfortable with this increase in power consumption and heat generation.
After all, what is the current PC, it is much faster than in the past.
Posted by:CoinYuppie，Reprinted with attribution to:https://coinyuppie.com/is-water-cooling-not-enough-why-are-the-chips-getting-hotter-now/
Coinyuppie is an open information publishing platform, all information provided is not related to the views and positions of coinyuppie, and does not constitute any investment and financial advice. Users are expected to carefully screen and prevent risks.