Roughly every two years, the density of transistors that can be fit onto a silicon chip doubles. This is Moore’s Law. Roughly every five years, the cost to build a factory for making such chips doubles, and the number of companies that can do it halves. 25 years ago, there were about 40 such companies and the cost to build a fab was about $2-4 billion. Today, there are either two or three such companies left (depending on your optimism toward Intel) and the cost to build a fab is in excess of $100 billion. Project these trends forward another ten years and you can expect a single factory to cost nearly half a trillion dollars, and the number of companies that can do it should drop to less than one.
Nanometer Numbers
The cutting edge of transistors are “2 nanometer”. Intel has abandoned nanometers as a metric and have begun using terms like 20A, 18A, and 14A, measuring in angstroms (1/10 of a nanometer, or the approximate width of a typical atom). However, these nanometer numbers are entirely fake today. This used to be an objective measure of the width of the gate on a planar transistor, but planar transistors stopped working 15 years ago. Their current substitute, FinFET transistors, have no equivalent feature to measure. The physical feature these numbers once measured no longer exists, but some naming scheme is still needed.
The advertised transistor density of modern 2nm nodes is somewhere around 200-250 million transistors per square mm. There are one trillion square nanometers in a square millimeter, and dividing this out this gives us a footprint of 4000-5000 square nanometers per transistor, or roughly a square 60nm on each side. The gate pitch, or minimum distance between transistors, is generally in the range of 30-40nm, and the 60x60nm footprint is a result of geometric constraints on how densely irregular circuits can be packed, leaving unavoidable empty space.
The extremely tiny features on a chip are produced via photolithography – a light is flashed through a photomask to project an image onto a wafer coated in a light-sensitive photoresist, which hardens or liquifies depending on its exposure to light. The tinier the image can be made while retaining focus and detail, the smaller components can be made.
The hype over the past few years around EUV (Extreme UltraViolet) lithography is mostly in its extremely fine resolution. Previous generations of lithography equipment relied on wavelengths of ultraviolet light measured in hundreds of nanometers, which puts limits on just how small of patterns can be created on a chip. This was cheated somewhat using multipatterning – if your “pixels” are 200nm across, you can shift the image by 100nm and project a new image, giving you some extra resolution in how the two images overlap. You can even do this multiple times, and employ a number of other tricks to cheat out much finer details than should be theoretically possible, at the expense of much more complicated and defect-prone manufacturing. EUV uses light with a wavelength of only 13.5nm, which lands in the overlap between “extreme UV” and “soft X-rays”. Being technically a form of X-ray (prone to passing through things), optical manipulation of this light is difficult – magnification requires atomically-precise curved mirrors relying on the Bragg effect to diffract the X-rays backward, as traditional materials are very ineffective at conventionally reflecting X-rays. X-ray lenses are effectively off the table.
EUV gives us “pixels” of about 13.5nm to build our chip from. ASML is now rolling out its “High-NA EUV” machines, which exploit some optical tricks to get this resolution down to about 6 or 7nm.
These “pixels” are still dozens of atoms across. We could theoretically scale down further – a 6x6x6nm cube has about a quarter million atoms. I suspect that if someone were simply willing to bite the bullet and attempt a 1:1 scale between wafers and photomasks, all this complicated optics could be eliminated and soft X-rays with 1-2nm wavelengths or less could be used. A typical medical X-ray tube is orders of magnitude more energy efficient at turning electricity into photons than the method of lasering molten tin droplets that’s currently used to produce EUV light. An X-ray tube is a simple vacuum tube with an electron beam between a cathode and anode, and the specific wavelength produced is a function of the atomic number of whatever material that the anode is made from.
The next problem however is not the wavelength of the photons, but rather the chemistry of the photoresist. Conventional photoresists are based on polymers – long chains of molecules that link or unlink when struck with photons. Linking them together forms a spaghetti-like knot on top of the silicon. However, the links of these chains are not infinitely small – they are made from atoms just like everything else, and they only work when forming long enough chains. Conventional photoresists stop working below 7-10nm, and most unconventional resists stop working below 5. I’m not aware of any clear plan for scaling past this fast-approaching point.
Transistor Geometry
In a transistor, the voltage of the gate lying on top of the channel controls the conductivity of the channel beneath it, either creating an insulator or “depletion region”, or leaving the silicon naturally conductive. Of course, the effect is localized, and the further away from the gate the effect weaker it is. About 15 years ago, the gates got small enough that the insulating region no longer could reach very deep into the silicon. This shallow depletion region is hardly a barrier to electrons at all – the electrons began to merely travel underneath.
Planar transistors, which had served as the workhorse of Moore’s Law for nearly 50 years, had to be abandoned in favor of FinFET transistors – rather than the gate lying flat across the channel, the channel is now a thin, vertical fin, and the gate wraps around three of the four sides. This increases the surface area between the gate and the channel, and means that this shallow depletion region needs only to reach toward the center of this thin fin to block off the flow of electrons.
FinFETs created a dramatic increase in manufacturing complexity, and forced several big players like Global Foundries to abandon chasing Moore’s Law. They have been the new workhorse, but will not be nearly as long-lived as planar transistors. Instead, after a mere 15 years they have already reached their limits and are being phased out in favor of GAAFETs – Gate-All-Around Field Effect Transistors. Intel calls them RibbonFETs. Rather than a vertical fin with a gate wire on three sides, GAAFET has the gate wrap around all four sides of the channel, and then stacks several channels on top of each other for extra reliability.
GAAFET is a dramatic increase in complexity on top of FinFETs – the layering is particularly hard, especially as there must be a thin insulating layer between the channel and the gate, and this layer must now wrap around all sides of a stack of channels. This alone multiplies the number of manufacturing steps, and thus multiplies the number of opportunities for defects, which eat into the bottom line of running a chip fab.
Even GAAFET is only expected to last a couple generations before it too runs out of steam. The end of the road seems like it may be CFET. Transistors naturally come in two flavors – PMOS and NMOS, and the CMOS logic that serves as the foundation of nearly all digital logic involves using the two together. CFET is simply GAAFET, but stacking the NMOS transistors on top of the PMOS transistors without much attempt to scale them.
Other Limitations
AMD launched the first 1 GHz processor in 2000, and Intel was soon boasting that they’d have 10 GHz processors by 2010. It is now 2025, and the CPU in my laptop maxes out with a mere 4 GHz boost speed. Where’s my other 6 GHz? Where’s all the further gains Intel was supposed to deliver in the past 15 years?
What had enabled this clock speed growth for decades was Dennard scaling, the observation that as transistors are scaled down the power density of the silicon (watts / sq mm) should stay the same. However, leakage currents and other factors do not scale with size. At large scales these are negligible, but on small scales they are not. Dennard scaling died around 2006, and clock speeds have largely stagnated ever since. The rising power density of the silicon has several other effects – rising power consumption, more extreme cooling systems, and an increasing need to disable large portions of the chip at a time to reduce power consumption.
Currently, one of the big technologies that Intel is betting hard on is backside power delivery. There are two types of wiring on a chip – power and logic. The logic wiring tends to consist of many small, thin wires that switch quickly and have low capacitance. The power wiring consists of a smaller number of large, thick wires with a consistent voltage and high capacitance meant to absorb the combined voltage swings of all the circuits attached. For a variety of engineering reasons it is not ideal to have these two near each other, and so Intel has begun fabricating chips with the bulk of the power wiring engraved on the back, with millions of tiny wires passing through the silicon.
This is a very difficult technology – a challenge that TSMC is not nearly as daring in bringing to market. If Intel can solve their presently high defect rates, corporate warfare, and stave off bankruptcy and buyouts, this technology may serve as a valuable advantage. On the other hand, this is an optimization that could have been introduced decades ago. Why now? Because it is a high-risk and relatively low-reward, one-time optimization. You cannot repeat this multiple times for greater gains. The gains, while meaningful, are certainly not enormous. This is not the kind of optimization that you make when the future looks optimistic – this is the kind of thing you do when you’re running out of ideas and have to start scraping the bottom of the barrel.
The trend toward chiplets and die stacking over the past decade, while not completely new, is also perhaps another modest optimization that’s becoming increasingly attractive as alternatives dry up.
Photomask prices have begun to skyrocket in cost. Photomasks serve as a fixed cost for the production of chips – you need to pay for a set of masks whether you intend to manufacture a thousand chips or a billion. Masks used to cost a few hundred thousand dollars, but with 28nm hit $1M, at 7nm hit $10M, and are somewhere around $40M at 3nm.
If you’re manufacturing 10,000 chips at 3nm, the masks alone will add $4,000 to the price of each chip, which may otherwise be <$100. If you’re at Nvidia’s scale and are manufacturing hundreds of millions of chips, then masks may only add a few dimes to the price of each chip. This essentially kills small players at advanced nodes, and perhaps risks undermining foundry business models like those of TSMC, which assume a large number of small and medium-sized customers. With rising photomask prices, all but TSMC’s biggest customers may simply be unable to afford to make their chips at these advanced nodes. As fabs continue to get more expensive to build, at some point it may no longer be possible to find any way to fund building them – or worse, investments are raised and a factory is built on the false assumption that customers will materialize as they always do, but this time with unrealistically deep pockets.
If you’re paying $40M or more for masks alone (not counting all the silicon, manufacturing, packaging, testing, etc.), perhaps a $150M EUV machine doesn’t look so pricey anymore. Or perhaps the machines start looking a bit overengineered and overpriced and you start reading up on X-ray physics and photolithography.
Future
Gordon Moore always emphasized that his “law” was fundamentally rooted in economics, not physics. How small or fast or efficient a transistor can be made in a lab is of absolutely no relevance if they can’t be mass-manufactured at a price anyone is willing to pay. The death of Moore’s Law has been predicted for many decades, and progress has continued to march on far longer than anyone expected. It outlived even its namesake.
Nevertheless, nothing here looks good. This is not some single technical obstacle, but many hard technical and financial walls fast approaching all at once. If there is any way forward, it will require a serious departure from the past few decades of work. If it is not competitive on day one with manufacturing processes that have been optimized and perfected over more than six decades, it stands no chance in the mass market.
Invention often favors the easiest solutions to find over those that are the most efficient. Estimates for the size of the codebase of ASML’s lithography machines border on the billions of lines of code. This seems to be less a result of necessity, and more of a culture of copy-pasting code out of paranoia and the extreme pressure to ship on a tight schedule. There are tricky things that such a machine does, but a billion lines of code is pretty implausible for a company of this scale to even write without heavy automation or copy and paste. Some of this is essential complexity, the rest is likely one of the world’s largest mountains of technical debt.
Once it is no longer feasible to simply build a bigger fab, the only way forward is to go back and simplify things. If costs can be reduced substantially, then what would have been a half-trillion-dollar next-gen fab could be a bit more affordable, or a current-gen fab could be built on a much smaller budget. It may take a lot longer than two years to simplify things enough though.
This isn’t even particularly unreasonable – there are already multiple examples of people doing lithography in their garage that’s on par with the state of the art from the early 1990s. Once a viable path has been found, speedrunning it is often far easier and cheaper than the original stumbling in the dark. Technology also improves over time in many dimensions, and what may have been an absurdly difficult technical problem 10 or 20 or 30 years ago may be easily solved with a more recent innovation. If a literal teenager can build a 1990-level chip fab in his parents’ garage today, then what can be achieved with some startup capital?
By far the largest factor driving high costs in semiconductors is the demand for extremely low defect rates. Chips are often assumed to be near-perfect. Highly parallel processors and memory chips can often tolerate higher defect rates through redundancy. Highly parallel chips designed from the ground up for extremely high defect tolerance could perhaps be manufactured with a vastly lower-quality, and by extension vastly cheaper chip fab. Memory chips have even been using complex 3D structures for decades that logic chips shy away from – high defect tolerance may not only lower costs but may also open the door for extremely advantageous manufacturing techniques that a defect-sensitive fab like TSMC may be too afraid to touch.
This is an industry where dozens of equipment suppliers must heavily coordinate to build tools for three major companies. This is an industry that likely should have embraced vertical integration at least a decade ago. Technological progress very rarely follows a smooth curve, and often follows more of an uneven stair-stepped trajectory. Moore’s Law is perhaps a self-fulfilling prophecy, and perhaps served as a powerful solution to the difficult problem of coordinating such a large and complex supply chain to innovate in sync. However, it’s very unlikely that this was the most efficient possible path, free of any wasteful diversions. I wouldn’t be surprised if huge efforts for short-lived technologies like strained silicon could have been completely skipped.
Used Car Markets
Another possibility that has long been on my personal list of “future articles to write” is that the future of computing may look more like used cars. If there is little meaningful difference between a chip manufactured in 2035 and a chip from 2065, then buying a still-functional 30-year-old computer may be a much better deal than it is today. If there is less of a need to buy a new computer every few years, then investing a larger amount upfront may make sense – buying a $10,000 computer rather than a $1,000 computer, and just keeping it for much longer or reselling it later for an upgraded model.
Most cars aren’t purchased new, but are glorified hand-me-downs. People with deep pockets (or perhaps poor financial sense) purchase new cars for tens or hundreds of thousands of dollars, and several years later sell them at an enormous discount and go buy a new one. This makes a valuable asset that would otherwise be prohibitively expensive affordable to a much larger portion of the population.
As of late, I’ve been busy working on startup stuff with The American Compute Corporation. My priorities and ambitions have shifted somewhat and I’m not as stubborn about bootstrapping as I used to be. I’m currently raising some pre-seed funds, aiming to build high-performance, high-efficiency, general-purpose CPUs designed from first principles. I’m also increasingly convinced that, with the right compromises and specialization, a startup-scale fab may be viable and is at least well-worth the experiment.
A new chip architecture is also as much a software problem as it is a hardware problem. The Bzo compiler project went on the backburner for a while, but I’ve picked it up again recently and will hopefully have a dev log with new things to share soon. I have big ideas for the software for these chips. There’s an immense opportunity to rethink many different aspects of computing.
We’re entering the post-Moore era, I’m busy designing chips (and maybe a fab) for this new world. I’d be happy to talk to investors.