Proto-writing traces back over 20,000 years. Logographic writing is at least 5,000 years old, but alphabetic writing took another 2,000 years to become widespread. Even then, many forms of punctuation and even things as basic as placing spaces between words are far more recent innovations. If reading text without spacing between words sounds difficult – the advice historically given was that reading the text aloud made this process easier, and up until only a few centuries ago the act of reading silently was generally seen as either a difficult talent or as an antisocial activity reserved for those who wish to hide the ideas they consume.
Business investment is always risky, but tech VC is notorious for its extreme tolerance of risk, far beyond that of most other types of investors. Very often this is simply accepted as the price paid for innovation, without any deeper thought. Often the exact same people who insist that extremely high risk is purely a matter of innovation will, in a different conversation, preach about how every field outside of computing had stagnated since the 70s, an era when Silicon Valley VC spray-and-pray strategies would have been seen as insane.
Founding a business is fundamentally an assertion of contrarian truth about the world, that there is some opportunity that you can take advantage of that everyone else has missed. There is always a serious chance that you’re wrong about this, or at the very least wrong about how to reach this opportunity. There is always risk. However, this axiom of risk asserts that risk must exist, but it tells us absolutely nothing about how much risk – is there a 10% chance of failure? 50%? 90%? 99%? Fundamentally, the risk involved is the inverse of how much certainty there is.
If you decide to open a new Starbucks, there is a tremendous amount of data about the last 10,000 Starbucks locations, and so deciding the right location to maximize profit is a science more than it is an art. People have been digging up rocks from the Earth for a very long time, the risks of opening a mine can be greatly mitigated by some surveying. If you decide to start a software business, there’s far less risk if you’re building some dime-a-dozen SaaS product for which a standard playbook has been built. The risks are far greater if you’re venturing off into something far more experimental.
There is risk of opening a new Starbucks location and there is risk in starting a wild new startup. This is not a fundamental difference in kind, only in magnitude.
Certainly, some of the risk of tech investment is probably downstream from the fact that software businesses often produce very high returns; it’s far more common to 100x an investment in a software company than it is in a company with physical products or other conventional constraints. Investors may be engaging in a form of risk homeostasis.
Tech bubbles pretty universally fail to live up to their hype and expectations. Each new technology will absolutely reshape the world in unimaginable ways, if you buy into the hype.
I recall watching the Internet of Things bubble in 2017, when the near future was apparently full of smart toasters and smart juice presses that would collect so much data that we’d invent a dozen new SI prefixes to describe how many bytes we’d need to store it all. Buzzwords like “Brontobytes” and “Geopbytes” were thrown around, describing the scale of datacenters supposedly a couple years away. It turns out that very few of these smart devices provided any meaningful value and the data they collected was mostly worthless while creating a myriad new security concerns. This somehow dovetailed into the 2017/2018 crypto bubble, and soon everyone was raving about how your toaster would be uploading the data it gathered from you onto a public blockchain.
There is a certain mythology to how technology is understood which shapes how people imagine the future and where they invest their time and money into building it. This mythology may be widely believed but can easily turn out to be wrong. I remember when the 2017/2018 crypto bubble was flooded with millions of people who asserted with absolute certainty that Bitcoin was a slow and primitive prototype, and by some immutable laws of technology that it would soon be slain by a new and more innovative cryptocurrency. This never happened, and the contrarian Bitcoin maximalist narrative that Bitcoin would become a store-of-value reserve currency, trusted by nation states, where transaction speed didn’t matter much seems to be far more plausible today.
On the other hand, I also recall a great deal of fuss about cryptocurrencies like EOS and Solana, which abandoned the core ideology of decentralization that motivated other crypto projects in favor of maximizing performance. Scaling blockchains while maintaining all the properties people wanted never succeeded. The wild optimism about using arcane mathematics and protocols to reshape the world that I saw in 2017 seems long dead, and the Nasdanq meme gambling thesis seems to have won out against all odds. Solana is now extremely popular as a host of meme coins, presumably because decentralization is of zero importance if you’re simply running a casino.
Facebook has repeatedly chosen to make gigantic bets on these tech bubbles. After its ambitions to create a cryptocurrency faced political challenges, it renamed its company to Meta, hired game development legend John Carmack, and dumped $40 billion into building “the Metaverse”. Carmack eventually quit and the company faced enormous difficulties getting its own employees to even enjoy using the technology they were building. Apple also chased this trend, building their own AR/VR headset. My perspective was that while it may perhaps be a success eventually there is currently no real killer app for AR/VR, and that escapism into gaming is a dystopian vision that normies won’t adopt as easily as tech people expect. I debated many friends at the time, who insisted that Apple was a magical company whose products could never fail, that the billions being spent were a guarantee of success, and that we’d all be wearing their VR headsets in a matter of months. I recall a friend who would respond to every criticism with “Siri, show me that girl naked”, and how all progress in the internet was driven by porn. These friends have since all long forgotten that they ever made such claims and have since started AI companies.
As for the present AI bubble, I consider it a mixed bag, and a long discussion for another time.
Venturing into the Fog
Computing is a very strange, new, and unprecedented field. We are venturing into the fog, and the tools we’ve built for understanding the world over thousands of years are not very enlightening. The family of visions of the future of computing that you will find in San Francisco and its derivative tech hubs, that guide us in building the future, that tell us where the fruit worth picking can be found, are perhaps only 5% correct. This implies enormous potential in building alternative tech hubs with wildly different visions of the future of computing, provided some contrarian truths can be found and deeply mined. Perhaps they will prove equally blind and misguided, but getting even a different 5% correct could be profoundly valuable.
Given a mere 32 bytes of data, there are more possible configurations than there are atoms in the observable universe. Moore’s Law could have ended decades ago and we’d still be able to mine the Software Library of Babel for new algorithms and ideas until the end of time. Even then, we still wouldn’t have made the slightest dent, we would have simply exhausted all of the resources provided by the entire universe first, which is unfathomably tiny in comparison.
We are in the very, very, very early days of computing, and there is a strange future that stretches out far ahead of us, vastly longer and vastly stranger than any of us can possibly imagine.
There are many path dependencies in the modern computing industry. There are many decisions made decades ago that have heavily shaped modern computing, where a different outcome could easily have led us down a wildly divergent path.
The modern range of popular operating systems were designed decades before the internet, on the assumption that the user was an expert who wasn’t stupid enough to run untrusted software on such expensive business equipment. The failure of operating systems to provide acceptable sandboxing of applications littered the early internet with security landmines. Browsers on the other hand provide sandboxing by default, and web apps began growing in popularity as soon as Google made Javascript run fast with V8. If the operating systems of the 90s and 00s had better security models, the domination of browsers as a de facto operating system may not have occurred.
If CPUs had supported efficient context switches directly between userspace programs, the performance bottlenecks that pushed the 1990’s growth in driver code into the operating system may never have happened. In 1995, Linux was a mere 15,000 lines of assembly, and began growing rapidly once it bolted the bloated USB drivers into the kernel. We could easily have taken an alternate path where operating systems stayed small, and where hobbyists or startups could compete with Microsoft.
If digital payments had been more quickly adopted and trusted, it’s possible that advertising wouldn’t be such a common monetization model for software. The extreme focus on attention maximization has undoubtedly shaped much of modern computing, and many of the widespread complaints about the internet and its impact are likely downstream from this. Our interactions with computers favor quantity and volume over the quality of interactions, whereas if people had to more often open their wallets to use software we may have instead seen a bias toward quality over quantity.
Complexity Theory has a number of very misleading theorems and conjectures, where “real world problems” frequently end up landing in weird corner cases where harsh laws turn out to be less harsh. There certainly exist programs whose behavior is undecideable, but there are also infinitely many programs which are perfectly decideable, and these are the building blocks that humans prefer to build things from. There are many problems which are very useful to be able to solve which happen to lie in complexity classes like NP and are therefore exponentially difficult – but O(1.0001^N) is no less of an exponential than O(2^N), and real-world problems frequently lie closer to the former than the latter. We have likely stunted progress substantially by spending decades wrongfully teaching millions of CS students that good static analysis is mathematically impossible and that broad classes of useful algorithms are “intractable”.
If we take such harsh laws at face value and consider information-theoretic limits on Kolmogorov complexity, we would be forced to conclude that the entire field of data compression should be laughably impossible, something that is empirically very untrue.
Where else may the future take us? Where can we look for new threads to pull, new orchards to search for unexpected fruit?
Computing is a rather odd field, though I expect that over the next century we will increasingly realize that an enormous amount of geometric structure can be found from the right perspectives. Lattices, one of the fundamental tools used in the static analysis of code, are almost exactly mathematically identical to topological spaces. A great number of difficult problems for reasoning about the properties of code boil down to reasoning about the shapes of the trajectories that programs take.
In many ways, computing is vastly simpler than physics. We’ve engineered computers to be understood, but the universe has no obligation to be legible to us. Nevertheless, much of the progress of the past few centuries of physics has come directly from applying geometry to better understand physics. The ancient Greek thesis that geometry is the foundation of all of mathematics will likely prove to have a great deal of truth to it, and I expect any disbelief that computation and logic may have sewn will gradually fade in time.
It’s notable that we already describe the complexity of algorithms with polynomial expressions. The convention with Big-O notation is to drop all constants and all but the most significant terms, but if choose to ignore this convention we are left describing our algorithms with mathematical tools that often closely resemble the Taylor Series approximations that are widespread in physics. We could even very easily employ tools like calculus for studying these algorithms.
There are also other very interesting ideas I’ve been thinking about lately that seem to imply a connection between decideability and geometry, though I’ll save that for a future article.
Human engineering is constrained by that which we understand. The word “ape” refer to both the branch of primates humans descend from, but also can refer to a clumsy mimic or mime. Humans are fundamentally a species which acts through mimicry, by learning to replicate patterns in the world, and then extrapolate and compose them to reach goals, even if we march past countless simpler solutions in the process.
Much of the value of machine learning is that it can synthesize algorithms for many kinds of difficult problems without the biases and blind spots of human engineering, or at least very different biases. The incomprehensibility of the weights inside machine learning models are a glimpse of algorithms that humans may not fully comprehend and master for a very long time. You may stare into these arrays of numbers like ancient people stared into the chaos of ocean waves and storm clouds, seeing powerful, unknowable gods in what their descendants would describe with a few Navier-Stokes equations.
There is still abundant chaos, but we’ve at least made progress, and we can even harness this understanding to build fantastical things like flying machines.
Complexity Theory describes a vast menagerie of classes of computation. The P class contains the vast majority of algorithms that humans use, and even then we mostly confine ourselves to a very tiny, quasi-linear subset of it. While NP problems often contain an exponential factor that can limit scaling, it contains many useful problems which are vastly tamer than people expect. Furthermore, P and NP are only a tiny part of a much larger family of classes called the Polynomial Hierarchy (PH). Most classes here are generalizations of NP which tend to be exponential-time problems (which may be tamer than they look) with an additional polynomial slowdown. All of PH is contained within PSPACE, the class of problems which can be solved with polynomial space and at most exponential time.
There are also many other problems which technically don’t fall into PH such as counting problems and interactive proof systems, which are not always infeasible to solve either. This generally is the result of PH being defined in a very constrained way to limit their scope and make mathematical progress easier, and there are many kinds of algorithms which break these constraints without necessarily becoming impractically hard; they just play by slightly different rules.
Optimally solving chess is technically in EXPTIME, which is more difficult than PSPACE and by extension all of PH. Exhausting the entire chess search space is pretty infeasible, but approximating this well enough to outperform human chess grandmasters was done in 1997.
Yet the vast majority of the tools we’ve chosen to use remain in the very tiny quasi-linear subset of P. We’ve mapped out a vast wilderness, but we’ve so far chosen not to explore much of it.
We’ve seen Moore’s Law drive over half a century of exponential growth in computing power. Even if it ends soon and computing power stagnates, the range of things we could do with the hardware that we have is far greater than anything we can possibly imagine.