What to expect when you’re expecting a (AI) Bubble

Don’t look to the dot-com bubble to understand the consequences of the current one.

23 Mar 2026 • 42 min read

This is my first post over that is 10,000 words. My table of contents includes common questions I’ve seen asked about AI, so feel free to skim by clicking around. Otherwise, grab a cup of tea, and read the post at your leisure. Also check out my first three posts on AI if you haven’t already: A less technical guide for understanding LLMs, LLM intelligence is a dark pattern, and AI hype is a mirror of market fundamentalism.

Toward the fall of 2025, pundits in American media began asking in unison “Is AI a bubble?” Nothing specific led to this realization; by this point, everyone knew AI hyperscalers had spent eye-popping amounts of money on GPUs^[1]¹ and data center construction. In fact, as I pointed out in my LLM dark pattern post, it was for this reason that the release of the DeepSeek model in January 2025 gave markets a brief scare. The DeepSeek moment had investors questioning why Silicon Valley was spending so much on GPUs if Chinese companies were purportedly doing more with less. Investors and the media had ample time and reason to want to question the economics of generative AI, well before September 2025.

What the media seemingly picked up on was the circular dealing being used to finance GPUs that will be worthless for frontier compute in three years, optimistically speaking. Around the same time, it became apparent that increasing amounts of debt fueled data center financing. Oddly while probing bubble fears, the media simultaneously rationalized a would-be bubble as a big bet that LLMs would (soon^tm) take your job. So what’s going on? Are we in a bubble?

Yes (generative) AI is a bubble

Yes, we absolutely are in a bubble.^[2]² You’ve likely noticed that as of February 2026, the media has become more confident in this assessment, too.

One reason we’re in this mess is because our information ecosystem has caused people to conflate different types of AI systems. It’s true that “AI” is being deployed in a lot of places, but the current bubble centers around transformer and diffusion based large language models (LLMs) or generative AI (GenAI). These are systems that can make video, images, music, text, and code. Conflation of GenAI and LLMs with other types of AI systems makes it seem like the entire field is having a renaissance. But investment is mainly concentrated in data centers and GPUs that are mostly only compatible with GenAI technologies. Machine learning engineers working with other types of AI have expressed concern that LLM and GenAI hype is directing attention, funding, and resources from other viable AI technologies. There is also concern that scaling LLMs cannot address the fundamental limitations of the technology. The key issue is that there is no way to consistently guarantee the accuracy of LLM outputs as they are probabilistic systems.

FatherPhi is a YouTuber who recently began giving current-gen LLMs unhinged but basic tests and watching them fail in spectacular ways. Also see: https://youtu.be/thKjvF1dZQY

Everyone involved in this frenzy is going to spend trillions of dollars by 2030 with the goal being to replace huge swaths of labor with machines. Not just on the margins, and not in a way where humans are simply rehired, for less pay, to fix the outputs of LLMs. That’s why I’m personally insistent on using the term “capital-as-labor” to describe what needs to happen for this not to be a bubble. I’m talking about a market transformation approaching a one-to-one substitution of labor (people) for capital (machines). This replacement wouldn’t have to happen across every industry, but hyperscalers and frontier LLM providers that are spending billions of dollars need their investment to buy them something durable. And that something cannot just be trillion dollar productivity software any tech company can spin up. Unfortunately, they’re not going to find that in LLMs, GPUs, electricity, or other infrastructure that must be replaced over time. If anything these will contribute to growing operating costs as every new model must be trained from scratch and these companies must purchase new GPUs every one to three years to train better models.

While Silicon Valley’s historical winners were at some point blitzscaling loss-leaders,^[3]³ there’s a specific order of operations that enables so-called blitzscaling^[4]⁴ to be successful. A traditional blitzscaler uses investor money to essentially purchase an entire market for themselves. By purchasing a market, I mean that the blitzscaler becomes the only game in town, usually by creating a platform that has inescapable network effects. But nothing that OpenAI, for example, is currently investing in will net them that kind of advantage. New GPUs every one to three years means a frontier provider like OpenAI is a perpetual customer of an asset that immediately depreciates when used. Furthermore, from a competitive standpoint, companies like OpenAI are at a huge disadvantage as rival LLM providers can distill their models and offer LLM services for cheaper (though still at a loss; see MiniMax). This is currently what Google and Anthropic are worried about.^[5]⁵

You could argue that, at the very least, the frontier LLM startups following the 2000s era Silicon Valley playbook are gaining ~~trapped~~ loyal customers, much like Amazon did way back yesteryear. The irony, though, is that as these frontier LLM services become more desperate for revenue, they must enshittify their product by finding ways to aggressively monetize their users. This is supposed to be the last stop on the blitzscaling train, when you have no competitors and user adoption is skyrocketing. Instead, frontier LLM startups must spend increasing amounts of money on GPUs and cloud infrastructure while squeezing their users. I suspect as this happens, the only users likely to stay are whales—a term used in freemium video game markets (and gambling) referring to superusers who are price insensitive—but here too is another irony. Generative AI whales sometimes use thousands of times more compute than the average user, so a pricing structure that relies on capturing a handful of whales might not be effective for this reason. This cost model works in freemium markets because the basic unit economics make sense. Basically, serving whales in Candy Crush does not require orders of magnitude more resources compared to free users. This allows whales to subsidize the vast majority of a user base, but the same is simply not true for generative AI because the compute, electricity, and labor to manage current models are not cheap.

From this perspective, much of the AI ecosystem looks like investors and firms ritualistically cargo culting the image of a successful VC-backed tech firm. Burning billions of dollars is a legitimate strategy assuming your unit economics make sense, you’re investing in durable infrastructure, or you can capture a market. So too is rushing to add advertisements to your product, assuming you have customer lock-in. Everything on paper, decontextualized from the specifics of AI economics, makes sense, and I think this is why the bubble narrative was resisted by the media for such a long time. People saw that Amazon and Uber lost a lot of money in their earlier years and assumed that LLM companies could do the same. But the circumstances are different here, partly because investors are spending many multiples more^[6]⁶ and will want to see returns at the same scale. But even if investors had the patience of saints, it’s not clear that LLM firms will enjoy the level of lock-in of earlier tech platforms. Optimists of the technology will happily acknowledge that LLMs look like a commodity, but commodities don’t work well for VC-backed firms with multi-billion dollar valuations (also known as unicorns). Unicorns want to sell something exclusive or have a market structure that makes substitution hard so that they can rent seek their way to stable incomes. I’m stuck on Facebook because my friends are there, but I’m not stuck on ChatGPT or Gemini.

However, none of this should be news to anyone paying attention. Praetorian Capital founder Harris “Kuppy” Kupperman said something similar last summer, and tech analyst Ed Zitron has been reporting on the rising meteoric costs facing companies like OpenAI and Anthropic for most of this hype cycle.

What I’d expect if generative AI weren’t a bubble

Whenever someone brings up the possibility of a bubble, suggests that the capabilities of today’s LLMs might be overstated, or points out that the technology is costly, the discussion often shifts to the merits of LLMs. I don’t think that’s productive, but more importantly the existence of an LLM bubble is not predicated on the actual merits of the technology. All the following can be true:

Model inference costs are coming down, in some use cases, as older models can run on consumer grade hardware and newer models use techniques like mixture-of-experts (MoE) or model routing for better cost control.
Conversely, however, excitement and demand for LLMs is driven by current state of the art (SotA) frontier models that are actually more expensive to run because they’re expensive to train and rely on techniques like reasoning which use more compute at runtime.
In order to remain competitive, frontier labs must keep training and scaling newer models. This is why costs for firms like OpenAI and Anthropic are still growing even if older models are cheaper to run. Cost efficiencies mainly benefit consumers in the form of open source models or older frontier models. But, so long as there is an AI race, providers will engage in two to three year GPU purchase cycles, large training runs, and talent wars all of which are part of AI’s true, unsubsidized cost.
Even maintaining existing and previous SotA models requires armies of engineers to make LLMs “just work” because engineers A/B test system prompts, build ecosystems of tools for specific tasks, and handle cybersecurity. The last one is extremely important, as LLMs by design intake and process any arbitrary input, making securing this tech impossible without guardrails and 24/7 observability.^[7]⁷
Exhaustion of real data has likely caused LLM labs to move towards synthetic data to train future models which have prompted fears of model collapse. This is when models become incapable of producing good output because they’ve been trained on less rich data. Synthetic data, or output produced by LLMs, cannot replace the rich data produced by people.
- To be clear, as far as I know, model collapse hasn’t been extensively documented in current LLMs, but some users suspect it exists. Given how LLMs work, though, degradation from synthetic data is a mathematical certainty. Even research that suggests there are ways to prevent total collapse, like an oft-cited paper by Gerstgrasser et al., does not say that synthetic data is harmless to future models, but that loss from data quality can be partly mitigated with access to original human-quality data.
- Anecdotally, there are cases where users prefer older models for specific tasks. See the bitter disappointment over LLaMA 4 as an example. It’s worth noting that users’ bad vibes, were backed up by benchmarks showing LLaMA 4 really was worse at certain tasks than LLaMA 3.
LLM productivity gains are mixed at best. There are people who claim they’ve become extremely productive, with some research supporting this in larger enterprises. Simultaneously, though, other research shows either no productivity impact or productivity loss. This is true even in domains like coding where LLMs have extensive training data. Even where there is productivity gain, quality might be impacted. Recent high-profile incidents at Amazon illustrate this well. In the last three months, both Amazon Web Services and Amazon.com have experienced multiple service outages affecting millions of customers as a result of LLM workflows that the company is now revamping.
Business adoption and return on investment looks mixed at best too, with the most successful use cases being automation of workflows rather than wholesale replacement of labor. There are also many instances where companies regret replacing workers with LLMs, and expect to hire them back if they haven’t already.
Part of why business adoption is weak may be that AI cyberinsurance coverage is sparse depending on which insurers firms want to work with.
Finally, rising electricity costs, rising PC component costs, and the spamming of open source maintainers with worthless pull requests and false accusations is cannibalizing resources for future tech sector growth while souring public sentiment. For lack of a better metaphor, LLMs are starting to piss in everyone’s beer at the party.

None of this means that LLMs will die or that tech firms won’t find better machine learning architectures. But right now, the relevant question is if billions of dollars in investment can be sustained when there are practically no returns and continued financing requires debt. If the answer is no, then we’re looking at a bubble that will impact much of the current crop of AI companies and maybe the broader economy. For LLMs to not be a bubble, I think one of two things would have to be true:

If we could rewind back to January 2025, when DeepSeek terrified markets, and convince investors to understand that small, built-for-purpose non-language model transformers and bespoke language model workflows were driving value. Then they’d find firms actually working on those things instead of handing money to OpenAI to hand to Microsoft and Nvidia to hand back to OpenAI and a small subset of very large companies.
Barring the above counterfactual, some form of capital-as-labor has to be true. I’m not sure if banks are buying this story, though, as some are now hedging to cover their bets.

I alluded to this elsewhere, but this soon-to-be bubble is a result of LLMs becoming victims of their own success. The minute Sam Altman saw dollar signs around GPT-3, he turned LLMs into a commodity, which directed money away from research and towards productization and subsidizing demand. Instead of money going towards new and more capable machine learning paradigms, it’s going to yet-to-be-profitable data centers (with no tenants?!) and rapidly depreciating GPUs. Things are so topsy-turvy that Llion Jones, one of the creators of the transformer architecture behind LLMs, thinks the current environment is hurting real AI progress. As I said in my LLM dark pattern post, Jones is far from the only AI luminary who feels this way.

Now, on the off chance that capital-as-labor is imminent, and I’m wrong… First, silly me, but more importantly that would also be bad. Workers are consumers who will have no income to spend if labor is made partly or mostly irrelevant. And even in a world where the capital-as-labor bet was winner-take-all, which investors seem to think it is, the costs of labor might not go down. If you’re a business owner who thinks unions are bad, then you should absolutely hate a world where most labor comes from a cartel of like five companies that rely on circular investment to survive.

So, how bad is the AI bubble going to be?

No one knows the future and anyone giving you a specific sequence of events without hedging is likely selling something.

The trillion-dollar question that probably can’t be answered until the bubble pops is the extent to which the real economy and AI economy are intertwined. A smaller, but equally relevant question is what other current risks are in play within the current economy. Discussing these will give us a sense of what the bursting bubble might do to the economy. I can’t answer either in full, but here is what concerns me:

High level economic indicators, like GDP and asset prices are increasingly being driven by tech and AI in particular. If a bubble were to pop there would, at the very least, be an economic slowdown. It might be a double whammy, too. In my recent plutonomy post I pointed out that the wealthy are driving consumption because higher asset prices are subsidizing their spending. If tech companies pull back on AI and tech stocks fall as a result, both consumption by the wealthy and tech investment into the economy are likely to shrink: two contributors to GDP hit at once.
There is already financial stress in the US economy given the Trump administration’s trade policy, which has created economic uncertainty and rising costs for American produers. More relevant, however, is the fact that private credit markets also show signs of elevated risk via NDFIs (non-depository financial institutions) for reasons mostly unrelated to AI spending. If AI or AI-related debt are also concentrating in NDFIs this could increase credit stress.^[8]⁸

While these trends have been happening for some time, things are still in-flux, making it hard to extrapolate how they interact in the event of a bubble. It’s clear, however, that any slowdown in AI spending will have direct macroeconomic implications.

How severe? I don’t know. The Dallas Federal Reserve report I included in a footnote in the plutonomy post suggests that plutonomy-like asset-driven consumption doesn’t significantly amplify recession severity. However, their model relies on data from past recessions. Today, both business investment and asset growth are far more concentrated than in the past, given that the lion’s share of stock market and real GDP growth are coming from the very industry creating the bubble.

Now, whether we see a financial crisis like 2008 is going to depend on how credit exposure and risk evolve within data center deals and bank lending. Any combination of these issues alone would be worth monitoring, and it’s possible for all of these to converge. But I honestly can’t say much more than that without speculating, but I’m sure someone somewhere has discussed this scenario in greater detail. A recent Chicago Federal Reserve article highlights this concern as a tail risk, but points out this is hard to quantify:

In a tail risk event, stress in one AI-adjacent industry can spill over to multiple interconnected AI-adjacent industries. For example, if software companies are stressed and unable to maintain their infrastructure spending levels, semiconductor manufacturers, energy companies, and data centers may be affected, impacting their ability to repay their debts to commercial banks. Banks most likely have additional exposure to AI-adjacent industries through lending to nonbank financial institutions (NBFIs). For example, a bank may lend to a private credit institution providing funding for a data center or lend to an investment fund that specializes in AI investments, and stress in the underlying companies may lead to stress in the NBFI borrowers.

From current reporting, the primary AI bubble risk seems concentrated to the core economy in a 2001-like event, rather than a systemic credit-driven crisis (See January’s International Monetary Fund Report). However, we seem to be flying blind here and this can change as these separate risks evolve. For example, recent moves from the private credit firm Blue Owl and from one of Blackrock’s private credit funds now have some analysts worried about AI investment concentration risk in NDFIs/NBFIs.^[9]⁹

None of this analysis takes into account Donnie Dumb-Dumb’s stupid-no-good war which is affecting energy prices and technology imports. 😬

If it’s a bubble, is it a “good” bubble?

The media, now entertaining the possibility of an imminent bubble, has taken to speculating if it will be good for society. Maybe even comparable to the likes of the dotcom crash. This frustrated me to no end for many reasons that I’m going to spend the rest of this post addressing. But first, you might be asking why bubbles even happen.

Literature on bubbles tend to focus on psychological exuberance of investors or distinct phases of technological growth as essential parts of why bubbles are a recurring part of capitalism. With a systems lens, though, the answer becomes much simpler to understand. What we call capitalism actually has two components: a market economy and a political/legal regime in the form of capitalism, a system that “labels” things in the market economy so that its owners receive entitlements.

Markets enable capital holders to create and toss out into the world competing configurations of profit-generating and profit-sustaining capital. These configurations, in the form of firms and other entities, compete to filter for ones that create durable advantage, and thus income. From the outside looking in, this search process is highly inefficient. This is partly because it is an open-ended, unbounded search process that allows capitalists with enough capital to modify the legal, political, social, and cultural environments to compete. Over time, competition becomes about control over the environment that capital functions in. Economists refer to some of these imperfections as market failures.

Bubbles are a normal part of this process. They often combine various technological innovations with legal and political transformations. For example, the 19th century US railroad boom was enabled by many factors, including the technical development of the locomotive and the legal evolution of eminent domain, right of way, and other processes for legally partitioning land. I don’t mean to footnote this, but this enabled the theft of land en masse from indigenous peoples. Similarly, LLMs were enabled by both the technological innovation of the transformer architecture and the large-scale “disregard” of copyright that enabled the training of these systems at scale.

While bubbles can be wasteful, the urge to be the first in a very profitable emerging frontier motivates capitalists to finance risks they otherwise wouldn’t. The public can, in some cases, benefit from the excess of capital caused by bubbles, as this makes the future use of this capital cheaper. This is what the media means when they use phrases like “good” or “productive” bubble. The dark fiber planted into the ground during the height of the dotcom bubble is pointed to as an example of this and is increasingly being used as an anecdote about how the AI bubble will go. This excess cabling, which sat in the ground unused for years, eventually made internet bandwidth cheap which subsidized and enabled the rise of sites like YouTube and Netflix. Some nuance here is warranted, though, as bubbles are extremely destructive. They cost people their jobs, their savings, and in some cases their lives.

You can never know if a bubble is productive from within, it’s only in hindsight you can tell. Attempts to invoke this in the midst of a bubble are a narrative act meant to restore financial and social confidence. As for what productive even means, most analysis categorizes bubbles as mispricing events. Producers overproduce something thinking it’s valuable, when it isn’t, but may eventually be. Essentially, a bubble will be called productive if anyone someday uses the thing that was overproduced.

Viewing bubbles this way, though, blinds us to the path-dependent nature of technology. This is because productive bubbles can lock in the trajectory of an emerging technology, even if there are negative consequences or if better trajectories for a technology are possible. Going back to the US Railroad bubble, while it lowered the cost of railroads over time, it also effectively locked in routes well in advance of any rational urban planning. This created redundancies and other quirks that future rail had to build around, and it also indirectly influenced which cities would benefit from rail, as those close to tracks laid during the bubble typically saw more growth. There are entire counterfactual worlds you could compare by asking questions like what types of innovation could you see in the absence of asset oversupply? Or what else could have been done with all the money that went into the bubble?

We tend to think of technological development as a flash-in-the-pan event where an invention single-handedly transforms society. However, I’ve come to believe strongly that governance is the superpower that unlocks technology and determines the form it takes. For example, there would be no modern electricity without utilities regulation, or no modern car without the victory of E30 over other fuels (let alone rules of the road and interstate highways). The particular shape governance takes is not fixed or determined, and emerges from the social, political, and economic structures of a society. AI, or rather, LLMs will likely be the same. That’s why I feel that a culture unable to imagine alternative ways a technology could have manifested loses the ability to shape its own future.

In hearing that “AI” (LLMs) would be “just like the internet” I sought out to understand the history of the internet to evaluate how true that claim could be. I don’t yet feel comfortable giving an answer, but in the process I developed a toy model to help me think about technological diffusion and governance. I call it the Great Technological Railway^[10]¹⁰ (GTR).

Internet counterfactuals as parallel worlds

The most interesting thing about the internet is that it’s not one technology, it’s a technology stack built over decades by different communities solving different problems. Each layer—routing, naming, hypertext, distribution, applications—forced its designers to make tradeoffs about openness, performance, governance, and control. Those choices were never purely technical—funding models, regulatory regimes, military priorities, university norms, and corporate incentives all shaped what was built and what was abandoned. Ultimately the internet we have now is the product of technical engineering, deliberate governance, and circumstance. While dark fiber is essential to the web as we know it today, there are still worlds where the internet exists without the dotcom fiber bubble.

This Intelligencer article is a fun listicle that includes a few alternative internets. I however have attempted to build out a more comprehensive one using my GTR metaphor. This is part conceptual genealogy and part speculative history, designed to track what I think are critical inflection points in the history of the Internet.

Simplified "GTR" technology map of the Internet - made in tennessine.co

I don’t want you to get lost in trying to understand every detail of this graphic, I just want you to notice the chain of dependencies required to reach Web 2.0 (aka, the Internet as we know it today), and how many branching paths away from Web 2.0 exist. Not only is there internet without dark fiber, there were (and still kind of are) different kinds of internets. We’ll do a full analysis of the pathways in this graphic in a separate post, but I want to now walk you through the features of the GTR metaphor to give you insight into my thoughts about technological diffusion and governance.

Riding the GTR

The GTR is a metaphor that imagines the history of a specific technology as a railway. The idea is that the destination or direction of the tech can be influenced by decisions made by different stakeholders or critical actors. Essentially, technology evolves through tracks laid by earlier builders, switches thrown by decision-makers, and upgrades that change what the line can carry. It’s a way to visualize how social coordination, and network effects, collectively shape technological trajectories.

Routes on the GTR contain the following:

Stations are major technological developments that fundamentally change capability or access. Protocols like TCP/IP, DNS, or HTTP are stations: once they exist, entire classes of applications become possible.
Junctions are places where multiple solutions were possible. At these points engineers, governments, and firms chose between alternatives—TCP/IP versus its many alternatives, HTTP versus Gopher, centralized platforms versus federated systems. Each junction leads to tracks that may become dominant routes or quiet sidings.
Ghost stations are alternate worlds on routes that were rarely traveled or taken. They are either abandoned infrastructure, or purely speculative parallel worlds that can be imagined if a single vision of the internet won out.

Train cars are, of course, implicitly part of this metaphor. They predominately represent what diffuses across society when a railway expands. On what I call the Internet’s “golden route,” (basically the world we got) diffusion enabled scale and network effects. ARPANET connected a handful of universities. NSFNET expanded access across academia. HTTPS made publishing globally accessible. Essentially, at every successive station, more and more people got on. Part of why Silicon Valley Web 2.0 companies are so valuable is that they inherited and leveraged the network effects present at lower protocol layers, built in earlier decades of the Internet’s life.

The power of the GTR is that the metaphor breaks down the complexity of a large distributed technology like the internet and allows us to ask crucial questions, like: Who owns a “station” or piece of technology? What “cargo” did the train drop off when it left a station (what diffused when a technology was adopted)? What financed a new path or station? What types of governance and power relations could have existed had we ended up on a different set of tracks?

What will remain after the AI bubble?

To understand what could remain after the generative AI/LLM bubble, let’s look at a simplified GTR I created just for transformers, the architecture driving the current AI bubble.

Simplified "GTR" technology map of the transformers AI architecture - made in tennessine.co

I want to note several things:

This is a highly simplified model. But the goal was mainly to focus on what was needed to describe the path to LLMs at the highest level. But just know I could have easily added dozens of branching stations for each of these core stations. I have a slightly more zoomed out diagram, but here too I deliberately focused on the bare minimum of what I think was essential to get to transformers.

Simplified "GTR" technology map of AI as a field - made in tennessine.co

Unlike with the internet GTR, older AI stations are still in play. The field likes to revisit ideas from time to time as newer technologies become available. Neural nets are a good illustration of this; they’re an older idea that was once considered somewhat niche until modern GPUs and the Internet made it feasible to train larger networks. That’s why web 2.0 and GPU appear as stations in the zoomed out diagram.

This means that the future of AI may very well come from something that already exists or has yet to be invented. With that in mind, let’s zoom in and look at future routes that might emerge on this rail system given current developments with transformers.

Post-bubble path 1: GenAI as a service (GAIaaS) and hyperscale compute

We’re currently building tracks for what I’m referring to as Generative AI as a Service (GAIaaS). This involves frontier model labs spending billions of dollars to train transformer or diffusion generative models. Some of these models will be able to run on local hardware and some will only be hosted in the cloud.

To understand why we’re on this path, it helps to appreciate the history of AI as a field. For decades AI research was unusually academic and open. Much of the field’s technical repertoire was published and accessible, allowing researchers and companies alike to revisit older ideas and test them as compute and data scaled. Many techniques that power modern systems originated as academic work that initially seemed impractical at the time.

Google extended this tradition by pairing open publication norms with industrial scale resources. The company would work with researchers and provide them the flexibility to conduct exploratory, academic adjacent work. It was through these efforts that the transformer architecture was created. This makes transformers an open source or common knowledge invention. If there’s one takeaway I want you to have from reading this post it’s this: LLMs and more broadly cloud-based subscription AI services are actually attempts to enclose or privatize the transformer architecture.

LLMs are not the only possible form this technology could take, and in some ways are less preferable to scoped use cases of the AI transformer architecture. Presenting LLMs as an “everything machine” basically lets the tech industry supplant and subsume transformers. Essentially, Silicon Valley is spending trillions to reclaim what they gave away, by creating a distorted version of it as a proprietary replacement.Tech companies effectively want to keep the train parked at the GAIaaS station to capture its value for themselves.^[11]¹¹

This is the epic crash out I imagine Google CEO Sundar Pichai had when he realized that OpenAI monetized transformers, which his company created.

There is, of course, still the inconvenient fact that GAIaaS is unprofitable. But many of today’s biggest hyperscalers are Web 2.0 victors like Microsoft, Amazon, and Google, which can use their gargantuan revenues to subsidize LLM training runs and offer LLM services at a loss. In contrast, newer firms like OpenAI, Anthropic, and MiniMax have to burn investor money to survive, and thus are more likely to go out of business. So, while the bubble might kill off the broader ecosystem of AI startups, I don’t expect generative AI services or more narrowly LLMs to go away. Big Tech will likely prevent that from happening.

But even if GAIaaS suddenly disappeared, the efforts to build out this station will leave behind something that diffuses.^[12]¹² The next major development that stems from investments made in this bubble may come in the form of future hardware motivated by the current boom. Google, for example, has begun investing heavily in building TPUs (Tensor Processing Units) which could play a role in future computational needs for other AI architectures. My suspicion is that GAIaaS by itself won’t be super profitable, even for firms still capable of running such services once the bubble pops. It will likely be a bridge to something else, whatever that might be.

Post-bubble path 2: Scoped LLMs

LLMs are a technology begging to be better scoped. Something that might diffuse from the train pulling into GAIaaS station are specific types of know-how alongside local LLMs whose weights are accessible. Know-how, along with infrastructure are the most common things to diffuse when bubbles pop. For the current LLM cycle this looks like MLOps specialists, data scientists, and software infrastructure like vector databases and local LLMs helping build highly scoped business use cases. Individuals will likely also seek out scoped, personal use cases as Cory Doctorow discusses in his Reverse Centaur thesis.

Some examples of use cases include business processes that can be automated, like sorting documents, queuing support ticket requests, things of that nature. Even for these tasks, though, you cannot simply plop in a LLM and expect results. You have to validate outcomes with MLOps and build bespoke orchestration workflows with the tools I’ve covered in my previous posts. I think from this path we will see things like small, fine-tuned LLMs on edge devices to do things like PII redaction for outbound emails, and orchestrated LLMs handle back office tasks.

Of course, we have a computer parts shortage now, so buildout of tracks to this station might be delayed. Some people are speculating Sam Altman deliberately caused this shortage in order to make on-device LLM use more expensive (OpenAI reserved 40% of the world’s RAM directly jacking up the cost of PC parts). We needn’t be conspiratorial, though, as this was probably just another move OpenAI made to signal relevance.

Post-bubble path 3: Non-LLM paths

People often compare LLMs to the internet; however, historically AI hype cycles, like the one we’re currently in, tend to lead to “AI winters.” These are named after the apocalyptic scenario of a nuclear winter where nothing is left behind. A third path might include a winter and/or development of non-LLM based AIs.

One of the first major winters occurred in the 1970s following the publication of the Lighthill report in 1973. The report was an investigation into the status of AI research within UK universities, and it highlighted that while research in restricted and highly bounded domains was successful, this would never generalize to broader environments. As a result, AI funding across the country was reduced for years.

A series of debates hosted by BBC about Lighthill's findings on AI research

However, the most well-known AI winter happened in the late 1980s, and it has some surprising parallels to today. American funding for AI had continued from the 70s and into the 80s, with symbolic AI showing promise, above other methods, through machines called expert systems. By the early 1980s, an entire AI industry around expert systems had already developed. The bubble was also fueled by Japan’s decade-long investment into AI that it dubbed the Fifth Generation computer project. Having a national rival that already dominated electronics and automobiles effectively supercharged interest in American AI. To be fair, there was also organic interest in expert systems, which were valuable in specialized settings. Companies like Digital Equipment Corporation claimed that the technology saved them over 40 million dollars in just six years. As a result of all these factors, in 1985, companies were spending more than 1 billion dollars annually on AI, much of it through in-house expert-system projects.

Despite these successes, though, expert systems had critical limitations and were only useful for a narrow set of problems. Systems like these use fixed facts and formal rules to produce outputs. Building them required extensive “knowledge engineering” (sound familiar?) which required interviewing experts and encoding their decision processes manually. This resulted in a brittleness that failed under complexity. If a rule were misencoded or an inaccurate assumption was made, these systems would produce nonsensical outputs. Simultaneously, as the expert systems buildout was occurring, the broader computer industry was evolving quickly. Personal computers, unix workstations, and the expansion of mainstream programming languages made task specific applications and software more common. While these weren’t expert systems, they led to the growth of applications that could manage business logic. Even LISP, which is a specific programming language for expert systems, was eventually able to run on workstations. That’s why by the early 90s, support for expert systems all but faded. Funding had dried up, interest had declined, and the second AI winter had begun.

As disastrous as winters are, they often encourage the growth of alternative tracks on the GTR. The move away from symbolic AI and expert systems encouraged the field to revisit other areas of AI. The move to statistical methods and neural nets specifically set the stage for today’s AI boom (and now bubble).

I’m recounting this history to point out that in 2026 we have many “non-LLM” and “non-GenAI” paths that could be built if they haven’t already been. Transformers utility outside of large language models has simply been overshadowed. For example, biomedical transformers trained on nucleotide sequences have been helpful in scientific research.^[13]¹³ This is a scoped domain, where outputs can more easily be verified, and the technology doesn’t require insane amounts of power or data centers to be useful. In fact, it’s possible that technologies on Path 3 may not even benefit from the buildout the bubble is encouraging. Technically, path 3 doesn’t necessitate a winter, but it would require LLM hype to stop. However, such a sharp move away from LLMs would still probably generate pessimism that would lessen investment into AI.

Which path is most likely (and what have we lost)?

For better or worse, I don’t see LLMs going away despite the parallels between this technology and expert systems. Both types of AI tend to struggle with complexity, but for different reasons. While LLMs’ command of natural language makes them more capable than expert systems, this is also a double-edge sword and the reason LLMs often speak nonsense.

LLMs work by modeling high level regularities in language use, inferred semantic relationships, syntax, and grammar without explicitly understanding any of these things (let alone know how to spell words). This means that the power of LLMs comes from the power of language itself. As I said in my guide to LLMs, the ways humans use language holds a mirror to our metaphorical collective social cognition. Fields like distributional semantics have understood this since the 1950s.

Professor Michael John Wooldridge, a veteran AI researcher talks about how LLMs fit into the history of AI

Jorge Luis Borges’s short story the Library of Babel provides an interesting way to think about this. In the story, Borges imagines a library consisting of a seemingly infinite number of galleries. Each one contains an assortment of books whose pages use some combination of 25 characters. In this library, you will find every possible truth and every plausible falsehood buried in a sea of absolute nonsense. LLMs aren’t the Library of Babel, but as others have noted, their relationship to language and truth is Babel-like. The statistical patterns and regularities that LLMs can identify give them an impressive but brittle “understanding” of the world. In that sense, if you imagine language itself as the library, LLMs are like a tour guide in Babel—though one with severe amnesia and dementia. When you ask an LLM to do something, you can’t be certain whether they’re reading from a book with real ground truth, one containing total nonsense, or something in between.

So it’s no surprise that when LLMs are deployed in novel environments, especially without tooling, they tend to perform poorly. But people are willing to put up with LLMs’ failures because there tends to be ambiguity when they make mistakes. Was it your prompt that caused the error? Maybe something else? In fact, as others have said, the variability in output quality is itself something that makes LLMs psychologically addictive, just like random payouts at a gambling table.

Just because LLMs aren’t going away, however, doesn’t mean that non-LLM transformers won’t grow. I see a world where parts of paths 1, 2, and 3 are all at play. As for what we’ve lost, I’m concerned that with LLMs, and the quest for AGI overshadowing other types of utility, we’ve lost a more neutral version of the transformer architecture. LLMs have become the face of this technology and very clearly have high economic, social, and ecological costs. I believe these will come down in time, but things will be painful in the meantime.

What should we expect to happen?

I realize this post has been fairly abstract so far, so I want to close with a few concrete things that I expect to happen based on what I’ve been talking about. Feel free to hold me to these predictions, though I’m mainly calling these out as signs you should look for to know the bubble is bursting.

1. Implosion of the AI startup ecosystem

Much of the AI startup ecosystem consists of companies that have little technology of their own. Many on the internet pejoratively refer to these companies as wrappers.^[14]¹⁴They are basically websites that make requests to ChatGPT, Claude, or some other GAIaaS. I expect many of these companies to go away if there’s a bubble because their viability is tied to the (currently) cheap pricing of the most popular LLMs.

Cursor’s price change last summer is a good preview of what’s to come. Cursor is a coding application that makes requests to the most popular LLMs to produce code. The service used to be offered at a fixed monthly cost, but overnight the token or credit caps offered within monthly plans changed suddenly to significant user backlash.

Analysts like Ed Zitron pointed out that this price change was just downstream of compute cost, increases Anthropic passed on to Cursor. As the economic reality of loss leading GAIaaS takes hold, we should expect this to break the viability of these wrapper businesses who are basically glorified middlemen. This is without even mentioning that GAIaaS frontier companies could easily replicate these wrapper startups’ features and that the venture capital money (and good will) sustaining these startups may dry up.

What you should look for are dramatic changes in the pricing of AI services that are not Claude, ChatGPT, DeepSeek, MiniMax, or other GAIaaS providers. This will likely be followed by many of these wrapper services going bust. I am less confident about assuming that GAIaaS startups like Anthropic (Claude) and OpenAI (ChatGPT) will implode, but their economics are also pretty unsustainable. Both of these companies are racing to go public, but if that fails perhaps their IP will be acquired by the tech firms that have invested in them.

2. Abandoned AI data center projects

It might be more accurate to consider the current bubble an AI data center bubble. Companies are increasingly taking on debt to finance the data center build out. Debt isn’t inherently problematic, but increasing delays in data center production, as well as questions of demand, have created uncertainty. Some investors, like Deutsche Bank, are hedging by betting that there will be empty and negative margin data centers while still feeding the bubble.

Keep building those ~~homes~~ data centers; someone will buy them. They're an essential part of the American Dream!

What's more, a lot of these data centers are being financed by many of the same institutions. Recent reporting from Zitron shows that the same seven institutions appear as investors in 26 prominent data center deals, for example.

I expect that sometime in the not-too-distant future, AI data center growth will end as financing these projects gets more difficult. Projects in progress will stall, and this may or may not be followed with the crash we’re all expecting given how coupled data center deals are with debt. What “dark fiber” emerges from this, aside from already existing open source models is anyone’s guess.

You may have noticed I’ve been reserved in describing the data we have on bubble risk. We can’t see things like how much debt is building up in NDFIs or how much AI exposure they have, but I’m pretty pessimistic given the reporting I’ve read. As I write these words my country has started a regional conflict in the area that supplies much of the world’s energy. Even if NDFI and bank exposure to AI ends up being extremely limited, the bubble won’t be popping in a vacuum. It’s going to take place in a macro environment that is, technically speaking, already fucked. There’s a chance we’re closer to 2008 territory than 2001, and the tariffs, wars, and all the other little disasters will merge together to make something truly unholy and painful.

3. Enshittified GAIaaS

While Big Tech is poised to survive this bubble, and even continue to offer LLM and generative AI services at a loss, I don’t think it comes without cost. A bubble, as well as the depreciation of GPUs bought during this current cycle, might cause even giants like Microsoft and Meta to become more conservative about service costs.

As a result, I expect many generative AI services to get worse. This is pretty easy to do, as successful unscoped use cases of LLMs and generative AI simply rely on managing a user’s psychological affect. Companies can route requests to cheaper models, or fiddle with token caps on the back end with users being none the wiser. Users have been conditioned to accept some responsibility for the quality of LLM outputs via terms like prompt engineering, which motivates them to rationalize circumstances where LLMs fail tasks.

Over time, I suspect whales and users who are addicted to the more emotional AI use cases will end up paying more for performance that is at or slightly above what they are used to right now. Today we have top of the line GAIaaS being offered for $200 a month. Why not $500 a month or more?

Sadly, I don’t think rising costs of GAIaaS will slow down what appears to be growing instances of AI psychosis. Chatbots lend themselves heavily toward anthropomorphizing, and so as adoption grows psychosis likely will too. Additionally, as with social media, I think some of the strategies that will increase profitability of LLMs are going to involve training them to be more “human-like.”

Suffice it to say LLMs are going to remain a polarizing technology because they’re easily anthropomorphized and poorly scoped. ELIZA, the first chatbot program actually illustrates this very well. ELIZA was a simple symbolic AI program, made in 1966, that used simple rules to rephrase a user’s response as a question. Despite this, a number of ELIZA’s users developed personal attachment to the system, deeply disturbing its creator Joseph Weizenbaum.

LLMs are much more sophisticated than ELIZA, but unsurprisingly turning a transformer into a chatbot has had the exact same consequences as Weizenbaum’s program, just on a greater scale. Even worse is the fact that LLMs are systems that are managed by engineers making decisions based on how their users behave. Essentially, there is a sociotechnical feedback loop at play here: GAIaaS services collect user data to continuously train their systems so that LLMs can be tweaked according to users’ preferences and intent. The result is a system that can be personally tailored to your preferences, essentially a very intimate echo cathedral where the only voice is yours, reflected in a program trained on your preferences.

What this means is that LLMs are going to become a societal Rorschach that will look different depending on which part of the loop you’re in. My suspicion is that a world that adopts this technology in its current form will only achieve productivity gains where people deliberately scope it. Unlike expert systems, there is no inherent failure mode for LLMs because their success depends on user affect. Technical users likely already know to scope language models to avoid their worst effects. But those who rely on LLMs in contexts where they are still acquiring expertise (like school or training) might actually be hindered by the technology as they acquire bad habits due to LLMs’ inconsistencies.

How this all ultimately plays out is anyone’s guess. I’m sensing parallels to DuPont’s decision to knowingly use lead, a neurotoxin, as an additive in gasoline. This substantially improved fuel economy, but at a high price, despite the fact that safer additives existed. Society, solely focused on the increased performance of cars, completely sleepwalked into environmental and neurodevelopmental consequences that we're still dealing with a century later.

LLMs are not leaded gasoline, but I see a world where talking heads point to productivity gains from technically proficient people and well scoped use cases to justify the tech’s worse abuses. We’ve already got a glimpse of this in the form of accountability sinks to automate life and death decisions, usage in military scenarios, and incomplete lesson plans. Bluntly, for people with no scruples GenAI will make it easier to sell snake oil. I’m not talking about spam and scams, though they're certainly part of the picture too. Because it’s nearly impossible to evaluate an LLM’s efficacy, grifters will be able to sell “AI solutions” that make problems worse. How long this will go on will depend on the economics of GenAI and whether the GAIaaS providers who survive the bubble find some way to soften their losses over time.

Wait, what about white collar jobs?

I don’t really have any predictions about job losses due to LLMs. An important part of the job loss conversation that talking heads and thought leaders don’t want to discuss is that job cutting is not just a technical decision. Many times job cuts are done to signal expectations. This could be to investors, as cutting jobs, even ones sorely needed, make companies look lean. This could be to other employees to create a chilling effect. Hell, a manager could simply be making this decision because there’s a specific group of employees they don’t like.

So, with that in mind, whether LLMs cause layoffs is not just a capability question. Nothing I’ve seen from generative AI makes me believe that the technology is technically capable of replacing 50% of entry level white collar jobs as Dario Amodei’s crocodile tears suggest. I don’t just mean today, either. I mean that LLMs, which rely on statistical associations, are not suited to universally be capital-as-labor. Using high level regularities to find relationships is not the only reason why white collar workers get paid. However, that absolutely does not mean companies will not try their damndest to make this work. In cases where immediately judging a LLMs output is difficult it will be much easier to justify this substitution. Outside these areas full replacement will be much harder.

This is exactly why, for example, LLMs have failed to fully replace customer service workers despite benchmarks indicating that they’ve gotten substantially better at following instructions. For many companies, customer service is seen as a cost center—a nasty but necessary price of doing business. From their point of view, the less they can spend here, the better. That’s why companies have degraded support for years, streamlining it to death and outsourcing it overseas. So if there’s one job begging for disruption, given the way companies currently think, it would be customer service.

It’s surprising, then, that multi-billion dollar language models trained on all the language ever languaged, which includes customer support interactions, struggle to follow scripts to fulfill basic support requests. But you’ve seen the headlines: Major fast food restaurants have to slow AI adoption after models fulfill orders that include 18,000 cups of water. Airlines and other service providers have to walk back promises made by models offering refunds that their policies don’t allow. You may have seen in the clip at the beginning of this very post that even today’s models, the best of the best, suggest you don’t drive your car to the car wash. Frequently, an organization relying on language models for arguably the most denigrated white collar support jobs has to issue an embarrassing mea culpa as their model says something inappropriate to a customer or client. This doesn’t mean that jobs are safe, it just means that job losses are not solely about what a technology can or cannot do, which is something we’ve known forever.

The future of productive LLMs use probably looks a lot like driverless cars

One final point to drive home what LLMs are: Consider Driverless cars. They are another deep learning AI technology that had their own hype cycle in recent memory. So they’re likely going to be instructive about what to expect from LLMs as people begin finding sustainable, productive use cases.

In the mid 2010s, many industry insiders considered self-driving cars a solved problem. Some ~~corporate puffers~~ like Elon Musk even made product claims based on this optimism, repeatedly, like every year for a decade, expecting the public to believe it.

It’s 2026 and something imitating the driverless cars we were promised exists. Companies like Waymo offer rides in cars where a driver is not physically present, but vehicles are monitored by remote managers in the Philippines, and cars can only drive within designated geofenced areas. The work that goes into enabling cars to function is painstaking. While driverless vehicles have sensors that enable them to react in real-time, high resolution maps that take into account construction and other road conditions must be created and maintained. Maps have centimeter level precision and are arguably as important as the neural net driving the vehicle. Essentially, today’s driverless cars are a highly scoped combination of infrastructure, curated data collection, AI models, and humans in the loop. This means just like LLMs, there’s an engineering layer that’s deeply hidden from users.

Moving away from this to something like the fictional self-driving car KITT would require an architectural change. Until then, driverless cars are an engineering problem that involves many smart people developing processes that help simplify the world for driverless cars. This could be in the form of maps containing ever greater levels of detail or by physically modifying the world by adding feedback sensors to traffic lights or other parts of the road. But even with this level of modification, cars likely still will not be able to drive in conditions with limited visibility or infrastructure.

Just as today’s driverless vehicle companies are picking which cities and environments allow for viable deployment, anyone getting value out of LLMs will be forced to think in much the same way. This will inevitably transform the technology from an everything machine to something more mundane that doesn’t fully replace labor.

How should you navigate this future?

So, now you know that a bubble is likely imminent and LLMs will probably stick around. What should you do?

As pessimistic as I am about LLMs’ many externalities I’ve been careful not to dismiss their utility. Not because I’m a shill or because I’m happy LLMs exist, but because I recognize that people, even those who know about these harms, will find good in these systems. As I said elsewhere, the best thing is for a harmful tech to not be invented, but if that isn’t possible we need people who will find and promote good uses.

Anglo-capitalist cultures have this penchant for subjecting their populations to high externality technologies, probably best exemplified by the Kehoe Principle. So named after Robert Kehoe, the toxicologist who defended DuPont’s decision to use lead as an additive in gasoline (among justifying DuPont’s other harmful inventions). Leaded gasoline is an extreme case, but many other technologies that we take for granted today like trains, factories, electricity, and cars had their own disastrous social consequences. It was activists, workers, and citizens who made these technologies and their creators accountable to humanity through regulation and promotion of socially productive use. I liken this to Karl Polanyi’s concept of the double movement where society tries to properly embed or internalize new forms of marketization.

So how do we try to scope LLMs and GenAI? I’m still trying to figure this out, but thinking and acting like a hacker might be a good starting point. Seriously! I don’t mean criminally, I mean in the sense that in order for us to push this technology into a properly scoped role we need people to demonstratively prove the ways it fails, both from a use case standpoint and a security standpoint.

I know I started this post by talking about how this bubble and the dotcom bubble shouldn’t be conflated, but in a lot of ways we’re entering a world that will be just as hackable as the web in its earliest days. This genuinely reminds me of stories like Michael Calce’s and those of many, many, many, other teenagers whose trivial exploits made the inherent risks of using the Internet very clear a long time ago. It feels like we’re really going to have to relearn these lessons because LLMs and LLM agents are laughably insecure. This is because if data takes a form that an agent interprets as an instruction, much like a shackled genie, it must obey. If you’re a millennial who grew up watching media where kids tricked genies into destroying themselves, that's what will be happening for a while until LLMs get scoped. Part of how we get there involves encouraging people to approach the tech more critically and then, where possible, find ways to substitute GAIaaS with local models untethered to problematic use cases and monetization.

I know that for me, and readers of this blog, hacking won’t take the form of black hat escapades. Instead, it will look more like carrying out anti-AI demos to illustrate to friends, family, or coworkers why LLMs are ill-suited to a specific use case. For those of you are using LLMs it may look like testing outcomes multiple times for a given use case.

Of course, individual action alone is not enough to shape the future. We obviously need to treat the harms from novel techs like LLMs as a collective action problem by connecting with computer science researchers, technology scholars, developmental psychologists, and other specialists who can provide informed knowledge to better scope this technology. This is all easier said than done, though, and I don’t have any solutions today. For now, I’m hoping to further develop frameworks like my GTR to help with thinking through counterfactual histories of technology.

If you liked this blog post, support my tea habit by tipping me!

Graphics Processing Units. These are computer chips that output images, usually for video editing or video games. For reasons beyond the focus of this post, those types of chips are also good for certain types of machine learning tasks.
In saying this I join brave voices like Jeff Bezos, Mark Zuckerberg, Sundar Pichai, Sam Altman, and Clem Delangue.
Loss leading is a strategy where a firm takes losses, usually on a specific item as a way of improving market position. Companies like Amazon, Uber, and other Big Tech firms are unusual in that they didn't just lose money on a single product, but were broadly unprofitable at first. This allowed them to offer their services for cheap.
Blitzscaling is one name for a popular Silicon Valley strategy from the mid-2000s onwards where investors would finance losses so a company could quickly capitalize and become a monopoly. I talked about it briefly in another topic.
The cost of training the original DeepSeek model is disputed, but believed to be cheaper than any of OpenAI's models. DeepSeek and other smaller labs are leveraging a technique called distillation where they send dozens of requests to an existing model and capture its responses to help build their own model. Distillation is probably preventable, but will probably lead to arms races that will make it harder to detect.
Uber's Series C raised $258 million on a valuation of $3.5 billion. OpenAI's Series C raised $6.6 billion at a valuation of $157 billion. OpenAI's series C valuation already puts it near Uber's total current valuation, but more importantly shows how much more money AI companies demand from venture capital.
LLMs are one giant security attack surface. The prompt and response format of LLM interaction means that if a model interprets something as a command it will act on it. Acting on any arbitrary input like malicious code, malicious instructions, or poisoned data will cause security issues. AI agents amplify this risk, as agents encounter data when they interact independently with data sources the user doesn’t control.
NDFIs or NBFIs (nonbank financial institutions) are sometimes referred to as shadow banks.
Blue Owl is a major investor in multiple AI Data Center build outs and so has direct exposure to a would-be bubble. The pessimistic case is that the move to restrict investor outflows is a sign that the bubble is near, but there's limited visibility here.
Friends of the blog might recognize this was formerly called the Great Relay Race. I had to change the model for reasons I will explain.
I don't want to simplify the AI build out to just this; I think there are true believers in some concept of an AI that can do everything motivating this frenzy for example.
When I say something diffuses, it doesn't mean that it justifies the total cost of the bubble.
These genomic foundation models are technically built on GPTs which are functionally LLM precursors. So, you could argue they fuzzily sit on the boundary as a tech that might not exist if language models didn't exist. It still highlights a non-llm use case of transformers that's scoped.
The "wrapper" insult refers to the fact that foundation models provide API (Application Programming Interface) access to their LLMs to third-party. If the majority of a service's functionality simply comes from making an API call to a LLM, then they're a middleman that can easily be replaced.