Video: How AI Datacenters Eat the World by High Yield, annotated

There’s a seismic shift happening in the datacenter industry. A shift that’s so fast, and of such proportion, it’s not only changing datacenters, but entire industries. And you don’t have to take my word for it, let me show you. This is Temple, a town about one hour outside of Austin, Texas. But we are not here for the city and its historical Santa Fe Railroad Depot. We are

here to look at a random field in the industrial area of town. To be precise, right now you are looking at pictures back from August of 2021. Fast forward one year to August of 2022 and the entire field is gone. Replaced by a giant construction site. It’s clear that Meta, the company who bought the land, has big plans. And they mean business. Just four months later – now

in December of 2022 – we can see that the construction of a datacenter is well underway. But then something strange happens. Another five months later, in April of 2023, our curious field in Temple looks like this: all the previous construction gone, razed to the ground. Meta just deleted their entire datacenter halfway through its construction. An estimated

70 million dollars just gone. Wasted it seems. But why? Was there a problem during construction? Or maybe they lacked the proper permits to continue? The truth is, none of that has anything to do with it. In this video we will not only figure out why Meta made such a radical decision, but we will take a look at how a modern datacenter works and explore why the entire industry is changing so rapidly. Because if this trend continues,

AI datacenters will start to eat the world. So, what happened in Temple? In order to understand what’s going on in our field in Texas, and the entire datacenter industry in general, we have to understand how a datacenter works. Because a modern datacenter is much more than just a shed with some computers inside. But for a long time, that’s basically what

a datacenter was. Early datacenters were often located inside business properties, where a few rooms, for example in the basement, were converted to server rooms, housing the IT equipment. With the rise of the internet, datacenters became more and more important. And they grew in size. Gone were the days of being an afterthought in a basement somewhere. Datacenters became massive construction projects and fueled the rise of the Web 2.0.

They basically turned into the “factories” of internet companies such as Google, Microsoft, Meta or Amazon. Their primary function was to host and distribute data. Thus, the name: data center. Right now, you are watching a YouTube video, which means it’s stored inside a Google datacenter somewhere. And that somewhere is hopefully not too far from you, because you want fast and uninterrupted access. When it comes to content like YouTube, Netflix

or basically anything else in the cloud that you want to access, a datacenter not only has to provide a massive amount of data storage, but also a lot of network bandwidth and good latency. And that’s where location comes into play. Datacenters, as we know them, are more location sensitive than you might think. A great example for this is communities like Ashburn or Sterling, which are located right next to the Dulles International Airport in Washington D.C.

This small area contains a lot of datacenters because it combines a prime location, close to major population centers, with massive networking infrastructure that grew each time a new datacenter was built there. It’s a perfect networking hub. If you are providing cloud services or are streaming content to your customers, that’s where you want your datacenter to be located. The bigger the fiber network and the

closer to your customers physical location, the better for a traditional datacenter. But all of this is changing with AI. Which is difficult to see at first, because from the outside, an AI datacenter looks very much like a traditional one. It’s a large, mostly flat, storage type building with power and cooling infrastructure surrounding it. But that’s

quite literally where the similarities end. If all you take away from this video is that AI datacenters are nothing like traditional datacenters, it’s already a win. Calling it “AI datacenter” might even be a bit misleading, but it has become the established name. In my opinion a more fitting name would be “AI supercomputer”, because that’s what it actually is. Let me explain. From a high-level overview a datacenter has

four main components: compute, connectivity, cooling and power (C3P). If we use these four areas to compare a traditional to an AI datacenter, the differences become quickly apparent. Which brings us one step closer to solving our Temple, Texas mystery. Let’s start with connectivity, because we just talked about how important location is for a traditional datacenter. But location literally doesn’t matter for an AI datacenter, at least

not in the way it does for a traditional one. There are two things an AI datacenter can be used for. First, it can train a Large Language Model, which is simply called training, and second, it can use that pre-trained model to generate output, which is called running inference. A training cluster is a more or less a closed system. It literally doesn’t matter where you place it, at least not in the sense of being close to customers. Because

there are no customers accessing data. It still has networking though and there are efforts to connect large training facilities with each other over massive fiber lines, with the goal of conducting large scale training runs across multiple AI datacenters. But that’s not the same type of network access and cross provider routing a traditional datacenter requires. But what about inference? If you are asking ChatGPT a question, you directly communicate

with the datacenter that runs the inference. While that is true, inference also doesn’t require the same networking as a traditional datacenter because it’s not latency sensitive. The compute part – basically calculating the answer – can take multiple seconds. Even if you add 500 milliseconds of latency on top, which is a lot, it doesn’t change the experience. A chatbot is not a latency sensitive application as long as it is limited by compute. It also doesn’t

require a lot of bandwidth. Even considering new applications like image and video generation, the compute time still outweighs any network connection when it comes to response time. This might change in the future, once AI becomes more responsive. But for now, neither training nor inference have strong latency or bandwidth requirements. At least not consumer facing. Netflix streaming thousands and thousands of 4K movies at the same time is in a whole

different ballpark, just like a video call or a video game has much tighter latency requirements. For AI datacenters, it’s not an important factor. And the differences only get bigger when we look at compute. As I’ve said, AI datacenters are actually more like supercomputers. Their only goal is to deliver as much computational performance, as efficiently as possible.

And in order to increase compute efficiency for AI workloads, you have to increase density. Which starts at the chip level. If we look at the number one provider of AI compute – Nvidia – we can see that ever since Volta, Nvidia’s first tensor core GPU, the performance and power consumption of each GPU generation has since skyrocketed. While Volta had

an almost tame TDP at only 250 watts, Ampere, its successor, raised it to 400 watts. Next, Hopper increased the TDP to 700 watts and Nvidia’s newest generation, Blackwell, is reaching 1000 watts. For a single GPU. A GB200 superchip, which combines two Blackwell GPUs with a Nvidia Grace CPU, has

a whopping 2,700-watt TDP, for a single board. And this trend will continue. Nvidia already announced GPUs that consist of up to four reticle sized chips. That’s twice as much silicon as Blackwell. And even with the increased efficiency of more advanced process nodes in mind, the first 2000-watt GPU isn’t too far away. The compute density is

massively increasing at the chip level. But it doesn’t stop there. Not only is each new chip offering much increased compute performance, the number of GPUs in a single server rack is increasing at the same time. When you are building a modern AI datacenter, you have to build for efficiency. Every watt not spent on actual compute is wasted. And while optical interconnects are great over long distances, and honestly there’s

no other option over a certain distance, they need optical transceivers and retimers, which require a lot of power. For that reason, you want to use as much copper as possible. Nvidia’s GB200 NVL72 compute rack contains over 5,000 wires and two miles of copper. If Nvidia would have used optics instead, it would have consumed 20,000 watts more than the

current copper based NVLink solution. But copper is really only viable at rack scale. Even within a single datacenter you have to switch to optics at some point. That’s why you want as many GPUs in a single rack as possible, so you can connect as many of them using copper. Compute density is the holy grail when it comes to AI datacenters. You want your GPUs to use as

much silicon as possible, have as many GPUs on a single board as possible and as many of these in a single rack as possible. That’s why the power requirements for a single rack are continuing to grow. The best way to see just how different traditional and AI datacenters are is to look at how much compute, and as a direct effect, how much energy a single rack in each of these datacenter types is using.

If you pick a random server in a traditional datacenter, outside of hyperscalers like Google, Meta and so on, you’d be hard pressed to find a rack that uses more than 10 kilowatts. The typical rack power consumption is in the range of 3 to maybe 7 kilowatts. Everything above 10 kilowatt per rack is already considered high-performance for a traditional datacenter. And while hyperscalers are building racks in the 15 to 20 kilowatt range, even that

doesn’t compare to racks used for AI compute. The Nvidia GB200 NVL72 we just talked about, which is Nvidia’s fastest rack-sized solution, has four power shelves that provide 33 kilowatts each. That’s a total of 132 kilowatts for a single rack. 10x what would be considered a high-performance setup in a traditional datacenter and 30-to-40x the

rack power of a standard run-of-the-mill server. We aren’t talking about small differences here, it’s night and day. I wasn’t kidding when I said “AI datacenter” is a somewhat misleading name, because these numbers even trump supercomputers. If it would be possible, AI hyperscalers would build a gigawatt rack. Because density is king. As you can imagine, this massive increase in

compute and power density also directly affects cooling. Traditional datacenters, with a lower critical IT power, need smaller cooling solutions. Makes sense. Until recently, almost all datacenters were air cooled. But this is changing. Datacenters that run AI compute are quickly transitioning to liquid cooling. There are three specific reasons for this, all related to density. This is AMD’s MI300X AI-accelerator. One GPU,

ready to be installed on a server blade. But about 90% of its volume is taken up by a massive heatsink. The small PCB below all that metal is the actual GPU. Unlike consumer GPUs, server GPUs don’t have individual fans. They just come with massive heatsinks and are cooled by industrial high-performance fans that cool every component on a single blade. Switching to liquid cooling drastically reduces the physical footprint of

each GPU, because a liquid cooler is much more compact than the massive heatsinks required for air cooling. It quite literally increases density, because it allows you to pack more hardware into a single server blade and rack. Because less space is wasted on heatsinks. Of course, liquid cooling requires a lot of additional infrastructure in and around the datacenter. But that’s outside of the rack scale – where density doesn’t matter anymore.

The second aspect is cooling performance. Liquid cooling can absorb about 4000 times more energy per unit of volume than air. If you have to remove a lot of heat, because you have to cool lots of GPUs in a very dense setup, it’s the only option. Super high-density designs are only feasible with liquid cooling. And while there are some Blackwell implementations that still

use air cooling, next-gen AI accelerators will almost exclusively use liquid cooling. Google for example switched to liquid cooling for their in-house high-performance TPUs a long time ago. But there’s a third, sometimes overlooked aspect to liquid cooling. Running silicon at lower temperatures not only increases its life span, but also increases energy efficiency. If you run just a single GPU there’s not much to it. But if you run 100,000 GPUs,

the savings add up. And that energy can be used for more important things, like more compute. Of course, liquid cooling is something you have to plan from the very beginning. An air-cooled datacenter is designed very differently to a liquid-cooled one. It’s not your average desktop PC where you can just upgrade an air-cooler to a water-cooler, it completely changes the layout of the datacenter. You need to include waterpipes,

from the rack- to the building-level and install massive cooling towers. But not only is it worth it when you strive for the highest amount of compute density, there’s simply no viable alternative if you want to stay competitive. Now that we’ve covered connectivity, compute and cooling, let’s talk about power. But not at the rack level, we’ve already discussed that. I’m talking about power at the level of the

entire facility, which has become the number one denomination when we talk about datacenter “size”. It’s not the actual size of the building we are talking about, it’s the total power capacity of the datacenter, also called “critical IT power”. Traditional “retail datacenters” often provide less than 10 megawatts of critical IT power. Even the larger “wholesale datacenters”, like the huddle of datacenters around the Dulles Airport in D.C., are only in the 10-to-30-megawatt range.

Modern hyperscale datacenters from the likes of Microsoft, Google, Amazon, and Meta, and I’m still talking about traditional datacenters that actually host data, can reach 40-to-100-megawatt of critical IT power. But they all pale in comparison to the critical IT power of AI datacenters. There are multiple AI datacenters with critical IT power of over 200 megawatts. Microsoft for example operates two 300 megawatt AI

datacenters for OpenAI. And this is just the beginning. AI campuses with one gigawatt of critical IT power are already under construction. All of this is further amplified by the fact that while a traditional datacenter has fluctuating power demand based on usage patterns and rarely runs at full power, AI datacenters are more or less constantly running at close to full load.

They not only have this massive critical IT power; they actually use it. With the power requirements of AI datacenters, we are talking about direct access to major high voltage power lines. And because server racks don’t run on high voltage, which is over 100 kilovolts, you need transformers to step down the voltage. First to medium voltage and then to low voltage, which for datacenters is usually 415 volts. With a power consumption that rivals large cities,

these massive AI datacenters require a lot of transformers. So many in fact, that the order books already have a back-log. Transformers, which previously were mostly bought by governments to serve large cities and industrial centers, are suddenly in high demand because of AI. Another interesting difference between traditional and AI datacenters is the idea of backup power. For a traditional datacenter, loss of power is a major critical failure point. That’s why

they need a system that ensures uninterrupted power supply. For the very short term that means batteries which have to bridge the time until the emergency generators come online. But AI datacenters have such a massive power demand, they need a large amount of generators, which not only have to be bought, but also require the proper permits. This not only adds a lot of additional cost, but takes time to set up. And since time to market is crucial in the AI race,

AI datacenters often have a very limited UPS system. In this case, the datacenter just stops working, if the main power source fails. Which, funny enough, isn’t that big of a deal for training runs, as the GPUs already introduce somewhat frequent failures. When the power goes out, you can simply continue the training run when it comes back. And I’m really just scratching the surface here. No matter if compute, connectivity,

cooling or power, there’s so much more depth to it. Most of what we just covered is based on the amazing datacenter anatomy series from SemiAnalysis, which covers every aspect of a modern AI datacenter in great detail. If you want to know how the power stage of a massive AI datacenter really works, or what’s required to make a datacenter ready for liquid cooling,

I highly recommend that you check out the articles I’ve linked in the video description below. I’ve been a SemiAnalysis subscriber for long before we started collaborations like this one, and it’s definitely worth the money. But the secret is that like 80% of a SemiAnalysis article is not behind a paywall. That’s how they got me. For every topic I researched I found a super detailed article from SemiAnalysis that actually explained how the industry works in a

digestible way. Even without a subscription it’s a top tier resource. And with a subscription it only gets better. I mean it in the sincerest way. If you are even somewhat interested in the semi space and AI datacenters, check out the links and read the articles. It’s so worth it. Now that we have learned all about AI datacenters and how they are nothing like a traditional datacenter, let’s get back to our mystery field in Temple, Texas. Why did

Meta start building a new datacenter only to tear it down half way through construction? As a quick refresher: construction started in mid-2022 and progressed until at least the end of 2022, but by April of 2023 the entire construction site was flattened. What could have happened during that time, which led to Meta making such a radical move? The brainiacs among you might have already figured out that this timeline aligns almost

perfectly with the release of ChatGPT in November of 2022. So, is that our answer? Did Meta start building a traditional datacenter and realized half way through the construction that it was outdated? Well, kind of. But it’s even more radical than that. The initial construction site was for Meta’s tried and tested H-type datacenter. It’s called that because the final shape of the datacenter looks like

the letter “H”. If we actually take a closer look at the satellite images, we can see that the initial build would have looked like an H if it was finished. Many of these H-type datacenters are used in a more traditional datacenter role, filled with CPUs and hard drives. But while it was designed for maximum energy efficiency, Meta”s H-type datacenters are already capable of running GPUs. Meta has a massive Nvidia Hopper

based AI cluster that combines 100,000 H100 GPUs across multiple of the same H-type datacenters. So Meta was already building an AI capable campus in Temple. But it wasn’t offering a high enough energy density to stay competitive. That's how fast the industry is moving. Even AI datacenters that are built on a very fast

timeline can become outdated during construction. The story has a happy end – at least for Meta. If you look at the most recent satellite images from 2025, we can see that our little field in Temple now houses not one, but two AI datacenters. This is Meta's new high-density design, with each building providing about 85 megawatts for a combined critical IT power of 170 megawatts.

The new design also has the added benefit of supporting liquid cooling, which makes the high-density layout possible in the first place. The older H-design would have only supported a total of 60 megawatts. Too little in today's AI datacenter race. And it’s only the beginning. This is Three Mile Island, a nuclear power plant located on the Susquehanna river, outside of Harrisburg, Pennsylvania. In March of 1979, the Tree Mile Island power

plant became infamous when its TMI-2 reactor had a critical failure and suffered a partial meltdown. To this day, still the most severe nuclear accident in United States history. The core of TMI-2 has been removed from the site and the second reactor TMI-1 was shut down in 2019, because it was operating at a loss. The entire site has since been marked for decommission. But that changed. Last year, in 2024,

Microsoft announced a deal with Constellation Energy, the owner of the site, to restart the still working TMI-1 reactor. The nuclear power plant is expected to resume operation in 2027, with all energy going to Microsoft for the next 20 years. And I’m sure you already know what Microsoft needs all that power for. To power the next generation of AI datacenters. And if you think this is an extreme example, think again. Not far from Three Mile Island,

only about 2 hours by car, is the Susquehanna Steam Electric Station. A nuclear power plant with about 2,500 megawatt output. In 2023, Talen Energy, the operator of the power plant, started to build a massive on-site datacenter, which was acquired by Amazon AWS in 2024 for about 650 million dollars. And there’s only one reason to place a datacenter right next to a nuclear power

plant. To power the massive energy demand for AI. We can see similar moves happening across the entire industry. Meta not only rebuilt the datacenter on our field in Temple, they are starting to place new AI datacenters in tents - because it reduces construction time. And aside from compute density, being fast is important in the AI race. Meta’s top two AI locations are Prometheus in Ohio, an already existing AI cluster

powered by gas turbines, that's supposed to scale to over 1 gigawatt in the next year. But number one is the Hyperion supercluster, which is supposed to reach a truly impossible scale. By 2030, the site located in Louisiana is supposed to reach a combined critical IT power of 2 gigawatts, with room to grow to 5 gigawatts. For comparison, the country of Germany has

an average power usage of about 60 gigawatts. Coreweave, a hyperscaler, acquired and retrofitted an old crypto mine datacenter in Denton, Texas, that was previously used to mine Bitcoin. If we take a look at satellite images, we can see a large cluster of buildings in this location. But one is not like the others. The center of the site is a massive gas powered power plant that directly

supplies power to the AI datacenters. Elon Musk’s xAI is the largest yet, with 150,000 Nvidia GB200. Datacenters are growing so quickly, they have to be fed with mobile generators because the main power sources take too long to get online. Everyone is scrambling as fast as they can. What we are seeing right now is only the beginning. 300 megawatt clusters might seem big in 2025, but the first gigawatt

clusters will come online next year. Right now, 200,000 GPUs are a lot for an AI cluster, but there are already plans for a million GPUs. And these won’t be the same Hopper or Blackwell generation as today, those will be upcoming GPU generations with even higher TDP numbers. The simple fact is that it doesn’t matter if you believe in AGI or not. All that matters is that the major players very clearly believe it’s a race to AGI and whoever

gets there first takes the entire cake. And because that cake is worth trillions of dollars, they are willing to do everything in their power to get there first. And that quite literally requires power. A lot of power. Google has announced that its funding the construction of three advanced nuclear power plants. It wont be long until most major hyperscalers will be major players in the energy business. Including owning

and operating multiple nuclear power plants. Next-gen AI datacenters not only have nothing in common with traditional datacenters, soon individual datacenter campuses will surpass even the power demand of megacities and huge industrial parks. And the largest AI clusters are adding power demand which rivals that of industrial nations. It’s not slowing down. The race for AGI is not only about compute, it’s about power. Both literally

and figuratively. If this trend continues, AI will become the number one consumer of energy. There’s so much going on in the datacenter industry, it’s almost impossible to follow all developments. Unlike supercomputers that are eager to get listed in the Top500, to show off what they achieved, AI datacenters are much more private. The AI race isn’t happening out in the open, at least outside of PR announcements, if you don’t know where to look.

Hyperscalers don’t want you or the competition to know how much actual compute they have, how much they will add over the next month and years and how competitive - or rather how dense - their AI datacenters are. And how much power all their datacenters consume. But then, how do we know about all these projects? How do we know about their critical IT power, how many GPUs they run, how efficient they are and which power source they use?

The answer is a combination of a large knowledge base, lots of high quality research and actually spending the money on high-resolution satellite images. And I don’t mean the kind on Google Earth that maybe gets a low-res update every half year. I mean professional satellite images. I’m not describing a fantasy of mine here, that’s actually what the brainiacs of the SemiAnalysis datacenter team are doing. It’s might be a bit insane, but SemiAnalysis is

tracking over 5,000 datacenters world-wide. And by tracking I don’t mean a simple Excel sheet with a name and an address. I’m talking about the true Sherlock Holmes stuff. The datacenter model not only tracks new construction projects, but also existing datacenters. High-res satellite images are analyzed in detail. And because the power stages, generators and cooling infrastructure are visible, the datacenter team can actually

create very detailed insights for each datacenter. Of course that only works if you know what to look for. But when you do - and the SemiAnalysis datacenter team certainly does - a datacenter is like an open book. I don’t think there’s anything that comes even close in terms of coverage and insights to what the AI datacenter model from SemiAnalysis offers. If you are working in or with the industry and are interested in a highly detailed overview of the

current AI datacenter market, you have to check out the datacenter model. Not only is it really cool, it delivers the most extensive insights available into the fast paced race to AGI. ChatGPT was released in November of 2022, less than three years ago. Ever since it feels like everything is speeding up. The race for AGI has created an insatiable demand for AI compute that

only seems to accelerate. The datacenter industry is now more focused on building AI supercomputers than actual data centers. And with it comes a massive demand for power. That not only means more transformers, more generators and liquid cooling, but energy generation is more and more a focus point of hyperscalers and big tech companies. From the launch of ChatGPT to next year, AI compute will add an estimated 40 to 50 gigawatts of global power demand. These are

numbers comparable to the average usage of entire countries like France and Germany. And I know I’m repeating myself, but this is just the beginning. If this trend continues, it will just take a few more years before Google, Microsoft, Meta, Amazon and other hyperscalers will operate more nuclear power plants than most countries in the world. And add AI datacenters with critical

IT power that surpasses those of most nations on a yearly basis. All of this in hopes of being the first to achieve AGI. And if they get there, the AI power demand will surge even more. It truly starts to look like AI datacenters are going to eat the world. The first bites are already visible, if you know where to look. Thank you again to the entire SemiAnalysis team

and especially Jeremie, who was very patient in answering all of my stupid questions. Go check out their amazing work and if you want to know when the next bites are coming, the SemiAnalysis data center model is your best weather forecast. I hope you found this video interesting and see you in the next one. Oh, and subscribe if you want to see more videos like this one.