When I wrote about XAI's Colossus project a few weeks ago, the scale of the initiative seemed, for lack of a better word, dizzying.
You can review this video of Patrick from ServetheHome leading viewers through the halls of a supercomputing cluster ostensibly created to power XAI's Grok chatbot: You can see row after row, rack after rack Yet another rack, with processors set in powerful motion, like a giant machine.
According to recent news, a major artificial intelligence hardware vendor now claims that three of its major customers are planning Colossus-sized projects. Broadcom reports that by 2027, no fewer than three customers may develop data centers with 1,000,000 processors.
The news comes from the company's fourth-quarter earnings call on December 12, where leaders said revenue soared 220% this year.
According to reports, Broadcom President and CEO Hock Tan said on the conference call: “As you know, we currently have three hyperscale customers who have developed their own multi-generation AI “By 2027, we believe they each plan to deploy 1,000,000 XPU clusters on a single fabric. “
For reference, Colossus first reported a need for 100,000 GPUs. After quickly doubling orders, Musk offered a candid update: The center will need 1 million Nvidia GPUs to run there. Industry leaders like Jensen Huang are surprised by the breakneck pace of data center construction: so are prominent journalists. It looks like a unicorn. But will this kind of project become necessary in the next few years?
Who could it be now?
Existing reports clarify that there is no specific identification of who is planning these massive designs. Ask ChatGPT and it will confirm this, although the model does give a list of the top companies it believes are better positioned to expand such initiatives, including:
· NVIDIA
· Microsoft
· Amazon Web Services
· Open artificial intelligence
· Metadata
· Tesla
In addition, ChatGPT lists a group of Chinese companies that have their own big data center capabilities, including Alibaba, Tencent and Baidu.
When you dig deeper, you'll find that most companies are nowhere near this metric. For example, Mark Zuckerberg has publicly stated that the company aims to have approximately 350,000 data center GPUs by the end of this year. Google estimates it has a total of 2 million GPUs in all of its global operations centers, not just one.
As for AWS's main cluster, the hardware that provides B2B services, one of the official figures is that AWS has built a virtual supercomputer with a performance of 9.95 petaflops, which looks to require several hundred Nvidia H100s.
So a million GPUs is still a big deal.
main focus
While this news is impressive, some may not be too enthusiastic about the idea of multiple data centers of this size.
If we go back to Colossus, we see critics talking about their concerns about competition for resources from this power-hungry behemoth.
For one thing, Colossus estimates that it needs up to 1,000,000 gallons of water per day – you'll often see municipal water usage per capita of about 100 gallons per day.
Natural gas turbines are also used to provide all this energy…
Experts have been suggesting for some time that the United States prepare to expand its nuclear power infrastructure to power data centers, specifically placing power sources next to superclusters to increase efficiency.
But even if they were 100% efficient, obviously the power requirements would be huge.
As leading journalists and others covering the issue have noted, things are moving at an alarming rate.
We have to figure out what impact these supercomputers will have on our society, because although Colossus is the only publicly identified supercomputer under development, others may soon appear.