Despite their $30,000+ price, Nvidia’s H100 GPUs are a hot commodity — to the point where they are typically back-ordered. Earlier this year, Google Cloud announced the private preview launch of its H100-powered A3 GPU virtual machines, which combines Nvidia’s chips with Google’s custom-designed 200 Gpbs Infrastructure Processing Units (IPUs). Now, at its Cloud Next conference, Google announced that it will launch the A3 into general availability next month.
We’ll have to see if Google Cloud will be able to keep up with demand for these chips, given that their focus is on training and serving generative AI models and large language models.
When it announced the A3 last year, Google Cloud said that it would offer up to 26 exaflops of AI performance and, thanks in part to the custom IPUs, up to 10x more network bandwidth compared to the previous-generation A2 machines.
“A3 is really purpose-built to train, tune and serve incredibly demanding and scalable generative AI workloads and large language models,” Mark Lohmeyer, the VP and GM for computer and ML infrastructure at Google Cloud, said during a press conference ahead of today’s announcement. “It leverages a number of unique Google innovations including Google networking technologies such as their infrastructure processing and offloads, that help support the massive scale and performance that these workloads require.”