Microsoft revealed two new chips at its Ignite conference in Seattle on Wednesday. The first one is called Maia 100, and it’s an artificial intelligence chip. This chip might rival Nvidia’s highly sought-after AI graphics processing units. The second chip is called Cobalt 100 Arm, and it’s designed for general computing tasks. This chip could be in competition with Intel processors.
Big tech companies with a lot of money are now offering more choices to their clients for cloud infrastructure. This allows them to run applications. Alibaba, Amazon, and Google have been doing this for a long time. Microsoft, which had about $144 billion in cash by the end of October, had a 21.5% share of the cloud market in 2022. It’s second only to Amazon, according to one estimate.
Microsoft’s Maia 100 and Evolution of AI Chips
In 2016, Google introduced its original tensor processing unit for AI. Amazon Web Services followed suit in 2018 with its Graviton Arm-based chip and Inferentia AI processor. In 2020, Amazon also announced Trainium for training models.
Special AI chips offered by cloud providers could help address the demand during a GPU shortage. However, Microsoft and other cloud computing companies don’t plan to sell servers with their chips, unlike Nvidia or AMD.
The company developed its AI computing chip based on feedback from customers, as explained by Borkar.
Microsoft is currently testing how well the Maia 100 performs in applications such as its Bing search engine’s AI chatbot (now called Copilot instead of Bing Chat), the GitHub Copilot coding assistant, and GPT-3.5-Turbo, a large language model supported by Microsoft-backed OpenAI.
Borkar mentioned that OpenAI has trained its language models using extensive data from the internet. These models can generate email messages, summarize documents, and answer questions with just a few words of human instruction.
The GPT-3.5-Turbo model operates in OpenAI’s ChatGPT assistant, which gained popularity shortly after its release last year. Following that, companies quickly integrated similar chat capabilities into their software, leading to a higher demand for GPUs.
At an Evercore conference in New York in September, Colette Kress, Nvidia’s finance chief, mentioned, “We’ve been collaborating extensively with all our suppliers to enhance our supply situation and meet the demands placed by many of our customers.”
OpenAI has previously used Nvidia GPUs in Azure to train its models.
Microsoft’s New Cooling System and GPU Challenges in Data Centers
Apart from creating the Maia chip, Microsoft has developed specialized liquid-cooled hardware called Sidekicks. These Sidekicks are designed to fit into racks located alongside those containing Maia servers. The unique aspect is that the installation of these server racks and Sidekick racks can be done without the need for retrofitting, as explained by a spokesperson.
Using GPUs in data centers can be challenging when it comes to optimizing limited space. Steve Tuck, co-founder and CEO of server startup Oxide Computer, mentioned that some companies address this challenge by placing a few servers with GPUs at the bottom of a rack like “orphans” to prevent overheating. Rather than filling the rack from top to bottom, they may opt for this arrangement. Tuck also noted that some companies incorporate cooling systems to lower temperatures.
If Amazon’s experience is any indication, Microsoft may witness a quicker adoption of Cobalt processors compared to Maia AI chips. Microsoft is currently testing its Teams app and Azure SQL Database service on Cobalt. The preliminary results indicate a 40% improvement compared to Azure’s existing Arm-based chips, sourced from startup Ampere, as reported by Microsoft.
Gravitons Success and Savings in the Cloud
In the last eighteen months, as prices and interest rates have risen, many companies have been looking for ways to make their cloud spending more efficient. For AWS customers, Graviton has been a popular choice. According to Vice President Dave Brown, all of AWS’ top 100 customers are now using the Arm-based chips, which can offer a 40% improvement in price-performance.
Transitioning from GPUs to AWS Trainium AI chips can be more intricate than moving from Intel Xeons to Gravitons. Each AI model has its own complexities. Brown mentioned that while many tools have been adapted to work on Arm due to their prevalence in mobile devices, the same isn’t as true for silicon designed for AI.
However, he anticipates that organizations will eventually experience similar price-performance gains with Trainium compared to GPUs.
“We have shared these specifications with the ecosystem and with many of our partners, benefiting all of our Azure customers,” said the spokesperson.
Borkar mentioned that she lacks specific details about Maia’s performance in comparison to alternatives like Nvidia’s H100. Nvidia recently announced that its H200 will begin shipping in the second quarter of 2024, but Borkar did not provide a direct comparison between Maia and the upcoming Nvidia H200.
Microsoft introduced the Maia 100 and Cobalt 100 Arm chips, showcasing advancements in AI and cloud computing. The industry’s focus on efficiency and inventive solutions, like Sidekicks, demonstrates a commitment to meeting demand. Challenges persist, but ongoing developments, along with AWS’ success with Graviton, suggest a promising future for AI chips and cloud infrastructure. As Microsoft undergoes testing and competitors prepare for new releases, the tech landscape is evolving, promising exciting advancements soon.