In a quiet office building in Austin, Texas, there are two simple rooms where a handful of Amazon employees work on creating microchips. These chips, known as Inferentia and Trainium, are designed specifically for speeding up generative AI processes. They offer an alternative to Amazon Web Services (AWS) customers who want to avoid the challenges and costs of using Nvidia GPUs to train large language models.
Amazon Web Services CEO, Adam Selipsky, stated in June that the demand for better chips for generative AI is high. Amazon believes they are well-equipped to meet this demand better than anyone else.
While Amazon has been slower to act, other companies have made bigger strides in the generative AI field. Microsoft gained attention for not only hosting the ChatGPT chatbot but also investing $13 billion in OpenAI. They quickly integrated AI models into Bing. Google also entered the scene with its own language model, Bard, and invested $300 million in Anthropic, a competitor to OpenAI.
Amazon joined the competition in April by introducing its Titan language models and Bedrock service to help developers enhance their software using generative AI. Chirag Dekate, VP Analyst at Gartner, noted that Amazon is usually a pioneer, not a follower. Meta also introduced its own Large Language Model (LLM), Llama 2, available for testing on Microsoft’s Azure public cloud platform.
Microchips as a form of authentic distinctiveness
Looking ahead, Dekate emphasized that Amazon’s proprietary microchips could potentially grant the company a competitive advantage within the realm of generative AI.
“The genuine distinction lies in the technological prowess they are harnessing,” he commented. “Consider this: Trainium and Inferentia are exclusive to Amazon, setting it apart from Microsoft.”
Amazon Web Services discreetly embarked on custom silicon production as far back as 2013, introducing a specialized component known as Nitro. This has now become the most widely deployed chip across AWS infrastructure. Amazon informed that each AWS server is equipped with at least one Nitro chip, amounting to a collective usage of over 20 million units.
In 2015, Amazon acquired the Israeli chip startup, Annapurna Labs. Subsequently, in 2018, Amazon unveiled Graviton, its Arm-based server chip, which posed as a competitor to x86 CPUs offered by industry giants such as AMD and Intel.
“Likely around high single-digit to perhaps 10% of total server sales are attributed to Arm, and a substantial portion of these can be attributed to Amazon. Thus, in terms of CPUs, their performance has been impressive,” remarked Stacy Rasgon, a senior analyst at Bernstein Research.
Within the same year of 2018, Amazon introduced chips tailored for AI applications. This development occurred two years after Google’s initial announcement of its Tensor Processor Unit (TPU). In the meantime, Microsoft has yet to officially reveal details about the Athena AI chip it has reportedly been developing, in collaboration with AMD.
Amazon’s chip laboratory is situated in Austin, Texas, serving as the development and testing ground for Trainium and Inferentia. During this tour, Vice President of Product, Matt Wood, elucidated the roles of both chips.
“Machine learning can be divided into these two distinct phases. Initially, there’s the training of machine learning models, followed by running inferences using these trained models,” explained Wood. “In terms of price performance, Trainium offers an approximate 50% enhancement compared to any alternative method of training machine learning models on AWS.”
Trainium made its debut in 2021, subsequent to the initial launch of Inferentia in 2019, which has since progressed to its second generation.
Inferentia empowers customers to achieve exceedingly cost-effective, high-throughput, and low-latency machine learning inference. This pertains to the processing of predictions generated from input prompts in generative AI models, ultimately resulting in the provided response,” Wood elaborated.
However, in the present scenario, Nvidia’s GPUs continue to hold the dominant position for model training. As of July, AWS introduced novel AI acceleration hardware driven by Nvidia H100s.
“The extensive software ecosystem that has evolved around Nvidia chips over the past approximately 15 years is unparalleled by any other. At the moment, Nvidia stands out as the prominent beneficiary within the realm of AI,” remarked Rasgon.
Harnessing cloud supremacy
Amazon’s commanding presence in the realm of cloud services, however, stands out as a significant distinguishing factor.
“Amazon’s requirement for making headlines is not as pronounced, given their already robust cloud user base. Their primary focus revolves around facilitating the expansion of existing clients into the realm of value-adding endeavors through generative AI,” highlighted Dekate.
When individuals are faced with the choice between Amazon, Google, and Microsoft for generative AI solutions, Amazon possesses an advantage due to the millions of AWS customers who are already acquainted with the platform, utilizing it for other applications and data storage.
“The key consideration is the speed at which these companies can move forward in developing generative AI applications. This is heavily influenced by their capacity to initiate the process utilizing the data they have within AWS, and leveraging the computational and machine learning tools we provide,” elucidated Mai-Lan Tomsen Bukovec, Vice President of Technology at AWS.
In the domain of cloud computing, AWS stands as the largest provider globally, commanding a 40% market share in 2022, as per insights from the technology industry research firm Gartner. Despite experiencing a consecutive three-quarter decline in year-over-year operating income, AWS continues to contribute significantly, comprising 70% of Amazon’s overall $7.7 billion operating profit in the second quarter. Notably, AWS has historically maintained substantially wider operating margins compared to Google Cloud.
Furthermore, AWS is progressively expanding its array of developer tools centered around generative AI.
“Even prior to the emergence of ChatGPT, the timeline reveals that this wasn’t a hastily devised response. The development of a microchip or the establishment of a service like Bedrock cannot be executed swiftly, especially within a span of merely 2 to 3 months,” emphasized Swami Sivasubramanian, Vice President of Database, Analytics, and Machine Learning at AWS.
Innovations and Advancements in AWS AI Ecosystem
Bedrock provides AWS clients with the opportunity to utilize extensive language models developed by Anthropic, Stability AI, AI21 Labs, as well as Amazon’s proprietary Titan models.
“We hold the perspective that no single model will dominate the entire landscape. Our objective is to empower our customers with cutting-edge models from diverse providers, enabling them to select the most suitable solution for their specific requirements,” Sivasubramanian articulated.
Among Amazon’s recent additions to its AI repertoire is AWS HealthScribe, a service introduced in July designed to assist physicians in composing patient visit summaries through the implementation of generative AI. Amazon also boasts SageMaker, a centralized platform for machine learning that encompasses a spectrum of algorithms, models, and related resources.
A notable tool in Amazon’s arsenal is CodeWhisperer, an aiding companion for coding endeavors. As per Amazon’s claims, CodeWhisperer has led to an average 57% reduction in task completion time for developers. Similarly, Microsoft also reported enhanced productivity through its coding assistant, GitHub Copilot, last year.
In June, AWS made public its initiative to establish a “center” dedicated to generative AI innovation, a venture that was backed by a substantial investment of $100 million.
“We are encountering numerous customers who express a desire to delve into generative AI, although they might not possess a clear understanding of how it translates to their individual business contexts. In response, we are mobilizing solutions architects, engineers, strategists, and data scientists to collaborate closely with them on a personalized basis,” elaborated AWS CEO Selipsky.
While AWS has predominantly concentrated on furnishing tools rather than creating a direct counterpart to ChatGPT, an internal email recently leaked indicates that Amazon CEO Andy Jassy is directly overseeing the formation of a new centralized team dedicated to the expansion of comprehensive large language models.
Enabling Growth and Security in a Rapidly Expanding Landscape
During the earnings call for the second quarter, Jassy highlighted that a substantial portion of AWS business is presently driven by AI, supported by the array of over 20 machine learning services it offers. Notable clientele encompass Philips, 3M, Old Mutual, and HSBC, among others.
The exponential expansion of AI has been accompanied by a surge of security apprehensions from companies concerned about the potential leakage of proprietary data as their employees contribute to the training data employed by publicly available large language models.
“It’s astonishing how many conversations I’ve engaged in with Fortune 500 companies that have taken the step to prohibit the usage of ChatGPT. Therefore, with our approach to generative AI facilitated by the Bedrock service, any activity or model employed through Bedrock will be contained within an isolated virtual private cloud environment. This environment will be encrypted and subject to the same AWS access controls,” clarified Selipsky.
Presently, Amazon is solely intensifying its pursuit of generative AI, indicating that “over 100,000” customers are currently leveraging machine learning on the AWS platform. While this constitutes a minor proportion of AWS’s extensive customer base, analysts anticipate this scenario to evolve in the future.
“What we’re not witnessing is businesses abruptly altering their infrastructure strategies and migrating everything to Microsoft simply because they perceive Microsoft to be significantly ahead in generative AI,” Dekate commented. “If a company is already an established Amazon customer, it’s probable that they will delve deeper into Amazon’s ecosystems for their exploration and adoption.”
Conclusion
Amazon’s Austin facility is transforming AI with Inferentia and Trainium chips. CEO Selipsky anticipates global demand fulfillment, while Dekate highlights AWS’s strategic edge. Bedrock and HealthScribe showcase AWS’s cautious AI approach. In a shifting landscape, AWS pioneers innovation while prioritizing data security, shaping a bold AI future.