In Amazon's Austin, Texas headquarters, a few specialized engineers are designing microchips that will let AWS customers train their own generative AI models. The chips, which are called Inferentia and Trainium, offer an opportunity for AWS customers to train their own large language models (LLM), thereby posing as alternative to Nvidia's GPUs that most people consider costly and difficult to use.
Trainium was first introduced to the market in 2021, which was two years after the first generation Inferentia chips were launched. Inferentia chips are now in their second generation, demonstrating how quickly these technologies are being enhanced.
Adam Selipsky, the CEO of Amazon Web Services (AWS), said in a June interview: “The entire world would like more chips for doing generative AI, whether that’s GPUs or whether that’s Amazon’s own chips we're designing. I think we’re in a better position than anybody else on Earth to supply the capacity that our customers collectively are going to want.”
Amazon vs Google vs Microsoft vs Meta: The Generative AI Competition Remains Tough
Interestingly, some companies are ahead of Amazon and have been able to capture a large chunk of the generative AI market. For example, Microsoft reportedly invested a whopping sum of $13 billion in OpenAI after the latter launched ChatGPT in November 2022. Not long after that, the company started incorporating generative AI models into its products, starting from Bing in February 2023.
Also in February, Google invested $300 million in OpenAI's rival, Anthropic, and launched its enormous LLM known as Bard.
In April, Amazon launched its own group of Large Language Models, Titan. Not only that, the company also launched Bedrock, a service that will help developers use generative AI to enhance their software.
Dekate, a VP analyst at Gartner, said "Amazon is not used to chasing markets. Amazon is used to creating markets. And I think for the first time in a long time, they are finding themselves on the back foot and they are working to play catch up."
However, the generative AI competition is not only between Google, Microsoft, and Amazon. Meta, Facebook’s parent company, is also in the race. Meta released its own open-source LLM, called Llama2, that is available for the public on Microsoft’s Azure Public Cloud. The generative AI model is Meta’s way of competing against OpenAI's ChatGPT.
Despite the intense competition going on, Dekate thinks that Amazon’s custom silicon will be an important competitive advantage for the company. He said “I think the true differentiation is the technical capabilities that they are bringing to bear because guess what? Microsoft does not have Trainium or Inferentia.”
AWS Chips: Amazon’s Generative AI Competitive Advantage?
Amazon did not start producing chips because of the current generative AI boom. This has been emphasized by Swami Sivasubramanian, AWS VP of database, analytics and machine learning, who said “let’s rewind the clock even before ChatGPT. It’s not like after that happened, suddenly we hurried and came up with a plan because you can’t engineer a chip in that quick a time, let alone you can’t build a Bedrock service in a matter of 2 to 3 months.”
Therefore, to follow Swami’s advice, going back memory lane to 2013, Amazon started producing its own custom silicon that contains a special hardware called Nitro. The company didn’t stop there as it bought Annapurna, an Israeli chip-production startup, in 2015. By 2018, Amazon had created Graviton, an Arm-based server chip that rivals AMD and Intel’s x86 CPUs. Currently, Amazon’s custom Chip is present in every AWS server and more than 20 million has been sold so far.
In the AI sphere, Amazon launched its first Artificial Intelligence-focused chips in 2018, which was two years later than Google’s Tensor Processor Unit (TPU). As for Microsoft, the company has been reportedly working with AMD on similar technology, called Athena AI chip, but it’s yet to make any announcement.
In a recent interview with CNBC, Matt Wood, AWS VP of product explained what Amazon’s two chips (Trainium and Inferentia) represent. In his words, "Machine learning breaks down into these two different stages. So you train the machine learning models and then you run inference against those trained models. Trainium provides about 50% improvement in terms of price performance relative to any other way of training machine learning models on AWS. Inferentia allows customers to deliver very, very low cost, high-throughput, low-latency, machine learning inference, which is all the predictions of when you type in a prompt into your generative AI model, that’s where all that gets processed to give you the response."
Training Large Language Models: Nvidia's GPUs The Current Winner
When it comes to training AI models, Nvidia's GPUs remain the best. According to Stacy Rasgon of Bernstein Research, Nvidia’s current leadership role is not by accident as she revealed that “Nvidia chips have a massive software ecosystem that’s been built up around them over the last like 15 years that nobody else has, and the big winner from AI right now is Nvidia.” Nvidia’s undeniable lead is further emphasized by Amazon’s use of Nvidia H100s in powering its newly-launched AI acceleration hardware.
Moreover, Amazon’s Bedrock service lets AWS users explore and use LLMs made by different giants in the industry, such as Anthropic, Stability AI, AI21 Labs, and Amazon’s Titan. Swami tried to let the public know the primary focus of Amazon right now by saying “we don’t believe that one model is going to rule the world, and we want our customers to have the state-of-the-art models from multiple providers because they are going to pick the right tool for the right job.”
Furthermore, AWS CEO Selipsky said, “We have so many customers who are saying, ‘I want to do generative AI,’ but they don’t necessarily know what that means for them in the context of their businesses. And so we’re going to bring in solutions architects and engineers and strategists and data scientists to work with them one on one.”
More Generative AI Competitive Advantage For Amazon: Cloud Computing Dominance
According to Dekate, “Amazon does not need to win headlines. Amazon already has a really strong cloud install base. All they need to do is to figure out how to enable their existing customers to expand into value creation motions using generative AI.” Mai-Lan Tomsen Bukovec, who is the VP of technology at AWS, further explained that “it’s a question of velocity. How quickly can these companies move to develop these generative AI applications is driven by starting first on the data they have in AWS and using compute and machine learning tools that we provide.”
Indeed, Amazon Web Services (AWS) remains the leading cloud computing technology provider in the world, claiming 40% of the total market share as at 2022, according to Gartner (a technology industry researcher). In comparison to Google Cloud, AWS’s profit margins have always been far wider. For more clarity, AWS is responsible for 70% of Amazon’s $7.7 billion total profits in the second quarter of 2022.
While a leaked email revealed that Andy Jassy, Amazon’s CEO, is overseeing the creation of other large language models, Dekate and Mai-Lan’s submissions explain why Amazon seems to be more focused on creating useful tools for its customer base instead of creating a ChatGPT competitor. For example, AWS HealthScribe is a generative AI helping doctors create patient visit summaries; SageMaker is a hub of AI models and algorithms that users can use; and CodeWhisperer is said to be helping developers complete their coding tasks 57% faster.
Training Large Language Models: Information Security Concerns Abound, and How Amazon Plans To Solve Them
Following the surge in Artificial Intelligence popularity, businesses are not confident about the security of their proprietary information they are meant to use as training data for public large language models.
Selipsky emphasized how Amazon is planning to solve this problem for its customers by saying “I can’t tell you how many Fortune 500 companies I’ve talked to who have banned ChatGPT. So with our approach to generative AI and our Bedrock service, anything you do, any model you use through Bedrock will be in your own isolated virtual private environment. It'll be encrypted, it'll have the same AWS access controls.”
While over 100,000 customers are currently using generative AI on AWS, the number is still extremely small when compared to the millions of businesses using its cloud computing services.
Nonetheless, Dekate explains why Amazon is not upset with the current number by saying “What we are not seeing is enterprises saying, ‘Oh, wait a minute, Microsoft is so ahead in generative AI, let’s just go out and let’s switch our infrastructure strategies, migrate everything to Microsoft.’ If you’re already an Amazon customer, chances are you’re likely going to explore Amazon ecosystem quite extensively.”