AWS and NVIDIA Collaborate on Next-Generation Infrastructure for Training Large Machine Learning Models and Building Generative AI Applications

New Amazon EC2 P5 instances deployed in EC2 UltraClusters are fully optimized to harness NVIDIA Hopper GPUs for accelerating generative AI training and inference at massive scale

MENA, March 24, 2023 — Amazon Web Services, Inc. (AWS), an Amazon.com, Inc. company (NASDAQ: AMZN), and NVIDIA (NASDAQ: NVDA) today announced a multi-faceted collaboration focused on building out the world’s most scalable, on-demand artificial intelligence (AI) infrastructure optimized for training increasingly complex large language models (LLMs) and developing generative AI applications.

The collaboration will leverage next-generation Amazon Elastic Compute Cloud (Amazon EC2) P5 instances powered by NVIDIA H100 Tensor Core GPUs and AWS’s state-of-the-art networking and scalability that will deliver up to 20 exaFLOPS of compute performance for building and training the largest deep learning models. P5 instances will be the first GPU-based instance to take advantage of AWS’s second-generation Elastic Fabric Adapter (EFA) networking, which provides 3,200 Gbps of low-latency, high bandwidth networking throughput, enabling customers to scale up to 20,000 H100 GPUs in EC2 UltraClusters for on-demand access to supercomputer-class performance for AI. Among globally known companies that leverage this technology is Pinterest, which is accelerating its product development and introducing new Empathetic AI-based experiences to their customers.

“AWS and NVIDIA have collaborated for more than 12 years to deliver large-scale, cost-effective GPU-based solutions on demand for various applications such as AI/ML, graphics, gaming, and HPC,” said Adam Selipsky, CEO at AWS. “AWS has unmatched experience delivering GPU-based instances that have pushed the scalability envelope with each successive generation, with many customers scaling machine learning training workloads to more than 10,000 GPUs today. With second-generation EFA, customers will be able to scale their P5 instances to over 20,000 NVIDIA H100 GPUs, bringing supercomputer capabilities on demand to customers ranging from startups to large enterprises.”

“The arrival of accelerated computing and AI is timely. As businesses strive to do more with less, accelerated computing provides step-function speedups while decreasing cost and power. The advent of generative AI has prompted businesses to reimagine their products and business models and become disruptors, rather than disrupted,” said Jensen Huang, founder and CEO of NVIDIA. “AWS is a long-time partner and was the first cloud service provider to offer NVIDIA GPUs. We are thrilled to combine our expertise, scale, and reach to help customers harness accelerated computing and generative AI to engage the enormous opportunities ahead.”

New Supercomputing Clusters and Server Designs

The new P5 instances powered by NVIDIA GPUs are ideal for training increasingly complex LLMs and computer vision models that power the most demanding and compute-intensive generative AI applications, such as question answering, code generation, video and image generation, and speech recognition. And the new server designs targeted scalable, efficient AI by leveraging the thermal, electrical, and mechanical expertise of the NVIDIA and AWS engineering teams, building servers that harness GPUs to deliver AI at scale, with an emphasis on energy efficiency in AWS infrastructure. GPUs are typically 20 times more energy efficient than CPUs for specific AI workloads, with the H100 being up to 300 times more energy efficient than CPUs for LLMs.

About Amazon Web Services

For over 15 years, Amazon Web Services has been the world’s most comprehensive and broadly adopted cloud offering. AWS has been continually expanding its services to support virtually any cloud workload, and it now has more than 200 fully featured services for compute, storage, databases, networking, analytics, machine learning and artificial intelligence (AI), Internet of Things (IoT), mobile, security, hybrid, virtual and augmented reality (VR and AR), media, and application development, deployment, and management from 96 Availability Zones within 30 geographic regions, with announced plans for 15 more Availability Zones and five more AWS Regions in Australia, Canada, Israel, New Zealand, and Thailand. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—trust AWS to power their infrastructure, become more agile, and lower costs.

About Amazon

Amazon is guided by four principles: customer obsession rather than competitor focus, passion for invention, commitment to operational excellence, and long-term thinking. Amazon strives to be Earth’s Most Customer-Centric Company, Earth’s Best Employer, and Earth’s Safest Place to Work. Customer reviews, 1-Click shopping, personalized recommendations, Prime, Fulfillment by Amazon, AWS, Kindle Direct Publishing, Kindle, Career Choice, Fire tablets, Fire TV, Amazon Echo, Alexa, Just Walk Out technology, Amazon Studios, and The Climate Pledge are some of the things pioneered by Amazon. For more information, visit amazon.com/about and follow @AmazonNews.

About NVIDIA

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.