Ever since Amazon launched its Cloud Computing Services division, commonly known as AWS (Amazon Web Services), in 2006, the company has been on a mission to transform the world according to its vision of how computing resources can be acquired and deployed, and make them as versatile as possible. This strategy has been demonstrated this year re: invent.
AWS debuted several new computing options, some based on its own new custom chips, as well as a staggering array of tools and services to organize, analyze, and connect data. The sheer number and complexity of the many new features and services that have been introduced makes it difficult to keep track of all the choices that are now available to customers. However, the abundance of opportunities is not the result of uncontrolled development.
New AWS CEO Adam Selipsky, in his keynote (see above) and other keynotes, sought to point out that the organization is “obsessed” with customers. As a result, most of its product decisions and strategies are based on customer needs. It turns out that when you have many different types of clients with different types of workloads and requirements, you end up with a complex set of options.
In fact, this approach will reach its logical limit at some point, but at the same time, it means that the wide range of AWS products and services is likely to be a mirror image of the fullness (and complexity) of today’s enterprise computing landscape. In fact, analyzing what services are being used, to what extent, and how they have changed over time can provide a wealth of information about enterprise computing trends, but that’s a topic for another time.
In terms of compute options, the company acknowledged that it now has over 600 different EC2 (Elastic Compute Cloud) compute instances, each made up of different combinations of CPU and other acceleration chips, memory, network connectivity, and more. Although this number is difficult to fully estimate, it once again indicates how diverse today’s computing needs have become. From AI or ML-based cloud containerized applications that require the latest dedicated AI accelerators or GPUs, to legacy “lift and port” enterprise applications that only use older x86 processors, cloud computing services like AWS are now should be able to handle all of the above.
Among the new products announced this year are several based on 3rd generation Intel Xeon Scalable processors. However, the ones that received the most attention were based on three of Amazon’s own new silicon designs. HPC7g The instance is based on an updated version of the Arm-based Graviton3 processor dubbed Graviton3E, which the company claims delivers 2x the floating point performance of the previous Hpc6g instance and 20% overall performance over the current Hpc6a.
As with many things, Hpc7g is designed for a specific set of workloads—in this case, high-performance computing (HPC) such as weather forecasting, genomics processing, fluid dynamics, and more. In particular, it is designed for larger machine learning models that often run on thousands of cores. What’s interesting about this is that it demonstrates how far Arm-based processors have come in terms of the types of workloads they’ve been used for, as well as the degree of refinement that AWS is bringing to its various EC2 instances.
See also: Why does Amazon make processors?
Separately, in several other sessions, AWS highlighted the momentum of using Graviton for many other types of workloads, especially cloud containerized applications from AWS customers such as DirecTV and Stripe.
One intriguing conclusion drawn from these sessions is that, due to the nature of the tools used to develop these types of applications, the challenges of porting code from x86 to native Arm instructions (which were once considered a major halt to the adoption of servers based Arm) has largely faded away.
Instead, all it takes is a simple toggle of a few options before the code is completed and deployed to the instance. This makes the potential for further growth of Arm-based cloud computing much more likely, especially in new applications.
Of course, some of these organizations are aiming to create applications that are completely independent of the instruction set in the future, which, presumably, will make the choice of instruction set irrelevant. However, even in this situation, Compute Instances with the best price/performance ratio or performance/watt ratio, which often have Arm-based processors, are a more attractive option.
For machine learning workloads, Amazon introduced its second generation conclusion processor as part of its new Inf2 instance. Inferentia2 is designed to support inference based on models with billions of parameters, such as many of the new large language models for applications such as real-time speech recognition that are currently in development.
The new architecture is designed to scale to thousands of cores, which is what these huge new models like GPT-3 require. Additionally, Inferentia2 includes support for a mathematical technique known as stochastic rounding, which AWS describes as “a probabilistic rounding method that provides high performance and greater accuracy than legacy rounding modes.” To maximize the benefits of distributed computing, the Inf2 instance also supports a next-generation version of the NeuronLink ring network architecture that is purported to provide 4x the performance and 1/10 latency of existing Inf1 instances. The bottom line is that it can offer up to 45% higher performance per watt for inference than any other option, including GPU-powered options. Given that the power requirements for inference are often 9 times higher than for model training according to AWS, this is a big deal.
The third new chip-driven instance is called C7gn and features the new AWS Nitro NIC, powered by fifth-generation Nitro chips. Designed specifically for workloads that require extremely high bandwidth such as firewalls, virtual networks, and real-time encryption/decryption of data, C7gn should have twice the network bandwidth and 50% faster packets per second than previous examples. . Importantly, the new Nitro cards are able to reach these levels with a 40% performance-per-watt improvement over their predecessors.
Overall, Amazon’s emphasis on specialized silicon and an increasingly diverse array of computing capabilities represents a comprehensive toolbox for companies looking to move more of their workloads to the cloud. As with many other aspects of its AWS offerings, the company continues to refine and improve what has undoubtedly become a very complex and mature set of tools. Together, they offer a notable and promising glimpse into the future of computing and the new types of applications they can implement.
Bob O’Donnell – Founder and Principal Analyst Technalize Research LLC technology consulting firm that provides strategic consulting and market research services to the technology industry and the financial professional community. You can follow him on Twitter @bobodtech.