Why is it important: The currently available deep learning resources are lagging behind due to increasing complexity, varying resource requirements, and the constraints imposed by existing hardware architectures. Several Nvidia researchers recently published a technical article detailing the company’s commitment to multi-chip modules (MCMs) to meet these evolving requirements. This article provides the team’s position on the benefits of a Composable-On-Package (COPA) GPU to better handle different types of deep learning workloads.
Graphics processing units (GPUs) have become one of the main resources supporting DL due to their inherent capabilities and optimizations. COPA-GPU is based on the realization that traditional converged GPU designs using specialized hardware are quickly becoming impractical. These GPU-based converged solutions rely on traditional die architecture as well as specialized hardware such as high bandwidth memory (HBM). Tensor cores (Nvidia)/Matrix cores (AMD), ray tracing (RT) kernels, and so on. As a result of this converged design, hardware may be good for some tasks but not efficient for others.
Unlike current monolithic GPUs, which bundle all specific execution components and caching in one package, COPA-GPU The architecture provides the ability to mix and match multiple hardware blocks to better match the dynamic workloads presented in today’s high performance computing (HPC) and deep learning (DL) environments. This ability to include more features and adapt to multiple types of workloads can lead to higher levels of GPU reuse and, more importantly, to a greater ability for data scientists to push the boundaries of what is possible by leveraging their existing resources.
Although the concepts of artificial intelligence (AI), machine learning (ML), and GO are often combined, they have different differences… DL, which is a subset of AI and ML, attempts to emulate the way our human brain processes information by using filters to predict and classify information. DL is the driving force behind many automated AI capabilities that can do anything from driving our cars to monitoring financial systems for fraudulent activity.
While AMD and others have touted chipset and chip stack technology as the next step in the evolution of their processors and GPUs over the past few years, the concept of MCM is far from new. The history of MCM can be traced back to MCM with IBM bubble memory and 3081 mainframes in the 1970s and 1980s.