"We need bigger GPUs," said Jensen Huang. He also introduced the NIM microservices and Omniverse Cloud APIs.
GTC 2024 has started and introduced various tech innovations NVIDIA has been busy with. Here is what you need to know from the keynote.
"We need bigger GPUs," the company's CEO Jensen Huang decided and announced Blackwell, a new computing platform that can handle the generative AI efforts NVIDIA has been working on and then some. Its increased computing power can be used "for everything from software to services, robotics to medical technology and more."
"We need another way of doing computing – so that we can continue to scale so that we can continue to drive down the cost of computing, so that we can continue to consume more and more computing while being sustainable. Accelerated computing is a dramatic speedup over general-purpose computing, in every single industry."
A Blackwell chip delivers up to 5 times more performance than the previous-generation Hopper in neural network training. It features a 5th-generation NVLink interconnect twice as fast as Hopper and scales up to 576 GPUs.
Image credit: NVIDIA
The Blackwell GPU architecture features six transformative technologies for accelerated computing:
- World’s Most Powerful Chip — Packed with 208 billion transistors, Blackwell-architecture GPUs are manufactured using a custom-built 4NP TSMC process with two-reticle limit GPU dies connected by 10 TB/second chip-to-chip link into a single, unified GPU.
- Second-Generation Transformer Engine — Fueled by new micro-tensor scaling support and NVIDIA’s advanced dynamic range management algorithms integrated into NVIDIA TensorRT™-LLM and NeMo Megatron frameworks, Blackwell will support double the compute and model sizes with new 4-bit floating point AI inference capabilities.
- Fifth-Generation NVLink — To accelerate performance for multitrillion-parameter and mixture-of-experts AI models, the latest iteration of NVIDIA NVLink® delivers groundbreaking 1.8TB/s bidirectional throughput per GPU, ensuring seamless high-speed communication among up to 576 GPUs for the most complex LLMs.
- RAS Engine — Blackwell-powered GPUs include a dedicated engine for reliability, availability and serviceability. Additionally, the Blackwell architecture adds capabilities at the chip level to utilize AI-based preventative maintenance to run diagnostics and forecast reliability issues. This maximizes system uptime and improves resiliency for massive-scale AI deployments to run uninterrupted for weeks or even months at a time and to reduce operating costs.
- Secure AI — Advanced confidential computing capabilities protect AI models and customer data without compromising performance, with support for new native interface encryption protocols, which are critical for privacy-sensitive industries like healthcare and financial services.
- Decompression Engine — A dedicated decompression engine supports the latest formats, accelerating database queries to deliver the highest performance in data analytics and data science. In the coming years, data processing, on which companies spend tens of billions of dollars annually, will be increasingly GPU-accelerated.
"NVIDIA's GB200 Grace Blackwell Superchip connects two Blackwell NVIDIA B200 Tensor Core GPUs to the NVIDIA Grace CPU over a 900GB/s ultra-low-power NVLink chip-to-chip interconnect." The company also noted that GB200-powered systems can be connected with the NVIDIA Quantum-X800 InfiniBand and Spectrum-X800 Ethernet platforms, which deliver advanced networking at speeds up to 800Gb/s.
Huang also presented NVIDIA NIM (NVIDIA inference microservices), "a new way of packaging and delivering software that connects developers with hundreds of millions of GPUs to deploy custom AI of all kinds."
He believes that in the future, we'll be building software using a team of AIs, and NIMs will help you with this. They support industry-standard APIs and are easy to connect and work across NVIDIA’s large CUDA installed base.
"The enterprise IT industry is sitting on a goldmine. They have all these amazing tools (and data) that have been created over the years. If they could take that goldmine and turn it into copilots, these copilots can help us do things."
Moreover, NVIDIA announced that NVIDIA Omniverse Cloud will be available as APIs, helping create industrial digital twin applications and workflows for software makers. Omniverse is the company's platform that integrates Universal Scene Description (OpenUSD) and RTX rendering technologies. The technology will be available on Apple's Vision Pro, and the new Omniverse Cloud APIs will let developers stream interactive digital twins into the headsets.
Robotics is also important for NVIDIA, so Huang announced the Isaac Perceptor SDK with multi-camera visual odometry, 3D reconstruction, and occupancy map, and depth perception to help robots better see their environment.
Also, the new Isaac Manipulator robotic arm perception, path planning, and kinematic control library will help make robotic arms more adaptable.
Other reveals include Project GR00T, a foundation model for humanoid robots, and Jetson Thor, a new computer for humanoid robots based on the NVIDIA Thor SOC.
Find out more about NVIDIA's innovations here and join our 80 Level Talent platform and our Telegram channel, follow us on Instagram, Twitter, and LinkedIn, where we share breakdowns, the latest news, awesome artworks, and more.