New Performance Optimizations Supercharge NVIDIA RTX AI PCs for Gamers, Creators and Developers (2024)

NVIDIA today announced at Microsoft Build new AI performance optimizations and integrations for Windows that help deliver maximum performance on NVIDIA GeForce RTX AI PCs and NVIDIA RTX workstations.

Large language models (LLMs) power some of the most exciting new use cases in generative AI and now run up to 3x faster with ONNX Runtime (ORT) and DirectML using the new NVIDIA R555 Game Ready Driver. ORT and DirectML are high-performance tools used to run AI models locally on Windows PCs.

WebNN, an application programming interface for web developers to deploy AI models, is now accelerated with RTX via DirectML, enabling web apps to incorporate fast, AI-powered capabilities. And PyTorch will support DirectML execution backends, enabling Windows developers to train and infer complex AI models on Windows natively. NVIDIA and Microsoft are collaborating to scale performance on RTX GPUs.

These advancements build on NVIDIA’s world-leading AI platform, which accelerates more than 500 applications and games on over 100 million RTX AI PCs and workstations worldwide.

RTX AI PCs — Enhanced AI for Gamers, Creators and Developers

NVIDIA introduced the first PC GPUs with dedicated AI acceleration, the GeForce RTX 20 Series with Tensor Cores, along with the first widely adopted AI model to run on Windows, NVIDIA DLSS, in 2018. Its latest GPUs offer up to 1,300 trillion operations per second of dedicated AI performance.

In the coming months, Copilot+ PCs equipped with new power-efficient systems-on-a-chip and RTX GPUs will be released, giving gamers, creators, enthusiasts and developers increased performance to tackle demanding local AI workloads, along with Microsoft’s new Copilot+ features.

For gamers on RTX AI PCs, NVIDIA DLSS boosts frame rates by up to 4x, while NVIDIA ACE brings game characters to life with AI-driven dialogue, animation and speech.

For content creators, RTX powers AI-assisted production workflows in apps like Adobe Premiere, Blackmagic Design DaVinci Resolve and Blender to automate tedious tasks and streamline workflows. From 3D denoising and accelerated rendering to text-to-image and video generation, these tools empower artists to bring their visions to life.

Faster LLMs and New Capabilities for Web Developers

Microsoft recently released the generative AI extension for ORT, a cross-platform library for AI inference. The extension adds support for optimization techniques like quantization for LLMs like Phi-3, Llama 3, Gemma and Mistral. ORT supports different execution providers for inferencing via various software and hardware stacks, including DirectML.

ORT with the DirectML backend offers Windows AI developers a quick path to develop AI capabilities, with stability and production-grade support for the broad Windows PC ecosystem. NVIDIA optimizations for the generative AI extension for ORT, available now in R555 Game Ready, Studio and NVIDIA RTX Enterprise Drivers, help developers get up to 3x faster performance on RTX compared to previous drivers.

Developers can unlock the full capabilities of RTX hardware with the new R555 driver, bringing better AI experiences to consumers, faster. It includes:

Support for DQ-GEMM metacommand to handle INT4 weight-only quantization for LLMs
New RMSNorm normalization methods for Llama 2, Llama 3, Mistral and Phi-3 models
Group and multi-query attention mechanisms, and sliding window attention to support Mistral
In-place KV updates to improve attention performance
Support for GEMM of non-multiple-of-8 tensors to improve context phase performance

Additionally, NVIDIA has optimized AI workflows within WebNN to deliver the powerful performance of RTX GPUs directly within browsers. The WebNN standard helps web app developers accelerate deep learning models with on-device AI accelerators, like Tensor Cores.

Now available in developer preview, WebNN uses DirectML and ORT Web, a Javascript library for in-browser model execution, to make AI applications more accessible across multiple platforms. With this acceleration, popular models like Stable Diffusion, SD Turbo and Whisper run up to 4x faster on WebNN compared to WebGPU and are now available for developers to use. Microsoft Build attendees can learn more about developing on RTX in the Accelerating development on Windows PCs with RTX AI in-person session on Wednesday, May 22, at 11 a.m. PT.

New Performance Optimizations Supercharge NVIDIA RTX AI PCs for Gamers, Creators and Developers (2024)

FAQs

What is the advantage of RTX AI? ›

AI Enhanced, RTX On

An AI-powered boost gives you maximum performance in over 300 games and apps thanks to DLSS. Plus, enjoy portable, powerful laptops with AI-powered Max-Q technologies that optimize your laptop's performance, power, acoustics, and more for peak efficiency.

Discover More ›

Which NVIDIA graphics card is best for AI? ›

5 Best GPUs for AI and Deep Learning in 2024

Top 1. NVIDIA A100. The NVIDIA A100 is an excellent GPU for deep learning. ...
Top 2. NVIDIA RTX A6000. The NVIDIA RTX A6000 is a powerful GPU that is well-suited for deep learning applications. ...
Top 3. NVIDIA RTX 4090. ...
Top 4. NVIDIA A40. ...
Top 5. NVIDIA V100.

Learn More ›

Is NVIDIA developing AI? ›

NVIDIA AI Platform

Transform any enterprise into an AI organization with NVIDIA AI, the world's most advanced platform with full stack innovation across accelerated infrastructure, enterprise-grade software, and AI models.

What is the latest RTX? ›

The Nvidia GeForce RTX 4080 was unveiled on September 20 and began shipping on November 2022 from third-party manufacturers like Asus, Gigabyte, PNY and more. It launched with a $1,199 price tag, which is still its current price.

Discover More Details ›

Do you need a GPU to run AI? ›

While the GPU handles more difficult mathematical and geometric computations. This means GPU can provide superior performance for AI training and inference while also benefiting from a wide range of accelerated computing workloads.

How much faster is GPU than CPU for AI? ›

Because they have thousands of cores, GPUs are optimized for training deep learning models and can process multiple parallel tasks up to three times faster than a CPU.

Know More ›

Is Nvidia AI free? ›

Kick-start your AI journey with access to NVIDIA AI workflows—for free.

Read The Full Story ›

How many GPUs do I need for AI? ›

Also keep in mind that a single GPU like the NVIDIA RTX 3090 or A5000 can provide significant performance and may be enough for your application. Having 2, 3, or even 4 GPUs in a workstation can provide a surprising amount of compute capability and may be sufficient for even many large problems.

Discover More ›

What GPU is needed for generative AI? ›

NVIDIA RTX GPUs — capable of running a broad range of applications at the highest performance — unlock the full potential of generative AI on PCs. Tensor Cores in these GPUs dramatically speed AI performance across the most demanding applications for work and play.

Why did Apple stop using Nvidia? ›

AFAICT the answer is that Nvidia and Apple are not the best of friends, and Apple decided that they would not sign any newer drivers. This may have been at least partly because it restricts the numbers of people who try making hackintoshes out of existing machines.

Tell Me More ›

How much does the Nvidia AI chip cost? ›

Nvidia's newest AI chip will cost anywhere from $30,000 to $40,000, CEO Jensen Huang told CNBC. The Blackwell chip, unveiled Monday, is a big step up from its predecessor, Hopper. Hopper could cost roughly $40,000 in high demand; the A100 before it cost much less at around $10,000.

Learn More ›

What is the most powerful Nvidia AI chip? ›

Nvidia's must-have H100 AI chip made it a multitrillion-dollar company, one that may be worth more than Alphabet and Amazon, and competitors have been fighting to catch up. But perhaps Nvidia is about to extend its lead — with the new Blackwell B200 GPU and GB200 “superchip.”

What is the strongest RTX in the world? ›

Product Description. The NVIDIA GeForce RTX 4090 is the ultimate GeForce GPU.

Learn More Now ›

What does RTX stand for? ›

The full form of RTX is known as Ray Tracing Texel eXtreme. The RTX cards provide real-time ray tracing to make the video look more beautiful. It is a variant of the GeForce, announced in 2018.

Is Nvidia better than AMD in 2024? ›

Still, Nvidia's growth trajectory positions it to pull far ahead of AMD in total revenue by 2024. Analysts are incredibly bullish on Nvidia's prospects in AI. They see Nvidia dominating the market for AI training with its industry-leading combination of GPU hardware and CUDA software platform.

What is the benefit of RTX? ›

Realistic Graphics

Ray tracing is the magical feature of Nvidia GeForce RTX gaming laptops, and this technology lets you enjoy the most realistic graphics possible in the latest gaming titles. Ray-tracing optimises in-game reflections and shadows and stimulates how light interacts with 3D space.

Show Me More ›

What is the point of RTX? ›

RTX facilitates a new development in computer graphics of generating interactive images that react to lighting, shadows and reflections.

Is RTX better for machine learning? ›

RTX GPUs often feature more CUDA cores and Tensor cores compared to GTX GPUs. Tensor cores, in particular, are essential for accelerating AI and deep learning tasks.

Get More Info ›

Why is the Nvidia chip better for AI? ›

Once Nvidia realised that its accelerators were highly efficient at training AI models, it focused on optimising them for that market. Its chips have kept pace with ever more complex AI models: in the decade to 2023 Nvidia increased the speed of its computations 1,000-fold.

Show Me More ›