Nvidia has announced its latest generation of GeForce RTX graphics cards at its GTC AI conference. The Nvidia RTX 4090 and RTX 3080 are the first GPUs from the ‘Ada Lovelace’ series, and they should be up to twice as fast as their last-gen counterparts in rasterised games – and up to four times as fast in ray-traced games.
The RTX 4090 24GB will be available on October 12th and costs $1599, while the RTX 4080 16GB debuts in November at costs $1199. There’s also an RTX 4080 12GB model that costs $899. The 4090 and 4080 are reportedly two to four times faster than their Ampere equivalents, which are the 3090 Ti and the 3080 Ti, respectively.
Note that Nvidia RTX 30-series cards will remain on sale for the moment, filling in the bottom part of the market until lower-tier RTX 40-series GPUs arrive.
So how are these GPUs so fast? Well, the 40-series GPUs use TSMC’s ‘4N’ process and boast up to 76 billion transistors and up to 18,000 CUDA cores, 70 percent more than were in last-gen Ampere. The new process allows the generation to be significantly more power-efficient too, although we expect the flagship cards to be as power-hungry as rumoured – you’re just getting a ton of extra performance to offset that in efficiency terms.
The new streaming multiprocessor uses a new technique, shader execution re-ordering (SER), which Nvidia claims provides a ‘two to three times speed-up’ for ray tracing and a 25 percent improvement for rasterised games. SER works by dynamically rescheduling shading workloads to better use GPU resources. There are similar advancements in the dedicated ray tracing silicon, with a doubling of ray-triangle intersection throughput, a new opacity micromap engine that doubles the speed of ‘ray tracing of alpha test geometry’ and a micromesh engine that ‘increases geometric richness without the BVH build and storage cost’. There’s also a new and more powerful Tensor core for AI tasks.
It is also the first card to support Nvidia’s DLSS 3, a new technique that generates entirely new frames – essentially adding in interpolated frames between ‘real’ ones to dramatically increase frame-rates. This has a deleterious effect on input latency, so it’s combined with Reflex to reduce input latency as much as possible. This reduces the load on both the CPU and the GPU, so you actually see frame-rate benefits in both CPU and GPU-limited games – although the input latency penalty means it’ll be less useful for, say, competitive FPS games.
Nvidia demonstrated Cyberpunk 2077 running at ~22fps at 4K with RT enabled and DLSS disabled, then ~100fps with RT enabled and DLSS 3 engaged – a massive speedup even if this is a cherry-picked demo. The firm also played a video demo of Flight Sim 2020, with the game running at ~60fps with RT + DLSS disabled and ~135fps with RT + DLSS 3 enabled – good evidence that the DLSS 3 technique does ease CPU limitations.
As well as upgrading existing games with RTX and DLSS 3, Nvidia also announced a ‘new’ title: Portal RTX. This remaster, 15 years on from the original release, looks impressive from the short teaser trailer. It’ll be released for free for Portal owners in November. The mod was created in what Nvidia is calling ‘RTX Remaster’, an application that lets you produce mods for a wide range of games that add ray tracing and AI-upscaled textures – very neat.
Elsewhere, Nvidia announced Racer X, a ‘simulated world’ with incredible realism – think their marble demo turned up to 11.
It was an impressive presentation from Nvidia, and we’re looking forward to testing both the performance and features of the new generation. It’ll be fascinating to see whether DLSS 3 lives up to Team Green’s lofty promises… and whether these new cards will actually be available at their announced prices if the performance claims are substantiated. We’ll share more information when we have it!