Around the first week of July 2021, I had the privilege of being humbled by Nvidia representatives – for a good reason. Long story short, I was working on a topic with DLSS as one of the subjects in mind. With that said, NVIDIA offered help to explain in the most basic form on how DLSS actually works – slides attached.
Suffice to say, what could be supplemental to my previous topic became the main subject on this one. Not gonna lie, with further research done, the technology impressed me so much I decided to give it a proper rundown.
What is DLSS?
Before we even discuss how DLSS works, first we need to understand why it has been developed in the first place. DLSS, short for Deep Learning Super Sampling was a feature co-announced back in September 2018 when NVIDIA revealed the Ray-Tracing capable RTX 20 series of graphics cards with the all new RT and Tensor Cores inside.
Ray-Tracing while pretty is resource intensive and NVIDIA knows this very well. This is where DLSS comes in: To provide an uplift in performance without (possibly) sacrificing visual fidelity. This is done by using the Tensor Cores built within the RTX GPUs doing AI deep learning stuff – while solving aliasing problems at the same time.
For example, at 4K resolution with DLSS turned on, the GeForce RTX 3080 Ti gets a massive average FPS gain of about 34.44% over 4K native. Gamers equipped with mid-range graphics do not need to worry on this area too with even the RTX 2060 able to breach the 120 FPS mark with the feature turned on.
Right now, there are more than 55 games supported by this feature and Nvidia’s super computers are working non-stop to optimize games after games for roll-out. This also explains why a new driver usually comes out every time a game gets DLSS support back then. Usually is the keyword here for the early iteration of DLSS – more on this later.
This June alone, NVIDIA announced that there are 5 news games that will be added to the roster. As of this writing, we now have the following titles with confirmed support for DLSS:
- Amid Evil
- Anthem
- Aron’s Adventure
- Battlefield V
- Bright Memory
- Call of Duty: Black Ops Cold War
- Call of Duty: Modern Warfare
- Call of Duty: Warzone
- Chernobylite
- Control
- CRSED: F.O.A.D. (Formerly Cuisine Royale)
- Crysis Remastered
- Cyberpunk 2077
- Death Stranding
- Deliver Us the Moon
- Doom Eternal
- Dying: 1983
- Edge of Eternity
- Enlisted
- Everspace 2
- F1 2020
- Final Fantasy XV
- Fortnite
- Ghostrunner
- Gu Jian Qi Tan Online
- Icarus
- Into the Radius VR
- Iron Conflict
- Justice
- LEGO Builder’s Journey
- Marvel’s Avengers
- MechWarrior 5: Mercenaries
- Metro Exodus
- Metro Exodus PC Enhanced Edition
- Minecraft With RTX For Windows 10
- Monster Hunter: World
- Moonlight Blade
- Mortal Shell
- Mount & Blade II: Bannerlord
- Necromunda: Hired Gun
- Nine to Five
- Naraka: Bladepoint
- No Man’s Sky
- Nioh 2 – The Complete Edition
- Outriders
- Pumpkin Jack
- Rainbow Six Seige
- Ready or Not
- Red Dead Redemption 2
- Redout: Space Assault
- Rust
- Scavengers
- Shadow of the Tomb Raider
- Supraland
- System Shock
- The Ascent
- The Fabled Woods
- The Medium
- The Persistence
- War Thunder
- Watch Dogs: Legion
- Wolfenstein: Youngblood
- Wrench
- Xuan-Yuan Sword VII
Perhaps what makes DLSS so appealing right now is the support that it gets from game developers and of course their game engines. For example, Unreal Engine already got a DLSS plugin of its own – which makes integration an ease. Another example would be the Steam Photon collaboration between NVIDIA, Valve and Linux to bring Windows based DLSS supporting games to the beloved Linux Open Source platform.
Key Differences between DLSS 1.0 and DLSS 2.0
Prior to DLSS 2.0’s development, DLSS 1.0 was troubled with visual artifacts especially with moving objects. The image below from the game Control showed this perfectly with DLSS 1.0 struggling to accurately represent the mesh behind the ventilation system. DLSS 2.0 improved upon this significantly with its new temporal feedback techniques.
Another change with DLSS 2.0 is the addition of various modes. We now have Ultra Performance (DLSS 2.1 and above), Performance, Balance and Quality as opposed to DLSS 1.0’s set it and forget it approach. Nvidia added these selectable modes to aid higher resolution upscaling and for gamers to choose what would works best with their setup.
Last but certainly not the least, DLSS 2.0 comes with a generalized AI Network as opposed to the original’s per-game trained AI Network. This not only efficiently make use of Tensor Cores for better performance, but it also makes game integration easier. That’s why recent DLSS supported games do not require driver updates – unless explicitly required. Some gamers are even backporting the new DLSS DLL file onto some games with older DLSS versions and they worked quite well. (Source)
So How Does DLSS 2.0 works?
In order to finally explain how DLSS 2.0 works we have to explain what deep learning is and how it works. Basically, deep learning is a subset of AI under machine learning that allows human brain like data processing to be used in decision making.
Picture it something like this analogy: Thousands of you are tasked to identify if something is dangerous or not – in a controlled environment. Now as time goes by, with such knowledge at hand shared between thousands, you now have the ability to accurately judge if something is dangerous or not – with much accuracy. This way, you just made some sort of threat assessment safer and optimized due to the pooled knowledge based on decision making. This information now gets shared to those who are in the field to asses in real-time what is a threat or not.
Deep learning sort of works this way that gets to decide which is which and which is not, training itself numerous times on how to do a specific thing accurately. NVIDIA’s deep learning technology have already made leaps on its own by using AI to colorize images, make a 2D sketch 3D and many more – like the NVIDIA Canvas that you could try right now.
Now in the case of DLSS, we are looking at the capabilities of NVIDIA’s DGX super computers and your RTX graphics card. Basically, images gets trained from the super computer and then the training data gets used by the Tensor Cores. Let me elaborate.
During the DLSS training process, a 1080P rendered image is compared to an offline rendered 16K image. The differences between the two are then compared then returned to the AI network in order for it to learn and improve its results. This happens within the super computer thousands or maybe even millions of times in a learning loop.
Now when the AI network finally reliably outputs high quality images next to the 16K reference, that’s when the training data gets to your RTX GPU via a driver update and or via an update from the game itself – remember that DLL we discussed earlier? This is where the Tensor Cores of your RTX GPU kicks in, now with high resolution training data at hand, they could now reliably output to the target resolution what they were trained for using lower resolution data from the game.
We’ve noted earlier that DLSS 2.0 also comes trained with temporal feedback as one of its main differences with the original implementation. This essentially uses Temporal Anti-Aliasing as a basis along with motion vectors and jitter from a lower resolution. This improves temporal stability which in layman’s term help mitigate ghosting and artifacts – and boy it does. This is also one of the reasons why you get to see the new and improved DLSS technology implemented on titles with TAA support.
Finally, should I use it?
DLSS 2.0 by all means should be used. It is a free performance upgrade if you have a GeForce RTX family based graphics card. I have been experimenting with it the past few weeks and I have seen major improvement with games especially when it comes to latency. The heavier the game is, the more you also get from it – albeit with an overhead to consider. Check out the performance below with the RTX 3060 Ti for example. DLSS at Performance Mode had the best rendering time out of all modes tested at Rainbow Six Siege. In a competitive gaming scenario, these milliseconds shaved into rendering a frame counts a lot – and that’s without even taking the NVIDIA Reflex into account.
Now as for those who are going to ask why it is not compatible with GTX graphics cards, you have to understand that in theory, the GTX family should be able to use it – but not highly recommended due to the cost of running deep learning calculations exceeding its intended benefits. For example, if the Tensor Cores could render a DLSS improved scene at 2ms, then it would take Pascal’s CUDA Cores 16ms based on the 8x better throughput figure provided by NVIDIA for the Tensor Cores.
In closing, DLSS is no longer a gamble – something that I would otherwise disagree a few years ago. That said, I have proved myself wrong with how DLSS 2.0 has been implemented and got really humbled in the process. Nice one NVIDIA. Frames does indeed win games.