Talk:Deep Learning Super Sampling: Difference between revisions

Content deleted Content added
 
(47 intermediate revisions by 20 users not shown)
Line 1:
{{Talk header}}
{{WikiProject Computing|class=|importance=mid}}
{{WikiProject Computerbanner scienceshell|class=C|importancecollapsed=yes|1=}}
{{WikiProject Computing|importance=Low|hardware=yes|hardware-importance=Low|software=yes|software-importance=Low|science=yes|science-importance=Low}}
{{WikiProject Computer graphics|importance=High}}
{{WikiProject Video games|importance=Low}}
{{WikiProject Technology}}
{{WikiProject Artificial Intelligence}}
}}
 
== Article is self-contradictory, lacks verifiability, uses vague, non-descriptive language and is written like an advertisement ==
Line 34 ⟶ 40:
:There is still a [contradictory] tag in the article. I am trying to focus attention on the issues while staying uninvolved so am asking if ''contradictory'' still applies. The justification for the tag appears to be "{{tq|Contradicts itself by both claiming that the technology is based on deep learning while simultaneously claiming a game with the technology was released without deep learning.}}" The article says "Nvidia advertised DLSS" at launch in 2018 and a particular game was issued in 2019 where deep learning was not used. Is the tag justified? [[User:Johnuniq|Johnuniq]] ([[User talk:Johnuniq|talk]]) 07:16, 30 May 2020 (UTC)
:: Hello, I tried to discuss the remarks about [[#Remark about Deep learning / Not Deep learning issue|deep learning]] in this talk page [[User:Hervegirod|Hervegirod]] ([[User talk:Hervegirod|talk]]) 13:13, 30 May 2020 (UTC)
::Yes, the article is still contradictory. The very first sentence of the article states that the technology is deep learning technology "{{tq|(DLSS) is a technology developed by Nvidia, using deep learning}}", where as the release history states that the "{{tq|2.0 (first iteration)}}" does "{{tq|not use machine learning}}", which [[Media:AI-ML-DL.png|by definition includes deep learning]]. Also I would not propose deletion anymore. [[Special:Contributions/62.248.185.87|62.248.185.87]] ([[User talk:62.248.185.87|talk]]) 19:30, 30 May 2020 (UTC)
::As of now / after [[User:Dicklyon|Dicklyon]]s revision 959815010 added "{{tq|(in some versions)}}" to the article, it doesn't contradict itself, but I personally don't feel confident in removing the tags since I believe it will take a miracle for the remark to stay in the article for as long as until it can be clarified with a reliable and sourced reference and not just edit warred out of the article. I base this on the experience of how removing a single sentence entirely contradicted by it's citation out of this article got me character assassinated on my user talk and a mod talk page, how my revisions concerning 2716 word referenced article were undone seconds after making them while none of the concerns were or ever have been addressed, and worst of all these behaviors only lead to actions against me. [[Special:Contributions/62.248.185.87|62.248.185.87]] ([[User talk:62.248.185.87|talk]]) 21:07, 30 May 2020 (UTC)
[[Special:Contributions/103.27.230.134|103.27.230.134]] ([[User talk:103.27.230.134|talk]]) 04:15, 19 August 2020 (UTC)
DLSS 2.0 is still based on AI. What made the editor think DLSS 2.0 for Control doesn't use Deep Learning. It just says that DLSS 2.0 doesn't need specific training for each game.
[[Special:Contributions/103.27.230.134|103.27.230.134]] ([[User talk:103.27.230.134|talk]]) 04:15, 19 August 2020 (UTC)
 
== Contradictions fixes ==
Line 61 ⟶ 72:
Searching for the phrase in google yields me 5 pages of results and total of 48 links.
* The first link gets you straight back to this article
* 26 are copies from this article like Wikipedia mirrors or short copies for [[Spamdexing|SEO spam]] purposes
* 5 do not use the words together but google has picked due to there being a line change or otherwise (such as in a formatted document)
* 3 are pure [[Spamdexing|SEO- spam]] without being copies
* All in all '''73% of the results are the article or copies of it, SEO spam, or mistaken results'''
* 2 of them is the same comment to physical clearance from a connector
* 1 refers to video card memory usage in a unclear manner, either way an unrelated concept
* 1 refers to saved performanceresource usage in a [[Headless software|headless installation]]
* 1 refers to non-utilized performance of a video card while the video card still is performing something
* 1 refers to delay in a video stream caused by a video card
* 1 refers to someonesthe computergeneral crashingconcept ifof theirrunning videosoftware cardconsuming is stressedperformance
* 1 is about thesomeones generalcomputer conceptcrashing ofif runningtheir softwarevideo consumingcard performanceis stressed
 
There's one single reference in a book about 3d modelling software Maya seemingly using it in the way used in the article, which appears 4 times. But even that doesn't really specify what it means more than as lost performance.
Line 82 ⟶ 93:
::Thanks, although as you would know, the original is standard terminology: [[Overhead (computing)]]. [[User:Johnuniq|Johnuniq]] ([[User talk:Johnuniq|talk]]) 06:59, 30 May 2020 (UTC)
:::Yes, I'm aware. It doesn't seem to be the right term/concept here, though I note that a source did use it. [[User:Dicklyon|Dicklyon]] ([[User talk:Dicklyon|talk]]) 15:47, 30 May 2020 (UTC)
:::It's not really the same as the articles examples. For example, based on this article, "protocol overhead" is the overhead taken by a protocols control and signalling data. So if data is sent in a bitstream with no protocol control and signalling data then there's no protocol overhead. The phrase is almost self explanatory, all one needs to know is what protocol data is. In stark contrast knowing what a video card is doesn't help you understand what the phrase "video card overhead" means, and removing the video card certainly wouldn't be the way to get rid of whatever it is that is referred to by the phrase here. It certainly isn't self-explanatory and doesn't seem to make much sense. [[Special:Contributions/62.248.185.87|62.248.185.87]] ([[User talk:62.248.185.87|talk]]) 18:55, 30 May 2020 (UTC)
::::Overhead is talking about the amount of work the video card has to do, and that greatly affects the speed at which graphics are drawn and the power used (heat that needs to be dissipated). [[User:Johnuniq|Johnuniq]] ([[User talk:Johnuniq|talk]]) 03:54, 31 May 2020 (UTC)
 
== Remark about Deep learning / Not Deep learning issue ==
Line 87 ⟶ 100:
I just read the contradict tag added again by [[User talk:62.248.185.87]] with this summary : "''If they called it DLSS before it had deep learning then the article can't say it's a technology based on deep learning''". It's not me who called it "Deep learning super sampling" but Nvidia, furthermore the sources I found state that:
* It used deep learning for DLSS 1.0, and learning was processed on a per game basis. So when Nvidia first promoted their technology, it used deep learning
* The first version did not work well, so Nvidia did not used Deep learning anymore for the first iteration of DLSS 2.0. I'm sayingwriting ''first iteration '' because it seems that they did not change the release number of their technology between this one and the current one. However the sources explain why this version was very different from the "current" DLSS 2.0 one
* The current version uses deep learning again, this time with a generic learning not specific to a game, for the current iteration of DLSS 2.0
I tried to clearly say and source that in the article, especially on the "Release history" chapter. It's not my fault if Nvidia was not always clear on their promotion of their own technology. Maybe the wording could be improved to make it clearer (it seems clear in this article for me, but I am the one who wrote the "Release history" chapter originally) [[User:Hervegirod|Hervegirod]] ([[User talk:Hervegirod|talk]])
 
:In summary what you have described is that DLSS is not necessarily based on deep learning. Thus describing it as a technology which uses deep learning, as is done in the lead section, is not correct. Nvidias failure to name only technologies that use deep learning as DLSS is perhaps a problem for their marketing team, but not for Wikipedia, and means this article can't say DLSS is based on deep learning in the introduction.
 
:Also I don't think anyone claimed it's your fault. My edit summary was in response to earlier edit by [[User:Dicklyon|Dicklyon]], which removed the tag with the commentary "''yes they might have called it DLSS even before it had DL''" - hence the response "''If they called it DLSS before it had deep learning then the article can't say it's a technology based on deep learning''".
 
:If Wikipedia allows technologies which are not necessarily deep learning based to introduce them as deep learning then soon half of the technology articles will be saying they are deep learning technologies. [[Special:Contributions/62.248.185.87|62.248.185.87]] ([[User talk:62.248.185.87|talk]]) 16:57, 30 May 2020 (UTC)
 
:: That's not up to us to decide, many sources say that the first and last iteration were / are deep learning based. [[User:Hervegirod|Hervegirod]] ([[User talk:Hervegirod|talk]]) 17:31, 30 May 2020 (UTC)
 
::: Whether the last and first iteration were deep learning based was never under contention. You yourself added the remark that a version was not.
 
::: Contradictory should remain as long as the article contradicts itself. Currently it does so by introducing DLSS as a deep learning technology and then later says there's a version which isn't. [[Special:Contributions/62.248.185.87|62.248.185.87]] ([[User talk:62.248.185.87|talk]]) 17:44, 30 May 2020 (UTC)
 
== DLSS 2.1 ==
 
Information about DLSS 2.1 should be added. It apparently adds an ultra performance mode as a major differentiating factor from previous versions. [[User:Svetroid|Svetroid]] ([[User talk:Svetroid|talk]]) 20:31, 8 December 2020 (UTC)
 
== These two paragraphs... smh ==
 
I'll start before somebody goes fanboy on me by saying that I could care less which of the two video card companies is "winning" the game of screwing people into buying new cards / switching brands, but I have to comment on this part:
 
<blockquote>
Tensor Cores are available since the Nvidia Volta GPU microarchitecture, which was first used on the Tesla V100 line of products. Their specificity is that each Tensor Core operates on 16 bits floating point 4 x 4 matrices, and seem to be designed to be used at the CUDA C++ level, even at the compiler level.
</blockquote>
Aside from the bizarre grammar, "operates on..." is a given, the real specificity (and how there are so many of them) is that they can do approximately jack squat besides multiply, add, load, and store elements to produce said matrices. This has no immediate appeal to the consumer market.
 
Fabrication of a chip for something like the V100 produces lots of defectives which they needed something to do with since enterprise customers won't accept the board where 1/4 of the SMs won't produce accurate results and another 1/4 won't work at all when they're doing something critical like drug discovery or fusion plasma simulation, hence the invention of features like this where the LSB being wrong in an element 1/10000 times won't be noticed. Slap some RGB lights on it and you can market your e-waste to gamers. AMD and Intel do the same thing, so I'm not picking on anybody here.
 
"Designed to be used at the CUDA C++ level" is kind of a given since nVidia wants people locked in to CUDA, which again, everybody else tries to do, although AMD at least stuck with open standards.
 
"or even the compiler level" means absolutely nothing. You don't "use" the machine languages you're compiling for at the compiler level. You compile high level languages through possible multiple stages until you're at either target-specific assembly or binary machine code. Every processor feature in history (since HLLs have existed) was designed to be used at the compiler level, because writing programs in hexadecimal is painful even if you know the instruction set encoding like the back of your hand. Most people don't know or bother with learning assembly, either, so a feature that can't be implemented as a compiler feature / optimization might as well not exist. This is why Itanium was a failure, but I've already ranted enough and won't go into that.
 
Features like the tensor units that would require an expert at multithreaded programming to program manually and require constantly updated system state to run optimally aren't even left to the compiler, except to simplify things a little for the end user, who will then use a python module that interfaces with the c++ cuda libs because all those semicolons in c++ are too confusing for them. Then they'll release a pixel art metroidvania that requires tenorflow that only runs on one processor core and can't maintain 30fps at 1080p and complain about computers being too slow. See below for how the instructions actually work.
 
<blockquote>
The Tensor Cores use CUDA Warp-Level Primitives on 32 parallel threads to take advantage of their parallel architecture. A Warp is a set of 32 threads which are configured to execute the same instruction.
</blockquote>
 
Sorta... it's an insanely verbose instruction (but a warp primitive in cuda compiles down to multiple instructions as below) that's sent to the warp scheduler to be executed as a ton of individual instructions on 32 hopefully optimally scheduled and localized (no guarantees are made) tensor subunits, then reassembled into a result when they finish. This cuts down on silicon per multiply-add subunit of the tensor cores (no need to do anything but multiply and accumulate, and a couple of simple bitwise ops on integers) but as far as i can tell you're SoL if you want a 4x4 matrix multiply add, the smallest "shape" they list is 8x4. There is no way, afaik, to individually address these sub-processing units from anywhere ''but'' the warp scheduler on chip, except maybe overriding data distribution. This is what the assembly looks like for a 16x16 matrix multiplication, which requires 32 tensor units each scheduled internally to do the required series of 4x4 operations:
<syntaxhighlight lang="text">
.global .align 32 .f16 A[256], B[256];
.global .align 32 .f32 C[256], D[256];
.reg .b32 a<8> b<8> c<8> d<8>;
wmma.load.a.sync.aligned.m16n16k16.global.row.f16 {a0, a1, a2, a3, a4, a5, a6, a7}, [A];
wmma.load.b.sync.aligned.m16n16k16.global.col.f16 {b0, b1, b2, b3, b4, b5, b6, b7}, [B];
wmma.load.c.sync.aligned.m16n16k16.global.row.f32 {c0, c1, c2, c3, c4, c5, c6, c7}, [C];
wmma.mma.sync.aligned.m16n16k16.row.col.f32.f32 {d0, d1, d2, d3, d4, d5, d6, d7}, {a0, a1, a2, a3, a4, a5, a6, a7}, {b0, b1, b2, b3, b4, b5, b6, b7}, {c0, c1, c2, c3, c4, c5, c6, c7};
wmma.store.d.sync.aligned.m16n16k16.global.col.f32 [D], {d0, d1, d2, d3, d4, d5, d6, d7};
 
</syntaxhighlight><ref>{{cite web|url=https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions|title=PTX ISA v7.2: Section 9.7.13: Warp Level Matrix Multiply-Accumulate Instructions}}</ref>
 
So that's the level it's *designed* to be used at. Since most humans can't smoke enough meth without dying to want to write it like that, compiler support for the PTX instructions is constantly worked on and a CUDA API created. I went through that pedantic mess because the wording implies that being designed for use in c++ (it was either that or C) and goes on with "wow you can even use it in ''compilers'' as if that's some super elite feature or even makes any sense. The miracle would be if it's ''supported'' by any compilers that nVidia employees didn't add the feature to. As a former compiler engineer I'd rather mate with a garbage disposal than have to implement just the backend encoding for their instruction set, let alone an assembler / assembly printer or any kind of language support.
 
The big snafu as far as i can tell is that GPUs don't do conditional speculative execution in the normal sense. If there are multiple code paths, they execute all of them at once and throw away the results for anything not needed. This makes the AI / deep learning sound fancy but it's really a way to minimize the number of times that happens on a user machine by pre-deciding most things and letting the computer run the resulting state machine. The deep learning part of it never happens on the GPU of the DLSS end user, it's just running the net nvidia trained, which the article got right... and I'd be interested to see how much of the tensor core capacity is in ''actual'' use with features like this. I'll quit my rant here and fix the grammar and try to improve the paragraph a bit later, but I couldn't help myself, marketing speak and the current disaster that is the GPU market price that resulted from it angers me.
 
Cheers ~~---- [[User:A Shortfall of Gravitas (other machine)|A Shortfall of Gravitas (other machine)]] ([[User talk:A Shortfall of Gravitas (other machine)|talk]]) 08:22, 4 April 2021 (UTC)
 
{{reflist-talk}}
 
== Article still contains inaccuracies, contradictory claims, and marketing language ==
 
Currently the second paragraph claims that "As of June 2021, this technology is available exclusively on GeForce RTX 20 and GeForce RTX 30 series of graphics cards.". Aside from being an obviously false claim, it seems to me to be a clear violation of [[MOS:PUFFERY]]. Either way this untrue claim has stood in the article since the creation when Hervegirod put it in. It's also contradicted later in the article. [[Special:Contributions/62.248.185.4|62.248.185.4]] ([[User talk:62.248.185.4|talk]]) 20:31, 7 May 2022 (UTC)
 
== "Tensor cores", "Tensor Cores", or "tensor cores"? ==
 
How should it be spelled? "Tensor cores", "Tensor Cores", or "tensor cores"?
 
--[[User:Mortense|Mortense]] ([[User talk:Mortense|talk]]) 13:31, 25 November 2022 (UTC)
 
== DLSS 1.9 to this day is complete conjecture ==
 
Neither Nvidia nor Remedy have ever confirmed a thing such as DLSS 1.9 existing. When Remedy implemented DLSS 1.0 into the game Control it surpassed the quality of similar implementations in other games.
 
After the release of DLSS 2.0 someone coined the previous implementation DLSS 1.9 which spread like wildfire.
 
But it was only coined because of it's qualitative similarity to 2.0. Which never excluded Remedy and Nvidia working together to provide a better implementation (and better machine learning training).
 
Which does not at all imply it being a different implementation or version of DLSS 1.0 at all. Only that Remedy's own TAA implementation and their implementation of DLSS 1.0 was an improvement.
 
But the same could have been said about the implementations of DLSS 1.0 in Metro Exodus and for example Battlefield V. Both of which had implementations so vastly differing in quality that one could imply they were different versions of DLSS 1.0 when they were not.
 
This self proclaimed fact of DLSS 1.9 being an actual differing iteration from DLSS 1.0 comes from the perception that DLSS 1.9 is/was the same (or almost the same) algorithm as DLSS 2.0 but ran on shader cores instead of shader cores.
 
This think is fueled by the presumption (and common disdain for Nvidia marketing and sales tactics) that Nvidia forced obsolescence by making DLSS 2.0 only work on RTX GPUs when it could have worked on all GPUs as shader-based implementation.
 
Which is completely unfounded, biased and complete speculation.
 
I believe much of this is based on a wishful perception by opponents of Nvidia and an unfortunate miswording or misunderstanding by Techspot.com who likely coined the term or at the very least picked it up in a more official journalistic capacity and therefor helped its propagation. (see https://www.techspot.com/article/1992-nvidia-dlss-2020/)
 
The only facts that remain are that only DLSS 1.0 and DLSS 2.0 officially ever existed as upscaling techniques and that within releases of games with DLSS 1.0 support we always had vast differentials in image quality.
 
These differences have never been proven to be anything but the quality of implementation and quality of game specific machine learning data.
 
Which as we know was one major factor in the different qualities of DLSS 1.0 implantations. The fact that DLSS 1.0 had to be trained on a per-game basis.
 
 
I therefor conclude that, until Nvidia ever officially states otherwise, DLSS 1.9 never existed.
 
Remedy's implementation of DLSS 1.0 simply outshone other implementations either by amount of effort to their own engine or/and to the amount of game specific machine-learned training that made people believe it was a different version entirely. [[User:Fnna509|Fnna509]] ([[User talk:Fnna509|talk]]) 11:49, 13 August 2023 (UTC)
 
== DLSS 3 is NOT Exclusive to Ada Lovelace NVIDIA GPUs. ==
 
DLSS 3 Has 2 main components:<br>
1. Upscaling<br>
2. Frame Generation<br>
While DLSS Frame Generation is indeed exclusive to Ada Lovelace NVIDIA GPUs, DLSS 3 ''(Subsequently, the third generation of DLSS)'' Upscaling <u>isn't</u>. [[Special:Contributions/85.198.63.121|85.198.63.121]] ([[User talk:85.198.63.121|talk]]) 16:07, 30 October 2023 (UTC)
 
== Family of upscaling technologies? ==
 
Is it really a family of upscaling technologies or rather just one technology with several updated iterations? In my opinion, "family" implies that there are several different versions for corresponding compatible devices, possibly each with their own advantages and disadvantages, which it is not. It is just the same technology updated with new, optimized iterations over time. Sure, Frame Generation serves a different purpose and is only compatible with RTX 40-series cards, but it's just an optional function of DLSS 3.5, which is also explained in the article. The upscaling technology of DLSS 3.5 itself hasn't branched off into different subversions depending on the compatability of the graphics card used, as the term "family" suggests. [[User:Maxeto0910|Maxeto0910]] ([[User talk:Maxeto0910|talk]]) 16:33, 16 September 2024 (UTC)
 
:I now [https://en.m.wikipedia.org/w/index.php?title=Deep_Learning_Super_Sampling&diff=1273058950&oldid=1273058758 changed the wording] as there hasn't been opposition to it for quite some months. [[User:Maxeto0910|Maxeto0910]] ([[User talk:Maxeto0910|talk]]) 13:38, 31 January 2025 (UTC)
 
== DLSS 4 exclusive to RTX 50 series ==
 
It seems like DLSS 4.0 will be exclusive to the GPUs of the RTX 50 series when looking at the benchmarks on the [https://www.nvidia.com/en-me/geforce/graphics-cards/50-series/ official website of the series], as the 40 series GPUs use FG, while the 50 series GPUs use MFG in the comparison, likely to highlight the new exclusive feature. [[User:Maxeto0910|Maxeto0910]] ([[User talk:Maxeto0910|talk]]) 04:56, 7 January 2025 (UTC)
 
:Okay, it's explained in [https://m.youtube.com/watch?v=qQn3bsPNTyI this video]. DLSS 4 is a set of multiple features, most of which are also available on the 40 series GPUs and some even on older ones, but MFG will indeed only come to the GPUs of the 50 series. [[User:Maxeto0910|Maxeto0910]] ([[User talk:Maxeto0910|talk]]) 05:32, 7 January 2025 (UTC)
 
== Tensor Cores as separate page? ==
 
For some reason, Tensor Cores seemed to redirect to this page. It seems like Tensor Core as a technology have grown enough to be considered a separate thing form DLSS. [[User:Anothercat613|Anothercat613]] ([[User talk:Anothercat613|talk]]) 09:59, 3 April 2025 (UTC)
 
:[[Tensor Core]] redirects to [[Deep_Learning_Super_Sampling#Architecture]]. There is not enough material here at the moment to justify a [[WP:SPLIT]]. ~[[User:Kvng|Kvng]] ([[User talk:Kvng|talk]]) 15:55, 3 April 2025 (UTC)
::Yeah. The section needs a lot of work and is rather outdated. Perhaps a split can be done after more material is added. Or someone can just start a new page about tensor cores. [[User:Anothercat613|Anothercat613]] ([[User talk:Anothercat613|talk]]) 17:48, 3 April 2025 (UTC)