Talk:Integer overflow: Difference between revisions

Content deleted Content added
Explanation for removal: C compilers do it, so it should be described here.
 
(8 intermediate revisions by 4 users not shown)
Line 1:
{{WikiProject Computer Securitybanner shell|class=Start |importance=}}
{{WikiProject Computer Security |importance=}}
}}
{{merged|Arithmetic overflow}}
 
Line 91 ⟶ 93:
:
:I might agree that the previous one isn't perfect, but recent discussions of C compilers indicate that it is needed, in some form or other. c optimizers have been doing things that I wouldn't suggest for some time. I disagree with this one, too, but it seems that they are doing it. [[User:Gah4|Gah4]] ([[User talk:Gah4|talk]]) 05:11, 12 September 2023 (UTC)
:: Sorry, I can't understand your comment. What exactly are compilers doing, and what evidence can you offer? [[User:Zero0000|Zero]]<sup><small>[[User_talk:Zero0000|talk]]</small></sup> 12:58, 12 September 2023 (UTC)
:: Incidentally, I know of some things that look somewhat similar but are actually not. For example, "if (x + 2 > x) ''something''" can be optimized to "''something''". It is valid because if no overflow occurs then the test is true arithmetically, and if there is overflow the result of the test is up to the implementation so treating it as true is allowed. A number of such gotchas are known. My example "z = x + y; if (z < 0) ''something''" is not like that. Although the result of "z = x + y" is implementation-defined if it overflows, the meaning of "if (z < 0) ''something''" is perfectly well defined. If execution flow reaches the statement "if (z < 0) ''something''" with z < 0 but ''something'' is not evaluated, that's a bug. [[User:Zero0000|Zero]]<sup><small>[[User_talk:Zero0000|talk]]</small></sup> 13:53, 12 September 2023 (UTC)
:::
:::I haven't followed it exactly, but I am pretty sure that there are plenty of sources. There is a lot of discussion, especially as programs fail. C has always had overflow undefined, as different processors do different things. But many people know what the common processors do, and assume that. Now they get surprised. The (x+2 > x) is the easy case, but there are many that are harder to see, and that compiler optimizers seem to find. [[User:Gah4|Gah4]] ([[User talk:Gah4|talk]]) 20:26, 12 September 2023 (UTC)
 
== Integer Underflow ==
 
This page had a similar passage trying to prove Integer Underflow with the same bogus sources as Arithmetic Underflow's page. In Arithmetric Underflow's Talk I have posted information discrediting the sources provided.
 
https://en.m.wikipedia.org/wiki/Talk:Arithmetic_underflow
 
I am under the belief these sections were provided by a biased individual, not formally educated in Computer Science, trying to prove a falsehood. The mere existence of a term in blogs, bug reports, section titles and circular reference should not be enough to prop a term into existence. Integer Underflow is not a quality term and is formed from a basic misunderstanding of what Integer Overflow is.
 
In Arithmetric Underflow, I removed the passage as I easily discredited the sources. Arithmetic underflow is more focused on floating point operations. Here I left it in place as there was an ambiguity section. I felt adding proper disqualification of the sources made the most sense. [[Special:Contributions/24.112.251.203|24.112.251.203]] ([[User talk:24.112.251.203|talk]]) 08:21, 18 June 2025 (UTC)
 
== Multiple problems in new text ==
 
There are multiple problems in the new version. Some highlights that don't include many examples of bad writing:
* {{tq|"The most common implementation of integers in modern computers are [[Two's complement|two's complement]]"}} — Bad grammar, plus this is only for signed integers.
* {{tq|"Unfortunately, for most [[Computer architecture|architectures]] the [[Arithmetic logic unit|ALU]] doesn't know the [[Binary number|binary representation]] is [[Signedness|signed]]."}} — This is nonsense. What does it mean for the ALU to not know something? It knows the bit pattern of memory and it knows how to execute machine instructions. The closest relevance is that there is usually only one machine instruction for integer addition, and similarly subtraction. However, for multiplication, division and many comparisons, there are separate signed and unsigned instructions. What should be said is that it is the programmer, via the program, who gives interpretations to bit patterns in memory. The hardware just obeys the interpretation given to it by the program by virtue of the choice of instructions that the program provides.
* {{tq|"Most [[Arithmetic logic unit|ALUs]] perform operations on [[Signedness|unsigned]] (positive) [[Binary number|binary numbers]]. These ALUs do not have any capability of dealing with [[Signedness|signed]] (positive and negative) numbers."}} — The examples of multiplication, division and comparison show this is not true. Even for the case of addition, although there is usually only one instruction, processors like Intel and ARM have flags that indicate signed overflow and flags that indicate unsigned overflow. Both are set or cleared and the program chooses which, if any, to query. In all cases, the computer knows perfectly well how to handle signed or unsigned integers.
* {{tq|"When an [[Two's_complement#Arithmetic_operations|operation]] occurs that results in a [[Carry (arithmetic)|carry]] past the 31-bits allocated for the number, the sign bit is overwritten."}} — Wrong. The addition (-2)+(-2)=(-4) causes overflow into the sign bit, but it isn't signed overflow. The carry-wise description of signed overflow is that the carry into the sign bit differs from the carry out of the sign bit.
* {{tq|"The ALU doesn't know it did anything wrong."}} — In 2025, computers still don't have moral sensibility (but wait another decade).
* {{tq|"Using integers of the same size as the [[Arithmetic logic unit|ALU]]'s [[register width]] will have the best performance in most applications. ..."}} — This paragraph is not about the topic of the page and should be removed.
* {{tq|"Integer Underflow is an improper term used to signify the negative side of overflow. This terminology confuses the prefix "over" in overflow to be related to the [[Sign (mathematics)|sign]] of the number."}} — "Improper" is not a defined concept, "negative side of overflow" has dubious meaning, and the second sentence is an unsourced opinion.
* The table — The footnotes have things like "most common", "typically", "default" without basis. Names like "uint32" are vendor-specific (for example they are not in the C or C++ standards). [[.NET]] naming conventions are not suitable for presentation as if they are more widely accepted. Integers with 128 bits are software provided, not hardware provided in any current cpus afaik.
* Probably more, but I'm running out of bullets. [[User:Zero0000|Zero]]<sup><small>[[User_talk:Zero0000|talk]]</small></sup> 03:15, 21 June 2025 (UTC)
 
:If there are grammer issues that you can fix, just do it.
:There are sections that a generalized verse specific, and it is unreasonable to detail every variant in summaries and origin sections. That is why the statements are not written as absolutes. I believe all of the updates are in context of the most used architectures. If anyone is using anything else, I believe they will understand that they are a possible outlier. To include all variants would be a form of over-complication and analysis paralysis.
:Although, I appreciate the snarkiness about the ALU not having intelligence, in the next paragraph the ALU does know there is something incomplete thus returns a flag in those operations. The two paragraphs are supposed to highlight the difference in signed verse unsigned int operations. The ALU is stupid and I am trying to outline it's limitations. More information below following up on your math example.
:I think you need to review how the ALU operates. Im not sure if you have ever taken a processor design course, or designed an ALU before, but I have after the x86 implementation the most common. For example, the add and subtract operation in the ALU is operating on the registers let's say 32-bits. The ALU doesnt have a Signed Int Add operator verse Unsigned Int Add it has a binary add. The program/implementation of the datatype places the constraints on the interpretation of the bits. Because the ALU only has binary add operation, the program/implementation has to handle the sign logic.
:For your example the ALU can never do the (-4) + (-4) operation you gave as an example. (I'm going to use 8 bits for simplicity.) The ALU would do 1000 0100 + 1000 0100. The ALU doesn't know what the bits mean, it only know operations like add, subtract. To be clear: int8 1000 0100 is -4, and uint8 1000 0100 is 132. You tell the ALU to add reg containing 1000 0100 and it does that in binary. This binary is not signed at the ALU level. The construct of int8 or uint8 is at the program language level, not the ALU.
:Let me quickly explain just the half adder to you. Although I am writing this as code, it is electrical silicone transistors, and-gates and or-gates at the ALU level.
:<code>
:HalfAdder(ref result, index, out carryOut) {
:left = reg1[index], right = reg2[index];
:result[index] = left | right;
:carryOut = left & right;
:}
:</code>
:For each bit index add legit is just 'and' and 'or' gate. So every index is added and then carries over to the next index. There is no magic here in understanding signs or integers or what the data means. The ALU does basic math operations on binary bits.
:Also to clarify why I am being general here. I am aware instruction set extensions that do have separate registers for the sign, but that is not an appropriate topic for the summary or origin sections in my opinion.
:About using integers of the same size for performance. This should be basic knowledge to anyone well educated in cs, but I wanted to highlight the purpose of not jumping into using longer ints for all operations. I felt that the reader could interpret that section as, oh to solve this issue I just need to make sure all my Integers are of infinite size. If you have a better idea on how to communicate that concept I'm all ears.
:For Integer Underflow, it is hard to talk about ambiguous terminology, which is the section. It is almost impossible to find a quality citation for a made-up erroneous dubious term. This section is still improved over the last version that said, ~ Integer Underflow must be a thing because we found it in the following 6 citations. The citations of which included no definitions and blogs. I'd be more than happy to write a blog article so I can cite it, but what a petty thing to do.
:.NET isn't a vender. It is a valid enough citation for a "typically and alias". This table was previous was a list without any citations listing the typical ranges. I added some alias to also highlight the difference between bytes, sbyte, int, and uint. The reason I chosen the .NET reference is because of the book Framework Design Guidelines ed.2 that outlines the debate of creating universal naming of said types. I find the request to refine an possible Alias because it isnt in c/c++ a bit narrow minded. Especially, when c/c++ is a much older language that only has a subset of what modern languages have. There are plenty of other citations out there like, https://users.cs.utah.edu/~germain/PPS/Topics/unsigned_integer.html but I felt that table was already heavy with citations. [[Special:Contributions/24.112.251.203|24.112.251.203]] ([[User talk:24.112.251.203|talk]]) 06:08, 21 June 2025 (UTC)
 
:: I stand by everything I wrote (except that "vendor" was a wrong word choice). I taught the basics of computer architecture for more than 40 years, which suggests that I'm not totally ignorant on the subject.
::* (-4)+(-4) is not 1000 0100 + 1000 0100 in twos-complement. It is 1111 1100 + 1111 1100. (Your words "the remaining least significant bits represent the number" are only true in an indirect sense.) When these are added as 8-bit binary numbers, a carry into the sign bit occurs, and a carry out of the sign bit also occurs (but is discarded). Because those two carries are the same, no signed overflow has occurred. The result is 1111 1000 (-8) with the unsigned overflow flag set and the signed overflow flag unset. Your description and your example are both wrong.
::* "The ALU is stupid and I am trying to outline it's limitations." is not helpful. The ALU does exactly what its designers want it to do. They could have implemented separate signed and unsigned ADD instructions, but they chose to implement one instruction that does both signed and unsigned addition at the same time. '''The output from the instruction is not just the bits in the register but also the flags.''' That's not a limitation at all. The data bits plus the flags in fact indicate the full exact results of both signed and unsigned addition. In the case of operations like multiplication, the combination of signed and unsigned operations into a single instruction is impractical, so separate instructions are provided.
::* "sbyte" is a C#/.NET type name, and vbnet has "SByte". It is not a standard type name in C, C++, Java, Rust, Go, or Python. Both old and new languages there. It is misleading to present the name as if it is something more general. Similarly with some of the other type names. People who want to know C#/.NET conventions can go to those pages.
::* No, you can't cite your own blog articles unless they have your real name on them and you are an established subject matter expert, see [[WP:BLOGS]].
::* Overall this page needs a major refresh. When I find the time I will do that. [[User:Zero0000|Zero]]<sup><small>[[User_talk:Zero0000|talk]]</small></sup> 13:55, 21 June 2025 (UTC)