Talk:Advanced Vector Extensions: Difference between revisions

Content deleted Content added
 
(10 intermediate revisions by 7 users not shown)
Line 1:
{{WikiProject Computingbanner shell|class=C|
{{WikiProject Computing |hardware=yes |hardware-importance=Low |importance=Low |software=y |software-importance=Low}}
}}
==Introduction==
Intro doesn't tell the story of what AVX is supposed to do, as in its purpose. Intro refers vaguely to "new features".
 
==AVX-512==
The section on AVX-512 looks like it has been copied from a news release. Maybe it should be a separate article. It needs to point out the distinction between the unnamed instruction set of Knights Corner and the AVX-512 instruction set of Knights Landing. The former uses an MVEX prefix and the latter uses an EVEX prefix. These two prefixes differ by a single bit, even for otherwise identical instructions. Therefore the two instruction sets are not mutually compatible, but both are backwards compatible with AVX2. Does anybody have info about the fate of the Knights Corner instruction set? Is it obsolete or will both lines be continued? [[User:Afog|Afog]] 09:48, 2 October 2013 (UTC)
Line 18 ⟶ 21:
 
:Note: Extended double precision is 80 bits in size, but is often stored as 128-bit for alignment. True 128-bit floating point numbers are "quadruple precision". I have no idea if AVX supports either of these things. [[User:Aaronfranke|Aaronfranke]] ([[User talk:Aaronfranke|talk]]) 19:40, 18 September 2019 (UTC)
:AVX does not support new data types, i.e. only the 32 and 64-bit floating-point numbers remain supported. F16C adds support for converting to/from 16-bit floating-point numbers. 80 or 128-bit floating-point numbers are not natively supported. [[Special:Contributions/2A02:2168:84E0:CE00:D38C:F7B5:4F24:FDE5|2A02:2168:84E0:CE00:D38C:F7B5:4F24:FDE5]] ([[User talk:2A02:2168:84E0:CE00:D38C:F7B5:4F24:FDE5|talk]]) 16:06, 1 November 2024 (UTC)
 
== why does AVX look like altivec? ==
Line 59 ⟶ 63:
 
I'm disappointed - no fast exp(), cos() etc. Unlike CUDA. Intel really has a problem. Oh, and have to get a new operating system to use the thing. Oh, really. <span style="font-size: smaller;" class="autosigned">— Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[Special:Contributions/113.190.231.252|113.190.231.252]] ([[User talk:113.190.231.252|talk]]) 11:51, 21 July 2012 (UTC)</span><!-- Template:Unsigned IP --> <!--Autosigned by SineBot-->
 
:It computes a square root of the FP64 element in the lower half of xmm3, then combines it with the FP64 element in the upper half of xmm2 and puts the result in xmm1. SSE sqrtsd used to do the same, only the first two arguments were the same register (that is, the square root was placed in the lower element of the target register, upper element unaffected). See SDM for instruction descriptions.
:More complex math functions are more expensive to implement in hardware, even division and square root are difficult. And no, the fact that there are such functions in CUDA runtime does not mean they are actually implemented as dedicated hardware instructions. [[Special:Contributions/188.32.106.30|188.32.106.30]] ([[User talk:188.32.106.30|talk]]) 22:56, 9 July 2022 (UTC)
 
== Windows XP and AVX ==
Line 150 ⟶ 157:
In any case my point is that if any hard numbers are to be left in that section they should probably be specified as examples, specific to those single processors, and in terms of base clock multipliers since percentages make little / no sense even on similar speed chips as seen from the 14C skylake server chip downclocking immensely to keep within the low TDP. --[[User:A Shortfall Of Gravitas|A Shortfall Of Gravitas]] ([[User talk:A Shortfall Of Gravitas|talk]]) 08:46, 18 July 2021 (UTC)
 
:The essential point this section should get across is that there is a separate downclocking mechanism that is besides thermal and power regulation. In other words, the downclocking that is being discussed in this section happens solely based on execution of certain instructions. The Ice Lake reference describes that in detail. The exact amount of downclocking is configurable, but there are default clock multiplier reductions. Changing these settings is considered overclocking and is besides the point of this section. Though I agree that it would be more accurate to use clock multiplier reductions instead of percentages. --[[Special:Contributions/5.142.43.254|5.142.43.254]] ([[User talk:5.142.43.254|talk]]) 21:02, 8 January 2022 (UTC)
———
 
== Introduction ==
 
It says "AVX provides new features..."
 
Would it be possible to provide one or two examples of the functions that get the most benefit from this instruction set(s)? Thanks [[User:Nei1|Nei1]] ([[User talk:Nei1|talk]]) 15:17, 23 February 2023 (UTC)
 
:There is an entire section listing software that benefits from AVX. [[Special:Contributions/2A02:2168:84D9:F00:FA48:DA12:73B7:376E|2A02:2168:84D9:F00:FA48:DA12:73B7:376E]] ([[User talk:2A02:2168:84D9:F00:FA48:DA12:73B7:376E|talk]]) 17:03, 9 June 2023 (UTC)
 
== Mac OS appears to support AVX 2 ==
 
The OS has apparently supported it since the release of the "[https://developer.apple.com/games/game-porting-toolkit/ Game Porting Toolkit 2]," however the real addition to the OS to allow the execution of AVX 2 instructions was modifications done to Rosetta 2 as evidenced by this [https://github.com/official-stockfish/Stockfish/issues/5707 github thread] using it and finding that there is also an issue with it's implementation. [[User:Sussis Amogus|Sussis Amogus]] ([[User talk:Sussis Amogus|talk]]) 21:32, 27 February 2025 (UTC)
The essential point this section should get across is that there is a separate downclocking mechanism that is besides thermal and power regulation. In other words, the downclocking that is being discussed in this section happens solely based on execution of certain instructions. The Ice Lake reference describes that in detail. The exact amount of downclocking is configurable, but there are default clock multiplier reductions. Changing these settings is considered overclocking and is besides the point of this section. Though I agree that it would be more accurate to use clock multiplier reductions instead of percentages. --[[Special:Contributions/5.142.43.254|5.142.43.254]] ([[User talk:5.142.43.254|talk]]) 21:02, 8 January 2022 (UTC)