Talk:Advanced Vector Extensions: Difference between revisions

Content deleted Content added
Line 60:
I'm disappointed - no fast exp(), cos() etc. Unlike CUDA. Intel really has a problem. Oh, and have to get a new operating system to use the thing. Oh, really. <span style="font-size: smaller;" class="autosigned">— Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[Special:Contributions/113.190.231.252|113.190.231.252]] ([[User talk:113.190.231.252|talk]]) 11:51, 21 July 2012 (UTC)</span><!-- Template:Unsigned IP --> <!--Autosigned by SineBot-->
 
:It computes a square root of the FP64 element in the lower half of xmm3, then combines it with the FP64 element in the upper half of xmm2 and puts the result in xmm1. SSE sqrtdsqrtsd used to do the same, only the first two arguments were the same register (that is, the square root was placed in the lower element of the target register, upper element unaffected). See SDM for instruction descriptions.
:More complex math functions are more expensive to implement in hardware, even division and square root are difficult. And no, the fact that there are such functions in CUDA runtime does not mean they are actually implemented as dedicated hardware instructions. [[Special:Contributions/188.32.106.30|188.32.106.30]] ([[User talk:188.32.106.30|talk]]) 22:56, 9 July 2022 (UTC)