Content deleted Content added
Homshnogen (talk | contribs) m →Spinlock: Changed heading to match other headings in the section, and tweaked the grammar for readability. This section is still loosely worded and can use more clarity. |
m →Minimization: Fixed grammar Tags: Mobile edit Mobile app edit Android app edit App section source |
||
(6 intermediate revisions by 5 users not shown) | |||
Line 34:
==Minimization==
One of the challenges for exascale algorithm design is to minimize or reduce synchronization.
Synchronization takes more time than computation, especially in distributed computing. Reducing synchronization drew attention from computer scientists for decades. Whereas it becomes an increasingly significant problem recently as the gap between the improvement of computing and latency increases. Experiments have shown that (global) communications due to synchronization on distributed computers takes a dominated share in a sparse iterative solver.<ref>{{cite journal|title=Minimizing synchronizations in sparse iterative solvers for distributed supercomputers |author=Shengxin, Zhu and Tongxiang Gu and Xingping Liu|journal=Computers & Mathematics with Applications|volume=67|issue=1|pages=199–209|year=2014|doi=10.1016/j.camwa.2013.11.008|doi-access=free|hdl=10754/668399|hdl-access=free}}</ref> This problem is receiving increasing attention after the emergence of a new benchmark metric, the High Performance Conjugate Gradient (HPCG),<ref>{{cite web|url=http://hpcg-benchmark.org/|title=HPCG Benchmark}}</ref> for ranking the top 500 supercomputers.
==Problems==
Line 72:
Barriers are simple to implement and provide good responsiveness. They are based on the concept of implementing wait cycles to provide synchronization. Consider three threads running simultaneously, starting from barrier 1. After time t, thread1 reaches barrier 2 but it still has to wait for thread 2 and 3 to reach barrier2 as it does not have the correct data. Once all the threads reach barrier 2 they all start again. After time t, thread 1 reaches barrier3 but it will have to wait for threads 2 and 3 and the correct data again.
Thus, in barrier synchronization of multiple threads there will always be a few threads that will end up waiting for other threads as in the above example thread 1 keeps waiting for thread 2 and 3. This results in severe degradation of the process performance.<ref name=":0">{{
The barrier synchronization wait function for i<sup>th</sup> thread can be represented as:
|