Chandra–Toueg consensus algorithm: Difference between revisions

Content deleted Content added
AnomieBOT (talk | contribs)
m Dating maintenance tags: {{Citation needed}}
Line 16:
 
== Why it works ==
The consensus problem requires:
 
# termination (all processes decide);
# validity (all processes decide on a value that was some process's input value); and
# agreement (all processes decide on the same value).
 
Before arguing that the algorithm above satisfies the 3 properties above, we recall the definition of an ''eventually strong failure detector''. An eventually strong failure detector is one that ''never'' identifies ''some'' specific non-faulty (or correct) process as having failed, after some initial period of confusion, and, at the same time, eventually identifies ''all'' faulty processes as failed.

''Termination'' holds because eventually the failure detector stops suspecting ''some'' non-faulty process p and eventually p becomes the coordinator. If the algorithm has not terminated before this occurs in some round r, then every non-faulty process in round r waits to receive p's preference and responds with ack(r). This allows p to collect enough acknowledgments to send decide(preference), causing every process to terminate. Alternatively, it may be that some faulty coordinator sends decide only to a few processes; but if any of these processes are non-faulty, they echobroadcast the decision to all the remaining processes, causing them to decide and terminate.
 
''Validity'' follows from the fact that every preference starts out as some process's input; there is nothing in the protocol that generates new preferences.
 
''Agreement'' is potentially the most difficult to achieve. It iscould be possible that a coordinator, in one round r, might send a decide message from some value v that propagates only to a few processes before some other coordinator, in a later round r', sends a decide message for some other value v'. To show that this does not occur, observe that before the first coordinator can send decide(v), it must have received ack(r) from a majority of processes; but, then, when any later coordinator polls a majority of processes, the later majority will overlap the earlier one and v will be the most recent value. So, any two coordinators that send out decide message send out the same value.
 
== References ==