Bully algorithm: Difference between revisions

Content deleted Content added
No edit summary
 
(49 intermediate revisions by 21 users not shown)
Line 1:
In [[distributed computing]], the '''bully algorithm''' is a method for dynamically [[Leader election|electing]] a [[Distributed computing#Coordinator electionElection|coordinator]] or leader byfrom processa IDgroup numberof distributed computer processes. The process with the highest process ID number from amongst the non-failed processes is selected as the coordinator.
 
==Assumptions==
 
The algorithm assumes that:<ref>Jean{{cite Dollimore,book Tim Kindberg,|last1=Coulouris |first1=George F.|last2=Dollimore Coulouris,|first2=Jean "Distributed|last3=Kindberg systems|first3=Tim : concepts and design (Third Edition)," in ''|title=Distributed systems Systems: conceptsConcepts and designDesign |date=2000 (Third|publisher=Addison Edition)''.Wesley Addison–Wesley,|isbn=978-0201619188 2003.|edition=3rd}}</ref>
* the system is synchronous and timeouts identify process failure.
* processes may crashfail at any time, including during execution of the algorithm.
* a process fails by stopping and returns from failure by restarting.
* there is a failure detector which detects failed processes.
* message delivery between processes is reliable.
* each process knows its own process id and address, and that of every other process.
* process ids are known.
 
==Algorithm==
The [[algorithm]] uses the following message types:
* Election Message: Sent to announce faster election.
* Answer (Alive) Message: RespondResponds to the electionElection message.
* Coordinator (Victory) Message: Sent toby announce the identitywinner of the electedelection to announce processvictory.
 
When a process {{var|P}} determinesrecovers thatfrom thefailure, currentor coordinatorthe isfailure downdetector becauseindicates ofthat messagethe timeouts or failure of thecurrent coordinator tohas initiate a handshakefailed, it{{var|P}} performs the following sequence of actions:
 
# If {{var|P}} broadcastshas anthe electionhighest messageprocess (inquiry)ID, it sends a Victory message to all other processes withand higherbecomes processthe IDsnew Coordinator. Otherwise, expecting{{var|P}} broadcasts an "IElection ammessage alive"to responseall fromother themprocesses ifwith theyhigher process IDs arethan aliveitself.
# If {{var|P}} hears fromreceives no processAnswer withafter asending higheran processElection IDmessage, then it winsbroadcasts thea electionVictory message to all other processes and broadcastsbecomes the victoryCoordinator.
# If {{var|P}} receives an hearsAnswer from a process with a higher ID, Pit waitssends ano certainfurther amountmessages offor timethis forelection anyand processwaits withfor a higherVictory IDmessage. to broadcast(If itselfthere asis theno leader.Victory Ifmessage itafter doesa notperiod receive this message inof time, it re-broadcastsrestarts the electionprocess messageat the beginning.)
# If {{var|P}} getsreceives an electionElection message (inquiry) from another process with a lower ID it sends an "I am alive"Answer message back and if it has not already started an election, it starts newthe election process at the beginning, by sending an Election message to higher-numbered electionsprocesses.
# If {{var|P}} receives a Coordinator message, it treats the sender as the coordinator.
Note that if P receives a victory message from a process with a lower ID number, it immediately initiates a new election. This is how the algorithm gets its name – a process with a higher ID number will bully a lower ID process out of the coordinator position as soon as it comes online.
 
===Analysis===
 
====Safety====
The safety property expected of [[leader election]] protocols is that every non-faulty process either elects a process {{var|Q}}, or elects none at all. Note that all [[process (computing)|processes]] that elect a leader must decide on the same process {{var|Q}} as the leader. The Bully algorithm satisfies this property (under the system model specified), and at no point in time is it possible for two processes in the group to have
a conflicting view of who the leader is, except during an election. This is true because if it weren't, there are two processes {{var|X}} and {{var|Y}} such that both sent the Coordinator (victory) message to the group. This means {{var|X}} and {{var|Y}} must also have sent each other victory messages. But this cannot happen, since before sending the victory message, Election messages would have been exchanged between the two, and the process with a lower process ID among the two would never send out victory messages. We have a contradiction, and hence our initial assumption that there are two leaders in the system at any given time is false, and that shows that the bully algorithm is safe.
 
====Liveness====
[[Liveness]] is also guaranteed in the [[synchronous]], crash-recovery model. Consider the would-be leader failing after sending an Answer (Alive) message but before sending a Coordinator (victory) message. If it does not recover before the set timeout on lower ID processes, one of them will become leader eventually (even if some of the other processes crash). If the failed process recovers in time, it simply sends a Coordinator (victory) message to all of the group.
 
====Network bandwidth utilization====
{{see also|network bandwidth}}
Assuming that the bully algorithm messages are of a fixed (known, invariant) sizes, the most number of messages are exchanged in the group when the process with the lowest ID initiates an election. This process sends (N−1) Election messages, the next higher ID sends (N−2) messages, and so on, resulting in <math>\Theta\left(N^2\right)</math> election messages. There are also the <math>\Theta\left(N^2\right)</math> Alive messages, and <math>\Theta\left(N\right)</math> co-ordinator messages, thus making the overall number messages exchanged in the worst case be <math>\Theta\left(N^2\right)</math>.
 
== See also ==
*[[Distributed computing#CoordinatorLeader election]]
*[[Chang and Roberts algorithm]]
 
Line 31 ⟶ 46:
* Witchel, Emmett (2005). [http://www.cs.utexas.edu/users/witchel/372/lectures/25.DistributedCoordination.ppt "Distributed Coordination"]. Retrieved May 4, 2005.
* Hector Garcia-Molina, Elections in a Distributed Computing System, IEEE Transactions on Computers, Vol. C-31, No. 1, January (1982) 48–59
* L. Lamport, R. Shostak, and M. Pease, [http://research.microsoft.com/en-us/um/people/lamport/pubs/byz.pdf "The Byzantine Generals Problem"] ACM Transactions on Programming Languages and Systems, Vol. 4, No. 3, July 1982.
 
==External links==
*{{Commonscatinline}}
[[Category:Distributed algorithms]]
[[Category:Graph algorithms]]