Excellent questions. For starters, let's consider the following:
- The cluster quorum ensures that only one subset of nodes ("a quorum") can run the cluster services to avoid "split-brain" scenarios.
- Quorum is based on votes. Each node typically gets 1 vote, and witnesses (disk or file share or cloud witness) can add an extra vote.
- Majority of votes are needed for the cluster to stay online.
- Do you need a quorum witness in this 3-node cluster?
- Technically, you can have quorum with just 3 nodes (each with 1 vote), because 3 is odd.
- However, because 2 nodes are in one site and 1 node is in the other, it’s highly recommended to have a witness to maintain quorum in failover scenarios.
- The witness adds an additional vote, making it easier to maintain quorum if nodes go down.
- What happens if the production datacenter (2 nodes) goes down?
- The remaining node in Datacenter B will have 1 vote.
- Without a witness, cluster votes = 1 out of 3, which is not a majority (majority of 3 is 2).
- So, cluster will NOT come online with just that one node, leading to downtime until the production datacenter returns.
- This is a big problem for disaster recovery scenarios.
- Can the single surviving node vote for itself and come online?
- No, single node voting for itself is not sufficient to form quorum in a 3-node cluster without a witness.
- Cluster needs majority votes to stay online.
- Can you create 2 witnesses?
- No, you can only have one quorum witness per cluster.
- The cluster quorum model supports only one witness (disk, file share, or cloud witness).
Effectively, I'd suggest to add a quorum witness (preferably a cloud witness or a file share witness hosted in a third ___location or one of the datacenters). This adds 1 vote, increasing total votes to 4 (3 nodes + 1 witness). Majority would now be 3 votes. If the production datacenter goes down (2 nodes lost), the node in the DR datacenter plus the witness vote = 2 votes, which is not a majority of 4. To solve this, you can configure the witness vote to be placed in the DR site, so that when production goes down, the DR node plus witness form majority and cluster stays online. Alternatively, you can consider configuring the cluster so the witness is "quorum arbitration" to favor the DR site during failover.
If the above response helps answer your question, remember to "Accept Answer" so that others in the community facing similar issues can easily find the solution. Your contribution is highly appreciated.
hth
Marcin