Search This Blog

Showing posts with label Windows 2008 Server Moving Cluster Quorum Disk to New SAN. Show all posts
Showing posts with label Windows 2008 Server Moving Cluster Quorum Disk to New SAN. Show all posts

Windows Cluster : Understanding the Quorum settings

In my earlier post , I have explained about the windows cluster and how Sql server works on cluster environment. In this post let us try to understand the quorum settings of windows cluster environment. When I say quorum, do not interpret as quorum disk. Quorum has literal meaning in the cluster environment. In this post I will use the word witness disk to refer the quorum disk.Let us see what are all the possible quorum settings and how it will affect the windows cluster.

What is a quorum ?
As per Wikipedia, quorum is the minimum number of members of a deliberative assembly necessary to conduct the business of that group. In short quorum is minimum number of votes required for majority.As I explained in my earlier post, the nodes participating in the windows cluster are connected through a private network and communicate through User Datagram Protocol (UDP) port 3343.The quorum configuration in a failover cluster determines the number of failures (failure of nodes) that the cluster can sustain while still remain online. If additional failure happened beyond this threshold, the cluster will stop running.Quorum is designed to handle the Split Brain scenario. When nodes are unable to communicate each other, each node assume that, resource groups owned by other nodes have to brought online. When same resource brought online on multiple nodes at the same time,data corruption can occur. This scenario is called Split Brain.

Let us assume that we have four node cluster and one instance of sql server is running on each node. Node1 and Node2 lost the communication with Node3 and Node4. Node1 and Node2 can communicate each other and  Node3 and Node4 can communicate each other. In this scenario each group does not know what happened to other  two nodes. Are they offline or just a communication failure ?. In this scenario, Node1 and Node2 try to bring online the Sql instance(resource) owned by Node3 and Node4. In the same way Node3 and Node4 will try to bring online the Sql instance (resource) owned by the Node1 and Node2, which will lead to disk corruption and many other issues.The windows cluster quorum setting is designed to prevent this kind of scenario.By having the concept of quorum, the cluster will force the cluster service to stop in one of the subsets of nodes to ensure that there is only one true owner for the particular resource group.

Voting
Having quorum (majority) is based on the voting algorithm where more than half of the voters must be online and able to communicate each other. The cluster knows how many node are used to form the the cluster and will know how many votes constitutes a quorum. If the number of votes drop below the majority, the cluster service will stop on the nodes of that group.Cluster requires more than half of the total votes to achieve the quorum.This is to avoid the tie in the number of votes. In a 8 node cluster , 5 voters must be online and able to communicate each other to have quorum. Because of this logic, it is recommended to always have an odd number of total voters in the cluster and the quorum setting define the the voters in a cluster.This does not necessarily mean an odd number of nodes is needed to form the cluster since both a witness disk (quorum disk) and a file share can contribute a vote, depending on the quorum settings.

Quorum Settings
Windows 2008 cluster supports four quorum models.



1 Node Majority 

2 Node and Disk Majority 

3 Node and File Share Majority

4 No Majority (disk only)



Node Majority: Node majority option is recommended for cluster with odd number of nodes.This configuration can handle a loss of half of the number of cluster nodes rounded off downwards. For example , a five node cluster can handle failure of two nodes. In this scenario three of the nodes (N1,N2,N3) can communicate each other but other two(N4 and N5) are not able to communicate. The group constituted by three node have the quorum (majority) and cluster will remain active and cluster service will be stopped on the other two nodes (N4 and N5). The resource group (sql server instance) hosted on that two nodes goes offline and come online on one of the three nodes based on possible owner settings.

Node and Disk Majority: This option is recommended for cluster with even number of nodes.In this configuration every node gets one vote and witness disk (quorum disk) gets one vote which makes total votes a odd number. The witness disk is a small ( approx 1 GB ) clustered disk.This disk is highly available and can failover between nodes. It is considered as part of the cluster core resource group.In a four node cluster, if there is a partition between two subsets of nodes, one of the subset will have witness disk and that subset will have quorum and cluster will remain online. This means that the cluster can lose any two voters,whether they are two nodes or one node and the witness disk.


Node and File Share Majority: This configuration is similar to the the Node and Disk Majority, but in this case the witness disk is replaced with a file share which is also known as File Share Witness Resource (FSW). This quorum configuration usually used in multi-site clusters (nodes are in different physical location) or where there is no common storage. The File Share Witness resource is a file share in any server in the same active directory which all the cluster nodes have access to. One of the node in the cluster will place a lock on the the file share to consider that node as owner of the file share.When this node goes offline or lost the connectivity another node grabs the lock and own the file share.On a standalone sever, the file share is not highly available , however the file share can also put on a clustered file share on an independent cluster,making the FSW clustered and giving it the ability to fail over between node. It is important that, this file share should not put in a node of the same cluster, because losing that node would cause for loosing two votes. A FSW does not store cluster configuration data like witness disk. It contain information about which version of the cluster configuration database is most recent.

No Majority (Disk only) : This configuration was available in windows server 2003 and has been maintained for compatibility reason and it is highly recommended not to use this configuration. In this configuration,only witness disk has a vote and there are no other voters in the cluster. That means if all nodes are online and able to communicate , but when witness disk failed or corrupted, the entire cluster will go offline.This is considered as single point of failure.



Hope you got a fair idea about various quorum settings available in windows 2008 cluster.

Windows 2008 Server : Moving Cluster Quorum Disk to New SAN

In my earlier post, I have explained how to move the MSDTC disk to new SAN. In this post we will go through the procedure to move the quorum drive to the new SAN. Follow the steps mentioned in the earlier post to add the new disk to the Available Storage Group.

In our environment the existing Quorum drive is Q and new designated drive is X. Once the new drive is available in the Available Storage Group, follow the steps given below:

  • Open the cluster manager and select the cluster group in the left pane.
  • In the right pane , you can see an option called More Actions. On clicking on that , a popup  menu will be opened as given below.


  • Click on the first option, Configure Cluster Quorum Settings, which will open a screen as given below.



  • Select Next button , which lead to to Quorum configuration settings screen.Select appropriate setting based on your environment . The default is the second option and that might suits to almost all environments.

  • On clicking next , disk selection page will open. Select the appropriate disk. In our case we have to select the X drive.You can expand the disk to see the drive letter. 

  • After selecting the appropriate disk, click on Next which will lead to Confirmation page and on clicking next from the confirmation page , the quorum will be moved to new disk. Now you can see a folder Cluster in the new drive (X). The old drive will be available in the Available Storage Group. 
Moving Quorum is completely an online operation and does not required any down time. If you really want to keep the drive letter Q for the Quorum drive , we can do it in two ways. 
  • Unassign the drive letter Q from the  old drive and change the drive letter of new quorum drive(X) to Q , but you will get a warning message and I did not proceed with this as I do not want to take any risk with our cluster environment.
  • The second method is , change the drive letter of the old disk to any available drive letter. For example Y. Follow the steps mentioned earlier to move the quorum to  Y drive. Change the drive letter X to Q. Again follow the same step to move to Q drive which is the new quorum drive.
I followed the above steps to move the quorum disk to new SAN and it worked well.